Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Challenges with Nomenclature and Compatibility Matrix in Molecule Building Blocks #121

Open
ggoetten opened this issue Apr 24, 2023 · 6 comments
Assignees
Labels
documentation Improvements or additions to documentation

Comments

@ggoetten
Copy link

Dear author,

I appreciate your hard work on the software and the regular updates. However, I have been encountering issues with the nomenclature of the building blocks, which has caused me some confusion. Although you have published a previous paper on the subject, the compatibility matrix is not very intuitive, and it requires some effort to understand which molecules match with which APClasses. This can lead to errors, and there is no easy way to check if we have assigned the correct links to the molecule.

In Tutorial 1.2, you provide examples of all the APClasses required for a new compatibility matrix and import another file without explaining why it is necessary to do so (Pt-Co fragments). However, the most challenging part for me is understanding the compatibility rules. The tutorial adds all these rules, but the nomenclature is complicated, making it difficult to understand why we are adding them in the first place. The same is true for the Forbidden Ends.

In conclusion, I suggest adding more examples of organic molecules to the tutorial, in addition to the organometallic examples. While the fragments provide an accurate depiction of APClass for organometallic complexes, there is much greater flexibility with organic compounds, and this is not currently addressed in the tutorial. Overall, I believe that these suggestions will help improve the user experience of the software and make it more accessible to a wider audience.

Regards.

@marco-foscato
Copy link
Member

Hi,

Thanks for your constructive comments! And, yes, I agree: the compatibility of APClasses and the associated machinery are not the most straightforward thing to understand. I'll try to improve Tutorial 1.2 according to your comments.

More in general. For a standard organic-chemistry use case the definition of APClasses and compatibility rules should ideally be a part of the default settings of the software. In a parallel repository, we have been working on defining such defaults also for organic chemistry, and the results should be ready in few weeks.

I'll be sent a notification when I have news on either topic.

Marco

@marco-foscato marco-foscato self-assigned this Apr 24, 2023
@marco-foscato marco-foscato added the documentation Improvements or additions to documentation label Apr 24, 2023
@marco-foscato
Copy link
Member

Added more explanation. See denoptim-project/tutorials@c033a2a

@ggoetten
Copy link
Author

ggoetten commented Apr 27, 2023

Dear Marco,

I believe that the explanation you provided has been significantly improved and is now much clearer.

I have some other questions, and if you could provide an answer it would be great. I was hoping to use your software to produce a route for obtaining the compound Bicyclopentadione after the photodegradation of curcumin. However, I encountered an issue with the fragmentation of the structure. Since curcumin contains aromatic rings, the fragmentation does not recognize them as individual fragments and incorporates them into the final structure. I saw a previous post where you discussed the applicability of a scaffold field. Have you considered adding this field to the software? Alternatively, there is an open-source software called MORTAR that fragments molecules by different methods and includes a scaffold field https://github.com/FelixBaensch/MORTAR.

I also have a question about using aromatic rings as scaffolds for virtual screening. When I attempted to run the software with an aromatic ring as a scaffold, the ring structure did not form more complex structures and remained stable. I believe this may be due to the need to assign cyclic graphs, which requires Tinker.

Lastly, I was wondering how to assemble structures into JSON for the initial population in GA experiments. I have tried using the "Make fragments" function and importing files, but it has not been successful. Can you provide guidance on the correct procedure?

Thank you for your time, and I apologize for the multiple questions.

Best regards,

@marco-foscato
Copy link
Member

Uhm... you do have quite a few things to discuss: OK, you asked for it! ;-)
I'll try to address them one by one, following the same order as in your message.

About producing a "route" from curcumin (CM) to bicyclopentadione (BCP). To test my understanding of this usage case, I'll try to reformulate your needs: you want to use the fragmentation feature to break CM into "fragments" that you later combine to generate potential intermediates along the photodegradation of CM to BCP. If this is the case, I would say our fragmentation is not what you want to use. This because our "fragments" do not rearrange their (implicit) electrons in any way: they are meant to be stiff arrangements of atoms. There cannot be a change of hybridization, charge, or connectivity within such stiff fragments.

About "the fragmentation does not recognize them as individual fragments". This could mean many things, and I'd need to reproduce your experience to really understand what it is. Still, here are a couple of hints:

  • the fragmentation used cutting rules, but none of the loaded cutting rules matches the bond that you would expect to cut. This could also derive from inconsistent handling of H atoms. Our default rules assume explicit H, and using them on implicit H structures can lead to funny results.
  • bonds are matched by cutting rules, but not as mush as needed to generate isolated (i.e., continuously connected) fragments, so the cut is not performed to prevent generation of fragments where there is one or more pairs of attachment points that come from the same cut. For example, cutting one bond inside a ring is not sufficient to make two fragments, so that cut leads to fragments that are rejected.

Then you mention "scaffold field". I'm not sure, but you are possibly hinting at the desire to threat some fragments (what in med chem one would call scaffolds) in a different way that others. Although I see the potential utility, this approach is very much biased to specific use cases, which are not our first priority right now. So, if you see a use case, please write a new issue suggesting a new feature in detail.

Next, about ring formation and Tinker. Tinker is needed only if you want to make the 3D models or even check ring closability in 3D. If you chose "Constitution" as ring-closability condition you should be good to go without Tinker. Still, you would need to set up the ring-closing compatibility matrix (in addition to the usual compatibility matrix), have ring-closing vertexes, and explicitly allow denoptim to form cyclyc graphs. Also graph Templates (see tutorial 3.0) come in very handy when dealing with ring systems.

Lastly, "assemble structures into JSON for the initial population". If you have a set of cutting rules that do what you need (see the point above about curcumin) you can convert molecular representation into graphs directly. This is a new feature of version 4.1. See the upgraded tutorial 1.1 (This is just a couple of days old, so you may not have seen it before) and the corresponding GA-InitMolsToFragmentFile keyword for GA runs.
Still, let me stress, this works to the extent that the cutting rules are suitable for your chemistry: this is a critical requirement to verify. If default cutting rules are not a god fit for your use case, which sounds very likely to me given the curcumin project you mentioned, then you should use other cutting rules. As I mentioned in my first message, we are about to release a new set of cutting rules that are best suited for organic chemistry. So, hold on a few more days to try those...

Cheers

@ggoetten
Copy link
Author

Dear Marco,

Thank you for your detailed response, which addressed all my doubts even though my initial message was not very clear.

I appreciate your insight regarding the fragmentation method and the fact that our needs may not be suited for the current version. I will wait for the new set of cutting rules before proceeding further, and I am also interested in trying the method to incorporate the JSON structure into these files.

Regarding the scaffold field, I understand that it is not a priority for your team at the moment, and I appreciate your suggestion to write a new issue if I see a use case in the future.

Also, please let me know if it would be okay to close this thread after I try out the new cutting rules and provide my final comments. Thank you again for your help.

Best regards.

@marco-foscato
Copy link
Member

Hi,
I know I said "few days" and that few days are passed, but I had to prioritize other things. I hope I'll deal with this next week. Sorry for delay.

Marco

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

2 participants