Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fit / Conformer Generation for Long Flexible Molecules #255

Open
MaTiHo opened this issue Apr 26, 2023 · 1 comment
Open

Fit / Conformer Generation for Long Flexible Molecules #255

MaTiHo opened this issue Apr 26, 2023 · 1 comment

Comments

@MaTiHo
Copy link

MaTiHo commented Apr 26, 2023

Dear OpenFF-Bespokefit Team,
First and foremost I would like to say thank you for the great work you did with BespokeFit!

I would like to ask for you help regarding a bug/problem I am facing testing out openff-bespokefit:

For long alkyl chains the conformer generation within the fitting workflow throws an error. Most likely it stems from the same problem one faces when using rdkit directly (see for example: rdkit/rdkit#3323).
One solution that seems to work within rdkit is incresing the maxAttempts keyword:
AllChem.EmbedMolecule(molecule,maxAttempts=50000)

I attached the json output as well as the molecule file using the BespokeFit workflow below:

openff-bespoke executor run --file "tmpC11.sdf"
--workflow "default"
--force-field "sage-2.1.0rc.offxml"
--output "acetaminophen.json"
--output-force-field "acetaminophen.offxml"
--n-qc-compute-workers 2
--qc-compute-n-cores 3
--default-qc-spec xtb gfn1xtb none

Thank you very much in advance!

output.txt
tmpC11.txt

@mattwthompson
Copy link
Member

I can reproduce your fragmentation error, but not with OpenEye Toolkits installed. Something is going wrong with fragmentation, but I haven't isolated exactly where.

The toolkit can generate conformers for this molecule using RDKit, albeit slower:

In [1]: from openff.toolkit.utils import *

In [2]: from openff.toolkit import Molecule

In [3]: molecule = Molecule('tmpC11.sdf')

In [4]: %%timeit
   ...: molecule.generate_conformers(n_conformers=800, toolkit_registry=OpenEyeToolkitWrapper())
   ...:
   ...:
6.35 s ± 85.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [5]: %%timeit
   ...: molecule.generate_conformers(n_conformers=800, toolkit_registry=RDKitToolkitWrapper())
   ...:
   ...:
2min 17s ± 3.45 s per loop (mean ± std. dev. of 7 runs, 1 loop each)

I hoped the toolkit would crash on this input since that would make the fix straightforward. Toolkit's call to AllChem.EmbedMultipleConfs might have room from improvements here but that's not obviously the issue.

@Yoshanuikabundi is there some magic happening in the fragmentation pathway? I can only find a tiny amount of magic that might cause something like this, nothing obviously making big changes to the molecule. Molecule.generate_conformers is only even called once from that package.

https://github.com/openforcefield/openff-fragmenter/blob/0.2.0/openff/fragmenter/chemi.py#L96-L100

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants