Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't kekulize mol #917

Closed
UnixJunkie opened this issue May 18, 2016 · 13 comments
Closed

Can't kekulize mol #917

UnixJunkie opened this issue May 18, 2016 · 13 comments
Labels

Comments

@UnixJunkie
Copy link
Collaborator

Using the following script:

#!/usr/bin/env python2

# output the MACCS bitstring of each molecule found in a MOL2 file

import rdkit.Chem
import sys

def RetrieveMol2Block(fileLikeObject, delimiter="@<TRIPOS>MOLECULE"):
    """generator which retrieves one mol2 block at a time
    """
    mol2 = []
    for line in fileLikeObject:
        if line.startswith(delimiter) and mol2:
            yield "".join(mol2)
            mol2 = []
        mol2.append(line)
    if mol2:
        yield "".join(mol2)

import sys
from rdkit.Chem import MACCSkeys
with open(sys.argv[1]) as in_file:
    problem_mols = open('problem.mol2', 'w')
    for mol2 in RetrieveMol2Block(in_file):
        mol = rdkit.Chem.MolFromMol2Block(mol2)
        try:
            maccs = MACCSkeys.GenMACCSKeys(mol)
            for bit in maccs:
                if bit:
                    sys.stdout.write('1')
                else:
                    sys.stdout.write('0')
            sys.stdout.write('\n')
        except:
            sys.stdout.write('0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000\n')
            problem_mols.write(mol2)

and rdkit-Release_2016_03_1; I got all the following molecules in error:
https://gist.github.com/UnixJunkie/c8c500f9b18d80daf59d0990c8bc964e

@UnixJunkie
Copy link
Collaborator Author

Is this a known problem ?
Is my bug report incomplete or incorrect in some way ?

@greglandrum
Copy link
Member

Just busy and haven't had a chance to take a look at it.

@UnixJunkie
Copy link
Collaborator Author

OK. For the moment, I will just ignore those molecules.
I hope they will be managed in future versions of rdkit.

@greglandrum
Copy link
Member

One piece of information that would really help: which piece of software produced the mol2 files?

@UnixJunkie
Copy link
Collaborator Author

Conformer generation was performed with omega (from openeye).

@greglandrum
Copy link
Member

And that wrote the mol2 file?

@UnixJunkie
Copy link
Collaborator Author

I hope so.

@UnixJunkie
Copy link
Collaborator Author

Hmm. Let me think; maybe there was an additional pass with open babel to ensure the partial charges were Gasteiger ones.

@greglandrum
Copy link
Member

The reader is not super robust. It really expects that the input files have atom types that match what Corina produces. There's a bit of documentation of that here: http://www.rdkit.org/Python_Docs/rdkit.Chem.rdmolfiles-module.html#MolFromMol2File

@greglandrum
Copy link
Member

If possible, you will have much better luck creating molecules from a Mol (or SDF) file

@UnixJunkie
Copy link
Collaborator Author

Thanks for the tip.
Maybe the error message should be more explicit (which atom type was not understood
or even the full problematic atom line from the MOL2 file).
This would give users a chance to fix their input file.

@greglandrum
Copy link
Member

That would indeed be nice, but it doesn't know that. It just knows that a ring was encountered that it could not kekulize.

Having some reporting that tells you which ring had the problem may help some.

greglandrum added a commit to greglandrum/rdkit that referenced this issue May 20, 2016
bp-kelley pushed a commit that referenced this issue May 23, 2016
* improve error reporting for kekulization failures
Connected to #917

* better phrasing of the message.
@greglandrum
Copy link
Member

Now that the error reporting has been improved (at least I think so), I'm closing this.
Please re-open if necessary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants