New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LongestAliphaticChainDescriptor doesn't work #362
Comments
The source code for the LongestAliphaticChain descriptor isn’t overly long (attached) and while I currently don’t have time to look at it, as quick inspection shows that it might be one of those beautiful off-by-one bugs :) in the breadth-first-search. Or maybe not.
Enjoy debugging. :)
Cheers,
Chris
—
Prof. Dr. Christoph Steinbeck
Analytical Chemistry - Cheminformatics and Chemometrics
Friedrich-Schiller-University Jena, Germany
Phone Secretariat: +49-3641-948171
http://cheminf.uni-jena.de
http://orcid.org/0000-0001-6966-0814
What is man but that lofty spirit - that sense of enterprise.
... Kirk, "I, Mudd," stardate 4513.3..
… On 1 Sep 02017, at 13:20, John Mayfield ***@***.***> wrote:
Personal communication from Nichola McCann
I am relatively new to using the cdk, but I have a question about the descriptor generation. When I calculate the Longest Aliphatic Chain (nAtomLAC) for ethanol, I get different answers, depending on how I enter the molecule. If I use CCO, I get a value of 2, but if I use OCC (the canonical version from cdk), I get 0.
I have attached below the R code that I used to generate these results – could you please explain the discrepancy?
@test
public void debug() throws ClassNotFoundException, CDKException, java.lang.Exception
{
SmilesParser sp = new SmilesParser(SilentChemObjectBuilder.
getInstance());
IAtomContainer mol1 = sp.parseSmiles("CCO"
);
IAtomContainer mol2 = sp.parseSmiles("OCC"
);
System.out.println(descriptor.calculate(mol1).
getValue());
System.out.println(descriptor.calculate(mol2).
getValue());
}
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or mute the thread.
|
Here's a correct non-optimal way of doing it but mirrors what is done ATM :-) private boolean isAcyclicCarbon(IAtom atom) {
return atom.getAtomicNumber() == 6 && !atom.isInRing();
}
private int longestAliphaticChain(IAtomContainer mol) {
IAtomContainer part = mol.getBuilder().newAtomContainer();
Cycles.markRingAtomsAndBonds(mol); // ensure ring flags are set
for (IAtom atom : mol.atoms())
if (isAcyclicCarbon(atom))
part.addAtom(atom);
for (IBond bond : mol.bonds())
if (isAcyclicCarbon(bond.getBegin()) && isAcyclicCarbon(bond.getEnd()))
part.addBond(bond);
AllPairsShortestPaths apsp = new AllPairsShortestPaths(part);
int max = 0;
// NxN can be avoided, apsp internally knows longest path
for (int beg = 0; beg < part.getAtomCount(); beg++) {
for (int end = 0; end < part.getAtomCount(); end++) {
int dist = apsp.from(beg).distanceTo(end);
if (dist > max) {
max = dist;
}
}
}
return 1+max;
} |
Merged
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Personal communication from Nichola McCann
The text was updated successfully, but these errors were encountered: