Skip to content

A kekulization failure from a SMILES benchmark  #1900

@baoilleach

Description

@baoilleach

In results presented at the recent ICCS regarding a SMILES benchmark (https://www.nextmovesoftware.com/products/SMILESBenchmark_ICCS_May2018.pdf), there was a single SMILES string (of the 47K written out by the CDK) that RDKit 2018.03.1 failed to kekulize. Given that (at least) CDK itself, OEChem and Open Babel all succeeded in kekulizing it, it's likely that it is kekulizable.

Unfortunately, it's a buckyball derivative (perhaps the same issue reported by Brian C at #1740?). The CDK aromatic SMILES is:

c12c3c4c5c6c7C8c5c-9c%10c%11c%12c%13c%14c%15c%11c9c%16c%17c%15c%18-c%14c%19c%20c%13c(c%12c1c%104)c%21c2c%22c%23c3c6c%24c%23c%25c%26c%22c%21c%20c%27C%19%28C%29(c%18c%30c%17c(C%168)c-%31c%32c%30c%29c(=c%27%26)c%25c%32c%24c7%31)CC=CC%28

For reference, the original (Kekule) input to CDK.

C12=C3C4=C5C6=C7C8C5=C9C5=C%10C%11=C%12C%13=C%14C%10=C9C9=C%10C%14=C%14C%13=C%13C%15=C%12C(=C%11C1=C54)C1=C2C2=C4C3=C6C3=C4C4=C5C2=C1C%15=C1C%132C6(C%14=C%11C%10=C(C98)C8=C9C%11=C6C(=C15)C4=C9C3=C78)CC=CC2

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions