Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some of the molecule generate smiles exceptions #340

Closed
PrayHopeWish opened this issue Jun 22, 2017 · 4 comments
Closed

Some of the molecule generate smiles exceptions #340

PrayHopeWish opened this issue Jun 22, 2017 · 4 comments

Comments

@PrayHopeWish
Copy link

CDK version is 2.0

private final static SmilesGenerator smiGen = SmilesGenerator.absolute();
String smiles2 = smiGen.create(molecule); // IAtomContainer molecule

Exception is :
Caused by: org.openscience.cdk.exception.CDKException: An InChI could not be generated and used to canonise SMILES: null
at org.openscience.cdk.smiles.SmilesGenerator.inchiNumbers(SmilesGenerator.java:503)
at org.openscience.cdk.smiles.SmilesGenerator.labels(SmilesGenerator.java:471)
at org.openscience.cdk.smiles.SmilesGenerator.create(SmilesGenerator.java:375)
at org.openscience.cdk.smiles.SmilesGenerator.create(SmilesGenerator.java:325)
... 125 common frames omitted
Caused by: java.lang.reflect.InvocationTargetException: null
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.openscience.cdk.smiles.SmilesGenerator.inchiNumbers(SmilesGenerator.java:496)
... 129 common frames omitted
Caused by: java.lang.ArrayIndexOutOfBoundsException: 1
at org.openscience.cdk.graph.invariant.InChINumbersTools.parseUSmilesNumbers(InChINumbersTools.java:145)
at org.openscience.cdk.graph.invariant.InChINumbersTools.getUSmilesNumbers(InChINumbersTools.java:73)
... 134 common frames omitted

I have debug the inchi code

**org.openscience.cdk.graph.invariant.InChiNumbersTools.java

parseUSmilesNumbers method,**

       ```

if ((index = aux.indexOf("/F:")) >= 0) {
String[] fixedHNumbers = aux.substring(index + 3, aux.indexOf('/', index + 3)).split(";");
for (int i = 0; i < fixedHNumbers.length; i++) {

                String component = fixedHNumbers[i];

                // m, 2m, 3m ... need to lookup number in the base numbering
                if (component.charAt(component.length() - 1) == 'm') {
                    int n = component.length() > 1 ? Integer
                            .parseInt(component.substring(0, component.length() - 1)) : 1;
                    for (int j = 0; j < n; j++) {
                        String[] numbering = baseNumbers[i + j].split(",");
                        first[i + j] = Integer.parseInt(numbering[0]) - 1;
                        for (String number : numbering)
                            numbers[Integer.parseInt(number) - 1] = label++;
                    }
                } else {
                    String[] numbering = component.split(",");
                    first[i] = Integer.parseInt(numbering[0]) - 1;    // error line 
                    for (String number : numbering)
                        numbers[Integer.parseInt(number) - 1] = label++;
                }

_**When parse the AuxInfo the error happened.**    
I think how to set the size of the **first** array can not find the basis. 
**baseNumbers[i + j]**  **first[i + j]**   may be not consider the size of the array._

The molecule is : 
  CDK     0520151716

  9  6  0  0  0  0  0  0  0  0999 V2000
    3.3660    1.0000    0.0000 Si  0  0  0  0  0  0  0  0  0  0  0  0
    4.2320    1.5000    0.0000 F   0  0  0  0  0  0  0  0  0  0  0  0
    2.5000    0.5000    0.0000 F   0  0  0  0  0  0  0  0  0  0  0  0
    2.5000    1.5000    0.0000 F   0  0  0  0  0  0  0  0  0  0  0  0
    3.3660    0.0000    0.0000 F   0  0  0  0  0  0  0  0  0  0  0  0
    3.3660    2.0000    0.0000 F   0  0  0  0  0  0  0  0  0  0  0  0
    4.2320    0.5000    0.0000 F   0  0  0  0  0  0  0  0  0  0  0  0
    0.0000    2.0000    0.0000 H   0  0  0  0  0  0  0  0  0  0  0  0
    3.1160    4.0000    0.0000 H   0  0  0  0  0  0  0  0  0  0  0  0
  1  2  1  0  0  0  0
  1  3  1  0  0  0  0
  1  4  1  0  0  0  0
  1  5  1  0  0  0  0
  1  6  1  0  0  0  0
  1  7  1  0  0  0  0
M  CHG  1   1  -2
M  CHG  1   8   1
M  CHG  1   9   1
M  END

**The correct Smiles is : F[Si-2](F)(F)(F)(F)F.[H+].[H+]**






@johnmay
Copy link
Member

johnmay commented Jun 22, 2017

I'm planning on deprecating Universal SMILES (absolute) flavour as it just doesn't work nicely.

You can use

SmilesGenerator smigen = new SmilesGenerator(SmiFlavor.Unique);

for this molecule.

@PrayHopeWish
Copy link
Author

@johnmay Thanks , I will change to use the new SmilesGenerator(SmiFlavor.Unique);

@johnmay
Copy link
Member

johnmay commented Jun 23, 2017

Now fixed on master also.

@johnmay
Copy link
Member

johnmay commented Jun 23, 2017

Thanks for reporting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants