Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PiContactDetectionDescriptor doesn't work correctly #509

Closed
k-ujihara opened this issue Nov 18, 2018 · 3 comments · Fixed by #833
Closed

PiContactDetectionDescriptor doesn't work correctly #509

k-ujihara opened this issue Nov 18, 2018 · 3 comments · Fixed by #833
Labels

Comments

@k-ujihara
Copy link
Contributor

I consider that PiContactDetectionDescriptor does not work correctly. For example, the following test137 is failed.
Anyway, I could not find the specification.

@Before
public void setUp() throws Exception {
    setDescriptor(PiContactDetectionDescriptor.class);
}

@Test
public void testButadiene() {
    IAtomContainer mol = TestMoleculeFactory.makeAlkane(4);
    mol.getBond(0).setOrder(IBond.Order.DOUBLE);
    mol.getBond(2).setOrder(IBond.Order.DOUBLE);
    for (int i = 0; i < 4; i++) 
        for (int j = 0; j < 4; j++) 
            Assert.assertEquals(true, checkAtomAtom(mol, i, j));
}
    
@Test
public void test137() {
    IAtomContainer mol = TestMoleculeFactory.makeAlkane(8);
    mol.getBond(0).setOrder(IBond.Order.DOUBLE);
    mol.getBond(2).setOrder(IBond.Order.DOUBLE);
    mol.getBond(6).setOrder(IBond.Order.DOUBLE);        
    for (int i = 0; i < 4; i++) 
        for (int j = 0; j < 4; j++) 
            Assert.assertEquals(true, checkAtomAtom(mol, i, j));
    for (int i = 0; i < 4; i++) 
        for (int j = 4; j < 8; j++) 
            Assert.assertEquals(false, checkAtomAtom(mol, i, j));
    Assert.assertEquals(true, checkAtomAtom(mol, 6, 7));
}

private boolean checkAtomAtom(IAtomContainer mol, int i, int j) {
    DescriptorValue result = descriptor.calculate(mol.getAtom(i), mol.getAtom(j), mol);
    BooleanResult val = (BooleanResult)result.getValue();
    return val.booleanValue();
}
@johnmay
Copy link
Member

johnmay commented Nov 20, 2018

It's only testing aromatic pi-systems. Now whether it's the documentation or algorithm that needs updating is another mater.

Any thoughts? I'm leaning towards documentation update...

@k-ujihara
Copy link
Contributor Author

Current implementation looks trying to detect not only aromatic π-conduction but polyene conduction. At least, later is useful for several purposes.

@johnmay
Copy link
Member

johnmay commented Feb 8, 2022

Fixed - very obvious in the end, Sorry for the delay. @kazuyaujihara the test case wasn't quite right as "contact" is consider if the atom is adjacent to a pi system of the other one. i.e. 0..5 and 5..8 rather than 0..4. Also pi system of two bonds only are not consider. I've update the doc to reflect this.

At the root of the implementation is ConjugatedPiSystemsDetector which could be cleaned up. Rather than pushing atoms around between containers I consider a cleaner implementation would be to pass in an array that we fill with numbers that mark the pi systems.

IAtomContainer mol = TestMoleculeFactory.makeAlkane(8);
mol.getBond(0).setOrder(IBond.Order.DOUBLE);
mol.getBond(2).setOrder(IBond.Order.DOUBLE);
mol.getBond(6).setOrder(IBond.Order.DOUBLE);
int[] visit = new int[mol.getAtomCount()];
ConjugatedPiSystemsDetector.detect(visit, mol);
System.err.println(Arrays.toString(visit));

This is then very easy to debug and check membership. Rather than doing linear lookups we just need to check the atoms are in the same set (have the same value set).

[1, 1, 1, 1, 0, 0, 2, 2]

Here is the implementation:

public static int detect(int[] visit, final IAtomContainer mol) {
    if (visit.length < mol.getAtomCount())
        throw new IllegalArgumentException("int[] visit array is not large enough for the number of atoms in the molecule!");

    Arrays.fill(visit, 0);
    int numPiSystems = 0;
    for (IAtom firstAtom : mol.atoms()) {

        // if this atom was already visited in a previous DFS, continue
        if (visit[firstAtom.getIndex()] != 0 ||
            getType(mol, firstAtom) != Type.CONJUGATED) {
            continue;
        }

        int piSystemId   = ++numPiSystems;
        int piSystemSize = 0;
        visit[mol.indexOf(firstAtom)] = piSystemId;
        Deque<IAtom> queue = new ArrayDeque<>();
        queue.push(firstAtom);

        IAtom atom = null;
        while (!queue.isEmpty()) {
            atom = queue.pop();
            piSystemSize++;
            for (IBond bond : mol.getConnectedBondsList(atom)) {
                IAtom nbor = bond.getOther(atom);
                int nborIdx = mol.indexOf(nbor);
                if (visit[nborIdx] == 0) {
                    Type check = getType(mol, nbor);
                    if (check == Type.CONJUGATED) {
                        queue.add(nbor);
                        visit[nborIdx] = piSystemId;
                    }
                }
            }
        }

        if (piSystemSize == 1) {
            visit[mol.indexOf(firstAtom)] = 0;
            numPiSystems--;
        }
    }

    return numPiSystems;
}

I hold off doing this now as it's unclear whether the limit of 2+ atoms makes sense (a slight adjustment needed) and why there is special handling of the cumulated cases - I would not consider those conjugated but perhaps it for case like:
C=CC=C=CC=C where by atom 3 is two different conjugated systems (that doesn't seem right).

On a side note I always wince when looking at these descriptor implementations as they feel like the "wrong shape". There is always so much copying/cloning and recalculation. Like in this case, it recalculates for every atom pair, really the Atom/Bond/Pair descriptors should take a molecule and give back an array of results.

@egonw egonw closed this as completed in #833 Feb 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants