-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RDKit Rooted Fingerprints Have Wrong Results in KNIME #143
Labels
Comments
greglandrum
pushed a commit
that referenced
this issue
Dec 2, 2023
manuelschwarze
added a commit
that referenced
this issue
Dec 2, 2023
- KNIME-1792: RDKit Binaries updated to 2023_09_2 and adapted tests - changes were detected related to: - Optimize geometry - RDKit rooted fingerprints - SVG generation - Substructure counter stereo-chemistry handling - KNIME-1794: Fixed rooted fingerprints (in KNIME code) - Update version from 4.9.0 => 4.9.1 for RDKit Binary Changes - Added for KNIME 4.6 testing javax.annotation to wizard: Some plugins that the RDKit Nodes Wizard requires have a dependency on javax.annotation, which was fulfilled in lower Java versions automatically, but not anymore when using Java 17. Therefore, it is now referenced additionally.
greglandrum
pushed a commit
that referenced
this issue
Dec 3, 2023
… to master (#141) * KNIME-1725: Version 4.7.1 => 4.9.0 for master (KNIME 4.7+) This version will contain changes (with subsequent commits) for: - Advanced port handling for RDKit nodes - KNIME File Handling Framework integration - RDKit Component Reaction Nodes improvements Commits will be cherry-picked from pre-nightly4.7 branch * KNIME-1309: Support all port objects, not only data tables (#133) * KNIME-1309 Added: support of dynamic ports and arbitrary port types * Dependency fix for RDKit Nodes Wizard --------- Co-authored-by: Roman Balabanov <roman-1.balabanov_ext@novartis.com> * KNIME-1705 FunctionalGroupFilter node: migration to KNIME NIO (#135) Authored-by: Roman Balabanov <roman-1.balabanov_ext@novartis.com> * KNIME-1706 StructureNormalizer node: migration to KNIME NIO (#136) Authored-by: Roman Balabanov <roman-1.balabanov_ext@novartis.com> * KNIME-1704 FingerprintReader FingerprintWriter nodes: migration to KNIME NIO (#137) Authored-by: Roman Balabanov <roman-1.balabanov_ext@novartis.com> * KNIME-1762 Fixed: StructureNormalizer node Options tab layout is bad on Mac (#139) Co-authored-by: Roman Balabanov <roman-1.balabanov_ext@novartis.com> * KNIME-132: Added: handling of additional columns in ComponentReaction nodes (#140) Authored-by: Roman Balabanov <roman-1.balabanov_ext@novartis.com> * KNIME-1704,KNIME-1705,KNIME-1706: MANIFEST fix for file handling dep * KNIME-1792: RDKit 2023_09_2 Binaries Update and Fixes (issue #143) - KNIME-1792: RDKit Binaries updated to 2023_09_2 and adapted tests - changes were detected related to: - Optimize geometry - RDKit rooted fingerprints - SVG generation - Substructure counter stereo-chemistry handling - KNIME-1794: Fixed rooted fingerprints (in KNIME code) - Update version from 4.9.0 => 4.9.1 for RDKit Binary Changes - Added for KNIME 4.6 testing javax.annotation to wizard: Some plugins that the RDKit Nodes Wizard requires have a dependency on javax.annotation, which was fulfilled in lower Java versions automatically, but not anymore when using Java 17. Therefore, it is now referenced additionally. --------- Co-authored-by: Roman Balabanov <roman-1.balabanov_ext@novartis.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
We have discovered that rooted fingerprints in KNIME have are different from Python and pure Java. The reason we found is that there is a bug how the atom list for the rooting feature is created before it passed to the RDKit native code.
Paolo Tosco's Java code created it like this, as example with atom index 17:
This is correct. It results in [ 17 ]
I had created it many years ago like this:
This is incorrect. It results in [ 0, 17 ], where "0" could also be something that is not well-defined.
In addition to this code, other buggy code also related to the same feature was discovered in InputDataObject class, which processes again such UInt_Vect objects and has the same bug.
Consequences of this bug:
InputDataInfo.getRDKitIntegerVector(DataRow row) – Always was double the size that it should be, and the first half was filled with 0 (or undefined).
RDKit Highlighting Node – Should have been leading to always coloring of atom at index 0, but for some reason this did not happen, hence no problem that a user would actually see
RDKit Highlighting Atoms Node (deprecated) – Influence on SVG creation, but no problem that a user would actually see
InputDataInfo.getRDKitUIntegerVector (DataRow row) – Always was double the size that it should be, and the first half was filled with 0 (or undefined).
RDKit Fingerprint Nodes – Rooted fingerprints (Morgan, FeatMorgan, AtomPair, Torsion, RDKit, Layered, always included the atom indexes 0 (or undefined)
AbstractFingerprintNodeModel.$AbstractRDKitCellFactory.process(…) – When a single atom was selected for rooting, it was always using atom at index 0 and the selected one, so it used 2 atoms instead of a single one, every fingerprint calculated with rooted atoms has in the moment wrong results: There are more bits set in the fingerprint than it would have without the additional badly added atom index.
Tracked at Novartis as KNIME-1794
The text was updated successfully, but these errors were encountered: