SMILES canonicalization like in old versions #8964
-
|
I've started working on a legacy project that uses rdkit=2018.03.2.0. We now want to update to a newer version, but from v. 2022.09 the canonicalization algorithm has changed. Is there a way to use Chem.MolToSmiles() with the old algorithm, perhaps via some global flag? Obviously, I could create a 'duct-tape' script that uses only Chem.MolFromSmiles() and Chem.MolToSmiles() functions and separate environment with the old rdkit version and call it from my main code. However, I'm looking for a more proper solution. |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments
-
|
The proper solution is, sadly, to recanonicalize. Many of the changes in the canonicalization algorithm were to fix issues so the general solution is to canonicalize from the source information if you still have it. |
Beta Was this translation helpful? Give feedback.
-
To be precise about wording: there have been no changes to the core canonicalization algorithm since it was introduced back in 2015. We have, however, fixed bugs throughout the RDKit and some of those will have an impact on the canonical SMILES you get. As @bp-kelley said: the only way to be sure that canonical SMILES are directly comparable to each other is to ensure that they were generated with the same version of the RDKit. Note that this is true of EVERY piece of software: bug fixes can always change results. |
Beta Was this translation helpful? Give feedback.
To be precise about wording: there have been no changes to the core canonicalization algorithm since it was introduced back in 2015. We have, however, fixed bugs throughout the RDKit and some of those will have an impact on the canonical SMILES you get.
As @bp-kelley said: the only way to be sure that canonical SMILES are directly comparable to each other is to ensure that they were generated with the same version of the…