Skip to content

Conversation

@sfluegel05
Copy link
Contributor

The ChEBI lookup parses SMILES without sanitization now. This ensures that even "wonky" SMILES get looked up (i.e., SMILES that can be parsed, but not sanitized).

For example: [H][C@@]12NC(=O)[C@@H](c3cc(Cl)c(O)c(Cl)c3)NC(=O)[C@H](NC(=O)C(=O)c3cc(Cl)c(O)c(Cl)c3)Cc3cnc4c(cccc34)-c3cc1cc(c3O)Oc1ccc(cc1)C[C@@]([H])(C(=O)N[C@@H](C(=O)O)c1ccc(O)cc1)N(C)C(=O)[C@@H](c1cc(Cl)c(O)c(Cl)c1)NC2=O can now be looked up (CHEBI:65618). In practice, this still fails for v244 because the SMILES representation in ChEBI changed between the SMILES shown here and the SMILES from v244. However, if we update the SMILES lookup to a newer version in the future, this might actually work.

@sfluegel05 sfluegel05 merged commit c64be19 into dev Nov 7, 2025
2 checks passed
@sfluegel05 sfluegel05 deleted the fix/chebi-lookup branch November 7, 2025 13:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants