-
|
I'm new to the REINVENT project and currently encountering an issue where my SMILES sequences contain specific tokens (particularly the phosphorus atom "P") that are not supported by the pre-provided priors in REINVENT. I have an extensive database of over 3 million molecules, which fully covers all the tokens required for my research. I have attempted to standardize the SMILES using RDKit and even converted them to SELFIES/DeepSMILES formats, but none of these approaches resolved the problem. The system consistently returns the error message:"Allowed tokens are: ({'c', '[n+]', '[nH]', '7', '5', '%10', '$', 'N', '-', 'n', 'O', '6', '=', '4', 'o', '3', '9', 'Br', '^', '8', 'Cl', '[N+]', 'F', '[S+]', '(', ')', 'C', 'S', '1', '[N-]', '2', '#', '[O-]', 's'}, set())" I'm now using the provided transfer_learning.toml as a template, and my TOML configuration is as follows: My questions are: REINVENT Version: 4.7 Any documentation, examples, or pointers to relevant code would be greatly appreciated! |
Beta Was this translation helpful? Give feedback.
Replies: 6 comments 13 replies
-
|
Hi, if it is only about supported elements, I could provide you with an (experimental) PubChem based prior which may solve your problem. Principally, REINVENT only supports SMILES. You can use our data pipeline through #126 referes to LinkInvent. I guess you are planning to train a new classical, de novo Reinvent prior. Many thanks, |
Beta Was this translation helpful? Give feedback.
-
|
You would need to provide me with an online storage where I can drop a 100MB binary file. |
Beta Was this translation helpful? Give feedback.
-
|
I'm facing the same problem. I tried to create a model using REINVENT4/reinvent/runmodes/create_model/reinvent.toml but got a couple of error messages. Would it be possible to either |
Beta Was this translation helpful? Give feedback.
-
|
Hi, many thanks for your enquiry. You would need to be more specific as to what the actual error messages are. Models are specific to the generator i.e. you can not mix, say, a Reinvent prior with a Mol2Mol prior. After you have create and "empty" model you would need to carry out transfer learning (TL) with a dataset suitable for the respective generator. Many thanks, |
Beta Was this translation helpful? Give feedback.
-
|
Hi, model creation is not supported by the Cheers, |
Beta Was this translation helpful? Give feedback.
-
|
Creating an empty model with |
Beta Was this translation helpful? Give feedback.

Hi,
if it is only about supported elements, I could provide you with an (experimental) PubChem based prior which may solve your problem.
Principally, REINVENT only supports SMILES. You can use our data pipeline through
reinvent_datapreto prepare the SMILES. If that interests you, I can provide further details.#126 referes to LinkInvent. I guess you are planning to train a new classical, de novo Reinvent prior.
Many thanks,
Hannes.