New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to train LinkerInvent's prior model using my own data #63
Comments
Hi, many thanks for your interest in REINVENT and welcome to the community! We have four different generator "styles", which one are you interested in (Reinvent, Libinvent, Linkinvent, Mol2Mol). Generally, I would not necessarily recommend to produce new priors for production as it requires some knowledge and skill to make a high-quality one. But it can certainly also be an interesting learning exercise to make new models and get it to perform well. Many thanks, |
First of all, thank you for your reply. I'm particularly interested in the 'styles' of the Linkinvent generator. My own dataset consists of a group of molecules represented in SMILES format. I want to train a prior model using Linkinvent starting from this data. I'm not sure where to find scripts for data preprocessing and retraining. I would greatly appreciate it if you could provide them. |
Hi again, the original publication describes how the input SMILES have been split but eventually is the same method as applied for Libinvent. The repository of the original code may have some clues and data but the data splitting code is in this repository. Cheers, |
Thank you very much for open-sourcing the project. I noticed that pre-trained prior models are provided in the 'prior/' directory. I am a beginner and would like to train a new prior using my own data, but I couldn't find any instructions on how to train the prior. I would greatly appreciate it if you could provide me with some guidance and assistance on this matter
The text was updated successfully, but these errors were encountered: