SMARTS-RX → SMILES Molecular Generator

We introduce VeGA-RX, conditioned on dual semantic descriptors (SMILES + SMARTS-RX), and VeGA-SCX, which integrates topological guidance (SMILES + Scaffold + SMARTS-RX). Both models efficiently explore chemical space but with distinct profiles: VeGA-SCX prioritizes structural discipline, while VeGA-RX maximizes generative freedom.

SMARTS-RX → SMILES Molecular Generator

This repository provides a complete pipeline for training, fine-tuning, and generating molecules using conditional SMARTS-RX → SMILES models.

The workflow is organized into six notebooks.

1. Installation

First, download the codebase. Then, use conda to set up a new environment for VeGA. If you're new to conda, we recommend checking out this tutorial before proceeding.

conda env create -f enviroment.yml  
conda activate vega
python -m pip install tensorflow[and-cuda]  
conda install -c conda-forge jupyter notebook

Now, you can launch one of the three notebooks using Jupyter Notebook.

Workflow Overview

Recommended order of execution:

Training → (Optional) Fine-Tuning → Generation

1. Training

1.1 def_train_SMART_SCAFFOLD.ipynb

Purpose

Train the SMARTS + Scaffold conditioned SMILES generator.

Setup

Modify the file paths in the configuration section:

SMILES dataset
SMARTS definition file
Output/cache paths

Execution

Run the main training cell.

If train_data.pkl and val_data.pkl do NOT exist, the notebook starts full preprocessing and training: logger.info("🚀 START TRAINING: SMARTS-RX → SMILES")

If cached files already exist, it directly loads them:

logger.info("🚀 AVVIO SCRIPT: CARICAMENTO DIRETTO E ADDESTRAMENTO")

In most cases, simply running the main training cell is sufficient.

Output

Trained model (.keras)
char2idx.pkl
idx2char.pkl
vocab.json

1.2 train_smartrx.ipynb

Purpose

Train a SMARTS-RX conditioned model (alternative implementation).

Setup

Update:

Dataset path
SMARTS file
Output directory

Execution

Run all cells sequentially.

2. Fine-Tuning

GPU acceleration is recommended for training.

2.1 fine_tuning_SMART_SCAFFOLD.ipynb

Purpose

Fine-tune a pretrained SMARTS + Scaffold model.

Setup

Modify:

PRETRAINED_MODEL
VOCAB_PATH
SMARTS file
Fine-tuning dataset
SAVE_DIR

Ensure MAX_LENGTH matches the original training configuration.

Execution

Run all cells.

2.2 fine_tuning_smart.ipynb

Purpose

Fine-tune a SMARTS-only conditioned model.

Setup

Update:

Pretrained model path
Vocabulary files
Dataset path
Save directory

Execution

Run all cells sequentially.

3. Generation

gen_def.ipynb

Purpose

Interactive molecule generation.

Supports:

SMARTS conditioning
Scaffold conditioning
Combined SMARTS + Scaffold
Unconditional generation
Batch generation with CSV export

Setup

Update:

Model path (.keras)
char2idx.pkl
idx2char.pkl
vocab.json

Execution

Run the notebook.

An interactive menu will appear in the console.
Follow the prompts to:

Select generation mode
Set batch size
Adjust temperature
Provide SMARTS and/or scaffold inputs

Notes

Keep MAX_LENGTH consistent across all notebooks.
Vocabulary files must correspond to the specific trained model.

Publication

https://pubs.acs.org/doi/10.1021/acs.jcim.6c00535

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
VeGA-RX		VeGA-RX
VeGA-SCX		VeGA-SCX
.gitattributes		.gitattributes
LICENSE		LICENSE
README.md		README.md
from_mol_to_SMART_RX.ipynb		from_mol_to_SMART_RX.ipynb
smartsrx.json		smartsrx.json

Folders and files

Latest commit

History

Repository files navigation

SMARTS-RX → SMILES Molecular Generator

1. Installation

Workflow Overview

1. Training

1.1 def_train_SMART_SCAFFOLD.ipynb

Purpose

Setup

Execution

Output

1.2 train_smartrx.ipynb

Purpose

Setup

Execution

2. Fine-Tuning

2.1 fine_tuning_SMART_SCAFFOLD.ipynb

Purpose

Setup

Execution

2.2 fine_tuning_smart.ipynb

Purpose

Setup

Execution

3. Generation

gen_def.ipynb

Purpose

Setup

Execution

Notes

Publication

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages