In [1]:
import os
import json

## Creating a Config JSON

`Saturn` adapts the execution workflow of [REINVENT Version 3.2](https://github.com/MolecularAI/Reinvent). All parameters of the reinforcement learning run are specified using a `JSON` config.

For the initial code release, this notebook is minimal and just walks through how to impose various reaction constraints.

#### Defining the Reward Function

Define the reward function to include all the properties to be optimized for.

Below is the reward function for all Development experiments in the paper with the reward function composed of:

1. QuickVina2-GPU docking against ClpP
2. QED
3. Number of hydrogen bond donors < 4
4. Imposing various reaction constraints using [Syntheseus](https://github.com/microsoft/syntheseus/)

Below, `<path to saturn>` refers to where the `saturn` codebase is cloned.

In [2]:
# Each dictionary below defines a specific reward function component (for example, QED)
# For a list of supported oracle components, see oracles/utils.py
# Below are details on the key parameters:
# "weight":weight of the oracle component - higher makes the reward contribution from the component more important
# "preliminary_check": whether to run this specific oracle component first and if the reward does not pass a threshold, discard the SMILES. 
#                       This is useful for components that are computationally inexpensive as a way to "pre-screen" the batch and not waste oracle calls 
# "reward_shaping_function_parameters": reward shaping function to apply to the component. The syntax is exactly the same as REINVENT 3.2. 
#                                       See the following notebook for function visualizations: https://github.com/MolecularAI/ReinventCommunity/blob/master/notebooks/Score_Transformations.ipynb

oracle_components = [
    # QuickVina2-GPU docking
    {
        "name": "quickvina2_gpu",
        "weight": 1,
        "preliminary_check": False,
        "specific_parameters": {
            "binary": "<path to the QuickVina2-GPU binary executable",
            "force_field": "uff",
            "receptor": "<path to saturn>/experimental_reproduction/synthesizability/7uvu-2-monomers-pdbfixer.pdbqt",
            "reference_ligand": "<path to saturn>/experimental_reproduction/synthesizability/7uvu-reference.pdb",
            "thread": 8000,
            "results_dir": "<where to save docking results (poses and scores)"
        },
        "reward_shaping_function_parameters": {
            "transformation_function": "reverse_sigmoid",
            "parameters": {
                "low": -16,
                "high": 0,
                "k": 0.15
            }
        }
    },
    # QED
    {
        "name": "qed",
        "weight": 1,
        "preliminary_check": False,
        "specific_parameters": {},
        "reward_shaping_function_parameters": {
            "transformation_function": "no_transformation"
        }
    },
    # Number of hydrogen bond donors
    {
        "name": "num_hbd",
        "weight": 1,
        "preliminary_check": False,
        "specific_parameters": {},
        "reward_shaping_function_parameters": {
            "transformation_function": "step",
            "parameters": {
                "low": 0,
                "high": 3
            }
        }
    }
]

In [3]:
# This is the main block of the notebook as all reaction constraints can be specified by the single dictionary below
reaction_constraints_oracle = {
    "name": "syntheseus",
    "weight": 1,
    "preliminary_check": False,
    "specific_parameters": {
        # This is the syntheseus conda environment
        "syntheseus_env_name": "syntheseus-full",
        
         # This is the retrosynthesis model. All pre-print experiments were run with MEGAN 
         # but "localretro", "retroknn", "rootaligned", and "graph2edits" are also supported
        "reaction_model": "megan",

        # This is the path to the building blocks stock file
        # The eMolecules stock used in the pre-print can be downloaded from: https://doi.org/10.6084/m9.figshare.29040977.v1
        "building_blocks_file": "<path to building blocks stock file>", 

        # -------------------------------------------------------------------------------------------------------------------------------
        # The block below concerns with enforcing specific building blocks, based on our TANGO pre-print: https://arxiv.org/abs/2410.11527
        # This means that generated molecules will not only have a predicted synthesis route, but one of the building blocks in 
        # "enforced_building_blocks_file" will be present in the routes. 
        # WARNING: If using this option with your own building blocks, ensure that the "building_blocks_file" also contains the 
        #          "enforced blocks". This is important because otherwise, the retrosynthesis model cannot even propose the 
        #          "enforced blocks" and the reaction constraint would be impossible to satisfy
        # -------------------------------------------------------------------------------------------------------------------------------
        "enforced_building_blocks": {
            # True if you want to enforce blocks
            "enforce_blocks": False,

            # This is the path to the file containing the enforced building blocks
            # The enforced blocks used in the pre-print can be downloaded from: https://doi.org/10.6084/m9.figshare.29040977.v1
            # They are a sub-set of the eMolecules stock
            "enforced_building_blocks_file": "<path to enforced building blocks stock file>",

            # True if you want to enforce the building blocks at the start of the route
            # NOTE: This was not experimented with in the pre-print, but examples of this capability can be seen in the TANGO pre-print: https://arxiv.org/abs/2410.11527
            "enforce_start": False,

            # True if you want to use the TANGO reward to enforce building blocks (should be always True)
            "use_dense_reward": True,

            # These are the default parameters of TANGO
            "reward_type": "tango_fms",
            "tango_weights": {
                "tanimoto": 0.5,
                "fg": 0.5,
                "fms": 0.5
            }
        },
        # -------------------------------------------------------------------------------------------------------------------------------
        # The block below concerns with enforcing specific building blocks, based on our TANGO pre-print: https://arxiv.org/abs/2410.11527
        # This means that generated molecules will not only have a predicted synthesis route, but one of the building blocks in 
        # "enforced_building_blocks_file" will be present in the routes. 
        # WARNING: If using this option with your own building blocks, ensure that the "building_blocks_file" also contains the 
        #          "enforced blocks". This is important because otherwise, the retrosynthesis model cannot even propose the 
        #          "enforced blocks" and the reaction constraint would be impossible to satisfy
        # -------------------------------------------------------------------------------------------------------------------------------
        "enforced_reactions": {
            # Must be True if wanting to enforce *any* reaction constraints
            "enforce_rxn_class_presence": True,

            # If True, *only* reactions listed in "enforced_rxn_classes" are permitted
            # NOTE: This naturally implies all other reactions are avoided
            "enforce_all_reactions": True,

            # Rxn-INSIGHT conda environment name
            "rxn_insight_env_name": "rxn-insight",

            # True if you want to use NameRxn to label reactions instead of Rxn-INSIGHT
            # NOTE: This requires a license from NextMove Software
            "use_namerxn": False,
            "namerxn_binary_path": "<path to your namerxn binary executable>/HazELNut/namerxn",

            # List of reactions to enforce
            # NOTE: Reaction matching is performed by string matching so be careful with typos (case-insensitive)
            #       Reactions are matched if the reaction label contains the below strings. For example, "to amide" will match
            #       any reaction with "to amide" in the label. For Rxn-INSIGHT reaction names, see https://github.com/schwallergroup/Rxn-INSIGHT/blob/master/src/rxn_insight/data/smirks.json
            "enforced_rxn_classes": [
                # Example: "to amide"
                # Any amount of reactions can be listed here
            ],

            # List of reactions to avoid
            # NOTE: If enforcing to avoid reactions (below) *and* also enforcing the presence of reactions (above),
            #       Then satisfying the reaction constraint means satisfying both the presence and avoidance constraints
            "avoid_rxn_classes": [
                # Example: ["protection", "deprotection"]
                # Any amount of reactions can be listed here
            ],

            # Path to the Rxn-INSIGHT extraction script - other than the saturn clone path, this does not need to be changed
            "rxn_insight_extraction_script_path": "<path to saturn>/oracles/synthesizability/utils/extract_rxn_insight_info.py",

            # Path to the NameRxn extraction script - other than the saturn clone path, this does not need to be changed
            "namerxn_extraction_script_path": "<path to saturn>/oracles/synthesizability/utils/extract_namerxn_info.py",

            # Whether to perform reaction-based enumeration to yield initial molecules already satisfying the reaction constraints
            # NOTE: In our preliminary testing, this was not strictly beneficial in the case studies of the pre-print
            "seed_reactions": False,
            # Reaction "seeding" requires some pre-computed files. If seed_reactions is True, the pre-computed file will be checked
            # If it does not exist, it will be automatically generated.
            "seed_reactions_file_folder": "dummy"
        },

        # Helper script for extracting route data - other than the saturn clone path, this does not need to be changed
        "route_extraction_script_path": "<path to saturn>/oracles/synthesizability/utils/extract_syntheseus_route_data.py",

        # Maximum time allowed for the retrosynthesis model
        "time_limit_s": 180,

        # True means to incentivize shorter routes
        "minimize_path_length": False, 

        # Parallelization is not supported at the moment - in the future, multi-GPU/threading will be supported
        "parallelize": False,  
        "max_workers": 4,


        "results_dir": "<path to save raw syntheseus output (pickled data, rendered PDF routes, etc.)>"
    },
    "reward_shaping_function_parameters": {
        "transformation_function": "no_transformation"
    }
}

# Add this oracle component to the list of reward components
oracle_components.append(reaction_constraints_oracle)


In [4]:
config = {
    "logging": {
        # How often to save out generated molecules and a model checkpoint - 5,000 denotes after every 5,000 oracle calls
        "logging_frequency": 5000,
        "logging_path": "<path to save log file>",
        "model_checkpoints_dir": "<path to directory to save model checkpoints>"
    },
    "oracle": {
        # Oracle budget
        "budget": 15000,
        # False denotes that generated molecules that have been generated before will have its reward retrieved from the cache which does not impose an oracle call
        "allow_oracle_repeats": False,

        # Weighted geometric mean to aggregate reward components
        "aggregator": "product",

        # This is the list of reward components defined in the cells above
        "components": oracle_components
    },
    "goal_directed_generation": {
        "reinforcement_learning": {
            "prior": "<path to saturn>/experimental_reproduction/checkpoint_models/pubchem_mamba_5_retrained.prior",
            "agent": "<path to saturn>/experimental_reproduction/checkpoint_models/pubchem_mamba_5_retrained.prior",
            # These parameters affect sampling behaviour - default works
            # See https://arxiv.org/abs/2405.17066 for more details
            "batch_size": 64,
            "learning_rate": 0.0001,
            "sigma": 128.0,
            "augmented_memory": True,
            "augmentation_rounds": 2,
            "selective_memory_purge": True
        },
        # Hill-climbing experience replay adapted from: https://jcheminf.biomedcentral.com/articles/10.1186/s13321-017-0235-x
        "experience_replay": {
            "memory_size": 100,
            "sample_size": 10,
            "smiles": []
        },
        # Diversity filter adapted from: https://jcheminf.biomedcentral.com/articles/10.1186/s13321-020-00473-0
        "diversity_filter": {
            "name": "IdenticalMurckoScaffold",
            "bucket_size": 10
        },
        # This block is for Hallucinated Memory: https://arxiv.org/abs/2405.17066 - details are omitted here
        "hallucinated_memory": {
            "execute_hallucinated_memory": False,
            "hallucination_method": "ga",
            "num_hallucinations": 100,
            "num_selected": 5,
            "selection_criterion": "random"
        },
        # This block is for Beam Enumeration: https://openreview.net/forum?id=7UhxsmbdaQ - details are omitted here
        "beam_enumeration": {
            "execute_beam_enumeration": False,
            "beam_k": 2,
            "beam_steps": 18,
            "substructure_type": "structure",
            "structure_min_size": 15,
            "pool_size": 4,
            "pool_saving_frequency": 1000,
            "patience": 5,
            "token_sampling_method": "topk",
            "filter_patience_limit": 100000
        }
    },
    # This block is for pre-training - details are omitted here
    "distribution_learning": {
        "parameters": {
            "agent": "<unused>",
            "training_steps": 20,
            "batch_size": 512,
            "learning_rate": 0.0001,
            "training_dataset_path": "<path to training dataset>",
            "train_with_randomization": True,
            "transfer_learning": False
        }
    },
    # "goal_directed_generation" denotes a reinforcement learning run
    "running_mode": "goal_directed_generation",
    # The default model architecture is Mamba: https://arxiv.org/abs/2312.00752
    "model_architecture": "mamba",
    # CPU also works but it is *extremely* slow
    "device": "cuda",
    "seed": 0
}

CONFIG_SAVE_PATH = "<path to save the config.json>"

In [None]:
with open(os.path.join(CONFIG_SAVE_PATH, f"config.json"), "w") as f:
    json.dump(config, f, indent=4)

## Example Full Config JSONs

The cells below produce two example configuration JSONs used in the pre-print.

**NOTE**: The `saturn` clone path and save paths still needs to be specified

#### Enforce the presence of the Mitsunobu while avoiding protection/deprotection reactions

In [2]:
oracle_components = [
    # QuickVina2-GPU docking
    {
        "name": "quickvina2_gpu",
        "weight": 1,
        "preliminary_check": False,
        "specific_parameters": {
            "binary": "<path to the QuickVina2-GPU binary executable",
            "force_field": "uff",
            "receptor": "<path to saturn>/experimental_reproduction/synthesizability/7uvu-2-monomers-pdbfixer.pdbqt",
            "reference_ligand": "<path to saturn>/experimental_reproduction/synthesizability/7uvu-reference.pdb",
            "thread": 8000,
            "results_dir": "<where to save docking results (poses and scores)"
        },
        "reward_shaping_function_parameters": {
            "transformation_function": "reverse_sigmoid",
            "parameters": {
                "low": -16,
                "high": 0,
                "k": 0.15
            }
        }
    },
    # QED
    {
        "name": "qed",
        "weight": 1,
        "preliminary_check": False,
        "specific_parameters": {},
        "reward_shaping_function_parameters": {
            "transformation_function": "no_transformation"
        }
    },
    # Number of hydrogen bond donors
    {
        "name": "num_hbd",
        "weight": 1,
        "preliminary_check": False,
        "specific_parameters": {},
        "reward_shaping_function_parameters": {
            "transformation_function": "step",
            "parameters": {
                "low": 0,
                "high": 3
            }
        }
    },
    # Syntheseus
    {
    "name": "syntheseus",
    "weight": 1,
    "preliminary_check": False,
    "specific_parameters": {
        "syntheseus_env_name": "syntheseus-full",
        "reaction_model": "megan",
        "building_blocks_file": "<path to building blocks stock file>", 
        "enforced_building_blocks": {
            "enforce_blocks": False,
            "enforced_building_blocks_file": "<path to enforced building blocks stock file>",
            "enforce_start": False,
            "use_dense_reward": True,
            "reward_type": "tango_fms",
            "tango_weights": {
                "tanimoto": 0.5,
                "fg": 0.5,
                "fms": 0.5
            }
        },
        "enforced_reactions": {
            "enforce_rxn_class_presence": True,
            "enforce_all_reactions": False,
            "rxn_insight_env_name": "rxn-insight",
            "use_namerxn": False,
            "namerxn_binary_path": "<path to your namerxn binary executable>/HazELNut/namerxn",
            "enforced_rxn_classes": [
                "mitsunobu"
            ],
            "avoid_rxn_classes": [
                "protection", 
                "deprotection"
            ],
            "rxn_insight_extraction_script_path": "<path to saturn>/oracles/synthesizability/utils/extract_rxn_insight_info.py",
            "namerxn_extraction_script_path": "<path to saturn>/oracles/synthesizability/utils/extract_namerxn_info.py",
            "seed_reactions": False,
            "seed_reactions_file_folder": "dummy"
        },
        "route_extraction_script_path": "<path to saturn>/oracles/synthesizability/utils/extract_syntheseus_route_data.py",
        "time_limit_s": 180,
        "minimize_path_length": False, 
        "parallelize": False,  
        "max_workers": 4,
        "results_dir": "<path to save raw syntheseus output (pickled data, rendered PDF routes, etc.)>"
    },
    "reward_shaping_function_parameters": {
            "transformation_function": "no_transformation"
        }
    }
]

In [3]:
config = {
    "logging": {
        # How often to save out generated molecules and a model checkpoint - 5,000 denotes after every 5,000 oracle calls
        "logging_frequency": 5000,
        "logging_path": "<path to save log file>",
        "model_checkpoints_dir": "<path to directory to save model checkpoints>"
    },
    "oracle": {
        # Oracle budget
        "budget": 15000,
        # False denotes that generated molecules that have been generated before will have its reward retrieved from the cache which does not impose an oracle call
        "allow_oracle_repeats": False,

        # Weighted geometric mean to aggregate reward components
        "aggregator": "product",

        # This is the list of reward components defined in the cells above
        "components": oracle_components
    },
    "goal_directed_generation": {
        "reinforcement_learning": {
            "prior": "<path to saturn>/experimental_reproduction/checkpoint_models/pubchem_mamba_5_retrained.prior",
            "agent": "<path to saturn>/experimental_reproduction/checkpoint_models/pubchem_mamba_5_retrained.prior",
            # These parameters affect sampling behaviour - default works
            # See https://arxiv.org/abs/2405.17066 for more details
            "batch_size": 64,
            "learning_rate": 0.0001,
            "sigma": 128.0,
            "augmented_memory": True,
            "augmentation_rounds": 2,
            "selective_memory_purge": True
        },
        # Hill-climbing experience replay adapted from: https://jcheminf.biomedcentral.com/articles/10.1186/s13321-017-0235-x
        "experience_replay": {
            "memory_size": 100,
            "sample_size": 10,
            "smiles": []
        },
        # Diversity filter adapted from: https://jcheminf.biomedcentral.com/articles/10.1186/s13321-020-00473-0
        "diversity_filter": {
            "name": "IdenticalMurckoScaffold",
            "bucket_size": 10
        },
        # This block is for Hallucinated Memory: https://arxiv.org/abs/2405.17066 - details are omitted here
        "hallucinated_memory": {
            "execute_hallucinated_memory": False,
            "hallucination_method": "ga",
            "num_hallucinations": 100,
            "num_selected": 5,
            "selection_criterion": "random"
        },
        # This block is for Beam Enumeration: https://openreview.net/forum?id=7UhxsmbdaQ - details are omitted here
        "beam_enumeration": {
            "execute_beam_enumeration": False,
            "beam_k": 2,
            "beam_steps": 18,
            "substructure_type": "structure",
            "structure_min_size": 15,
            "pool_size": 4,
            "pool_saving_frequency": 1000,
            "patience": 5,
            "token_sampling_method": "topk",
            "filter_patience_limit": 100000
        }
    },
    # This block is for pre-training - details are omitted here
    "distribution_learning": {
        "parameters": {
            "agent": "<unused>",
            "training_steps": 20,
            "batch_size": 512,
            "learning_rate": 0.0001,
            "training_dataset_path": "<path to training dataset>",
            "train_with_randomization": True,
            "transfer_learning": False
        }
    },
    # "goal_directed_generation" denotes a reinforcement learning run
    "running_mode": "goal_directed_generation",
    # The default model architecture is Mamba: https://arxiv.org/abs/2312.00752
    "model_architecture": "mamba",
    # CPU also works but it is *extremely* slow
    "device": "cuda",
    "seed": 0
}

CONFIG_SAVE_PATH = "./enforce-mitsunobu-avoid-protection-deprotection.json"
with open(CONFIG_SAVE_PATH, "w") as f:
    json.dump(config, f, indent=4)

#### Enforce all amide reactions

In [8]:
oracle_components = [
    # QuickVina2-GPU docking
    {
        "name": "quickvina2_gpu",
        "weight": 1,
        "preliminary_check": False,
        "specific_parameters": {
            "binary": "<path to the QuickVina2-GPU binary executable",
            "force_field": "uff",
            "receptor": "<path to saturn>/experimental_reproduction/synthesizability/7uvu-2-monomers-pdbfixer.pdbqt",
            "reference_ligand": "<path to saturn>/experimental_reproduction/synthesizability/7uvu-reference.pdb",
            "thread": 8000,
            "results_dir": "<where to save docking results (poses and scores)"
        },
        "reward_shaping_function_parameters": {
            "transformation_function": "reverse_sigmoid",
            "parameters": {
                "low": -16,
                "high": 0,
                "k": 0.15
            }
        }
    },
    # QED
    {
        "name": "qed",
        "weight": 1,
        "preliminary_check": False,
        "specific_parameters": {},
        "reward_shaping_function_parameters": {
            "transformation_function": "no_transformation"
        }
    },
    # Number of hydrogen bond donors
    {
        "name": "num_hbd",
        "weight": 1,
        "preliminary_check": False,
        "specific_parameters": {},
        "reward_shaping_function_parameters": {
            "transformation_function": "step",
            "parameters": {
                "low": 0,
                "high": 3
            }
        }
    },
    # Syntheseus
    {
    "name": "syntheseus",
    "weight": 1,
    "preliminary_check": False,
    "specific_parameters": {
        "syntheseus_env_name": "syntheseus-full",
        "reaction_model": "megan",
        "building_blocks_file": "<path to building blocks stock file>", 
        "enforced_building_blocks": {
            "enforce_blocks": False,
            "enforced_building_blocks_file": "<path to enforced building blocks stock file>",
            "enforce_start": False,
            "use_dense_reward": True,
            "reward_type": "tango_fms",
            "tango_weights": {
                "tanimoto": 0.5,
                "fg": 0.5,
                "fms": 0.5
            }
        },
        "enforced_reactions": {
            "enforce_rxn_class_presence": True,
            "enforce_all_reactions": True,  # Enforce all reactions
            "rxn_insight_env_name": "rxn-insight",
            "use_namerxn": False,
            "namerxn_binary_path": "<path to your namerxn binary executable>/HazELNut/namerxn",
            "enforced_rxn_classes": [
                "to amide"
            ],
            "avoid_rxn_classes": [],
            "rxn_insight_extraction_script_path": "<path to saturn>/oracles/synthesizability/utils/extract_rxn_insight_info.py",
            "namerxn_extraction_script_path": "<path to saturn>/oracles/synthesizability/utils/extract_namerxn_info.py",
            "seed_reactions": False,
            "seed_reactions_file_folder": "dummy"
        },
        "route_extraction_script_path": "<path to saturn>/oracles/synthesizability/utils/extract_syntheseus_route_data.py",
        "time_limit_s": 180,
        "minimize_path_length": False, 
        "parallelize": False,  
        "max_workers": 4,
        "results_dir": "<path to save raw syntheseus output (pickled data, rendered PDF routes, etc.)>"
    },
    "reward_shaping_function_parameters": {
            "transformation_function": "no_transformation"
        }
    }
]

In [9]:
config = {
    "logging": {
        # How often to save out generated molecules and a model checkpoint - 5,000 denotes after every 5,000 oracle calls
        "logging_frequency": 5000,
        "logging_path": "<path to save log file>",
        "model_checkpoints_dir": "<path to directory to save model checkpoints>"
    },
    "oracle": {
        # Oracle budget
        "budget": 15000,
        # False denotes that generated molecules that have been generated before will have its reward retrieved from the cache which does not impose an oracle call
        "allow_oracle_repeats": False,

        # Weighted geometric mean to aggregate reward components
        "aggregator": "product",

        # This is the list of reward components defined in the cells above
        "components": oracle_components
    },
    "goal_directed_generation": {
        "reinforcement_learning": {
            "prior": "<path to saturn>/experimental_reproduction/checkpoint_models/pubchem_mamba_5_retrained.prior",
            "agent": "<path to saturn>/experimental_reproduction/checkpoint_models/pubchem_mamba_5_retrained.prior",
            # These parameters affect sampling behaviour - default works
            # See https://arxiv.org/abs/2405.17066 for more details
            "batch_size": 64,
            "learning_rate": 0.0001,
            "sigma": 128.0,
            "augmented_memory": True,
            "augmentation_rounds": 2,
            "selective_memory_purge": True
        },
        # Hill-climbing experience replay adapted from: https://jcheminf.biomedcentral.com/articles/10.1186/s13321-017-0235-x
        "experience_replay": {
            "memory_size": 100,
            "sample_size": 10,
            "smiles": []
        },
        # Diversity filter adapted from: https://jcheminf.biomedcentral.com/articles/10.1186/s13321-020-00473-0
        "diversity_filter": {
            "name": "IdenticalMurckoScaffold",
            "bucket_size": 10
        },
        # This block is for Hallucinated Memory: https://arxiv.org/abs/2405.17066 - details are omitted here
        "hallucinated_memory": {
            "execute_hallucinated_memory": False,
            "hallucination_method": "ga",
            "num_hallucinations": 100,
            "num_selected": 5,
            "selection_criterion": "random"
        },
        # This block is for Beam Enumeration: https://openreview.net/forum?id=7UhxsmbdaQ - details are omitted here
        "beam_enumeration": {
            "execute_beam_enumeration": False,
            "beam_k": 2,
            "beam_steps": 18,
            "substructure_type": "structure",
            "structure_min_size": 15,
            "pool_size": 4,
            "pool_saving_frequency": 1000,
            "patience": 5,
            "token_sampling_method": "topk",
            "filter_patience_limit": 100000
        }
    },
    # This block is for pre-training - details are omitted here
    "distribution_learning": {
        "parameters": {
            "agent": "<unused>",
            "training_steps": 20,
            "batch_size": 512,
            "learning_rate": 0.0001,
            "training_dataset_path": "<path to training dataset>",
            "train_with_randomization": True,
            "transfer_learning": False
        }
    },
    # "goal_directed_generation" denotes a reinforcement learning run
    "running_mode": "goal_directed_generation",
    # The default model architecture is Mamba: https://arxiv.org/abs/2312.00752
    "model_architecture": "mamba",
    # CPU also works but it is *extremely* slow
    "device": "cuda",
    "seed": 0
}

CONFIG_SAVE_PATH = "./enforce-all-amide-reactions.json"
with open(CONFIG_SAVE_PATH, "w") as f:
    json.dump(config, f, indent=4)

## Output Files

In the specified syntheseus results save directory, the raw syntheseus output is saved and tagged by the oracle calls. Other files are:

1. `matched_generated_smiles_with_rxn.json` - contains all generated SMILES satisfying the reaction constraints

2. `smiles_rxn_tracker` - contains synthesis route information for *all* generated SMILES so not necessarily only the ones that satisfy the reaction constraints. The information here is sufficient to reconstruct the synthesis routes  