# Tour of `rxn4chemistry`

In this tour we will explore the main features of `rxn4chemistry`, the python wrapper for [RXN for Chemistry](https://rxn.res.ibm.com).
For a full set of features, consult the the [online documentation](https://rxn4chemistry.github.io/rxn4chemistry) or refer to the [GitHub repo](https://github.com/rxn4chemistry/rxn4chemistry).

Below are the tools in RXN for Chemistry. Click on a tool name to jump to the relevant section of this Jupyter notebook.

| Tool | Description | Ref |
| --- | --- | --- |
| [**Predict retrosynthesis**](#predict-retrosynthesis) | Predict possible retrosynthetic routes given a target molecule. | 1, 2 |
| [**Predict products**](#predict-products) | Predict the products of a chemical reaction given the starting materials. | 3 | 
| [**Predict reagents**](#predict-reagents) | Predict the reagents needed to convert a given starting material to a given product. | N/A | 
| [**Plan a synthesis**](#plan-a-synthesis) | Plan a synthesis starting from a target molecule, a retrosynthetic route, or an experimental procedure. | 4, 5 | 
| [**Atom mapping**](#atom-mapping) | Map atoms from starting materials to products. | 6 | 
| [**Text to procedure**](#text-to-procedure) | Translate your reaction procedures from text to exact steps to follow. | 5 | 
| [**Reaction digitization**](#reaction-digitization) | Convert images of reaction schemes to machine-readable format. | N/A | 

When publishing results obtained with RXN for Chemistry, we kindly request that you cite the article related to the tool(s) you have used:
1. *"Predicting retrosynthetic pathways using transformer-based models and a hyper-graph exploration strategy."* Schwaller, P.; , Petraglia, R.; Zullo, V.; Nair, V. H.; Haeuselmann, R. A.; Pisoni, R.; Bekas, C.; Iuliano, A.; Laino, T.  *Chem. Sci.*, **2020**, *11*, 3316. [[link]](https://doi.org/10.1039/C9SC05704H)
2. *"Biocatalysed synthesis planning using data-driven learning"* Probst, D.; Manica, M.; Nana Teukam, Y. G.; Castrogiovanni, A.; Paratore, F.; Laino, T. *Nat. Commun.* **2022**, *13*, 964. [[link]](https://www.nature.com/articles/s41467-022-28536-w)
3. *"Molecular Transformer: A Model for Uncertainty-Calibrated Chemical Reaction Prediction."* Schwaller, P.; Laino, T.; Gaudin, T.; Bolgar, P.; Hunter, C. A.; Bekas, C.; Lee, A. A. *ACS Cent. Sci.* **2019**, *5*, 1572. [[link]](https://doi.org/10.1021/acscentsci.9b00576)
4. *"Inferring experimental procedures from text-based representations of chemical reactions."* Vaucher, A. C.; Schwaller, P.; Geluykens, J.; Nair, V. H.; Iuliano, A.; Laino, T. *Nat. Commun.* **2021**, *12*, 2573. [[link]](https://doi.org/10.1038/s41467-021-22951-1)
5. *"Automated extraction of chemical synthesis actions from experimental procedures."* Vaucher, A. C.; Zipoli, F.; Geluykens, J.; Nair, V. H.; Schwaller, P.; Laino, T. *Nat. Commun.* **2020**, *11*, 3601. [[link]](https://doi.org/10.1038/s41467-020-17266-6)
6. *"Extraction of organic chemistry grammar from unsupervised learning of chemical reactions."* Schwaller, P.; Hoover, B; Reymond, J.-L.; Strobelt, H.; Laino, T. *Sci. Adv.* **2021**, *7*, eabe4166. [[link]](https://www.science.org/doi/10.1126/sciadv.abe4166)



## API access

Users in the free tier of RXN for Chemistry have UI access only.  For programmatic access to RXN for Chemistry, users need an Individual or Team subscription.  These plans feature full API access with no rate limitations. You can view subscription options [here](https://rxn.app.accelerate.science/rxn/user-subscription). 

## Instantiate the wrapper

Set up the wrapper using a valid API key. Your API key be found on the RXN for Chemistry [profile page](https://rxn.res.ibm.com/rxn/user/profile). 



In [1]:
from rxn4chemistry import RXN4ChemistryWrapper
api_key = 'api-key'
rxn = RXN4ChemistryWrapper(api_key=api_key)

### For on-premise installations

You can refer to a custom on-premise installation via an environment variable:

```console
export RXN4CHEMISTRY_BASE_URL="https://some.other.rxn.server"
```

Or by setting a different host in your python code:

```python
rxn = RXN4ChemistryWrapper(api_key=api_key, base_url='https://some.other.rxn.server')
# NOTE: You can also set the host after wrapper instantiation
# rxn = RXN4ChemistryWrapper(api_key=api_key)
# rxn.set_base_url('https://some.other.rxn.server')
```

## Projects

Results from the four tools below can be saved to a project.  Projects help to organize analyses and can be shared with colleagues.

- Predict retrosynthesis
- Predict product
- Predict reagents
- Plan a synthesis

### Create a project

To create a project, run the ```.create_project()``` function on the wrapper.

This step can be skipped if you want to perform tasks within an already existing project.

In [2]:
rxn.create_project('rxn4chemistry_tour')
print(f'The project ID is {rxn.project_id}')

The project ID is 6553c24b0fb57c001f185dbe


### Set the project

Tell the wrapper which project you are working within.

In [3]:
rxn.set_project('655389f70fb57c001f17f304')

## Predict retrosynthesis

RXN for Chemistry uses a hypergraph exploration approach informed by molecular transformers for backward and forward reaction prediction.

To predict a retrosynthesis using default parameters, simply define a molecule in SMILES format and pass it as an argument to ```predict_automatic_retrosynthesis()```.

In [None]:
smiles = 'CC(=O)NC1=CC=C(Br)C=C1'
predict_automatic_retrosynthesis_response = rxn.predict_automatic_retrosynthesis(product=smiles)

Check on the status of the retrosynthesis prediction. 
- 'NEW': Job is still running.
- 'SUCCESS': Job is complete.

Rerun the cell below until 'SUCCESS' is returned.

In [None]:
predict_automatic_retrosynthesis_results = rxn.get_predict_automatic_retrosynthesis_results(
    predict_automatic_retrosynthesis_response['prediction_id']
)
predict_automatic_retrosynthesis_results['status']

'SUCCESS'

Upon 'SUCCESS' we can assess the predicted retrosynthetic paths. 

But first we define a function ```collect_reactions_from_retrosynthesis``` that can parse the results of the retrosynthesis prediction.

In [None]:
from typing import Dict, List
from IPython.display import display

# To parse results from the predict retrosynthesis tool
def collect_reactions_from_retrosynthesis(tree: Dict) -> List[str]:
    reactions = []
    if 'children' in tree and len(tree['children']):
        reactions.append(
            AllChem.ReactionFromSmarts('{}>>{}'.format(
                '.'.join([node['smiles'] for node in tree['children']]),
                tree['smiles']
            ), useSmiles=True)
        )
    for node in tree['children']:
        reactions.extend(collect_reactions_from_retrosynthesis(node))
    return reactions

Then we use this helper function to display the different retrosynthesis routes produced by the tool.

In [None]:
for index, tree in enumerate(predict_automatic_retrosynthesis_results['retrosynthetic_paths']):
    print('Showing path {} with confidence {}:'.format(index, tree['confidence']))
    for reaction in collect_reactions_from_retrosynthesis(tree):
        display(Chem.Draw.ReactionToImage(reaction))

## Predict products

RXN for Chemistry uses a forward reaction prediction model based on molecular transformers.  



To run a forward reaction prediction, use the ```.predict_reaction()``` function.

In [None]:
predict_reaction_response = rxn.predict_reaction(
    'BrBr.c1ccc2cc3ccccc3cc2c1'
)

Then we can get the results of the prediction.

In [None]:
predict_reaction_results = rxn.get_predict_reaction_results(
    predict_reaction_response['prediction_id']
)

Define a helper function to parse the output.

In [None]:
from rdkit import Chem
from rdkit.Chem import AllChem

def get_reaction_from_smiles(reaction_smiles: str) -> Chem.rdChemReactions.ChemicalReaction:
    return AllChem.ReactionFromSmarts(reaction_smiles, useSmiles=True)

Then use the helper function to show the predicted product.

In [None]:
get_reaction_from_smiles(predict_reaction_results['response']['payload']['attempts'][0]['smiles'])

It is also possible to run forward reaction predictions in batches to use the service in a high-throughput fashion.  Note that this will not store the information in any project.

In [None]:
predict_rection_batch_response = rxn.predict_reaction_batch(
    precursors_list=['BrBr.c1ccc2cc3ccccc3cc2c1', 'Cl.c1ccc2cc3ccccc3cc2c1']
)

In [None]:
for reaction_prediction in rxn.get_predict_reaction_batch_results(
    predict_rection_batch_response['task_id']
)['predictions']:
    print(f'Confidence: {reaction_prediction["confidence"]}')
    display(get_reaction_from_smiles(reaction_prediction['smiles']))

It is also possible to predict multiple forward reaction prediction outcomes (in batch):

In [None]:
response = rxn.predict_reaction_batch_topn(
    precursors_lists=[
        ["BrBr", "c1ccc2cc3ccccc3cc2c1"],
        ["BrBr", "c1ccc2cc3ccccc3cc2c1CCO"],
        ["Cl", "CCC(=O)NCCC", "O"],
    ],
    topn=5,
)

In [None]:
result = rxn.get_predict_reaction_batch_topn_results(
    response["task_id"]
)

In [None]:
for i, reaction_predictions in enumerate(result['predictions'], 1):
    print(f'Outcomes for reaction no {i}:')
    for j, prediction in enumerate(reaction_predictions["results"], 1):
        product_smiles = ".".join(prediction["smiles"])
        confidence = prediction["confidence"]
        print(f'  Product(s) {j}: {product_smiles}, with confidence {confidence}')

**NOTE:** The results for batch predictions are only stored temporarily in our databases, so we strongly recommend saving them elsewhere.

## Predict reagents
Plan and execute a Reaction completion starting from an incomplete formula

In [None]:
starting_material_smiles = '[CH3:1][O:2][C:3]1[CH:9]=[C:8]([CH3:10])[CH:7]=[C:6]([O:11][CH3:12])[C:4]=1[NH2:5]'
product_smiles = '[CH3:12][O:11][C:6]1[CH:7]=[C:8]([CH3:10])[CH:9]=[C:3]([O:2][CH3:1])[C:4]=1[NH:5][C:20](=[O:25])[CH2:21][CH:22]([CH3:24])[CH3:23]'
response = rxn.predict_reagents(
    starting_material_smiles,
    product_smiles
)

Since this call is asynchronous, we poll the response until our result is ready

In [None]:
# Check continuously to see if the results are ready
# Should print out 'SUCCESS' when ready
print(result['response']['payload']['status'])

## Plan a synthesis

Once a retrosynthesis prediction is performed, we can turn that synthesis route into a synthesis plan.

**NOTE:** The **Plan a synthesis** tool can also start from a target molecule or an experimental procedure in text format (neither shown herein).

In [None]:
create_synthesis_from_sequence_response = rxn.create_synthesis_from_sequence(
    sequence_id=predict_automatic_retrosynthesis_results['retrosynthetic_paths'][1]['sequenceId']
)
print(f'Identifier for the synthesis: {create_synthesis_from_sequence_response["synthesis_id"]}')

Identifier for the synthesis: 6553e68e0fb57c001f18a0d8


Inspect the actions predicted by the AI model.

In [None]:
synthesis_id = create_synthesis_from_sequence_response['synthesis_id']
node_ids = rxn.get_node_ids(synthesis_id=synthesis_id)
node_id = node_ids[-1]

In [None]:
import json

actions_and_product = rxn.get_reaction_settings(synthesis_id=synthesis_id, node_id=node_id)
node_actions, product = actions_and_product['actions'], actions_and_product['product']

for index, action in enumerate(node_actions, 1):
    print(f'Action {index}:\n{json.dumps(action, indent=4)}\n')

The adding acetyl chloride action needs to be changed to not be dropwise since solids are added in pins and we remove the purify action since it is currently not supported on commonly used robotic hardware.


In [None]:
# update the the action so the solid is not added dropwise
node_actions[3]['content']['dropwise']['value'] = False

# remove the purify action 
node_actions.pop(11)

# update the node actions
rxn.update_reaction_settings(synthesis_id=synthesis_id, node_id=node_id, actions=node_actions, product=product)


## Atom mapping

Map atoms from starting materials to product.

In [23]:
reaction_prop = rxn.predict_reaction_properties(
    ['CC(=O)[OH]>>CC(=O)OCC']
)
print(reaction_prop['response']['payload']['content'][0]['value'])

[CH3:1][C:5](=[O:3])[OH:4]>>[CH3:1][C:2](=[O:3])[O:4][CH2:5][CH3:6]


## Text to procedure

RXN for Chemistry allows for extraction of machine-readable actions from description of chemical procedures in paragraph format.

Extract the actions from a recipe:

In [28]:
actions_from_procedure_results = rxn.paragraph_to_actions(
    'To a stirred solution of '
    '7-(difluoromethylsulfonyl)-4-fluoro-indan-1-one (110 mg, '
    '0.42 mmol) in methanol (4 mL) was added sodium borohydride '
    '(24 mg, 0.62 mmol). The reaction mixture was stirred at '
    'ambient temperature for 1 hour.'
)

The procedure can then be reconstituted as a list of standardized steps.  These procedures can be a useful starting point for passing information to a robotic platform.

In [None]:
for index, action in enumerate(actions_from_procedure_results['actions'], 1):
    print(f'{index}. {action}')

## Reaction digitization
Convert images of reactions into machine-readable format

In [None]:
# Upload the picture and get back the file id
# Use full path
response = rxn.upload_file('/Users/derek/Documents/RXN/example-data.png')
file_id = response['response']['payload']['id']
print(file_id)

Use the file_id to start the digitization process. This may take some time

In [None]:
result = rxn.digitize_reaction(file_id)
result['response']['payload']['reactions']