# Tour of `rxn4chemistry`

In this quick tour we will explore the main features of `rxn4chemistry`, the python wrapper for [RXN for Chemistry](https://rxn.res.ibm.com).
For a full set of features check the [GitHub repo](https://github.com/rxn4chemistry/rxn4chemistry) and/or the [online documentation](https://rxn4chemistry.github.io/rxn4chemistry).

In [None]:
import logging
from typing import Dict, List
from rdkit import Chem
from rdkit.Chem import AllChem
from IPython.display import display

logging.basicConfig(level=logging.INFO, format='%(levelname)s : %(message)s')

def get_reaction_from_smiles(reaction_smiles: str) -> Chem.rdChemReactions.ChemicalReaction:
    return AllChem.ReactionFromSmarts(reaction_smiles, useSmiles=True)


def collect_reactions_from_retrosynthesis(tree: Dict) -> List[str]:
    reactions = []
    if 'children' in tree and len(tree['children']):
        reactions.append(
            AllChem.ReactionFromSmarts('{}>>{}'.format(
                '.'.join([node['smiles'] for node in tree['children']]),
                tree['smiles']
            ), useSmiles=True)
        )
    for node in tree['children']:
        reactions.extend(collect_reactions_from_retrosynthesis(node))
    return reactions

## Instantiating the wrapper

Setup the wrapper using a valid API key. You can get one on the IBM RXN website from [here](https://rxn.res.ibm.com/rxn/user/profile).

In [None]:
from rxn4chemistry import RXN4ChemistryWrapper
api_key = 'API_KEY'
rxn4chemistry_wrapper = RXN4ChemistryWrapper(api_key=api_key)

You can also use a custom on-premise installation by controlling an environment variable:

```console
export RXN4CHEMISTRY_BASE_URL="https://some.other.rxn.server"
```

Or setting a different host in your python code:

```python
rxn4chemistry_wrapper = RXN4ChemistryWrapper(api_key=api_key, base_url='https://some.other.rxn.server')
# or set it afterwards
# rxn4chemistry_wrapper = RXN4ChemistryWrapper(api_key=api_key)
# rxn4chemistry_wrapper.set_base_url('https://some.other.rxn.server')
```

## Create a project

Create a project, you can easily check the identifier associated to it in the response:

In [None]:
rxn4chemistry_wrapper.create_project('rxn4chemistry_tour')
print(f'Identifier for the project {rxn4chemistry_wrapper.project_id}')
# NOTE: you can create a project or set an esiting one using:
# rxn4chemistry_wrapper.set_project('6088fc284fe8920001a58546')

## Product / Reaction prediction

RXN for Chemistry uses the Molecular Transformer as forward reaction prediction model (more details in the [paper](https://doi.org/10.1021/acscentsci.9b00576)).
![molecular_transformer](https://pubs.acs.org/na101/home/literatum/publisher/achs/journals/content/acscii/2019/acscii.2019.5.issue-9/acscentsci.9b00576/20190918/images/medium/oc9b00576_0009.gif)

Running a reaction prediction is as simple as:

In [None]:
predict_reaction_response = rxn4chemistry_wrapper.predict_reaction(
    'BrBr.c1ccc2cc3ccccc3cc2c1'
)

**NOTE:** we have set limitations on the number of calls per second and per minute in the public version of RXN for Chemistry. These limits can be tweaked or removed in on-premise deployments. Those limitations are currently set to 5 calls per minute, in most cases this is not a problematic limitation.

In [None]:
predict_reaction_results = rxn4chemistry_wrapper.get_predict_reaction_results(
    predict_reaction_response['prediction_id']
)

In [None]:
get_reaction_from_smiles(predict_reaction_results['response']['payload']['attempts'][0]['smiles'])

It is possible to run reaction prediction in batches (not storing the information in any project) to use the service in a highthroughput fashion:

In [None]:
predict_rection_batch_response = rxn4chemistry_wrapper.predict_reaction_batch(
    precursors_list=['BrBr.c1ccc2cc3ccccc3cc2c1', 'Cl.c1ccc2cc3ccccc3cc2c1']
)

In [None]:
for reaction_prediction in rxn4chemistry_wrapper.get_predict_reaction_batch_results(
    predict_rection_batch_response['task_id']
)['predictions']:
    print(f'Confidence: {reaction_prediction["confidence"]}')
    display(get_reaction_from_smiles(reaction_prediction['smiles']))

It is also possible to predict multiple forward reaction prediction outcomes (in batch):

In [None]:
response = rxn4chemistry_wrapper.predict_reaction_batch_topn(
    precursors_lists=[
        ["BrBr", "c1ccc2cc3ccccc3cc2c1"],
        ["BrBr", "c1ccc2cc3ccccc3cc2c1CCO"],
        ["Cl", "CCC(=O)NCCC", "O"],
    ],
    topn=5,
)

In [None]:
result = rxn4chemistry_wrapper.get_predict_reaction_batch_topn_results(
    response["task_id"]
)

In [None]:
for i, reaction_predictions in enumerate(result['predictions'], 1):
    print(f'Outcomes for reaction no {i}:')
    for j, prediction in enumerate(reaction_predictions["results"], 1):
        product_smiles = ".".join(prediction["smiles"])
        confidence = prediction["confidence"]
        print(f'  Product(s) {j}: {product_smiles}, with confidence {confidence}')

**NOTE:** the results for batch predictions are not stored permanently in our databases, so we strongly recommend to save them since they will expire.

## Reaction properties prediction (Atom Mapping)

Map atoms from starting materials to product.
Running a reaction prediction is as simple as:

In [None]:
reaction_prop = rxn4chemistry_wrapper.predict_reaction_properties(
    ['CC(=O)[OH]>>CC(=O)OCC']
)
print(reaction_prop['response']['payload']['content'][0]['value'])

## Actions from procedure description (text-to-procedure)

RXN for Chemistry allows to extract machine-readable actions from textual description of chemical procedures (see details in the [paper](https://doi.org/10.1038/s41467-020-17266-6)).
![actions_from_procedure](https://media.springernature.com/full/springer-static/image/art%3A10.1038%2Fs41467-020-17266-6/MediaObjects/41467_2020_17266_Fig3_HTML.png)

Extract the actions from a recipe:

In [None]:
actions_from_procedure_results = rxn4chemistry_wrapper.paragraph_to_actions(
    'To a stirred solution of '
    '7-(difluoromethylsulfonyl)-4-fluoro-indan-1-one (110 mg, '
    '0.42 mmol) in methanol (4 mL) was added sodium borohydride '
    '(24 mg, 0.62 mmol). The reaction mixture was stirred at '
    'ambient temperature for 1 hour.'
)

In [None]:
for index, action in enumerate(actions_from_procedure_results['actions'], 1):
    print(f'{index}. {action}')

## Predict retrosynthesis route

RXN for Chemistry uses an hyper-graph exploration approach informed by the Molecular Transfomer for backward and forward reaction prediction (for details see the [paper](https://doi.org/10.1039/C9SC05704H)).
![retrosynthesis_prediction](https://pubs.rsc.org/en/Image/Get?imageInfo.ImageType=GA&imageInfo.ImageIdentifier.ManuscriptID=C9SC05704H&imageInfo.ImageIdentifier.Year=2020)

Running a retrosynthesis is as easy as picking a molecule and calling a one-liner:

In [None]:
smiles = 'CC(=O)NC1=CC=C(Br)C=C1'
predict_automatic_retrosynthesis_response = rxn4chemistry_wrapper.predict_automatic_retrosynthesis(product=smiles)

Check the status of the retrosynthesis prediction:

In [None]:
predict_automatic_retrosynthesis_results = rxn4chemistry_wrapper.get_predict_automatic_retrosynthesis_results(
    predict_automatic_retrosynthesis_response['prediction_id']
)
predict_automatic_retrosynthesis_results['status']

Upon 'SUCCESS' we can choose one of the returned retrosynthetic paths. The paths are sorted based on the scoring mechanism of the models:

In [None]:
for index, tree in enumerate(predict_automatic_retrosynthesis_results['retrosynthetic_paths']):
    print('Showing path {} with confidence {}:'.format(index, tree['confidence']))
    for reaction in collect_reactions_from_retrosynthesis(tree):
        display(Chem.Draw.ReactionToImage(reaction))

## Perform a synthesis using one of the predict routes

Once a retrosynthesis prediction is performed we can predict a synthesis plan:

In [None]:
create_synthesis_from_sequence_response = rxn4chemistry_wrapper.create_synthesis_from_sequence(
    sequence_id=predict_automatic_retrosynthesis_results['retrosynthetic_paths'][1]['sequenceId']
)
print(f'Identifier for the synthesis: {create_synthesis_from_sequence_response["synthesis_id"]}')

Inspect the actions predicted by the AI model (for details see [paper](https://doi.org/10.1038/s41467-021-22951-1)).
![smiles_to_actions](https://media.springernature.com/full/springer-static/image/art%3A10.1038%2Fs41467-021-22951-1/MediaObjects/41467_2021_22951_Fig1_HTML.png)


In [None]:
import json
synthesis_id = create_synthesis_from_sequence_response['synthesis_id']
node_ids = rxn4chemistry_wrapper.get_node_ids(synthesis_id=synthesis_id)
node_id = node_ids[-1]

In [None]:
actions_and_product = rxn4chemistry_wrapper.get_reaction_settings(synthesis_id=synthesis_id, node_id=node_id)
node_actions, product = actions_and_product['actions'], actions_and_product['product']

for index, action in enumerate(node_actions, 1):
    print(f'Action {index}:\n{json.dumps(action, indent=2)}\n')

The adding acetyl chloride acction needs ot be changed to not adding it dropwise since solids are added in pins and we remove the purify actions since it is currently not supported on the robotic hardware


In [None]:
# update the the action so the solid is not added dropwise
node_actions[3]['content']['dropwise']['value'] = False

# remove the purify action 
node_actions.pop(11)

# update the node actions
rxn4chemistry_wrapper.update_reaction_settings(synthesis_id=synthesis_id, node_id=node_id, actions=node_actions, product=product)


## Predict reagents
Plan and execute a Reaction completion starting from an incomplete formula

In [None]:
reagents_smiles = '[CH3:1][O:2][C:3]1[CH:9]=[C:8]([CH3:10])[CH:7]=[C:6]([O:11][CH3:12])[C:4]=1[NH2:5]'
product_smiles = '[CH3:12][O:11][C:6]1[CH:7]=[C:8]([CH3:10])[CH:9]=[C:3]([O:2][CH3:1])[C:4]=1[NH:5][C:20](=[O:25])[CH2:21][CH:22]([CH3:24])[CH3:23]'
response = rxn.predict_reagents(
    reagents_smiles,
    product_smiles
)

Since this call is asynchronous, we poll the response until our result is ready

In [None]:
# Check continuously to see if the results are ready
# Should print out 'SUCCESS' when ready
print(result['response']['payload']['status'])

## Reaction digitization
Convert images of reactions into machine-readable format

In [None]:
# Upload the picture and get back the file id
# Use full path
response = rxn.upload_file('/home/johnsmith/Downloads/example-data.png')
file_id = response['response']['payload']['id']
print(file_id)

Use the file_id to start the digitization process. This may take some time

In [None]:
result = rxn.digitize_reaction(file_id)
result['response']['payload']['reactions']