In [None]:
%reload_ext openad.notebooks.styles

<!-- Header banner -->
<div class="banner"><div>Working with the RXN Plugin</div><b>OpenAD <span>Tutorial</span></b></div>

### Table of Contents

1. [Getting Started](#Getting-Started)
2. [Forward Reactions](#Forward-Reactions)
3. [Retrosynthesis](#Retrosynthesis)
4. [Interpreting Recipes](#Interpreting-Recipes)
5. [Enriching your Molecules with RXN Results](#Enriching-your-Molecules-with-RXN-Results)

## Getting Started

<div class="alert alert-info">
<b>Note:</b> In order to prevent re-running the same queries as you practice with this Notebook, we'll use the <code>use cache</code> clause in all our examples.
</div>

### Installation
If you haven't already, you can install the plugin directly from its [GitHub repo](https://github.com/acceleratedscience/openad-plugin-rxn#readme).
    
    pip install git+https://github.com/acceleratedscience/openad-plugin-rxn

### Magic Commands
Magic commands let you interact with the OpenAD shell.
1. `%openad` - Display results directly in your notebook<br>
2. `%openadd` - Store the returned data in a variable

To learn more, check the [OpenAD intro to magic commands](https://github.com/acceleratedscience/openad-toolkit/blob/main/openad/notebooks/magic_commands.ipynb).

### About RXN
To learn about what this plugin does, and to list its available commands, run:

    rxn

In [None]:
%openad rxn ?

### Command Documentation

Every command has detailed documentation where you can find everything you need to know, including optional parameters and examples.

To see the documentation of a command, just run the beginning of the command followed by a question mark.

In [None]:
%openad rxn clear ?

## Forward Reactions

### Single Prediction

Predicting a reaction is as simple as passing a reactions SMILES.

    rxn predict reaction '<smiles>.<smiles>'

In [None]:
# Hydrochloric acid + N-propylpropanamide + Oxygen + Water
%openad rxn predict reaction 'Cl.CCC(=O)NCCC.O' use cache

### Batch Predictions

Batch predictions can be done from a list of SMILES, or a file or dataframe containing lists of smiles.

#### From a List
    
    rxn predict reactions from list ['<smiles>.<smiles>',...]

In [27]:
# Bromine + 2-anthracen-1-ylethanol / anthracene
%openadd rxn predict reactions from list ['BrBr.c1ccc2cc3ccccc3cc2c1', 'BrBr.c1ccc2cc3ccccc3cc2c1CCO'] use cache

Unnamed: 0,input,input_0,input_1,output,reaction,from_cache,confidence,photochemical,thermal
0,"[BrBr, c1ccc2cc3ccccc3cc2c1]",BrBr,c1ccc2cc3ccccc3cc2c1,Brc1c2ccccc2cc2ccccc12,BrBr.c1ccc2cc3ccccc3cc2c1>>Brc1c2ccccc2cc2ccccc12,True,0.979795,False,False
1,"[BrBr, c1ccc2cc3ccccc3cc2c1CCO]",BrBr,c1ccc2cc3ccccc3cc2c1CCO,BrCCc1cccc2c(Br)c3ccccc3cc12,BrBr.OCCc1cccc2cc3ccccc3cc12>>BrCCc1cccc2c(Br)...,True,0.649979,False,False


#### From a DataFrame

    rxn predict reactions from dataframe <dataframe_name>
    
Your DataFrame should have a "Reactions" column.

In [None]:
# Create a Pandas DataFrame with reaction SMILES
import pandas as pd
reactions = ['BrBr.c1ccc2cc3ccccc3cc2c1', 'BrBr.c1ccc2cc3ccccc3cc2c1CCO']
df = pd.DataFrame(reactions, columns=['Reactions'])

In [None]:
# Predict reactions
%openad rxn predict reactions from dataframe df use cache

#### From a File

    rxn predict reactions from file '<filename.csv>'
    
When using a CSV file, it should contain a "Reactions" column, just like with a dataframe. Alternatively, you can simply use a text file with one reaction per line.

For the purpose of this demo, we'll store both a .txt a .csv file with reactions in your workspace.

In [None]:
import pandas as py

# Prep
reactions = ['BrBr.c1ccc2cc3ccccc3cc2c1', 'BrBr.c1ccc2cc3ccccc3cc2c1CCO']
cmd_pointer = %openadd cmd_pointer
workspace_path = cmd_pointer.workspace_path()
csv_file_path = f'{workspace_path}/rxn_demo_reactions.csv'
text_file_path = f'{workspace_path}/rxn_demo_reactions.txt'

# Store reactions in a CSV file in your workspace
df = py.DataFrame(reactions, columns=['Reactions'])
df.to_csv(csv_file_path)

# Store reactions in a text file in your workspace
with open(text_file_path, "w") as file:
    for item in reactions:
        _ = file.write(f"{item}\n");

In [None]:
# Inspect the files we just created
%cat {csv_file_path}
%cat {text_file_path}

In [None]:
# Predict reactions from a CSV file
%openad rxn predict reactions from file 'rxn_demo_reactions.csv' use cache

In [None]:
# Predict reactions from a text file
%openad rxn predict reactions from file 'rxn_demo_reactions.txt' use cache

### Calculating top results at once

Sometimes one result per reaction may not be enough. You may want to see a list of most likely outcomes ranked by confidence.

To do this, you can simply pass the `topn` parameter and set it to however many outcomes you want to see.

This works for both single predictions and batch predictions.

In [None]:
%openad rxn predict reaction 'BrBr.c1ccc2cc3ccccc3cc2c1' using (topn=3) use cache

In [None]:
%openadd rxn predict reactions from list ['BrBr.c1ccc2cc3ccccc3cc2c1', 'BrBr.c1ccc2cc3ccccc3cc2c1CCO'] using (topn=5) use cache

## Retrosynthesis

    rxn predict retro '<smiles>'

Finding the retrosynthesis route of a molecule is again as simple as providing its SMILES.

There's a number of options available, like `max_steps` or `exclude_substructures`.<br>
To see the full list of options you can consult the command's documentation.

In [None]:
%openad rxn predict retro ?

You'll notice below that we are using variable substitution to provide the command with the SMILES molecule.

In [28]:
smiles = "CC(C)(c1ccccn1)C(CC(=O)O)Nc1nc(-c2c[nH]c3ncc(Cl)cc23)c(C#N)cc1F"
%openadd rxn predict retro {smiles} using (max_steps=20)

Output()

Unnamed: 0,reaction_path_index,result,confidence,compound [step -1],confidence [step -1],compound [step -2],confidence [step -2],compound [step -3],confidence [step -3],compound [step -4],confidence [step -4],compound [step -5],confidence [step -5],compound [step -6],confidence [step -6],compound [step -7],confidence [step -7],compound [step -8]
0,1,CC(C)(c1ccccn1)C(CC(=O)O)Nc1nc(-c2c[nH]c3ncc(C...,0.991,C1CCOC1,,,,,,,,,,,,,,
1,1,CC(C)(c1ccccn1)C(CC(=O)O)Nc1nc(-c2c[nH]c3ncc(C...,0.991,CCOC(=O)CC(Nc1nc(-c2cn(S(=O)(=O)c3ccc(C)cc3)c3...,0.996,C1COCCO1,,,,,,,,,,,,
2,1,CC(C)(c1ccccn1)C(CC(=O)O)Nc1nc(-c2c[nH]c3ncc(C...,0.991,CCOC(=O)CC(Nc1nc(-c2cn(S(=O)(=O)c3ccc(C)cc3)c3...,0.996,CCOC(=O)CC(Nc1nc(Cl)c(C#N)cc1F)C(C)(C)c1ccccn1,1.000,CCN(C(C)C)C(C)C,,,,,,,,,,
3,1,CC(C)(c1ccccn1)C(CC(=O)O)Nc1nc(-c2c[nH]c3ncc(C...,0.991,CCOC(=O)CC(Nc1nc(-c2cn(S(=O)(=O)c3ccc(C)cc3)c3...,0.996,CCOC(=O)CC(Nc1nc(Cl)c(C#N)cc1F)C(C)(C)c1ccccn1,1.000,CCOC(=O)CC(N)C(C)(C)c1ccccn1,1.0,CC(=O)[O-].[NH4+],,,,,,,,
4,1,CC(C)(c1ccccn1)C(CC(=O)O)Nc1nc(-c2c[nH]c3ncc(C...,0.991,CCOC(=O)CC(Nc1nc(-c2cn(S(=O)(=O)c3ccc(C)cc3)c3...,0.996,CCOC(=O)CC(Nc1nc(Cl)c(C#N)cc1F)C(C)(C)c1ccccn1,1.000,CCOC(=O)CC(N)C(C)(C)c1ccccn1,1.0,CCO,,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
505,30,CC(C)(c1ccccn1)C(CC(=O)O)Nc1nc(-c2c[nH]c3ncc(C...,0.991,CCOC(=O)CC(Nc1nc(-c2cn(S(=O)(=O)c3ccc(C)cc3)c3...,0.998,Cc1ccc(S(=O)(=O)n2cc(-c3nc(F)c(F)cc3C#N)c3cc(C...,0.937,N#Cc1cc(F)c(F)nc1Br,1.0,Fc1cc(I)c(Br)nc1F,1.0,O=N[O-].[Na+],,,,,,
506,30,CC(C)(c1ccccn1)C(CC(=O)O)Nc1nc(-c2c[nH]c3ncc(C...,0.991,CCOC(=O)CC(Nc1nc(-c2cn(S(=O)(=O)c3ccc(C)cc3)c3...,0.998,Cc1ccc(S(=O)(=O)n2cc(-c3nc(F)c(F)cc3C#N)c3cc(C...,0.937,N#Cc1cc(F)c(F)nc1Br,1.0,Fc1cc(I)c(Br)nc1F,1.0,[I-].[K+],,,,,,
507,30,CC(C)(c1ccccn1)C(CC(=O)O)Nc1nc(-c2c[nH]c3ncc(C...,0.991,CCOC(=O)CC(Nc1nc(-c2cn(S(=O)(=O)c3ccc(C)cc3)c3...,0.998,Cc1ccc(S(=O)(=O)n2cc(-c3nc(F)c(F)cc3C#N)c3cc(C...,0.937,N#Cc1cc(F)c(F)nc1Br,1.0,[C-]#N.[C-]#N.[Cu+2],,,,,,,,
508,30,CC(C)(c1ccccn1)C(CC(=O)O)Nc1nc(-c2c[nH]c3ncc(C...,0.991,O,,,,,,,,,,,,,,


## Interpreting Recipes

RXN also lets you interpret a textual description of a process for performing a chemical reaction and spit it back as a set of defined steps.

#### From a paragraph

    rxn interpret recipe '<recipe>'

In [None]:
# Interpret recipe from a text paragraph
%openad rxn interpret recipe 'To a stirred solution of 7-(difluoromethylsulfonyl)-4-fluoro-indan-1-one \
(110 mg, 0.42 mmol) in methanol (4 mL) was added sodium borohydride (24 mg, 0.62 mmol). \
The reaction mixture was stirred at ambient temperature for 1 hour.'

#### From a file
    
    rxn interpret recipe '<recipe.txt>'

For the purpose of this demo, we'll store a `rxn_demo_recipe.txt` file in your workspace.

In [None]:
# Prep
cmd_pointer = %openadd cmd_pointer
workspace_path = cmd_pointer.workspace_path()
recipe_file_path = f'{workspace_path}/rxn_demo_recipe.txt'

recipe = 'To a stirred solution of 7-(difluoromethylsulfonyl)-4-fluoro-indan-1-one \
(110 mg, 0.42 mmol) in methanol (4 mL) was added sodium borohydride (24 mg, 0.62 mmol). \
The reaction mixture was stirred at ambient temperature for 1 hour.'

# Store recipe in a text file in your workspace
with open(recipe_file_path, "w") as file:
    _ = file.write(recipe);

In [None]:
# Inspect the file we just created
%cat {recipe_file_path}

In [None]:
# Interpret recipe from a text file
%openad rxn interpret recipe 'rxn_demo_recipe.txt'

## Enriching your Molecules with RXN Results

After running an RXN query, you can add the results to the related molecules in your molecule working set.

    enrich molecules with analysis

In [None]:
# Clear any previously stored results
%openad clear analysis cache

# Empty your molecule working set
%openad clear mols

In [None]:
# Run retrosynthesis query (using %openadd to skip the printout)
smiles = "CC(C)(c1ccccn1)C(CC(=O)O)Nc1nc(-c2c[nH]c3ncc(Cl)cc23)c(C#N)cc1F"
_ = %openadd rxn predict retro '{smiles}' using (max_steps=5)

# Add the relevant molecule to your molecule working set (MWS)
%openad add molecule {smiles}

# Enrich the MWS with the RXN result
print(1)
%openad enrich molecules with analysis
print(2)

# Display the molecule to see the result (scroll down to analysis).
# From here you can export the molecule to a new file.
%openad show molecule {smiles}