### Run the below before performing slide show.

In [None]:
# Import styles
import sys
sys.path.append('./styles')
from init_style import init
init()

In [None]:
#%openad clear sessions
%openad clear sessions
%openad remove toolkit ds4sd
%openad remove toolkit rxn
%openad add toolkit ds4sd
%openad add toolkit rxn
%openad list toolkits
%openad set llm bam
%openad set context rxn
%openad set context ds4sd

## Prerequisities

You must have Jupyterlab-rise installed and enabled

Use the slide show icon at top of notebook to run

Run the cells above before running

<!-- Header banner -->
<div class="banner"><div>Slide Show Demonstrations</div><b>OpenAD <span>Tutorial</span></b></div>

# Demonstration : Source PFAS Molecules and find Alternatives

### - Use IBM Deep Search to search for PFAS molecules
### - Use IBM's OpenAD's open-source property generation to generate additional properties
### - Merge and collate molecule data with OpenAD toolkit
### - Generate similar molecules with IBM open-source Regression Transformer with higher soluability
### - With Deep Search determine if generated molecules are mentioned in a patent, only proceed with molecules that do not.
### - Take one of the molecules and Use IBM RXN Retrosynthesis commands to generate a path to synthesis


## Step 1 Use IBM Deep Search to search for PFAS molecules </span>

In [None]:
%openad set context ds4sd
%openad search collection 'PubChem' for 'PFOA OR PFOS OR PFHxS OR PFNA OR HFPO-DA'


## Step 2: Use IBM's OpenAD open-source property generation to generate additional properties</span>

Load molecules into a OpenAD molecule set and Initialize the list of additional proerties to generate

In [None]:
# Load the data from the datframe Style object into the molecules data set
df_data = %openadd result as dataframe

%openad load molecules using dataframe df_data

#Define list of Delta to be inferred properties
properties = ['is_scaffold', 'bertz', 'tpsa', 'logp', 'qed', 'plogp', 'penalized_logp', 'lipinski', 'sas', 'esol']

Generate and merge the additional properties

In [None]:
# Generate SMILES properties and merge with molecules
%openad prop get molecule property {properties} for  @mols merge with mols

### Let's Examine the available Molecules

In [None]:
%openad show molecules

### Drilling in on the details of a molecule

In [None]:
%openad show molecule 'Perfluorononanoic acid'

##  Step 3: Generate Similar Molecules with IBM's open-source Regression Transformer

In [None]:
datasets = []
mol_list = %openadd export molecules
for row in mol_list.to_dict("records"):
    MY_SMILES= row['canonical_smiles']
    esol= float(row['esol'])+2 #higher soluability by 2 points
    MY_PARAMS = { "fraction_to_mask": 0.1 , "property_goal": { "<esol>": esol} }
    display("Generating Molecules for "+MY_SMILES+" with soluability:"+str(row['esol']) )
    result = %openadd gen generate with RegressionTransformerMolecules data for $MY_SMILES sample 10 \
    using(algorithm_version=solubility  search=sample temperature=1.5 tolerance=60.0 sampling_wrapper = "$MY_PARAMS" )
    display(result)
    datasets.append(result)

## Step 4: With IBM Deep Search determine if generated molecules are mentioned in a patent

In [None]:
x = 0
patent_count=0
patents_to_search=[]
patented_molecules=[]
non_patented_molecules=[]
searched_list=[]

# For all the molecules in the data set search for those with patents
for result in datasets:  
    for mol in result['0'].to_list():
        # remove duplicates
        if mol in searched_list:
            continue
        else:
            searched_list.append(mol)
        # Execute Patent Search  
       
        x = %openadd search for patents containing molecule '{mol}'
     
        # If has patents append to list
        if isinstance(x,DataFrame):
            patents_to_search.extend(x["PATENT ID"].to_list())
            patented_molecules.append(mol)
            print(f'patents for molecule {mol}:\n  {x["PATENT ID"].to_list()}')
        else:
            non_patented_molecules.append(mol)


Step 5: Add the Non Patented molecules to our list and generate properties for them 

In [None]:
#generate the new properties for all of the new molecules
properties_all = ['molecular_weight', 'number_of_aromatic_rings', 'number_of_h_acceptors', 'number_of_atoms','number_of_rings', 'number_of_rotatable_bonds', 'number_of_large_rings', 'number_of_heterocycles', 'number_of_stereocenters','is_scaffold', 'bertz', 'tpsa', 'logp', 'qed', 'plogp', 'penalized_logp', 'lipinski', 'sas', 'esol']
new_props = %openadd prop get molecule property {properties_all} for {non_patented_molecules} merge with mols

# Lets merge the new molecules into our Molecule Working Set

%openad enrich molecules with analysis

In [None]:
%openad show molecules

## Step 6: Lets Examine one of the Patented Molecules and Generate Retrosynthesis paths for it

 Use the Interactive Help to find out how to create the molecule using the IBM RXN Predict Retrosynthesis capability 

In [None]:
%openad tell me about the command predict retrosynthesis providing syntax and list all available parameters

### Run IBM RXN Retrosynthesis 

In [None]:
#set The RXN toolkit active
%openad set context rxn

#select the last molecule in the List
molecule = non_patented_molecules[-1]

%openad predict retrosynthesis  'N=S(=O)(O)C(F)(F)C(F)(F)C(F)(F)C(F)(F)C(F)(F)C(F)(F)C(F)(F)C(F)(F)F' using (max_steps=6 ai_model = '12class-tokens-2021-05-14' )
%openad enrich molecules with analysis

### Now lets take a look at what we know about the molecule

In [None]:
%openad show molecule 'N=S(=O)(O)C(F)(F)C(F)(F)C(F)(F)C(F)(F)C(F)(F)C(F)(F)C(F)(F)C(F)(F)F'