# Using the MODAC Client

In this tutorial, we will utilize the [MODAC API](https://modac.cancer.gov/swagger-ui/4.14.0/index.html#/) to download and run an existing model from https://modac.cancer.gov/. This can be useful when you are running AMPL on a remote machine, and cannot easily upload existing models.

## Configuration

In PrecisionFDA, you can declare `MODAC_USER` and `MODAC_PASS` in your `.env` file. See [PFDA.md](https://github.com/mass-matrix/AMPL/blob/master/PFDA.md#L44) for more information about the `.env` file. Alternatively, you can just set your environment like below:

In [1]:
import os

os.environ['MODAC_USER'] = 'my-username@example.com'
os.environ['MODAC_PASS'] = 'my-password'

## Downloading Existing Models

As an example, we use the [QMugs HOMO-LUMO Prediction Model](https://modac.cancer.gov/assetDetails?dme_data_id=NCI-DME-MS01-63856556). However, you can choose any other model from this site. Be sure to copy `ASSET PATH`. For this example it's `/NCI_DOE_Archive/ATOM/QMugs/QMugs_HOMO-LUMO_Prediction_Model`

In [2]:
from atomsci.modac import *

client = MoDaCClient()

client.download_all_files_in_collection("/NCI_DOE_Archive/ATOM/QMugs/QMugs_HOMO-LUMO_Prediction_Model")

Calling function 'download_all_files_in_collection' with args: ('/NCI_DOE_Archive/ATOM/QMugs/QMugs_HOMO-LUMO_Prediction_Model',) kwargs: {}
Calling function 'get_collection' with args: ('/NCI_DOE_Archive/ATOM/QMugs/QMugs_HOMO-LUMO_Prediction_Model',) kwargs: {}
Get collection. Making requests to https://modac.cancer.gov/api/collection//NCI_DOE_Archive/ATOM/QMugs/QMugs_HOMO-LUMO_Prediction_Model
Function 'get_collection' returned: {'collectionId': 63856556, 'collectionName': '/NCI_DOE_Archive/ATOM/QMugs/QMugs_HOMO-LUMO_Prediction_Model', 'absolutePath': '/NCI_DOE_Archive/ATOM/QMugs/QMugs_HOMO-LUMO_Prediction_Model', 'collectionParentName': '/NCI_DOE_Archive/ATOM/QMugs', 'collectionOwnerName': 'ncidoesvcp2', 'collectionOwnerZone': 'ncifprodZone', 'collectionMapId': '0', 'collectionInheritance': '1', 'createdAt': 1684528422000, 'specColType': 'NORMAL', 'subCollections': [{'id': 63856585, 'path': '/NCI_DOE_Archive/ATOM/QMugs/QMugs_HOMO-LUMO_Prediction_Model/Documentation', 'dataSize': 0}],

After the download is complete, you can see all the files get downloaded inside a folder which seems to be named by the model `ASSET NAME`. 

Here we look for all the files within the directory `QMugs_HOMO-LUMO_Prediction_Model/`

In [3]:
import os

project_dir = os.path.join(os.getcwd(), 'QMugs_HOMO-LUMO_Prediction_Model')
os.listdir(project_dir)

['QMugs_curatedDFT_model_86052011-b22c-419c-b033-4c67586d319c.tar.gz',
 'QMugs_curatedDFT.csv']

## Import the dataset

In [4]:
import pandas as pd

test_datafile = os.path.join(project_dir, 'QMugs_curatedDFT.csv')
test_data = pd.read_csv(test_datafile)
test_data.head()

Unnamed: 0,chembl_id,smiles,GFN2_HOMO_LUMO_GAP,DFT_HOMO_LUMO_GAP,rdkit_smiles,inchi_key,compound_id,VALUE_NUM_mean,VALUE_NUM_std,Perc_Var,Remove_BadDuplicate
0,CHEMBL1,[H]c1c([H])c2c(c([H])c1OC([H])([H])[H])OC([H])...,0.044924,0.24293,COc1ccc2c(c1)OC[C@H]1[C@@H]2C2=C(OC1(C)C)C1=C(...,GHBOEFUAGSHXPO-XZOTUCIWSA-N,GHBOEFUAGSHXPO-XZOTUCIWSA-N,0.242102,0.000833,0.341925,0
1,CHEMBL1000,[H]OC(=O)C([H])([H])OC([H])([H])C([H])([H])N1C...,0.11517,0.340824,O=C(O)COCCN1CCN([C@@H](c2ccccc2)c2ccc(Cl)cc2)CC1,ZKLPARSLTMPFCP-NRFANRHFSA-N,ZKLPARSLTMPFCP-NRFANRHFSA-N,0.333456,0.006397,2.209649,0
2,CHEMBL10000,[H]c1c([H])c(N([H])c2nc3c([H])c([H])c([H])c([H...,0.106596,0.293095,O=c1oc(Nc2ccc(I)cc2)nc2ccccc12,KXLZEFPIBPQEAU-UHFFFAOYSA-N,KXLZEFPIBPQEAU-UHFFFAOYSA-N,0.289124,0.003439,1.373521,0
3,CHEMBL100003,[H]O/C(OC([H])([H])[H])=C1\C(C([H])([H])[H])=N...,0.080096,0.27541,CCC[C@@H]1C(C(=O)OCC)=C(C)N=C(C)/C1=C(\O)OC,FCFUFMMLEUYHMD-CDZMIXDFSA-N,FCFUFMMLEUYHMD-CDZMIXDFSA-N,0.282891,0.007237,2.644696,0
4,CHEMBL100004,[H]O/C(OC([H])([H])C([H])([H])[H])=C1\C(C([H])...,0.069851,0.272307,CCO/C(O)=C1/C(C)=NC(C)=C(C(=O)OCCSc2ccccc2)[C@...,CUHAMGYOMBKPLA-BQOWYSNXSA-N,CUHAMGYOMBKPLA-BQOWYSNXSA-N,0.272592,0.003115,0.104604,0


## Run the model for predictions

In [5]:
import logging
import os
from atomsci.ddm.pipeline import predict_from_model as pfm

logger = logging.getLogger()
logger.setLevel(logging.ERROR)

model_file = os.path.join(project_dir, 'QMugs_curatedDFT_model_86052011-b22c-419c-b033-4c67586d319c.tar.gz')
input_df = test_data # Make sure this matches your test dataset
response_col = "VALUE_NUM_mean"
compound_id = 'compound_id'
smiles_col = "rdkit_smiles"
results_df = pfm.predict_from_model_file(model_path = model_file,
                            input_df = input_df,
                            smiles_col = smiles_col,
                            response_col = response_col)
results_df.head()

Standardizing SMILES strings for 346780 compounds.


INFO:atomsci.ddm.utils.model_version_utils:/home/herman/massmatrix/AMPL/atomsci/ddm/examples/tutorials/QMugs_HOMO-LUMO_Prediction_Model/QMugs_curatedDFT_model_86052011-b22c-419c-b033-4c67586d319c.tar.gz, 1.4.1
INFO:atomsci.ddm.utils.model_version_utils:Version compatible check: /home/herman/massmatrix/AMPL/atomsci/ddm/examples/tutorials/QMugs_HOMO-LUMO_Prediction_Model/QMugs_curatedDFT_model_86052011-b22c-419c-b033-4c67586d319c.tar.gz version = "1.4", AMPL version = "1.6"


ValueError: Version compatible check: /home/herman/massmatrix/AMPL/atomsci/ddm/examples/tutorials/QMugs_HOMO-LUMO_Prediction_Model/QMugs_curatedDFT_model_86052011-b22c-419c-b033-4c67586d319c.tar.gz version: "1.4" not matching AMPL compatible version group: "1.6"

## Notes
- The Modac API can be unreliable at times
- You may encounter a version compatibility error. In this case, you will need to run matching versions of AMPL
For example:
```
ValueError: Version compatible check: <model>.tar.gz version: "1.4" not matching AMPL compatible version group: "1.6"
```