<a href="https://colab.research.google.com/github/ravichas/AMPL-Tutorial/blob/master/01_make_AMPL_Google_Drive.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Make AMPL on Google Drive

# Warning: This is an experimental notebook
- AMPL is using Python 3.6.7 while Colab is currently using 3.6.9 so they do not currently match. Your mileage may vary.

# Goals
- Create a reusable installation of AMPL in Colab and save in User's personal Google Drive

# Requirements
- Datasets are required for testing AMPL. delaney-processed_curated_fit.csv and delaney-processed_curated_external.csv are copied to this runtime from Google Drive.

## Authenticate and Mount your Google drive 

When you run the following cell, you will be asked to do the following thihngs:

1. A empty input box and a link will appear
2. Click the link
3. Authenticate your Google account
4. copy the link and paste it in the input box that appeared in step 1

In [1]:
# Mount Google Drive
from google.colab import drive
drive.mount("/content/drive")

Mounted at /content/drive


In [2]:
import pandas as pd
import requests
import io
url = 'https://raw.githubusercontent.com/ravichas/AMPL-Tutorial/master/datasets/delaney-processed_curated_external.csv'
url1 = 'https://raw.githubusercontent.com/ravichas/AMPL-Tutorial/master/datasets/delaney-processed_curated_fit.csv'
download = requests.get(url).content
download1 = requests.get(url1).content

df = pd.read_csv(url, index_col=0)
df1 = pd.read_csv(url1, index_col=0)

# Reading the downloaded content and turning it into a pandas dataframe
df = pd.read_csv(io.StringIO(download.decode('utf-8')))
df1 = pd.read_csv(io.StringIO(download1.decode('utf-8')))

df.to_csv('delaney-processed_curated_external.csv', index=False)
df.to_csv('delaney-processed_curated_fit.csv', index=False)

## Check whether the input files present

In [3]:
import os
assert(os.path.isfile('/content/delaney-processed_curated_fit.csv'))
assert(os.path.isfile('/content/delaney-processed_curated_external.csv'))

## Get the Python version

In [4]:
!python --version

Python 3.6.9


## Install Miniconda to /content/AMPL
Conda include hard coded paths so this is the only location it can go (< 30 seconds)

**WE NEED TO ADDRESS WHAT IF THE DIRECTORY ALREADY EXISTS**

In [5]:
!wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
!time bash Miniconda3-latest-Linux-x86_64.sh -b -p /content/AMPL

--2020-09-23 13:17:15--  https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
Resolving repo.anaconda.com (repo.anaconda.com)... 104.16.130.3, 104.16.131.3, 2606:4700::6810:8303, ...
Connecting to repo.anaconda.com (repo.anaconda.com)|104.16.130.3|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 93052469 (89M) [application/x-sh]
Saving to: ‘Miniconda3-latest-Linux-x86_64.sh’


2020-09-23 13:17:15 (196 MB/s) - ‘Miniconda3-latest-Linux-x86_64.sh’ saved [93052469/93052469]

PREFIX=/content/AMPL
Unpacking payload ...
Collecting package metadata (current_repodata.json): - \ | done
Solving environment: - \ done

## Package Plan ##

  environment location: /content/AMPL

  added / updated specs:
    - _libgcc_mutex==0.1=main
    - ca-certificates==2020.1.1=0
    - certifi==2020.4.5.1=py38_0
    - cffi==1.14.0=py38he30daa8_1
    - chardet==3.0.4=py38_1003
    - conda-package-handling==1.6.1=py38h7b6447c_0
    - conda==4.8.3=py38_0
    - crypt

In [6]:
!ls 

AMPL					drive
delaney-processed_curated_external.csv	Miniconda3-latest-Linux-x86_64.sh
delaney-processed_curated_fit.csv	sample_data


## Create AMPL code dependency and save it to AMPL.txt

In [7]:
url='https://raw.githubusercontent.com/ravichas/AMPL-Tutorial/master/datasets/AMPL.txt'

downloaded_obj = requests.get(url)
with open("AMPL.txt", "wb") as file:
    file.write(downloaded_obj.content)

## Install code dependencies using the AMPL.txt file that we downloaded in the previous step 
* ~ 3 minutes

```
real	1m53.391s
user	1m12.443s
sys	0m9.643s
```

In [8]:
!time /content/AMPL/bin/conda install --file AMPL.txt -y


Downloading and Extracting Packages
_libgcc_mutex-0.1    | : 100% 1.0/1 [00:00<00:00,  5.39it/s]
ca-certificates-2020 | : 100% 1.0/1 [00:00<00:00, 14.27it/s]
fftw3f-3.3.4         | : 100% 1.0/1 [00:00<00:00,  1.25it/s]
libgfortran-3.0.0    | : 100% 1.0/1 [00:00<00:00, 11.30it/s]
libgfortran-ng-7.5.0 | : 100% 1.0/1 [00:00<00:00,  3.21it/s]
libstdcxx-ng-9.3.0   | : 100% 1.0/1 [00:00<00:00,  1.27it/s]
pandoc-2.10.1        | : 100% 1.0/1 [00:05<00:00,  5.73s/it]               
libgomp-9.3.0        | : 100% 1.0/1 [00:00<00:00,  8.50it/s]
openblas-0.2.20      | : 100% 1.0/1 [00:04<00:00,  4.45s/it]              
_openmp_mutex-4.5    | : 100% 1.0/1 [00:00<00:00, 25.04it/s]
blas-1.1             | : 100% 1.0/1 [00:00<00:00, 33.75it/s]
libgcc-ng-9.3.0      | : 100% 1.0/1 [00:01<00:00,  1.50s/it]
blosc-1.20.0         | : 100% 1.0/1 [00:00<00:00,  5.40it/s]
bzip2-1.0.8          | : 100% 1.0/1 [00:00<00:00,  7.58it/s]
c-ares-1.16.1        | : 100% 1.0/1 [00:00<00:00, 18.05it/s]
expat-2.2.9        

## Get AMPL related Python, Pip and Conda versions

* Note the local drive paths

In [9]:
%%bash
/content/AMPL/bin/python -V
/content/AMPL/bin/pip -V
/content/AMPL/bin/conda -V

Python 3.6.7
pip 20.2.2 from /content/AMPL/lib/python3.6/site-packages/pip (python 3.6)
conda 4.8.4


## Clone AMPL source and apply patches (if any)

In [10]:
%%bash
mkdir github
cd github
git clone https://github.com/ATOMconsortium/AMPL.git

Cloning into 'AMPL'...


In [11]:
# There is a problem with the UMAP package so remove umap import
%%bash
cat << "EOF" > transformations_py.patch
--- transformations.py  2020-09-14 17:08:22.225747322 -0700
+++ transformations_patched.py  2020-09-14 17:08:07.869651225 -0700
@@ -9,7 +9,7 @@

 import numpy as np
 import pandas as pd
-import umap
+# import umap

 import deepchem as dc
 from deepchem.trans.transformers import Transformer, NormalizationTransformer
EOF

patch -N /content/github/AMPL/atomsci/ddm/pipeline/transformations.py transformations_py.patch

patching file /content/github/AMPL/atomsci/ddm/pipeline/transformations.py


In [12]:
# There is a problem with dependency checking on import after install
%%bash
cat << "EOF" > __init___py.patch
--- /content/AMPL/atomsci/ddm/__init__.py.backup	2020-09-19 18:10:05.264013977 +0000
+++ /content/AMPL/atomsci/ddm/__init__.py	2020-09-19 18:15:37.338771924 +0000
@@ -1,6 +1,6 @@
 import pkg_resources
 try:
     __version__ = pkg_resources.require("atomsci-ampl")[0].version
-except TypeError:
+except:
     pass
EOF

patch -N /content/github/AMPL/atomsci/ddm/__init__.py __init___py.patch

patching file /content/github/AMPL/atomsci/ddm/__init__.py


## Build and install AMPL

In [13]:
%%bash
# Move conda python to beginning of PATH in this cell
PATH=/content/AMPL/bin:$PATH
# Clear PYTHONPATH
PYTHONPATH=

cd /content/github/AMPL
time ./build.sh
time ./install.sh system

running build
running build_py
creating /content/github/AMPL.build/ampl/lib
creating /content/github/AMPL.build/ampl/lib/atomsci
copying atomsci/__init__.py -> /content/github/AMPL.build/ampl/lib/atomsci
creating /content/github/AMPL.build/ampl/lib/atomsci/ddm
copying atomsci/ddm/__init__.py -> /content/github/AMPL.build/ampl/lib/atomsci/ddm
creating /content/github/AMPL.build/ampl/lib/atomsci/ddm/pipeline
copying atomsci/ddm/pipeline/splitting.py -> /content/github/AMPL.build/ampl/lib/atomsci/ddm/pipeline
copying atomsci/ddm/pipeline/model_pipeline.py -> /content/github/AMPL.build/ampl/lib/atomsci/ddm/pipeline
copying atomsci/ddm/pipeline/featurization.py -> /content/github/AMPL.build/ampl/lib/atomsci/ddm/pipeline
copying atomsci/ddm/pipeline/model_tracker.py -> /content/github/AMPL.build/ampl/lib/atomsci/ddm/pipeline
copying atomsci/ddm/pipeline/model_datasets.py -> /content/github/AMPL.build/ampl/lib/atomsci/ddm/pipeline
copying atomsci/ddm/pipeline/dist_metrics.py -> /content/githu

Skipping installation of /content/github/AMPL.build/ampl/bdist.linux-x86_64/wheel/atomsci/__init__.py (namespace package)

real	0m0.546s
user	0m0.460s
sys	0m0.085s

real	0m0.866s
user	0m0.669s
sys	0m0.129s


In [14]:
# Remove conda package downloads to decrease package size
# 1 min
!time /content/AMPL/bin/conda clean -a -y

Cache location: /content/AMPL/pkgs
Will remove the following tarballs:

/content/AMPL/pkgs
------------------
_libgcc_mutex-0.1-conda_forge.tar.bz2          3 KB
ipywidgets-7.5.1-py_0.tar.bz2                101 KB
simplejson-3.17.2-py36h8c4c3a4_0.tar.bz2     102 KB
gettext-0.19.8.1-hc5be6a0_1002.tar.bz2       3.6 MB
numba-0.50.1-py36h830a2c2_1.tar.bz2          3.5 MB
setuptools-49.6.0-py36h9f0ad1d_0.tar.bz2     935 KB
python-3.8.3-hcff3b4d_0.conda               49.1 MB
ipython-7.16.1-py36h95af2a2_0.tar.bz2        1.1 MB
bzip2-1.0.8-h516909a_3.tar.bz2               398 KB
numexpr-2.7.1-py36h830a2c2_1.tar.bz2         197 KB
ipython_genutils-0.2.0-py_1.tar.bz2           21 KB
jsonschema-3.2.0-py36h9f0ad1d_1.tar.bz2       89 KB
rdkit-2017.09.1-py36_1.tar.bz2              19.7 MB
certifi-2020.6.20-py36h9f0ad1d_0.tar.bz2     151 KB
jupyter-1.0.0-py_2.tar.bz2                     4 KB
sqlite-3.28.0-h8b20d00_0.tar.bz2             1.9 MB
libstdcxx-ng-9.1.0-hdf63c60_0.conda          3.1 MB
libuui

## Compress and Store AMPL installation in Google drive (~ 600 MB and takes < 6 minutes)

This step is done to retrieve the installation for later use 

In [15]:
!time tar -cjvf AMPL.tar.bz2 AMPL

[1;30;43mStreaming output truncated to the last 5000 lines.[0m
AMPL/pkgs/gettext-0.19.8.1-hc5be6a0_1002/share/doc/gettext/examples/hello-tcl/po/cs.po
AMPL/pkgs/gettext-0.19.8.1-hc5be6a0_1002/share/doc/gettext/examples/hello-tcl/autoclean.sh
AMPL/pkgs/gettext-0.19.8.1-hc5be6a0_1002/share/doc/gettext/examples/hello-tcl/m4/
AMPL/pkgs/gettext-0.19.8.1-hc5be6a0_1002/share/doc/gettext/examples/hello-tcl/m4/Makefile.am
AMPL/pkgs/gettext-0.19.8.1-hc5be6a0_1002/share/doc/gettext/examples/hello-tcl/hello.tcl
AMPL/pkgs/gettext-0.19.8.1-hc5be6a0_1002/share/doc/gettext/examples/hello-tcl/Makefile.am
AMPL/pkgs/gettext-0.19.8.1-hc5be6a0_1002/share/doc/gettext/examples/hello-java-qtjambi/
AMPL/pkgs/gettext-0.19.8.1-hc5be6a0_1002/share/doc/gettext/examples/hello-java-qtjambi/INSTALL
AMPL/pkgs/gettext-0.19.8.1-hc5be6a0_1002/share/doc/gettext/examples/hello-java-qtjambi/BUGS
AMPL/pkgs/gettext-0.19.8.1-hc5be6a0_1002/share/doc/gettext/examples/hello-java-qtjambi/configure.ac
AMPL/pkgs/gettext-0.19.8.1-hc

In [20]:
# Copy to Google Drive
# 1 min
!time cp AMPL.tar.bz2 '/content/drive/My Drive/colab'


real	0m3.124s
user	0m0.012s
sys	0m1.064s


In [24]:
!ls '/content/drive/My Drive/colab'

AMPL.tar.bz2


# Note
- Check https://drive.google.com/ for AMPL.tar.bz2. It may take minutes to synchronize.