# Environment Sanity Check #

Click the _Runtime_ dropdown at the top of the page, then _Change Runtime Type_ and confirm the instance type is _GPU_.

Check the output of `!nvidia-smi` to make sure you've been allocated a Tesla T4, P4, or P100.

In [1]:
!nvidia-smi

Tue Jul 26 18:58:27 2022       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.32.03    Driver Version: 460.32.03    CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  Tesla T4            Off  | 00000000:00:04.0 Off |                    0 |
| N/A   62C    P8    11W /  70W |      0MiB / 15109MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Proces

In [2]:
!pip install pynvml 
import pynvml

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting pynvml
  Downloading pynvml-11.4.1-py3-none-any.whl (46 kB)
[K     |████████████████████████████████| 46 kB 5.2 MB/s 
[?25hInstalling collected packages: pynvml
Successfully installed pynvml-11.4.1


In [3]:
# This get the RAPIDS-Colab install files and test check your GPU.  Run this and the next cell only.
# Please read the output of this cell.  If your Colab Instance is not RAPIDS compatible, it will warn you and give you remediation steps.
!git clone https://github.com/rapidsai/rapidsai-csp-utils.git
!python rapidsai-csp-utils/colab/env-check.py

Cloning into 'rapidsai-csp-utils'...
remote: Enumerating objects: 300, done.[K
remote: Counting objects: 100% (129/129), done.[K
remote: Compressing objects: 100% (74/74), done.[K
remote: Total 300 (delta 74), reused 99 (delta 55), pack-reused 171[K
Receiving objects: 100% (300/300), 87.58 KiB | 14.60 MiB/s, done.
Resolving deltas: 100% (136/136), done.
***********************************************************************
Woo! Your instance has the right kind of GPU, a Tesla T4!
***********************************************************************



In [None]:
# This will update the Colab environment and restart the kernel.  Don't run the next cell until you see the session crash.
!bash rapidsai-csp-utils/colab/update_gcc.sh
import os
os._exit(00)

Updating your Colab environment.  This will restart your kernel.  Don't Panic!
Get:1 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64  InRelease [1,581 B]
Get:2 https://cloud.r-project.org/bin/linux/ubuntu bionic-cran40/ InRelease [3,626 B]
Get:3 https://cloud.r-project.org/bin/linux/ubuntu bionic-cran40/ Packages [85.6 kB]
Get:4 http://security.ubuntu.com/ubuntu bionic-security InRelease [88.7 kB]
Ign:5 https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64  InRelease
Get:6 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64  Packages [824 kB]
Hit:7 https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64  Release
Get:9 http://ppa.launchpad.net/c2d4u.team/c2d4u4.0+/ubuntu bionic InRelease [15.9 kB]
Hit:10 http://archive.ubuntu.com/ubuntu bionic InRelease
Get:11 http://archive.ubuntu.com/ubuntu bionic-updates InRelease [88.7 kB]
Hit:12 http://ppa.launchpad.net/cran/libgit2/ubu

In [1]:
# This will install CondaColab.  This will restart your kernel one last time.  Run this cell by itself and only run the next cell once you see the session crash.
import condacolab
condacolab.install()

⏬ Downloading https://github.com/jaimergp/miniforge/releases/latest/download/Mambaforge-colab-Linux-x86_64.sh...
📦 Installing...
📌 Adjusting configuration...
🩹 Patching environment...
⏲ Done in 0:00:32
🔁 Restarting kernel...


In [1]:
# you can now run the rest of the cells as normal
import condacolab
condacolab.check()

✨🍰✨ Everything looks OK!


##### This next cell takes around 30 minutes to download, not the advertised 15 mins :'(

In [2]:
# Installing RAPIDS is now 'python rapidsai-csp-utils/colab/install_rapids.py <release> <packages>'
# The <release> options are 'stable' and 'nightly'.  Leaving it blank or adding any other words will default to stable.
!python rapidsai-csp-utils/colab/install_rapids.py stable
import os
os.environ['NUMBAPRO_NVVM'] = '/usr/local/cuda/nvvm/lib64/libnvvm.so'
os.environ['NUMBAPRO_LIBDEVICE'] = '/usr/local/cuda/nvvm/libdevice/'
os.environ['CONDA_PREFIX'] = '/usr/local'

Found existing installation: cffi 1.14.5
Uninstalling cffi-1.14.5:
  Successfully uninstalled cffi-1.14.5
Found existing installation: cryptography 3.4.5
Uninstalling cryptography-3.4.5:
  Successfully uninstalled cryptography-3.4.5
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting cffi==1.15.0
  Downloading cffi-1.15.0-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (427 kB)
Installing collected packages: cffi
Successfully installed cffi-1.15.0
Installing RAPIDS Stable 21.12
Starting the RAPIDS install on Colab.  This will take about 15 minutes.
Collecting package metadata (current_repodata.json): ...working... done
failed with initial frozen solve. Retrying with flexible solve.
failed with repodata from current_repodata.json, will retry with next repodata source.
Collecting package metadata (repodata.json): ...working... done
done

## Package Plan ##

  environment location: /usr/local

  added / updated specs:
   

# Start

In [3]:
import cudf
import cupy as cp
import cuml
from cuml.svm import SVR
import numpy as np
import pandas as pd
import time
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler, MinMaxScaler
from sklearn.metrics import mean_squared_error, r2_score
import matplotlib.pyplot as plt

In [None]:
type_dict = {
            'Focal_Distance_(um)':'float64',
            'Intensity_(W_cm2)':'float64',
            'Max_Proton_Energy_(MeV)':'float64',
            'Target_Thickness (um)':'float64',
            'Pulse_Duration_(fs)':'float64',
            'Spot_Size_(FWHM um)':'float64'
            }

In [None]:
df = pd.read_csv(r'raw_data.csv', dtype=type_dict)

In [None]:
df.drop('Pulse_Duration_(fs)', axis= 1, inplace= True)
df.drop('Spot_Size_(FWHM um)', axis= 1, inplace= True)

In [None]:
df.head()

Unnamed: 0,Focal_Distance_(um),Intensity_(W_cm2),Max_Proton_Energy_(MeV),Target_Thickness (um)
0,-3.483857,2.363742e+18,0.002208,3.728562
1,4.143612,1.541043e+18,0.000524,6.975312
2,8.279505,3.021621e+18,0.00097,7.575147
3,5.698734,3.57145e+18,0.004871,2.324687
4,3.022455,2.594938e+18,0.003454,2.489047


In [None]:
df2 = df.copy()

In [None]:
df2.shape

(300000, 4)

In [None]:
scaler = StandardScaler()

In [None]:
scaler.fit(df2[['Intensity_(W_cm2)', 'Target_Thickness (um)', 'Focal_Distance_(um)']])

StandardScaler()

In [None]:
temp_list = ['Intensity_(W_cm2)', 'Target_Thickness (um)','Focal_Distance_(um)']
df2[temp_list] = scaler.transform(df2[temp_list])

In [None]:
df2.head()

Unnamed: 0,Focal_Distance_(um),Intensity_(W_cm2),Max_Proton_Energy_(MeV),Target_Thickness (um)
0,-0.603956,0.068001,0.002208,-0.553982
1,0.717406,-0.783266,0.000524,0.629602
2,1.433896,0.748723,0.00097,0.848268
3,0.986811,1.317644,0.004871,-1.065757
4,0.52318,0.307224,0.003454,-1.00584


In [None]:
X_var = list(df2.columns)
X_var.remove('Max_Proton_Energy_(MeV)')
y_var = ['Max_Proton_Energy_(MeV)']

In [None]:
X_train, X_test, y_train, y_test = train_test_split(df2[X_var], df2[y_var], test_size=0.90, random_state=42)

In [None]:
y_train = np.log(y_train)

X_train_cu = cudf.DataFrame.from_pandas(X_train)
X_test_cu = cudf.DataFrame.from_pandas(X_test)
y_train_cu = cudf.DataFrame.from_pandas(y_train)
y_test_cu = cudf.DataFrame.from_pandas(y_test)

# Training the model

In [None]:
start_time = time.time()

In [None]:
model = SVR(kernel='poly', degree=3, gamma='auto', C=9.4, epsilon=0.001, coef0=1.2, cache_size= 10000, nochange_steps= 5000)

In [None]:
model.fit(X_train_cu, y_train_cu)

SVR()

In [None]:
y_predict = np.exp(cp.asarray(model.predict(X_test_cu)).get())

In [None]:
end_time = time.time()

In [None]:
total_time = end_time - start_time

# Testing the runtime complexity of SVR with gpu

In [None]:
df3 = df2.copy()

In [None]:
myarr = np.linspace(start= 100, stop= 300_000, num= 10)
tarr = np.zeros(len(myarr))

In [None]:
for i in range(len(myarr)):
    df4 = df3.sample(n= int(myarr[i]), random_state=42)
    ########
    X_train2, X_test2, y_train2, y_test2 = train_test_split(df4[X_var], df4[y_var], test_size=0.33, random_state=42)
    ########
    y_train2 = np.log(y_train2)

    X_train_cu2 = cudf.DataFrame.from_pandas(X_train2)
    X_test_cu2 = cudf.DataFrame.from_pandas(X_test2)
    y_train_cu2 = cudf.DataFrame.from_pandas(y_train2)
    y_test_cu2 = cudf.DataFrame.from_pandas(y_test2)
    ########
    start_time2 = time.time()
    ########
    # tmodel = SVR(kernel='poly', degree=3, gamma='auto', C=9.4, epsilon=0.001, coef0=1.2, cache_size= 10000, nochange_steps= 5000)
    tmodel = SVR(kernel='poly', degree=3, gamma='auto', C=9.4, epsilon=0.001, coef0=1.2)
    ########
    tmodel.fit(X_train_cu2, y_train_cu2)
    ########
    y_predict2 = np.exp(cp.asarray(tmodel.predict(X_test_cu2)).get())
    ########
    end_time2 = time.time()
    ########
    total_time2 = end_time2 - start_time2
    tarr[i] = total_time2

# Summary

In [None]:
fig = plt.figure()
plt.xlabel("Amount of training points at a .33 testing split")
plt.ylabel('Seconds needed to train and predict')
plt.plot(myarr, tarr)
plt.show()

In [None]:
print(f'Size of training set| {len(X_train)}\nSize of testing set | {len(X_test)}')

In [None]:
mean_squared_error(y_test, y_predict)

In [None]:
print(f'Wall Clock Time\n\nHours  | {total_time // 60**2}\nMinutes| {total_time // 60}\nSeconds| {np.abs(total_time - 60 * (total_time // 60 ))}')

In [None]:
r2_score(y_test,y_predict)