<a href="https://colab.research.google.com/github/knc6/jarvis-tools-notebooks/blob/master/jarvis-tools-notebooks/JARVIS_ML_LightGBM_GPUvsCPU.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## This notebook shows how GPU can accelerate model training with respect to CPU using JARVIS-ML CFID dataset .

## Install LightGBM with GPU
## Make sure you have enabled GPU using: " Edit>Notebook Settings>GPU " from the dropdown menu up

In [1]:
!sudo apt install nvidia-driver-460 nvidia-cuda-toolkit clinfo
!apt-get update --fix-missing
!pip install lightgbm --install-option=--gpu --install-option="--opencl-include-dir=/usr/local/cuda/include/" --install-option="--opencl-library=/usr/local/cuda/lib64/libOpenCL.so"


Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
clinfo is already the newest version (3.0.21.02.21-1).
Some packages could not be installed. This may mean that you have
requested an impossible situation or if you are using the unstable
distribution that some required packages have not yet been created
or been moved out of Incoming.
The following information may help to resolve the situation:

The following packages have unmet dependencies:
 libnvidia-compute-510 : Depends: libnvidia-compute-535 but it is not installable
 nvidia-cuda-dev : Breaks: libcuda1 (< 495)
                   Recommends: libnvcuvid1 but it is not installable
[1;31mE: [0mUnable to correct problems, you have held broken packages.[0m
Get:1 https://cloud.r-project.org/bin/linux/ubuntu jammy-cran40/ InRelease [3,626 B]
Get:2 http://security.ubuntu.com/ubuntu jammy-security InRelease [129 kB]
Hit:3 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x8

In [2]:
# ! git clone --recursive https://github.com/Microsoft/LightGBM
# ! cd LightGBM && rm -rf build && mkdir build && cd build && cmake -DUSE_GPU=1 ../../LightGBM && make -j4 && cd ../python-package && python3 setup.py install --precompile --gpu;

##GPU information

In [3]:
!nvidia-smi

Wed Jan 22 01:19:42 2025       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.104.05             Driver Version: 535.104.05   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|   0  Tesla T4                       Off | 00000000:00:04.0 Off |                    0 |
| N/A   36C    P8               9W /  70W |      0MiB / 15360MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                    

##Install jarvis-tools

In [4]:
!pip install jarvis-tools

Collecting jarvis-tools
  Downloading jarvis_tools-2024.10.30-py2.py3-none-any.whl.metadata (3.0 kB)
Collecting xmltodict>=0.11.0 (from jarvis-tools)
  Downloading xmltodict-0.14.2-py2.py3-none-any.whl.metadata (8.0 kB)
Downloading jarvis_tools-2024.10.30-py2.py3-none-any.whl (4.2 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m4.2/4.2 MB[0m [31m25.9 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading xmltodict-0.14.2-py2.py3-none-any.whl (10.0 kB)
Installing collected packages: xmltodict, jarvis-tools
Successfully installed jarvis-tools-2024.10.30 xmltodict-0.14.2


##Get example data from JARVIS

In [5]:
from jarvis.ai.pkgs.utils import get_ml_data
from jarvis.ai.pkgs.utils import regr_scores
X,y,jid=get_ml_data()
#Formation energy for 3D materials, you can choose other properties/dataset as
# Take 1000 samples only for tutorial purposes
X=X[0:1000]
y=y[0:1000]
jid=jid[0:1000]

Obtaining 3D dataset 55k ...
Reference:https://www.nature.com/articles/s41524-020-00440-1
Other versions:https://doi.org/10.6084/m9.figshare.6815699


100%|██████████| 446M/446M [00:32<00:00, 13.8MiB/s]


Loading the zipfile...
Loading completed.


##Formation energy model training with GPU
### Other hyperparameters for materials properties are available at https://github.com/usnistgov/jarvis/tree/master/jarvis/ai/pkgs/lgbm

In [6]:
!mkdir -p /etc/OpenCL/vendors && echo "libnvidia-opencl.so.1" > /etc/OpenCL/vendors/nvidia.icd


In [7]:
%%capture
import lightgbm as lgb
from sklearn.model_selection import train_test_split
lgbm = lgb.LGBMRegressor(device= 'gpu',n_estimators= 1170,learning_rate= 0.15375236057119931,num_leaves= 273)

X_train, X_test, y_train, y_test, jid_train, jid_test = train_test_split(X, y, jid, random_state=1, test_size=.1)
lgbm.fit(X_train,y_train)
pred = lgbm.predict(X_test)
reg_sc = regr_scores(y_test, pred)


In [8]:
print (reg_sc['mae'])

0.31409038719427806


##Accuracy of the model can be further improved by increasing number of trees, decreasing learning rate, and increasing number of leaves. However, it requires more training time and the model may overfit. You can skip this step, if you just want to see the speed up

In [19]:
%%capture
import time
import lightgbm as lgb
from sklearn.model_selection import train_test_split
t1=time.time()
lgbm = lgb.LGBMRegressor(device= 'gpu',n_estimators= 1170,learning_rate= 0.05375236057119931,num_leaves= 573)

X_train, X_test, y_train, y_test, jid_train, jid_test = train_test_split(X, y, jid, random_state=1, test_size=.1)
lgbm.fit(X_train,y_train)
pred = lgbm.predict(X_test)
reg_sc = regr_scores(y_test, pred)
t2=time.time()
total_time=t2-t1



In [20]:
print (reg_sc['mae'],total_time)

0.2931568537049556 126.60851740837097


## Formation energy model training with CPU

In [21]:
%%capture
import lightgbm as lgb
from jarvis.ai.pkgs.utils import regr_scores
from sklearn.model_selection import train_test_split
t1=time.time()

lgbm = lgb.LGBMRegressor(n_estimators= 1170,learning_rate= 0.15375236057119931,num_leaves= 273, device= 'cpu')
X_train, X_test, y_train, y_test, jid_train, jid_test = train_test_split(X, y, y, random_state=1, test_size=.1)
lgbm.fit(X_train,y_train)
pred = lgbm.predict(X_test)
reg_sc = regr_scores(y_test, pred)
print (reg_sc['mae'])
t2=time.time()
total_time=t2-t1

In [22]:
print (reg_sc['mae'],total_time)

0.3125744831169388 135.3630268573761
