# Colab GPU setup for Redox Potential Pipeline

- Use the `Python 3 (Google Colab)` kernel with `GPU` hardware accelerator (Runtime â†’ Change runtime type).
- This notebook installs dependencies, writes a GPU config, trains, and optionally runs a prediction.
- Adjust `PROJECT_DIR` below if the repo is in a different location (e.g., after mounting Drive).


In [None]:
!nvidia-smi
!git clone https://github.com/PrithivP12/Science_Fair-25-26.git /content/OOK


Mon Jan 19 13:00:58 2026       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.15              Driver Version: 550.54.15      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|   0  Tesla T4                       Off |   00000000:00:04.0 Off |                    0 |
| N/A   48C    P8             10W /   70W |       0MiB /  15360MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                

In [None]:
import os, pathlib

default_dir = pathlib.Path('/content/OOK').resolve()
project_dir = default_dir if (default_dir / 'configs' / 'default.yaml').exists() else pathlib.Path.cwd().resolve()
os.environ['PROJECT_DIR'] = str(project_dir)
print('PROJECT_DIR =', project_dir)
if not (project_dir / 'configs' / 'default.yaml').exists():
    raise FileNotFoundError(f"{project_dir} is missing configs/default.yaml. Update PROJECT_DIR above and rerun.")


In [None]:
%%bash
set -e
cd "$PROJECT_DIR"
pip install -q --upgrade pip
pip install -q -r requirements.txt


In [None]:
%%bash
set -e
cd "$PROJECT_DIR"
cat > configs/gpu_colab.yaml <<'EOF'
random_seed: 42
test_size: 0.2
val_size: 0.1
n_splits: 5
selection_metric: MAE
selection_split: cv
tuning: true
tune_trials: 20
models: [catboost, xgb, ensemble]
ensemble: true

data:
  path: data/redox_dataset.csv

artifacts:
  dir: artifacts

catboost:
  depth: 8
  learning_rate: 0.05
  iterations: 500
  l2_leaf_reg: 3.0
  loss_function: MAE
  eval_metric: MAE
  subsample: 0.8
  task_type: GPU
  devices: "0"

xgboost:
  max_depth: 6
  learning_rate: 0.05
  n_estimators: 400
  min_child_weight: 1.0
  subsample: 0.8
  colsample_bytree: 0.8
  reg_lambda: 1.0
  objective: reg:absoluteerror
  tree_method: gpu_hist
  predictor: gpu_predictor
  eval_metric: mae
EOF


In [None]:
%%bash
set -e
cd "$PROJECT_DIR"
python scripts/run_train.py --config configs/gpu_colab.yaml


In [None]:
%%bash
set -e
cd "$PROJECT_DIR"
python scripts/run_predict.py --model artifacts/models/best_model.pkl --input data/redox_dataset.csv --output artifacts/predictions/redox_preds.csv


In [None]:
from google.colab import files
files.download('/content/OOK/artifacts/predictions/redox_preds.csv')
