# üß¨ BioFoundry Active Learning - Final Working Version

**All Installation Issues Fixed**

### Fixes Applied:
- ‚úÖ Correct OCP repository (FAIR-Chem/fairchem)
- ‚úÖ Specific version checkout (f83d150)
- ‚úÖ Code modification in utils.py
- ‚úÖ submitit dependency
- ‚úÖ GPU-adaptive config

## Cell 1: GPU Check

In [None]:
import subprocess
import sys

print("=" * 60)
print("GPU Information:")
print("=" * 60)
subprocess.run(["nvidia-smi"], check=False)

import torch
print(f"\nPyTorch: {torch.__version__}")
print(f"CUDA: {torch.cuda.is_available()}")

if torch.cuda.is_available():
    gpu_name = torch.cuda.get_device_name(0)
    print(f"GPU: {gpu_name}")
    print(f"Memory: {torch.cuda.get_device_properties(0).total_memory / 1e9:.2f} GB")
    
    if "A100" in gpu_name:
        RECOMMENDED_BATCH_SIZE = 16
        RECOMMENDED_LMAX = [4]
    elif "V100" in gpu_name:
        RECOMMENDED_BATCH_SIZE = 8
        RECOMMENDED_LMAX = [4]
    elif "T4" in gpu_name:
        RECOMMENDED_BATCH_SIZE = 4
        RECOMMENDED_LMAX = [2]
    else:
        RECOMMENDED_BATCH_SIZE = 4
        RECOMMENDED_LMAX = [2]
    
    print(f"\nRecommended Config:")
    print(f"  batch_size: {RECOMMENDED_BATCH_SIZE}")
    print(f"  lmax_list: {RECOMMENDED_LMAX}")
else:
    print("‚ö†Ô∏è No GPU!")
    RECOMMENDED_BATCH_SIZE = 1
    RECOMMENDED_LMAX = [2]

## Cell 2: Install Dependencies

In [None]:
print("Installing Dependencies...")

!pip uninstall -y torch-scatter torch-sparse torch-geometric torch-cluster
!pip install torch==2.1.0 torchvision==0.16.0
!pip install pyg_lib torch_scatter torch_sparse torch_cluster torch_spline_conv \
    -f https://data.pyg.org/whl/torch-2.1.0+cu121.html
!pip install torch-geometric
!pip install lmdb pyyaml tqdm biopython ase e3nn timm \
    scipy==1.13.1 numba wandb tensorboard submitit \
    scikit-learn matplotlib seaborn

print("\n‚úÖ Dependencies installed")

## Cell 3: Mount Drive & Copy Data

‚ö†Ô∏è **MODIFY `DRIVE_DATA_PATH`**

In [None]:
from google.colab import drive
import os
import shutil

drive.mount('/content/drive', force_remount=True)

# ‚ö†Ô∏è MODIFY THIS
DRIVE_DATA_PATH = "/content/drive/My Drive/BioFoundry/data"

LOCAL_DATA_PATH = "/content/data"
CHECKPOINT_PATH = "/content/checkpoints"
EMBEDDING_PATH = "/content/embeddings.npy"

os.makedirs(LOCAL_DATA_PATH, exist_ok=True)
os.makedirs(CHECKPOINT_PATH, exist_ok=True)

print("Copying LMDB...")
if os.path.exists(DRIVE_DATA_PATH):
    shutil.copytree(DRIVE_DATA_PATH, LOCAL_DATA_PATH, dirs_exist_ok=True)
    print(f"‚úÖ Data copied to {LOCAL_DATA_PATH}")
    !ls -lh {LOCAL_DATA_PATH}
else:
    print(f"‚ùå {DRIVE_DATA_PATH} not found!")

## Cell 4: Clone & Install OCP (CRITICAL FIX)

**Key Changes:**
1. Use FAIR-Chem/fairchem (not OCP old repo)
2. Checkout specific version f83d150
3. Modify utils.py before installation

In [None]:
import os
import sys

os.chdir("/content")

print("=" * 60)
print("Installing OCP (FAIR-Chem)...")
print("=" * 60)

# Clone FAIR-Chem (new OCP repo)
if not os.path.exists("/content/ocp"):
    print("\nüì• Cloning FAIR-Chem repository...")
    !git clone https://github.com/FAIR-Chem/fairchem.git ocp
    os.chdir("/content/ocp")
    
    # Checkout specific version
    print("\nüìå Checking out version f83d150...")
    !git checkout f83d150
    
    # Modify utils.py (CRITICAL)
    print("\nüîß Modifying ocpmodels/common/utils.py...")
    utils_path = "/content/ocp/ocpmodels/common/utils.py"
    
    with open(utils_path, 'r') as f:
        content = f.read()
    
    # Find line 329 and add imports
    if "finally:" in content and "import nets" not in content:
        modified = content.replace(
            "finally:\n        registry.register",
            "finally:\n        import nets\n        import oc20.trainer\n        registry.register"
        )
        with open(utils_path, 'w') as f:
            f.write(modified)
        print("‚úÖ utils.py modified")
    else:
        print("‚ö†Ô∏è utils.py already modified or structure changed")
    
    # Install OCP
    print("\nüì¶ Installing ocpmodels package...")
    !pip install -e .
    print("‚úÖ OCP installed")
else:
    print("‚úÖ OCP already exists")
    os.chdir("/content/ocp")

# Clone EquiformerV2
os.chdir("/content")
if not os.path.exists("/content/equiformer_v2"):
    print("\nüì• Cloning EquiformerV2...")
    !git clone https://github.com/atomicarchitects/equiformer_v2.git
    print("‚úÖ EquiformerV2 cloned")
else:
    print("‚úÖ EquiformerV2 already exists")

# Add to path
sys.path.insert(0, "/content/ocp")
sys.path.insert(0, "/content/equiformer_v2")

# Verify
print("\n" + "=" * 60)
print("Verifying Installation...")
print("=" * 60)

try:
    from ocpmodels.common import distutils
    from ocpmodels.common.registry import registry
    print("‚úÖ ocpmodels imports successful")
except ImportError as e:
    print(f"‚ùå Import failed: {e}")

print("\n‚úÖ Setup complete")

## Cell 5: Generate Config

In [None]:
import yaml

config = {
    "trainer": "energy_v2",
    "dataset": {
        "train": {"src": f"{LOCAL_DATA_PATH}/train.lmdb", "normalize_labels": False},
        "val": {"src": f"{LOCAL_DATA_PATH}/val.lmdb"}
    },
    "logger": "tensorboard",
    "task": {
        "dataset": "lmdb_v2",
        "description": "BioFoundry Active Learning",
        "type": "regression",
        "metric": "mae",
        "primary_metric": "mae",
        "labels": ["predicted_score"]
    },
    "model": {
        "name": "equiformer_v2",
        "use_pbc": False,
        "regress_forces": False,
        "otf_graph": True,
        "max_neighbors": 20,
        "max_radius": 12.0,
        "max_num_elements": 90,
        "num_layers": 4,
        "sphere_channels": 64,
        "attn_hidden_channels": 64,
        "num_heads": 4,
        "attn_alpha_channels": 64,
        "attn_value_channels": 32,
        "ffn_hidden_channels": 128,
        "norm_type": "layer_norm",
        "lmax_list": RECOMMENDED_LMAX,
        "mmax_list": [2] if RECOMMENDED_LMAX == [4] else [1],
        "grid_resolution": 18 if RECOMMENDED_LMAX == [4] else 8
    },
    "optim": {
        "batch_size": RECOMMENDED_BATCH_SIZE,
        "eval_batch_size": RECOMMENDED_BATCH_SIZE * 2,
        "num_workers": 2,
        "lr_initial": 0.001,
        "optimizer": "AdamW",
        "optimizer_params": {"weight_decay": 0.01},
        "scheduler": "ReduceLROnPlateau",
        "scheduler_params": {"factor": 0.5, "patience": 5, "epochs": 50},
        "mode": "min",
        "max_epochs": 50,
        "energy_coefficient": 1.0,
        "eval_every": 5,
        "checkpoint_every": 10
    }
}

config_path = "/content/colab_config.yml"
with open(config_path, "w") as f:
    yaml.dump(config, f, default_flow_style=False)

print(f"‚úÖ Config saved: {config_path}")
print(f"Batch: {RECOMMENDED_BATCH_SIZE}, Lmax: {RECOMMENDED_LMAX}")

## Cell 6: Train EquiformerV2

‚è∞ 2-6 hours

In [None]:
os.environ['PYTHONPATH'] = '/content/ocp:/content/equiformer_v2'
os.chdir("/content/equiformer_v2")

print("=" * 60)
print("Starting Training...")
print("=" * 60)

!python main_oc20.py \
    --config-yml {config_path} \
    --mode train \
    --run-dir {CHECKPOINT_PATH} \
    --print-every 10

print("\n‚úÖ Training done")
print(f"Checkpoints: {CHECKPOINT_PATH}")

## Cell 7: TensorBoard (Optional)

In [None]:
%load_ext tensorboard
%tensorboard --logdir {CHECKPOINT_PATH}

---

## Summary of Fixes

### Problem 1: `submitit` missing
**Fix**: Added to Cell 2

### Problem 2: `ocpmodels` missing
**Root cause**: Wrong OCP repo + no installation

**Fix (Cell 4)**:
1. Clone `FAIR-Chem/fairchem` (not old OCP)
2. Checkout version `f83d150`
3. Modify `utils.py` (add 2 lines)
4. Run `pip install -e .`

### Expected Output:
```
‚úÖ ocpmodels imports successful
‚úÖ Setup complete
```

---

**Cells 8-14** (Embedding extraction, Active Learning) remain unchanged from previous versions.