# Structural Bioinformatics — Colab Getting Started

This notebook confirms your Colab setup and shows the standard workflow we’ll use all semester:
1. Open a course notebook from GitHub
2. Install required Python packages
3. Get course files/data into the runtime
4. (Optional) Mount Google Drive for persistent storage
5. Save your work and submit

**Tip:** Early exercises do *not* require Google Drive mounting. Final projects *do*.


In [None]:
import sys, platform, os, textwrap
print("Python:", sys.version.split()[0])
print("Platform:", platform.platform())
print("Working dir:", os.getcwd())


## 1) Choose whether to use Google Drive (persistence)

- If you want your edits to persist across sessions, use **Drive**.
- If you're just running a quick exercise, you can skip Drive and use the temporary Colab filesystem (`/content`).



In [None]:
USE_DRIVE = True  # set False if you want to skip Drive mounting

if USE_DRIVE:
    from google.colab import drive
    drive.mount('/content/drive')
    print("Drive mounted at /content/drive")
else:
    print("Skipping Drive mount.")


In [None]:
import os
from pathlib import Path

# Change this folder name once; everything else uses it.
COURSE_DIR_NAME = "structbio_course"  # you can rename for your course

if "drive" in str(Path("/content/drive")) and os.path.exists("/content/drive") and USE_DRIVE:
    ROOT = Path("/content/drive/MyDrive") / COURSE_DIR_NAME
else:
    ROOT = Path("/content") / COURSE_DIR_NAME

ROOT.mkdir(parents=True, exist_ok=True)
(ROOT / "data").mkdir(exist_ok=True)
(ROOT / "outputs").mkdir(exist_ok=True)

print("ROOT:", ROOT)
print("data:", ROOT/"data")
print("outputs:", ROOT/"outputs")


## 2) Get course materials

We will normally pull notebooks/data from a course GitHub repo.
If you opened this notebook from GitHub already, you may still want a local copy of the repo for helper scripts and data.

**Instructor note:** update `REPO_URL` below.


In [None]:
import subprocess, shlex, os
from pathlib import Path

REPO_URL = "https://github.com/vvoelz/chem5412-spring2026" 
REPO_DIR = ROOT / "repo"

def run(cmd):
    print(">>", cmd)
    return subprocess.check_call(shlex.split(cmd))

if not REPO_DIR.exists():
    run(f"git clone {REPO_URL} {REPO_DIR}")
else:
    run(f"git -C {REPO_DIR} pull")

print("Repo at:", REPO_DIR)


## 3) Install Python packages

Colab already includes numpy/scipy/matplotlib/pandas, but we will install extras.
This may take 1–3 minutes.

**Instructor note:** keep this list minimal; install only what you use.


In [None]:
# Keep this lightweight; add packages as your course needs.
# Examples: biopython, mdtraj, nglview, py3Dmol, prody
!pip -q install biopython mdtraj nglview py3Dmol

# Optional: if your repo includes a requirements file:
# !pip -q install -r "{REPO_DIR}/environment/colab_requirements.txt"


In [None]:
import numpy as np
import matplotlib.pyplot as plt
import Bio
import mdtraj as md

print("numpy:", np.__version__)
print("biopython:", Bio.__version__)
print("mdtraj:", md.__version__)


## 4) Quick structure visualization check

We’ll use simple viewers in Colab:
- `py3Dmol` (browser-based, easy)
- (optional) `nglview` (more featureful, sometimes finicky)

Below: fetch a PDB and render it.


In [None]:
import requests, textwrap
import py3Dmol

PDB_ID = "1CRN"  # crambin (small test structure)
url = f"https://files.rcsb.org/download/{PDB_ID}.pdb"
pdb_txt = requests.get(url).text

view = py3Dmol.view(width=600, height=450)
view.addModel(pdb_txt, "pdb")
view.setStyle({"cartoon": {}})
view.zoomTo()
view.show()


## 5) Saving your work and submitting

**Saving**
- If you opened from a link, click: `File → Save a copy in Drive` (recommended)
- Or download: `File → Download → .ipynb`

**Submitting**
- You will submit either:
  - a `.ipynb` file (preferred), or
  - a PDF export of the notebook, depending on the assignment.

**If you get stuck**
- Restart runtime: `Runtime → Restart runtime`
- Re-run the install cell(s)
- Post the error message + what cell it came from


## Next steps
Open the first exercise notebook from the course site and repeat the same workflow:
1. Install packages (if needed)
2. Pull data from the repo (or download as instructed)
3. Run analysis
4. Save a copy to Drive
