# BindCraft AlphaFold2 Local Workflow

This notebook replaces the legacy ColabFold interface with a reproducible workflow that runs directly inside this repository.

## Notebook overview

This local workflow performs the following steps:
1. Verify that the environment and AlphaFold parameter files are available.
2. Configure which `settings`, `filters`, and `advanced` JSON files to use for a run.
3. Launch `bindcraft.py` exactly as it would run from the command line.
4. Inspect CSV summaries and generated structures inside the configured `design_path`.
5. Optionally visualize a ranked structure with `py3Dmol` when the package is installed.

## Prerequisites

- Run `./install_bindcraft.sh` once (or build one of the Docker images) to install BindCraft dependencies locally.
- Run `./download_alphafold_weights.sh` so every file under `params/` is present on disk.
- Keep your NVIDIA driver/CUDA stack up to date; the environment check below will confirm GPU visibility.

In [None]:
from pathlib import Path
import os
import platform
import shutil
import subprocess

PROJECT_ROOT = Path.cwd()
if not (PROJECT_ROOT / "bindcraft.py").exists():
    if (PROJECT_ROOT.parent / "bindcraft.py").exists():
        PROJECT_ROOT = PROJECT_ROOT.parent
    else:
        raise FileNotFoundError("Run this notebook from within the BindCraftPeptide repository.")

print(f"Project root: {PROJECT_ROOT}")
print(f"Python version: {platform.python_version()}")

def run_and_report(cmd):
    result = subprocess.run(cmd, capture_output=True, text=True)
    print(f"$ {' '.join(cmd)}")
    if result.stdout:
        print(result.stdout.strip())
    if result.stderr:
        print(result.stderr.strip())
    return result.returncode

python_bin = os.environ.get("PYTHON", "python")
run_and_report([python_bin, "--version"])

if shutil.which("nvidia-smi"):
    run_and_report(["nvidia-smi"])
else:
    print("nvidia-smi not found; GPU acceleration may be unavailable.")

params_dir = PROJECT_ROOT / "params"
npz_files = list(params_dir.glob("*.npz"))
print(f"AlphaFold parameter dir: {params_dir}")
print(f"Found {len(npz_files)} parameter files.")

## Configure a run

Edit the dictionary below to point at different JSON files or to append optional CLI flags such as extra `--prefilters`.

In [None]:
from datetime import datetime

CONFIG = {
    "settings_path": PROJECT_ROOT / "example/pdl1/PDL1_partial.json",
    "filters_path": PROJECT_ROOT / "settings_filters/no_filters.json",
    "advanced_path": PROJECT_ROOT / "settings_advanced/test_settings_peptide_betasheet_4stage_multimer.json",
    "extra_args": [],
    "capture_output": True,
}

timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
CONFIG.setdefault("run_id", f"notebook_run_{timestamp}")

for key in ("settings_path", "filters_path", "advanced_path"):
    CONFIG[key] = Path(CONFIG[key]).expanduser().resolve()

CONFIG

In [None]:
import json

with open(CONFIG["settings_path"], "r") as f:
    target_settings = json.load(f)

design_path = Path(target_settings["design_path"]).expanduser().resolve()
design_path.mkdir(parents=True, exist_ok=True)

print(f"Design path: {design_path}")
print(f"Filters file: {CONFIG['filters_path']}")
print(f"Advanced settings file: {CONFIG['advanced_path']}")

csv_candidates = [
    design_path / "AF2_design_stats.csv",
    design_path / "failure_csv.csv"
 ]
for csv_path in csv_candidates:
    status = "exists" if csv_path.exists() else "will be created"
    print(f"{csv_path.name}: {status}")

## Run BindCraft

This cell spawns `bindcraft.py` with the configuration above. Set `capture_output=False` if you prefer to stream logs directly into the notebook output area.

In [None]:
import shlex
import sys

cmd = [
    sys.executable,
    str(PROJECT_ROOT / "bindcraft.py"),
    "--settings", str(CONFIG["settings_path"]),
    "--filters", str(CONFIG["filters_path"]),
    "--advanced", str(CONFIG["advanced_path"])
] + list(CONFIG.get("extra_args", []))

print("Executing:", " ".join(shlex.quote(part) for part in cmd))
result = subprocess.run(
    cmd,
    cwd=PROJECT_ROOT,
    capture_output=CONFIG.get("capture_output", True),
    text=True
 )

if CONFIG.get("capture_output", True):
    if result.stdout:
        print(result.stdout)
    if result.stderr:
        print(result.stderr, file=sys.stderr)

print(f"Return code: {result.returncode}")
if result.returncode != 0:
    raise RuntimeError("bindcraft.py exited with a non-zero status. Inspect the logs above.")

## Inspect outputs

The following cells summarize CSV results, enumerate generated PDB files, and help with downstream inspection.

In [None]:
import pandas as pd

stats_path = design_path / "AF2_design_stats.csv"
failure_path = design_path / "failure_csv.csv"
accepted_dir = design_path / "Accepted"
accepted_dir.mkdir(parents=True, exist_ok=True)

if stats_path.exists():
    stats_df = pd.read_csv(stats_path)
    display(stats_df.tail())
else:
    print(f"Missing {stats_path}; run BindCraft first.")

if failure_path.exists():
    failure_df = pd.read_csv(failure_path)
    display(failure_df)

accepted_models = sorted(accepted_dir.glob("*.pdb"))
print(f"Accepted models: {len(accepted_models)}")
for pdb in accepted_models[:5]:
    print(f"- {pdb.name}")
if len(accepted_models) > 5:
    print("  ...")

## Optional: visualize a structure

If `py3Dmol` is installed locally, this cell renders the first accepted structure. Install it with `pip install py3Dmol` if needed.

In [None]:
try:
    import py3Dmol
    candidate = next(iter(accepted_models), None)
    if candidate is None:
        raise FileNotFoundError("No PDB files in Accepted/. Run BindCraft to generate structures.")
    view = py3Dmol.view(js='https://3dmol.org/build/3Dmol.js')
    view.addModel(candidate.read_text(), 'pdb')
    view.setStyle({'cartoon': {'colorscheme': 'rainbow'}})
    view.zoomTo()
    view.show()
except ImportError:
    print("py3Dmol is not installed. Run `pip install py3Dmol` to enable 3D visualization.")
except FileNotFoundError as exc:
    print(str(exc))

## Next steps

- Adjust `CONFIG['extra_args']` to enable prefilters, alternative advanced settings, or different output locations.
- Use the generated CSV files for downstream analysis or ranking logic.
- Incorporate additional notebook cells (e.g., PyRosetta analysis, plotting) tailored to your project.