# Paper Reproduction Launcher (Llama Runner)

This Colab-friendly notebook drives the existing CLI scripts in `summarization/` to run very small LED and Llama jobs. It uses the `train_last_100.json` / `valid_last_100.json` splits so we can exercise the full code paths quickly before launching the long runs described in the paper.

In [1]:
!sudo apt-get update -y
!sudo apt-get install -y python3.10 python3.10-dev python3.10-distutils

# Point 'python' and 'python3' to python3.10
!sudo update-alternatives --install /usr/bin/python python /usr/bin/python3.10 1
!sudo update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.10 1

# Reinstall pip for Python 3.10
!curl -sS https://bootstrap.pypa.io/get-pip.py | sudo python3.10

0% [Working]            Hit:1 http://archive.ubuntu.com/ubuntu jammy InRelease
0% [Connecting to security.ubuntu.com (91.189.92.22)] [Connected to cloud.r-pro                                                                               Get:2 http://archive.ubuntu.com/ubuntu jammy-updates InRelease [128 kB]
0% [2 InRelease 24.3 kB/128 kB 19%] [Waiting for headers] [Connected to cloud.r                                                                               Get:3 http://security.ubuntu.com/ubuntu jammy-security InRelease [129 kB]
0% [2 InRelease 41.7 kB/128 kB 33%] [3 InRelease 14.2 kB/129 kB 11%] [Connected                                                                               Hit:4 https://cli.github.com/packages stable InRelease
0% [2 InRelease 60.5 kB/128 kB 47%] [3 InRelease 14.2 kB/129 kB 11%] [Waiting f                                                                               Get:5 https://cloud.r-project.org/bin/linux/ubuntu jammy-cran40/ InReleas

In [2]:
import sys
sys.version
# assert '3.10' in sys.version
# !python3.10 --version

'3.12.12 (main, Oct 10 2025, 08:52:57) [GCC 11.4.0]'

In [3]:
# @title Sync the local repository from Google Drive (no git clone)
from pathlib import Path
import os
import importlib.util

COLAB = importlib.util.find_spec("google.colab") is not None
if COLAB:
    from google.colab import drive  # type: ignore
    drive.mount('/content/drive', force_remount=True)
    REPO_IN_DRIVE = Path('/content/drive/Othercomputers/My Mac/patient_summaries_with_llms/')  # @param {type:"string"}
    if not REPO_IN_DRIVE.exists():
        raise FileNotFoundError(f"Upload/sync the local repo to {REPO_IN_DRIVE} first.")
    TARGET_DIR = Path('/content/patient_summaries_with_llms')
    TARGET_DIR.mkdir(parents=True, exist_ok=True)
    os.system(f"rsync --progress -a --delete '{REPO_IN_DRIVE}/summarization' '{TARGET_DIR}/'")
    os.system(f"rsync --progress -a --delete '{REPO_IN_DRIVE}/data' '{TARGET_DIR}/'")
    os.system(f"rsync --progress -a --delete '{REPO_IN_DRIVE}/requirements.txt' '{TARGET_DIR}/'")
    os.system(f"rsync --progress -a --delete '{REPO_IN_DRIVE}/requirements-llama.txt' '{TARGET_DIR}/'")
    %cd /content/patient_summaries_with_llms
else:
    print("Running outside Colab; using the current working directory.")
    TARGET_DIR = Path.cwd()

Mounted at /content/drive
/content/patient_summaries_with_llms


In [4]:
# @title Install base dependencies shared by LED + evaluation
%pip install -q -r requirements-llama.txt

  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
  Building wheel for rouge-score (pyproject.toml) ... [?25l[?25hdone


## Llama LoRA smoke run (`summarization/fine_tune_llama.py`)

Runs a tiny LoRA training job on the same dataset subset (100 examples). This assumes you already have access to `meta-llama/Llama-2-7b-hf` and a GPU runtime; set `HF_TOKEN` in the environment if needed.

In [8]:
%pip install bert sacremoses fire bert_score

Collecting bert
  Downloading bert-2.2.0.tar.gz (3.5 kB)
  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
Collecting sacremoses
  Downloading sacremoses-0.1.1-py3-none-any.whl.metadata (8.3 kB)
Collecting fire
  Downloading fire-0.7.1-py3-none-any.whl.metadata (5.8 kB)
Collecting erlastic (from bert)
  Downloading erlastic-2.0.0.tar.gz (6.8 kB)
  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
Collecting termcolor (from fire)
  Downloading termcolor-3.2.0-py3-none-any.whl.metadata (6.4 kB)
Downloading sacremoses-0.1.1-py3-none-any.whl (897 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m897.5/897.5 kB[0m [31m15.2 MB/s[0m  [33m0:00:00[0m
[?25hDownloading fire-0.7.1-py3-none-any.whl (115 kB)
Downloading termcolor-3.2.0-py3-no

In [10]:
# @title Configure HF token and output directories
import os
from pathlib import Path
from google.colab import userdata


os.environ["HF_TOKEN"] = userdata.get("HF_TOKEN")
os.environ["WANDB_MODE"] = "offline"
%env MPLBACKEND=Agg

DATA_DIR = Path("data/ann-pt-summ/1.0.1/mimic-iv-note-ext-di-bhc/dataset")
assert DATA_DIR.exists(), f"Missing dataset folder: {DATA_DIR}"

env: MPLBACKEND=Agg


In [13]:
# @title Llama 2 7B LoRA fine-tuning on 100-example subset
!python3.10 summarization/fine_tune_llama.py \
    --model_name_or_path meta-llama/Llama-2-7b-hf \
    --data_path {DATA_DIR} \
    --output_path results/llama_full_run_predict \
    --evaluation \
    --evaluation_model_path data/llama_4000_600_chars/best_val_loss \
    --num_test_examples 100

data: data/ann-pt-summ/1.0.1/mimic-iv-note-ext-di-bhc/dataset
[34m[1mwandb[0m: Tracking run with wandb version 0.16.6
[34m[1mwandb[0m: W&B syncing is set to [1m`offline`[0m in this directory.  
[34m[1mwandb[0m: Run [1m`wandb online`[0m or set [1mWANDB_MODE=online[0m to enable cloud syncing.
loading configuration file config.json from cache at /root/.cache/huggingface/hub/models--meta-llama--Llama-2-7b-hf/snapshots/01c7f73d771dfac7d292323805ebc428287df4f9/config.json
Model config LlamaConfig {
  "_name_or_path": "meta-llama/Llama-2-7b-hf",
  "architectures": [
    "LlamaForCausalLM"
  ],
  "attention_bias": false,
  "attention_dropout": 0.0,
  "bos_token_id": 1,
  "eos_token_id": 2,
  "hidden_act": "silu",
  "hidden_size": 4096,
  "initializer_range": 0.02,
  "intermediate_size": 11008,
  "max_position_embeddings": 4096,
  "model_type": "llama",
  "num_attention_heads": 32,
  "num_hidden_layers": 32,
  "num_key_value_heads": 32,
  "pretraining_tp": 1,
  "rms_norm_eps": 1e