# Beyond-next-token-probabilities â€” create 100 HD samples (IMDB)

Run this notebook on a **GPU machine** with CUDA.

It performs everything needed after clone:
- installs python deps
- applies the required patches
- generates **100** HD raw samples (IMDB)

> SLURM tip (recommended): start Jupyter *inside* a GPU allocation:
> ```bash
> srun --gres=gpu:1 --cpus-per-task=4 --mem=24G --time=02:00:00 --pty bash
> conda activate LOS_Net
> cd ~/DL-236781
> jupyter lab --no-browser --ip 127.0.0.1 --port 8891
> ```

In [1]:
import sys, platform
import torch

print("python:", sys.version)
print("platform:", platform.platform())
print("torch:", torch.__version__)
print("cuda available:", torch.cuda.is_available())
print("cuda device count:", torch.cuda.device_count())
if torch.cuda.is_available():
    print("gpu:", torch.cuda.get_device_name(0))
else:
    raise RuntimeError("CUDA is not available. Run this notebook on a GPU node with CUDA-enabled PyTorch.")

python: 3.10.15 (main, Oct  3 2024, 07:27:34) [GCC 11.2.0]
platform: Linux-5.4.0-214-generic-x86_64-with-glibc2.31
torch: 2.2.0+cu121
cuda available: True
cuda device count: 1
gpu: NVIDIA GeForce RTX 2080 Ti


In [2]:
from pathlib import Path
import os

WORKDIR = Path.home() / "DL-236781"
REPO_DIR = WORKDIR / "Beyond-next-token-probabilities"

WORKDIR.mkdir(parents=True, exist_ok=True)
os.chdir(WORKDIR)
print("WORKDIR:", WORKDIR)

if not REPO_DIR.exists():
    !git clone https://github.com/BarSGuy/Beyond-next-token-probabilities.git
else:
    print("Repo already exists:", REPO_DIR)

os.chdir(REPO_DIR)
!pwd
!git rev-parse --short HEAD

WORKDIR: /home/shaked-adi/DL-236781
Repo already exists: /home/shaked-adi/DL-236781/Beyond-next-token-probabilities
/home/shaked-adi/DL-236781/Beyond-next-token-probabilities
fb7ca51


## Install required Python packages

Pins `bitsandbytes` depending on the installed torch version.

In [3]:
import re, sys, subprocess

def pip_install(*args):
    cmd = [sys.executable, "-m", "pip", "install", "-U", *args]
    print(" ".join(cmd))
    subprocess.check_call(cmd)

pip_install("datasets", "accelerate", "nvidia-ml-py3", "transformers==4.42.3")
pip_install("git+https://github.com/davidbau/baukit.git")

import torch
m = re.match(r"(\d+)\.(\d+)\.(\d+)", torch.__version__)
major, minor = (int(m.group(1)), int(m.group(2))) if m else (2, 2)

# torch < 2.3 -> bitsandbytes 0.43.1 works well
if (major, minor) < (2, 3):
    pip_install("--no-deps", "bitsandbytes==0.43.1")
else:
    pip_install("bitsandbytes")

import datasets, baukit, bitsandbytes
print("datasets:", datasets.__version__)
print("bitsandbytes:", bitsandbytes.__version__)
print("baukit import: ok")

/home/shaked-adi/miniconda3/envs/LOS_Net/bin/python -m pip install -U datasets accelerate nvidia-ml-py3 transformers==4.42.3
/home/shaked-adi/miniconda3/envs/LOS_Net/bin/python -m pip install -U git+https://github.com/davidbau/baukit.git
Collecting git+https://github.com/davidbau/baukit.git
  Cloning https://github.com/davidbau/baukit.git to /tmp/pip-req-build-spg3e5lu


  Running command git clone --filter=blob:none --quiet https://github.com/davidbau/baukit.git /tmp/pip-req-build-spg3e5lu


  Resolved https://github.com/davidbau/baukit.git to commit 9d51abd51ebf29769aecc38c4cbef459b731a36e
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
  Preparing metadata (pyproject.toml): started
  Preparing metadata (pyproject.toml): finished with status 'done'
/home/shaked-adi/miniconda3/envs/LOS_Net/bin/python -m pip install -U --no-deps bitsandbytes==0.43.1


  from .autonotebook import tqdm as notebook_tqdm


datasets: 4.5.0
bitsandbytes: 0.43.1
baukit import: ok


## Patch 1: Fix IMDB dataset name (HF)

Replace `load_dataset("imdb")` with `load_dataset("stanfordnlp/imdb")`.

In [4]:
from pathlib import Path

p = Path("utils/datasets_HD_helper.py")
txt = p.read_text()

before = 'load_dataset("imdb")'
after  = 'load_dataset("stanfordnlp/imdb")'

if before in txt:
    p.write_text(txt.replace(before, after))
    print("Patched:", p)
else:
    print("No change needed (already patched or different code).")

No change needed (already patched or different code).


## Patch 2: Force 4-bit model loading on GPU

Overwrites `utils/LLM_helpers.py` with a minimal 4-bit loader (bitsandbytes).

In [5]:
from pathlib import Path

content = 'import torch\nfrom transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig\n\ndef load_model_and_validate_gpu(model_name: str):\n    tokenizer = AutoTokenizer.from_pretrained(model_name)\n\n    bnb_config = BitsAndBytesConfig(\n        load_in_4bit=True,\n        bnb_4bit_quant_type="nf4",\n        bnb_4bit_compute_dtype=torch.float16,\n        bnb_4bit_use_double_quant=True,\n    )\n\n    model = AutoModelForCausalLM.from_pretrained(\n        model_name,\n        quantization_config=bnb_config,\n        device_map={"": 0},\n        torch_dtype=torch.float16,\n        low_cpu_mem_usage=True,\n    )\n\n    assert all(d not in ("cpu", "disk") for d in model.hf_device_map.values()), model.hf_device_map\n    return model, tokenizer\n'
Path("utils/LLM_helpers.py").write_text(content)
print("Rewrote utils/LLM_helpers.py")

Rewrote utils/LLM_helpers.py


## Optional: Check GPU free memory (NVML)

In [6]:
import pynvml
pynvml.nvmlInit()
h = pynvml.nvmlDeviceGetHandleByIndex(0)
name = pynvml.nvmlDeviceGetName(h).decode()
mem = pynvml.nvmlDeviceGetMemoryInfo(h)
print("GPU:", name)
print("free_GB:", round(mem.free/1024**3,2), "/", round(mem.total/1024**3,2))

GPU: NVIDIA GeForce RTX 2080 Ti
free_GB: 10.75 / 11.0
