# BatFit LoRA Training on Colab
This notebook installs the lightweight dependencies, optionally mounts Google Drive for artifact storage, and launches `scripts/train_lora.py` on a Colab GPU runtime. Feel free to tweak hyperparameters via environment variables before running the training cell.

## Clone private repo
Set a GitHub personal access token (PAT) with `repo` scope; the cell below will prompt for it if `GIT_TOKEN` isn't already set.

In [None]:
import os, getpass
if 'GIT_TOKEN' not in os.environ:
    os.environ['GIT_TOKEN'] = getpass.getpass('GitHub PAT (repo scope): ')
repo_url = os.environ.get('GIT_REPO', 'https://github.com/wahajaslm/batfit.git')
token = os.environ['GIT_TOKEN']
safe_url = repo_url.replace('https://', f'https://{token}:x-oauth-basic@')
!git clone $safe_url
%cd batfit

In [None]:
!pip install -q -r requirements-colab.txt

## (Optional) Mount Drive
Uncomment the following cell if you want to save checkpoints into Drive.

In [None]:
# from google.colab import drive
# drive.mount('/content/drive')

## Configure env vars
Set the base model (TinyLlama by default) and adjust sequence length/epochs when experimenting.

In [None]:
import os
os.environ['BATFIT_BASE_MODEL'] = os.environ.get('BATFIT_BASE_MODEL', 'TinyLlama/TinyLlama-1.1B-Chat-v1.0')
os.environ['BATFIT_MAX_LEN'] = os.environ.get('BATFIT_MAX_LEN', '768')
os.environ['BATFIT_EPOCHS'] = os.environ.get('BATFIT_EPOCHS', '2')
print('Base model:', os.environ['BATFIT_BASE_MODEL'])

## Launch training
Make sure this notebook is running from the project root (the cell below assumes `scripts/train_lora.py` is available).

In [None]:
!python scripts/train_lora.py