# Tiny Recursive Model (TRM) Training on Colab

This notebook guides you through setting up and training the TRM on Google Colab.

## 1. Setup

First, we need to clone the repository and install the dependencies.

In [None]:
!git clone https://github.com/dauvannam1804/TinyRecursiveModels.git
%cd TinyRecursiveModels
!pip install -r requirements.txt

## 2. Generate Data

Generate training (20k) and validation (5k) datasets.

In [None]:
!python create_sample.py

## 3. Train Tokenizer

We need to train a custom tokenizer on our dataset. This script will use the `data/processed/sample_1k.csv` by default.

In [None]:
!chmod +x scripts/train_tokenizer.sh
!./scripts/train_tokenizer.sh

## 4. Train Model

Now we can start training the model. You can adjust configurations in `src/config.py` if needed.

In [None]:
!chmod +x scripts/run_train.sh
!./scripts/run_train.sh

## 5. Plot Results

Visualize the training and validation loss.

In [None]:
import json
import matplotlib.pyplot as plt
import os

history_path = "checkpoints/history.json"
if os.path.exists(history_path):
    with open(history_path, "r") as f:
        history = json.load(f)

    plt.figure(figsize=(10, 5))
    plt.plot(history["train_loss"], label="Train Loss")
    plt.plot(history["val_loss"], label="Val Loss")
    plt.xlabel("Epoch")
    plt.ylabel("Loss")
    plt.legend()
    plt.title("Training and Validation Loss")
    plt.show()
else:
    print("History file not found. Train the model first.")