# üöÄ Manus Fullstack Specialization: Unsloth Training (Colab)

This notebook runs the **Manus Replica** training pipeline using **Unsloth** for 2x faster training and 70% less memory usage.

### **Hardware Requirement**
- **Free Tier**: Tesla T4 (16GB) -> Runs Qwen 2.5 7B
- **Pro**: A100 (40GB+) -> Runs Qwen 2.5 72B (Maybe)

The scripts automatically detect your GPU and switch to the 7B model if you are on the Free Tier.

In [None]:
# 1. Install Unsloth & Dependencies
%%capture
!pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
!pip install --no-deps trl peft accelerate bitsandbytes

In [None]:
# 2. Clone/Setup Repository (Simulated for this artifacts folder)
# In a real scenario, you would git clone. Here we assume files are present or uploaded.
# If running via this artifact, ensure the .py files are in the working directory.
import os
if not os.path.exists("02_generate_trajectories.py"):
    print("‚ö†Ô∏è Please upload the python scripts (02_..., 03_..., 04_..., 06_...) to the Colab runtime files.")

## Step 1: Generate Data (Fast Mode)
We will generate a small batch for demonstration. The script supports 1B scale.

In [None]:
!python 02_generate_trajectories.py --demo # You might need to edit main() to run smaller batch if 1B is default

## Step 2: Validation

In [None]:
!python 03_validate_trajectories.py

## Step 3: SFT Training (Unsloth Optimized)
The script `04_sft_training.py` has been patched to auto-detect T4 GPU and load `unsloth/Qwen2.5-7B-Instruct-bnb-4bit`.

In [None]:
!python 04_sft_training.py

## Step 4: GRPO Reinforcement Learning
Uses `GRPOTrainer` to optimize for stack compliance and replica features.

In [None]:
# Simulate Rejection Sampling first (needed for GRPO input)
!python 05_rejection_sampling.py
!python 06_grpo_training.py

In [None]:
# Export Model to GGUF (Optional)
from unsloth import FastLanguageModel
model, tokenizer = FastLanguageModel.from_pretrained("checkpoints/stage3_grpo/final", load_in_4bit=True)
model.save_pretrained_gguf("model_gguf", tokenizer, quantization_method = "q4_k_m")