# Cloud GPU Environment Setup Guide

This notebook provides step-by-step instructions for setting up GPU-accelerated environments for the LLM Spelling Project using:
1. Google Colab
2. Lightning.ai

## Prerequisites
- A Google account (for Colab)
- A Lightning.ai account
- Weights & Biases account
- Hugging Face account

## Environment Setup Overview
1. Install required packages
2. Configure authentication
3. Set up data synchronization
4. Test GPU availability
5. Run example fine-tuning code

In [None]:
# Verify GPU availability
!nvidia-smi

In [None]:
!pip install unsloth torch transformers datasets wandb dspy lightning matplotlib seaborn pandas jupyter notebook ipywidgets

In [None]:
# Weights & Biases setup
import wandb
wandb.login()  # You'll need to enter your API key

# Hugging Face setup
from huggingface_hub import login
login()  # You'll need to enter your token

In [None]:
!git clone https://github.com/yourusername/raspberry.git
!cd raspberry

In [None]:
from unsloth import FastLanguageModel
import torch

# Initialize model with Unsloth optimizations
model_name = "meta-llama/Llama-2-7b-hf"  # Example model
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name=model_name,
    max_seq_length=512,
    dtype=torch.bfloat16,
    load_in_4bit=True,
)

# Example training configuration
training_config = {
    "learning_rate": 2e-4,
    "num_train_epochs": 3,
    "per_device_train_batch_size": 4,
    "gradient_accumulation_steps": 4,
}

# The actual training code will depend on your specific dataset and requirements

## 2. Lightning.ai Setup

### 2.1 Initial Setup
1. Go to [Lightning.ai](https://lightning.ai/lars/home)
2. Create a new project or open existing
3. Select a GPU-enabled compute instance

In [None]:
%%writefile lightning.app
name: llm-spelling-project
compute:
  type: gpu
  size: medium  # Adjust based on needs
requirements:
  - unsloth
  - transformers
  - wandb
  - dspy

## 3. Syncing Results and Models

### 3.1 Using Weights & Biases
W&B is the primary method for tracking experiments and syncing models:

In [None]:
import wandb

# Initialize a new run
wandb.init(
    project="llm-spelling",
    name="fine-tuning-run-1",
    config=training_config
)

# Log metrics during training
wandb.log({"loss": 0.5, "accuracy": 0.95})

# Save model artifacts
wandb.save("./model.pt")

### 3.2 Using Hugging Face Hub
For sharing models and datasets:

In [None]:
from huggingface_hub import push_to_hub

# Save model to Hugging Face Hub
model.push_to_hub("your-username/model-name")
tokenizer.push_to_hub("your-username/model-name")

## 4. Troubleshooting Tips

### Common Issues and Solutions

1. **GPU Out of Memory**
   - Reduce batch size
   - Enable gradient checkpointing
   - Use 4-bit or 8-bit quantization
   ```python
   model.enable_gradient_checkpointing()
   ```

2. **Colab Runtime Disconnection**
   - Save checkpoints frequently
   - Use W&B for experiment tracking
   - Keep browser tab active

3. **Package Conflicts**
   - Create a fresh environment
   - Install packages in the correct order
   - Check compatibility matrix

4. **Authentication Issues**
   - Verify API keys in environment variables
   - Check token permissions
   - Ensure proper login sequence

### Best Practices

1. **Regular Checkpointing**
   - Save model state every N steps
   - Use W&B for experiment tracking
   - Keep local copies of important data

2. **Resource Management**
   - Monitor GPU memory usage
   - Use appropriate batch sizes
   - Clean up unused variables

3. **Version Control**
   - Commit code changes regularly
   - Use meaningful commit messages
   - Tag important versions