# HyLoRADA v0.5.0 - Cost-Efficient Long-Context Learning

**Streamlined PEFT for long-context LLMs** based on literature review (2023-2025).

Key Techniques:
- **S²-Attn (LongLoRA)**: 16x training efficiency
- **Trainable Embeddings & Norms**: Critical for >32k context
- **RoPE Scaling (YaRN)**: Extend context to 128k+
- **Sink Tokens (SinkLoRA)**: Stable attention patterns

In [None]:
# Setup
import warnings
warnings.filterwarnings('ignore')

In [None]:
# Clone repo (Kaggle)
import os
if os.path.exists('hylorada'):
    %cd hylorada
    !git pull
else:
    !git clone https://github.com/SadiaTabassum1216/hylorada.git
    %cd hylorada

In [None]:
# Install dependencies
!pip install -q transformers datasets accelerate tqdm bitsandbytes peft

In [None]:
# Check GPU
import torch
print(f"CUDA: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"GPU: {torch.cuda.get_device_name(0)}")
    print(f"Memory: {torch.cuda.get_device_properties(0).total_memory / 1e9:.1f} GB")

## 1. Quick Start - Standard Configuration

In [None]:
from hylorada import HyLoRADAConfig, HyLoRADAModel
from transformers import AutoModelForCausalLM, AutoTokenizer

# Load model and tokenizer
model_name = "Qwen/Qwen2.5-0.5B"
base_model = AutoModelForCausalLM.from_pretrained(model_name, trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)

# Standard Config (efficient, minimal params)
config = HyLoRADAConfig(lora_rank=8)
model = HyLoRADAModel(base_model, config)

print("=== Standard Config ===")
model.print_trainable_params()

## 2. Long-Context Configuration (>32k Tokens)

For efficient long-context learning:
- `s2_attn_enabled=True`: S²-Attn for 16x training efficiency
- `train_embeddings=True`: Trainable embeddings (LongLoRA)
- `s2_sink_tokens=4`: Sink tokens for stable attention (SinkLoRA)
- `rope_scaling_type`: Position extension (YaRN)

In [None]:
# Reload base model for long-context config
base_model_long = AutoModelForCausalLM.from_pretrained(model_name, trust_remote_code=True)

# Long-Context Config
long_config = HyLoRADAConfig(
    lora_rank=8,
    # S²-Attn (LongLoRA) - 16x training efficiency
    s2_attn_enabled=True,
    s2_group_size=2048,
    # Trainable embeddings & norms (LongLoRA)
    train_embeddings=True,
    train_norms=True,
    # Sink tokens (SinkLoRA)
    s2_sink_tokens=4,
    # RoPE Scaling (YaRN)
    rope_scaling_type="linear",
    rope_scaling_factor=4.0,
)

model_long = HyLoRADAModel(base_model_long, long_config)

print("=== Long-Context Config ===")
model_long.print_trainable_params()

## 3. Run Benchmark

In [None]:
# Quick benchmark: Compare LoRA vs HyLoRADA
!python run_benchmark.py --methods lora hylorada --epochs 1 --num_train 200 --num_test 50

In [None]:
# View benchmark results
import json, glob

for f in glob.glob('./**/benchmark*.json', recursive=True):
    print(f"\n=== {f} ===")
    try:
        data = json.load(open(f))
        if 'results' in data:
            for method, r in data['results'].items():
                ppl = r.get('perplexity', 'N/A')
                params = r.get('trainable_params', 'N/A')
                time_s = r.get('train_time', 'N/A')
                if isinstance(ppl, float):
                    ppl = f"{ppl:.2f}"
                print(f"{method}: PPL={ppl}, params={params}, time={time_s}")
    except Exception as e:
        print(f"Error: {e}")

## 4. Full Benchmark (All Methods)

In [None]:
# Full comparison (takes longer)
!python run_benchmark.py --methods lora dora hylorada --epochs 2 --num_train 500 --dataset wikitext

## 5. Unit Tests

In [None]:
# Run tests to verify installation
!python -m pytest tests/ -v --tb=short -q