# Mamba-Killer ResNet-BK: One-Click Reproducibility

This notebook provides a complete reproducible environment for training and evaluating Mamba-Killer ResNet-BK on Google Colab.

**Features:**
- Automatic setup and dependency installation
- Dataset preparation (WikiText-2, WikiText-103, C4, The Pile)
- Model training with checkpointing
- Evaluation and visualization
- Google Drive integration for persistence

**Runtime:** ~6-12 hours on Colab T4 GPU (free tier)

**Author:** Mamba-Killer Team  
**License:** MIT

## 1. Setup Environment

In [None]:
# Check GPU availability
!nvidia-smi

In [None]:
# Mount Google Drive for checkpoint persistence
from google.colab import drive
drive.mount('/content/drive')

# Create directories
!mkdir -p /content/drive/MyDrive/mamba_killer_checkpoints
!mkdir -p /content/drive/MyDrive/mamba_killer_results

In [None]:
# Repo setup (clone if needed, add to sys.path)
import os, sys, subprocess, pathlib
REPO_URL = 'https://github.com/neko-jpg/Project-ResNet-BK-An-O-N-Language-Model-Architecture.git'
REPO_DIR = 'Project-ResNet-BK-An-O-N-Language-Model-Architecture'
cwd = pathlib.Path.cwd()
candidates = [cwd, cwd.parent, cwd / REPO_DIR, cwd.parent / REPO_DIR]
root = next((p for p in candidates if (p / 'src').exists()), None)
if root is None:
    root = cwd / REPO_DIR
    if not root.exists():
        subprocess.run(['git', 'clone', REPO_URL, str(root)], check=True)
if root != pathlib.Path.cwd():
    os.chdir(root)
root_str = str(pathlib.Path.cwd())
if root_str not in sys.path:
    sys.path.insert(0, root_str)
print('PWD:', root_str)


In [None]:
# Install dependencies
!pip install -q -r requirements.txt
!pip install -q -e .

print("✓ Dependencies installed")

## 2. Prepare Datasets

In [None]:
# Prepare WikiText-2 (smallest, fastest)
!python scripts/prepare_datasets.py \
    --datasets wikitext2 \
    --output_dir /content/data

print("✓ WikiText-2 prepared")

In [None]:
# Optional: Prepare additional datasets (takes longer)
# Uncomment to prepare WikiText-103, C4, and The Pile

# !python scripts/prepare_datasets.py \
#     --all \
#     --output_dir /content/data \
#     --c4_samples 50000 \
#     --pile_samples 25000

## 3. Train Model

In [None]:
# Train with Colab-optimized configuration
!python train.py \
    --config configs/colab_config.yaml \
    --data_dir /content/data \
    --checkpoint_dir /content/drive/MyDrive/mamba_killer_checkpoints \
    --log_dir /content/logs

## 4. Monitor Training (Optional)

In [None]:
# Load TensorBoard
%load_ext tensorboard
%tensorboard --logdir /content/logs

## 5. Evaluate Model

In [None]:
# Evaluate on test set
!python scripts/evaluate.py \
    --checkpoint /content/drive/MyDrive/mamba_killer_checkpoints/latest.pt \
    --dataset wikitext2 \
    --data_dir /content/data \
    --output_dir /content/drive/MyDrive/mamba_killer_results

## 6. Generate Visualizations

In [None]:
# Generate killer graphs
!python scripts/generate_stability_graph.py \
    --results_dir /content/drive/MyDrive/mamba_killer_results \
    --output_dir /content/drive/MyDrive/mamba_killer_results/figures

!python scripts/generate_quantization_graph.py \
    --results_dir /content/drive/MyDrive/mamba_killer_results \
    --output_dir /content/drive/MyDrive/mamba_killer_results/figures

!python scripts/generate_efficiency_graph.py \
    --results_dir /content/drive/MyDrive/mamba_killer_results \
    --output_dir /content/drive/MyDrive/mamba_killer_results/figures

In [None]:
# Display results
from IPython.display import Image, display
import os

figures_dir = '/content/drive/MyDrive/mamba_killer_results/figures'

for fig_name in ['stability_graph.png', 'quantization_graph.png', 'efficiency_graph.png']:
    fig_path = os.path.join(figures_dir, fig_name)
    if os.path.exists(fig_path):
        print(f"\n{fig_name}:")
        display(Image(filename=fig_path))

## 7. Compare with Mamba (Optional)

In [None]:
# Run full comparison benchmark
!python scripts/mamba_vs_bk_benchmark.py \
    --model bk \
    --seq_len 2048 \
    --bits 32 \
    --dataset wikitext2 \
    --data_dir /content/data \
    --output_dir /content/drive/MyDrive/mamba_killer_results

## 8. Download Results

In [None]:
# Create archive of results
!cd /content/drive/MyDrive && \
    tar -czf mamba_killer_results.tar.gz mamba_killer_results/

print("✓ Results archived to Google Drive: mamba_killer_results.tar.gz")
print("  You can download this file from your Google Drive")

## 9. Cleanup (Optional)

In [None]:
# Clean up temporary files to free space
!rm -rf /content/data
!rm -rf /content/logs

print("✓ Temporary files cleaned up")
print("  Checkpoints and results are preserved in Google Drive")

## Troubleshooting

### Out of Memory (OOM)
- Reduce batch size in config: `training.batch_size: 2`
- Reduce sequence length: `model.n_seq: 512`
- Enable CPU offloading: `model.use_cpu_offload: true`

### Colab Timeout
- Checkpoints are automatically saved to Google Drive
- Resume training by running the training cell again
- The script will automatically detect and resume from the latest checkpoint

### Slow Training
- Ensure GPU is enabled: Runtime → Change runtime type → GPU
- Check GPU utilization: `!nvidia-smi`
- Reduce logging frequency: `training.log_interval: 500`

### Dataset Download Fails
- Check internet connection
- Try downloading datasets individually
- Use smaller sample sizes for C4 and The Pile

## Citation

If you use this code in your research, please cite:

```bibtex
@article{mamba-killer-2024,
  title={Mamba-Killer: Ultra-Scale ResNet-BK with Birman-Schwinger Theory},
  author={Your Name},
  journal={arXiv preprint arXiv:XXXX.XXXXX},
  year={2024}
}
```