# Spill the Beans: RAG Privacy Attack - Colab Setup

This notebook reproduces Table 1 from the paper "Follow My Instruction and Spill the Beans" (ICLR 2025).

**GPU Required**: 
- **7B models**: Runtime > Change runtime type > T4 GPU
- **13B models**: Runtime > Change runtime type > A100 GPU (Colab Pro required)

## 1. Clone Repository

In [None]:
!git clone https://github.com/aamangeldi/spill-the-beans.git
%cd spill-the-beans

## 2. Install Dependencies

In [None]:
!pip install -e .

## 3. Verify GPU Availability

In [None]:
import torch
print(f"CUDA available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"GPU: {torch.cuda.get_device_name(0)}")
    print(f"Memory: {torch.cuda.get_device_properties(0).total_memory / 1e9:.2f} GB")

## 4. Quick Test (10 samples)

Test with Mistral-7B to verify everything works (~5 minutes).

In [None]:
!python src/main.py --models mistral-7b --num-samples 10

In [None]:
!python src/view_results.py

## 5. Reproduce Table 1 (50 samples)

Run all models from the paper to reproduce Table 1.

**Total time**: ~4-5 hours on A100 GPU

**Models tested**:
- **7B**: Llama2-7B, Mistral-7B
- **10.7B**: SOLAR-10.7B  
- **13B**: Llama2-13B, Vicuna-13B, WizardLM-13B
- **8x7B**: Mixtral-8x7B (with automatic 4-bit quantization)

**Note**: Using 50 samples instead of 230 for faster runtime while maintaining statistical significance.

**Mixtral-8x7b**: Automatically uses 4-bit quantization to fit in A100 (40GB VRAM). Results may differ slightly from paper's fp16 implementation.

In [None]:
!python src/main.py --models llama2-7b mistral-7b solar-10.7b llama2-13b vicuna-13b mixtral-8x7b wizardlm-13b --num-samples 50

## 6. View Final Results

Display all experiment results in a summary table.

In [None]:
!python src/view_results.py

## 7. Download Results

Download all experiment results as a zip file.

In [None]:
from google.colab import files

# Zip all results
!zip -r results.zip results/

# Download
files.download('results.zip')