# XARELLO Training - Colab v2

**IMPORTANT**: Before running:
1. Runtime → Change runtime type → **GPU** (T4)
2. This ensures you get ~12GB RAM + GPU acceleration

## Step 1: Check resources

In [1]:
import torch
import psutil
import sys

print(f"Python: {sys.version}")
print(f"RAM: {psutil.virtual_memory().total / 1e9:.1f} GB")
print(f"GPU: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"GPU Name: {torch.cuda.get_device_name(0)}")
    print(f"GPU Memory: {torch.cuda.get_device_properties(0).total_memory / 1e9:.1f} GB")

if psutil.virtual_memory().total / 1e9 < 10:
    print("\n⚠️  WARNING: Less than 10GB RAM. Enable GPU runtime for more memory!")
    print("Go to: Runtime → Change runtime type → GPU")

Python: 3.12.12 (main, Oct 10 2025, 08:52:57) [GCC 11.4.0]
RAM: 13.6 GB
GPU: True
GPU Name: Tesla T4
GPU Memory: 15.8 GB


## Step 2: Install all dependencies first

In [2]:
# Install specific compatible versions
!pip uninstall -y transformers huggingface-hub peft -q
!pip install transformers==4.38.1 huggingface-hub==0.21.0 peft==0.9.0 -q
!pip install gymnasium OpenAttack datasets -q
!pip install bitsandbytes -q
!pip install fasttext

# Verify
import transformers, peft, huggingface_hub
print(f"transformers: {transformers.__version__}")
print(f"peft: {peft.__version__}")
print(f"huggingface_hub: {huggingface_hub.__version__}")

[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
datasets 4.0.0 requires huggingface-hub>=0.24.0, but you have huggingface-hub 0.21.0 which is incompatible.
gradio 5.50.0 requires huggingface-hub<2.0,>=0.33.5, but you have huggingface-hub 0.21.0 which is incompatible.
diffusers 0.36.0 requires huggingface-hub<2.0,>=0.34.0, but you have huggingface-hub 0.21.0 which is incompatible.
sentence-transformers 5.2.0 requires transformers<6.0.0,>=4.41.0, but you have transformers 4.38.1 which is incompatible.[0m[31m
[0m[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
sentence-transformers 5.2.0 requires transformers<6.0.0,>=4.41.0, but you have transformers 4.38.1 which is incompatible.[0m[31m
[0mtransformers: 4.38.1
peft: 0.9.0
hu

## Step 3: Clone repositories

In [3]:
!rm -rf BODEGA xarello  # Clean any previous clones

# Clone your forks with the fixes
!git clone -b playground https://github.com/marti-farre/BODEGA.git
!git clone -b testing-stuff https://github.com/marti-farre/xarello.git

print("\nCloned successfully.")

Cloning into 'BODEGA'...
remote: Enumerating objects: 193, done.[K
remote: Counting objects: 100% (19/19), done.[K
remote: Compressing objects: 100% (12/12), done.[K
remote: Total 193 (delta 7), reused 7 (delta 7), pack-reused 174 (from 1)[K
Receiving objects: 100% (193/193), 48.93 KiB | 4.45 MiB/s, done.
Resolving deltas: 100% (108/108), done.
Cloning into 'xarello'...
remote: Enumerating objects: 67, done.[K
remote: Counting objects: 100% (65/65), done.[K
remote: Compressing objects: 100% (57/57), done.[K
remote: Total 67 (delta 30), reused 28 (delta 7), pack-reused 2 (from 1)[K
Receiving objects: 100% (67/67), 31.80 KiB | 5.30 MiB/s, done.
Resolving deltas: 100% (30/30), done.

Cloned successfully.


## Step 4: Upload data files

Upload `dev.tsv` and `BiLSTM-512.pth`

In [4]:
import os
from google.colab import files

# Create directories
os.makedirs('/root/data/BODEGA/RD', exist_ok=True)
os.makedirs('/root/data/xarello/models/wide/RD-BiLSTM', exist_ok=True)
os.environ['HOME'] = '/root'

print("Upload dev.tsv and BiLSTM-512.pth:")
uploaded = files.upload()

for filename in uploaded.keys():
    if 'dev' in filename.lower():
        os.rename(filename, '/root/data/BODEGA/RD/dev.tsv')
        print(f"✓ {filename} → /root/data/BODEGA/RD/dev.tsv")
    elif filename.endswith('.pth'):
        os.rename(filename, '/root/data/BODEGA/RD/BiLSTM-512.pth')
        print(f"✓ {filename} → /root/data/BODEGA/RD/BiLSTM-512.pth")

print("\nFiles ready:")
!ls -la /root/data/BODEGA/RD/

Upload dev.tsv and BiLSTM-512.pth:


Saving BiLSTM-512.pth to BiLSTM-512.pth
Saving dev.tsv to dev.tsv
✓ BiLSTM-512.pth → /root/data/BODEGA/RD/BiLSTM-512.pth
✓ dev.tsv → /root/data/BODEGA/RD/dev.tsv

Files ready:
total 8216
drwxr-xr-x 2 root root    4096 Dec 29 18:59 .
drwxr-xr-x 3 root root    4096 Dec 29 18:49 ..
-rw-r--r-- 1 root root 4577013 Dec 29 18:59 BiLSTM-512.pth
-rw-r--r-- 1 root root 3821928 Dec 29 18:59 dev.tsv


In [5]:
import sys
if sys.version_info >= (3, 12):
    print("Patching OpenAttack for Python 3.12...")

    import site
    import os
    site_packages = site.getsitepackages()[0]
    data_init = f"{site_packages}/OpenAttack/data/__init__.py"

    if os.path.exists(data_init):
        # Read current content
        with open(data_init, 'r') as f:
            content = f.read()
        
        # Check if already correctly patched
        if "importlib.import_module('OpenAttack.data.' + data.name)" in content:
            print("✓ Already patched correctly")
        else:
            # Download fresh OpenAttack and patch it
            import subprocess
            subprocess.run([sys.executable, '-m', 'pip', 'uninstall', '-y', 'OpenAttack'], 
                          capture_output=True)
            subprocess.run([sys.executable, '-m', 'pip', 'install', 'OpenAttack', '-q'], 
                          capture_output=True)
            
            # Now patch the fresh install
            with open(data_init, 'r') as f:
                content = f.read()
            
            old_code = "data = data.module_finder.find_loader(data.name)[0].load_module()"
            new_code = "data = importlib.import_module('OpenAttack.data.' + data.name)"
            
            if old_code in content:
                if "import importlib" not in content:
                    content = "import importlib\n" + content
                content = content.replace(old_code, new_code)
                with open(data_init, 'w') as f:
                    f.write(content)
                print("✓ OpenAttack reinstalled and patched")
            else:
                print("⚠️ Could not find expected code to patch")
                print("Content preview:")
                print(content[:500])
    else:
        print(f"File not found: {data_init}")
else:
    print("No patch needed")


Fixing OpenAttack indentation...
✓ Fixed indentation in OpenAttack


## Step 6: Test imports

In [6]:
# Run this FIRST, before any other imports
import sys
sys.path.insert(0, '/content/BODEGA')

# NOW test imports
try:
    from victims.bilstm import VictimBiLSTM
    print("✓ VictimBiLSTM imported")
except Exception as e:
    print(f"✗ VictimBiLSTM failed: {e}")

try:
    from victims.transformer import VictimTransformer
    print("✓ VictimTransformer imported")
except Exception as e:
    print(f"✗ VictimTransformer failed: {e}")

try:
    import OpenAttack
    print("✓ OpenAttack imported")
except Exception as e:
    print(f"✗ OpenAttack failed: {e}")

print("\nAll imports OK!")


✗ VictimBiLSTM failed: 'NoneType' object has no attribute 'loader'
✗ VictimTransformer failed: 'NoneType' object has no attribute 'loader'
✗ OpenAttack failed: 'NoneType' object has no attribute 'loader'

All imports OK!


## Step 7: Train XARELLO

Full training with GPU - should take 1-2 hours

In [None]:
import os
os.chdir('/content/xarello')
os.environ['HOME'] = '/root'
os.environ['MPLBACKEND'] = 'agg'  # Fix matplotlib backend

!PYTHONPATH=/content/BODEGA python main-train-eval.py RD BiLSTM /root/data/xarello/models/wide/RD-BiLSTM

## Step 8: Download model

In [None]:
from google.colab import files
import os

model_dir = '/root/data/xarello/models/wide/RD-BiLSTM'
model_path = f'{model_dir}/xarello-qmodel.pth'

print("Model directory contents:")
!ls -la {model_dir}

if os.path.exists(model_path):
    print(f"\nDownloading model ({os.path.getsize(model_path)/1e6:.1f} MB)...")
    files.download(model_path)
else:
    print("\n⚠️  Model not found. Check training output above.")

## Step 9: Download plots (optional)

In [None]:
from google.colab import files
import glob

model_dir = '/root/data/xarello/models/wide/RD-BiLSTM'
for pdf in glob.glob(f'{model_dir}/*.pdf'):
    print(f"Downloading {os.path.basename(pdf)}...")
    files.download(pdf)

---
## Done!

Place `xarello-qmodel.pth` at `~/data/xarello/models/wide/RD-BiLSTM/` locally and run evaluation.