# üéÆ CyborgMind MineRL Training on Google Colab

This notebook trains a MineRL agent with PMM (Predictive Memory Module) on Google Colab with Cuberite server for visualization.

## Features
- **GPU Training**: Utilizes Colab's free GPU
- **Cuberite Server**: Lightweight Minecraft server for visualization
- **VNC Viewer**: Watch the agent play in real-time
- **WandB Logging**: Track training metrics

## 1Ô∏è‚É£ Setup Environment

In [None]:
# Check GPU availability
!nvidia-smi

# Install system dependencies
!apt-get update -qq
!apt-get install -qq -y openjdk-8-jdk xvfb x11vnc fluxbox websockify > /dev/null

print("‚úÖ System dependencies installed")

In [None]:
# Clone CyborgMind repository
!git clone https://github.com/dawsonblock/cyborg_mind.git
%cd cyborg_mind

# Install Python dependencies
!pip install -q torch torchvision numpy gymnasium wandb pyyaml tqdm matplotlib
!pip install -q minerl

print("‚úÖ Python dependencies installed")

## 2Ô∏è‚É£ Setup Cuberite Server

In [None]:
%%bash
# Download and setup Cuberite (lightweight C++ Minecraft server)
cd /content
if [ ! -d "cuberite" ]; then
    echo "Downloading Cuberite..."
    wget -q https://download.cuberite.org/linux-x86_64/Cuberite.tar.gz
    tar -xzf Cuberite.tar.gz
    mv Server cuberite
    rm Cuberite.tar.gz
    echo "‚úÖ Cuberite downloaded"
else
    echo "‚úÖ Cuberite already exists"
fi

In [None]:
# Configure Cuberite for headless operation
cuberite_settings = """
[Authentication]
Authenticate=false
AllowBungeeCord=false
Server=sessionserver.mojang.com
Address=/session/minecraft/hasJoined?username=%USERNAME%&serverId=%SERVERID%

[Server]
Description=CyborgMind Training Server
MaxPlayers=4
HardcoreEnabled=false
AllowMultiLogin=true
Port=25565

[RCON]
Enabled=true
Port=25575
Password=cyborg123
"""

with open('/content/cuberite/settings.ini', 'w') as f:
    f.write(cuberite_settings)

print("‚úÖ Cuberite configured")

In [None]:
import subprocess
import time

# Start Cuberite server in background
cuberite_process = subprocess.Popen(
    ['./Cuberite'],
    cwd='/content/cuberite',
    stdout=subprocess.PIPE,
    stderr=subprocess.PIPE
)

print("‚è≥ Starting Cuberite server...")
time.sleep(10)  # Wait for server to start

if cuberite_process.poll() is None:
    print("‚úÖ Cuberite server running on port 25565")
    print("   Connect with Minecraft client to: <your-colab-ip>:25565")
else:
    print("‚ùå Cuberite failed to start")
    print(cuberite_process.stderr.read().decode())

## 3Ô∏è‚É£ Setup VNC for Visualization

In [None]:
%%bash
# Start virtual display
export DISPLAY=:99
Xvfb :99 -screen 0 1024x768x24 &
sleep 2

# Start window manager
fluxbox &
sleep 2

# Start VNC server
x11vnc -display :99 -forever -nopw -shared -rfbport 5900 &
sleep 2

# Start websockify for browser-based VNC
websockify --web /usr/share/novnc 6080 localhost:5900 &

echo "‚úÖ VNC server started"

In [None]:
# Get Colab URL for VNC viewer
from google.colab.output import eval_js

print("üñ•Ô∏è VNC Viewer Setup")
print("="*50)
print("Option 1: Use ngrok for external access")
print("Option 2: Use Colab's built-in port forwarding")
print("\nTo view the agent, you'll need a VNC client or noVNC in browser.")

## 4Ô∏è‚É£ Configure Training

In [None]:
# Training configuration
CONFIG = {
    # Environment
    'env_name': 'MineRLTreechop-v0',
    'num_envs': 2,
    'frame_stack': 4,
    
    # Model
    'encoder': 'gru',  # 'gru', 'mamba', or 'mamba_gru'
    'hidden_dim': 384,
    'vision_dim': 256,
    
    # PMM (Memory)
    'pmm_enabled': True,
    'pmm_slots': 16,
    'pmm_dim': 256,
    
    # Training
    'total_steps': 100000,  # Reduce for testing
    'horizon': 512,
    'batch_size': 2048,
    'learning_rate': 3e-4,
    
    # Logging
    'use_wandb': False,  # Set to True and login for tracking
    'log_freq': 1000,
}

print("üìã Training Configuration")
for k, v in CONFIG.items():
    print(f"   {k}: {v}")

In [None]:
# Optional: Login to WandB for experiment tracking
# import wandb
# wandb.login()

## 5Ô∏è‚É£ Train the Agent

In [None]:
import os
os.environ['DISPLAY'] = ':99'

import torch
import numpy as np

# Build config dict for trainer
config_dict = {
    'env': {
        'name': CONFIG['env_name'],
        'size': [64, 64],
        'max_steps': 18000,
    },
    'model': {
        'encoder': CONFIG['encoder'],
        'hidden_dim': CONFIG['hidden_dim'],
        'vision_dim': CONFIG['vision_dim'],
    },
    'pmm': {
        'enabled': CONFIG['pmm_enabled'],
        'memory_dim': CONFIG['pmm_dim'],
        'num_slots': CONFIG['pmm_slots'],
        'write_rate_target_inv': 2000,
        'gate_type': 'soft',
        'temperature': 1.0,
        'sharpness': 2.0,
    },
    'train': {
        'device': 'cuda' if torch.cuda.is_available() else 'cpu',
        'num_envs': CONFIG['num_envs'],
        'horizon': CONFIG['horizon'],
        'batch_size': CONFIG['batch_size'],
        'seq_len': 64,
        'total_timesteps': CONFIG['total_steps'],
        'learning_rate': CONFIG['learning_rate'],
        'gamma': 0.99,
        'gae_lambda': 0.95,
        'clip_epsilon': 0.2,
        'value_coef': 0.5,
        'entropy_coef': 0.01,
        'max_grad_norm': 0.5,
        'ppo_epochs': 4,
        'amp': torch.cuda.is_available(),
        'compile': False,
    }
}

print(f"üîß Device: {config_dict['train']['device']}")
print(f"üß† Encoder: {config_dict['model']['encoder']}")
print(f"üíæ PMM: {'Enabled' if config_dict['pmm']['enabled'] else 'Disabled'}")

In [None]:
# Import and create trainer
from cyborg_rl.trainers.ppo_trainer import PPOTrainer

trainer = PPOTrainer(
    config_dict=config_dict,
    use_wandb=CONFIG['use_wandb']
)

print("‚úÖ Trainer initialized")
print(f"   Observation dim: {trainer.obs_dim}")
print(f"   Action dim: {trainer.action_dim}")

In [None]:
# Start training!
print("üöÄ Starting Training...")
print(f"   Total steps: {CONFIG['total_steps']:,}")
print(f"   This may take a while on Colab free tier.")
print("="*50)

try:
    trainer.train()
    print("\n‚úÖ Training Complete!")
except KeyboardInterrupt:
    print("\n‚èπÔ∏è Training interrupted by user")
except Exception as e:
    print(f"\n‚ùå Training error: {e}")
    raise

## 6Ô∏è‚É£ Save & Download Checkpoint

In [None]:
# Save final checkpoint
import torch

checkpoint = {
    'config': config_dict,
    'encoder_state_dict': trainer.encoder.state_dict(),
    'policy_state_dict': trainer.policy.state_dict(),
    'value_state_dict': trainer.value.state_dict(),
    'obs_dim': trainer.obs_dim,
    'action_dim': trainer.action_dim,
}

if trainer.use_pmm:
    checkpoint['pmm_state_dict'] = trainer.pmm.state_dict()
    checkpoint['pmm_proj_state_dict'] = trainer.pmm_proj.state_dict()

save_path = '/content/cyborg_trained_agent.pt'
torch.save(checkpoint, save_path)
print(f"‚úÖ Checkpoint saved to {save_path}")

# Download to local machine
from google.colab import files
files.download(save_path)

## 7Ô∏è‚É£ Evaluate Agent

In [None]:
# Run evaluation
from evaluate_minerl_agent import AgentEvaluator

evaluator = AgentEvaluator(
    checkpoint_path=save_path,
    device='cuda' if torch.cuda.is_available() else 'cpu',
    deterministic=True
)

results = evaluator.evaluate(
    env_name=CONFIG['env_name'],
    num_episodes=5,  # Reduced for speed
)

print("\nüìä Evaluation Results")
print("="*50)
print(f"Mean Reward: {results['summary']['mean_reward']:.2f}")
print(f"Mean Length: {results['summary']['mean_length']:.0f}")
print(f"Success Rate: {results['summary']['success_rate']:.1%}")

## üßπ Cleanup

In [None]:
# Stop Cuberite server
if 'cuberite_process' in dir() and cuberite_process.poll() is None:
    cuberite_process.terminate()
    print("‚úÖ Cuberite server stopped")

# Kill VNC processes
!pkill -f x11vnc
!pkill -f Xvfb
!pkill -f websockify
print("‚úÖ VNC processes stopped")

---

## üìù Notes

### Connecting to Cuberite
To view the Minecraft world:
1. Use port forwarding: `!ngrok tcp 25565`
2. Connect with Minecraft Java Edition to the ngrok URL

### Performance Tips
- Use Colab Pro for better GPUs and longer sessions
- Reduce `num_envs` if running out of memory
- Enable AMP (`amp: True`) for faster training on GPU

### Troubleshooting
- If MineRL fails to install, try: `!pip install minerl --no-deps`
- If GPU runs out of memory, reduce `batch_size` or `hidden_dim`