# Snake DQN Training (Colab)

Use this notebook on Google Colab to train the Snake DQN and download the checkpoint. Steps:
1. Set your git repository URL in the clone cell below.
2. Run the setup/install cell to clone the repo and install dependencies.
3. Adjust hyperparameters in the args cell if desired.
4. Run training.
5. Download the trained model.



In [1]:
# Clone the repository (set your git URL)
import os

REPO_URL = os.environ.get("SNAKE_REPO_URL", "https://github.com/SyntheticVis-Umut/Snake.git")  # change if using a fork
GITHUB_TOKEN = os.environ.get("SNAKE_GITHUB_TOKEN") or os.environ.get("GITHUB_TOKEN")

# Clone and setup
import shutil
import sys
import subprocess

if not REPO_URL:
    raise ValueError("Set REPO_URL (or env SNAKE_REPO_URL) before running this cell.")

if os.path.exists('/content/Snake'):
    shutil.rmtree('/content/Snake')

clone_url = REPO_URL
masked_url = REPO_URL
if GITHUB_TOKEN and REPO_URL.startswith("https://github.com/"):
    # Inject token for private repos (note: token will appear in Colab logs)
    clone_url = REPO_URL.replace("https://", f"https://{GITHUB_TOKEN}@")
    masked_url = REPO_URL.replace("https://", "https://<TOKEN>@")
    print("Using token from env for clone.")

clone_cmd = ["git", "clone", "--depth", "1", clone_url, "/content/Snake"]
print("Clone command (token masked):", " ".join(clone_cmd).replace(clone_url, masked_url))
result = subprocess.run(clone_cmd, capture_output=True, text=True)
if result.returncode != 0:
    print("git clone stdout:\n", result.stdout)
    print("git clone stderr:\n", result.stderr)
    sys.exit("git clone failed; check REPO_URL, token (if private), and permissions")

os.chdir('/content/Snake')

# Install deps (CUDA wheels on Colab are handled automatically by torch)
%pip install -q pygame torch numpy tqdm

print('CWD:', os.getcwd())
print('Repository cloned and ready!')


Clone command (token masked): git clone --depth 1 https://github.com/SyntheticVis-Umut/Snake.git /content/Snake
CWD: /content/Snake
Repository cloned and ready!


In [2]:
# Quick import test
from train import train
from types import SimpleNamespace
print('Imports ok')


pygame 2.6.1 (SDL 2.28.4, Python 3.12.12)
Hello from the pygame community. https://www.pygame.org/contribute.html
Imports ok


In [None]:
# Configure hyperparameters
args = SimpleNamespace(
    episodes=20000,
    max_steps=1000,
    buffer_size=100000,
    batch_size=256,
    gamma=0.995,
    lr=3e-4,
    eps_start=0.6,
    eps_end=0.01,
    eps_decay=8000,
    target_update=500,
    warmup=4000,
    grid=(20, 20),
    seed=42,
    save_path="models/dqn_snake_colab.pt",
    resume=None,  # set to a checkpoint path to continue training
    device="cuda",  # force GPU (A100 on Colab); use "auto" to fall back
    grad_clip=1.0,  # gradient clipping for stability; set <=0 to disable
    double_dqn=True,  # use Double DQN targets for better stability
    eval_every=200,  # run greedy eval every N episodes (0 disables)
    eval_episodes=20,  # number of greedy episodes per eval
)


In [4]:
# Train
train(args)
print('Training done, saved to', args.save_path)


Using CUDA device: NVIDIA A100-SXM4-80GB
Episode 1/10000 Reward: -2.00 Epsilon: 0.498 Best: -2.00
Episode 10/10000 Reward: -3.80 Epsilon: 0.469 Best: -2.00
Episode 20/10000 Reward: -11.90 Epsilon: 0.442 Best: -2.00
Episode 30/10000 Reward: -11.70 Epsilon: 0.419 Best: -2.00
Episode 40/10000 Reward: -10.90 Epsilon: 0.399 Best: -2.00
Episode 50/10000 Reward: -12.10 Epsilon: 0.382 Best: -2.00
Episode 60/10000 Reward: -11.40 Epsilon: 0.368 Best: -1.70
Episode 70/10000 Reward: -12.60 Epsilon: 0.353 Best: -1.70
Episode 80/10000 Reward: -11.30 Epsilon: 0.337 Best: -1.70
Episode 90/10000 Reward: -11.80 Epsilon: 0.292 Best: 3.10
Episode 100/10000 Reward: -4.80 Epsilon: 0.254 Best: 4.40
Episode 110/10000 Reward: 30.40 Epsilon: 0.214 Best: 30.40
Episode 120/10000 Reward: -3.70 Epsilon: 0.192 Best: 30.40
Episode 130/10000 Reward: -0.30 Epsilon: 0.164 Best: 30.40
Episode 140/10000 Reward: -3.60 Epsilon: 0.144 Best: 30.40
Episode 150/10000 Reward: 22.50 Epsilon: 0.119 Best: 58.20
Episode 160/10000 Re

KeyboardInterrupt: 

In [None]:
# Download the trained model (Colab)
from google.colab import files
files.download(args.save_path)

