[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/your-username/nautilus/blob/main/notebooks/colab_cartpole.ipynb)

# Nautilus PPO on CartPole (Google Colab)

This notebook shows a minimal end-to-end run of Nautilus PPO on CartPole using a Colab GPU.
It installs dependencies, clones the repo, and runs the PPO runner with TensorBoard logging.

## 1. Runtime setup
- In Colab, set runtime to GPU (Runtime → Change runtime type → GPU).
- Restart runtime after installing if prompted.

In [None]:
# Check GPU availability
!nvidia-smi

## 2. Install Nautilus
We install directly from the current repo. If you forked it, replace the URL.
This will also pull PyTorch CPU/CUDA wheels; Colab's default image already has CUDA drivers.

In [None]:
%%bash
set -euo pipefail
# Install Nautilus with dev extras (includes pytest, ruff)
pip install --upgrade pip
pip install 'git+https://github.com/acb-code/nautilus.git#egg=nautilus[dev]'

# Optional: install wandb for tracking
pip install wandb

## 3. Quick sanity import
Verify the package imports correctly.

In [None]:
import torch

print("PyTorch:", torch.__version__, "CUDA available:", torch.cuda.is_available())

## 4. Train PPO on CartPole
This runs the PPO runner for a small number of steps to keep Colab fast.
Adjust `--total-steps` / `--num-envs` to experiment.

In [None]:
%%bash
python -m nautilus.runners.ppo_runner \
  --env-id CartPole-v1 \
  --total-steps 50000 \
  --num-envs 4 \
  --lr 3e-4 \
  --seed 1 \
  --track \
  --wandb-project-name nautilus-colab-demo

## 5. View logs
Launch TensorBoard in Colab. Click the generated link to open the UI.

In [None]:
%load_ext tensorboard
%tensorboard --logdir runs

## 6. Notes
- If WandB is enabled, run `wandb.login()` in a cell and paste your API key when prompted.
- Increase `--total-steps` for longer training or switch `--env-id` to try other Gymnasium envs.
- The runner writes checkpoints to `checkpoints/` if `save_interval` is reached.