From e1bae525463ed59656fd7418cad9fb2d783875f9 Mon Sep 17 00:00:00 2001
From: Sanyam Bhutani <sanyambhutani@meta.com>
Date: Sun, 19 Oct 2025 20:36:36 -0700
Subject: [PATCH 01/19] add example

---
 README.md                       |   2 +
 examples/OpenEnv_Tutorial.ipynb | 853 ++++++++++++++++++++++++++++++++
 2 files changed, 855 insertions(+)
 create mode 100644 examples/OpenEnv_Tutorial.ipynb

diff --git a/README.md b/README.md
index 5169f40..a078900 100644
--- a/README.md
+++ b/README.md
@@ -2,6 +2,8 @@
 
 An e2e framework for creating, deploying and using isolated execution environments for agentic RL training, built using Gymnasium style simple APIs.
 
+[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/meta-pytorch/OpenEnv/blob/main/examples/OpenEnv_Tutorial.ipynb) **← Try the Interactive Tutorial!**
+
 ## Overview
 
 OpenEnv provides a standard for interacting with agentic execution environments via simple Gymnasium style APIs - step(), reset(), state(). Users of agentic execution environments can interact with the environment during RL training loops using these simple APIs.
diff --git a/examples/OpenEnv_Tutorial.ipynb b/examples/OpenEnv_Tutorial.ipynb
new file mode 100644
index 0000000..db4ddc8
--- /dev/null
+++ b/examples/OpenEnv_Tutorial.ipynb
@@ -0,0 +1,853 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# OpenEnv: Production-Ready RL Environments\n",
+    "\n",
+    "**Learn how OpenEnv standardizes RL environments for production use**\n",
+    "\n",
+    "---\n",
+    "\n",
+    "## What You'll Learn\n",
+    "\n",
+    "This notebook teaches you:\n",
+    "\n",
+    "1. **RL Fundamentals** - The core loop in 5 minutes\n",
+    "2. **OpenEnv Framework** - Why we built it and how it works\n",
+    "3. **Using Integrations** - Work with existing environments (OpenSpiel example)\n",
+    "4. **Interactive Demo** - See policies in action\n",
+    "5. **Adding Integrations** - Wrap your own environments\n",
+    "\n",
+    "---"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Part 1: RL Fundamentals - The Core Loop\n",
+    "\n",
+    "Reinforcement Learning boils down to a simple loop:\n",
+    "\n",
+    "```\n",
+    "Agent observes → chooses action → gets reward → repeat\n",
+    "```\n",
+    "\n",
+    "Let's see it:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import random\n",
+    "\n",
+    "# Simple RL: Guess a number\n",
+    "target = random.randint(1, 10)\n",
+    "guesses = 3\n",
+    "\n",
+    "print(\"🎯 Guess a number (1-10)\\n\")\n",
+    "\n",
+    "while guesses > 0:\n",
+    "    guess = random.randint(1, 10)  # Policy: random\n",
+    "    guesses -= 1\n",
+    "    \n",
+    "    print(f\"Guess: {guess}\", end=\" → \")\n",
+    "    \n",
+    "    if guess == target:\n",
+    "        print(\"🎉 Correct! Reward: +1\")\n",
+    "        break\n",
+    "    elif abs(guess - target) <= 2:\n",
+    "        print(\"🔥 Warm\")\n",
+    "    else:\n",
+    "        print(\"❄️ Cold\")\n",
+    "else:\n",
+    "    print(f\"\\nIt was {target}. Reward: 0\")\n",
+    "\n",
+    "print(\"\\n💡 That's RL: observe → act → reward → repeat\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "**The Problem**: How do we make this production-ready?\n",
+    "- Need type safety\n",
+    "- Need isolation\n",
+    "- Need deployment\n",
+    "- Need standardization\n",
+    "\n",
+    "**Enter OpenEnv.**\n",
+    "\n",
+    "---"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Part 2: OpenEnv - The Framework\n",
+    "\n",
+    "### What is OpenEnv?\n",
+    "\n",
+    "OpenEnv is a **framework for creating, deploying, and using isolated RL environments**.\n",
+    "\n",
+    "Think \"Docker for RL environments\" with:\n",
+    "- ✅ Standardized API (reset, step, state)\n",
+    "- ✅ Type-safe dataclasses\n",
+    "- ✅ Docker isolation\n",
+    "- ✅ HTTP communication (language-agnostic)\n",
+    "- ✅ Production-ready deployment\n",
+    "\n",
+    "### The Architecture\n",
+    "\n",
+    "```\n",
+    "┌────────────────────────────────────┐\n",
+    "│  Your Training Code                │  Python, Rust, Julia...\n",
+    "│                                    │\n",
+    "│  env = SomeEnv(...)                │  ← Import OpenEnv client\n",
+    "│  result = env.reset()              │  ← Type-safe!\n",
+    "│  result = env.step(action)         │  ← Type-safe!\n",
+    "└──────────┬─────────────────────────┘\n",
+    "           │\n",
+    "           │ HTTP/JSON\n",
+    "           │\n",
+    "┌──────────▼─────────────────────────┐\n",
+    "│  Docker Container                  │\n",
+    "│                                    │\n",
+    "│  FastAPI Server                    │\n",
+    "│  └─ Environment Logic              │\n",
+    "│     └─ Your game/simulation        │\n",
+    "└────────────────────────────────────┘\n",
+    "```\n",
+    "\n",
+    "### The Pattern - Every Environment Has:\n",
+    "\n",
+    "```\n",
+    "src/envs/your_env/\n",
+    "├── models.py         ← Type-safe contracts (Action, Observation, State)\n",
+    "├── client.py         ← Client API (what you import)\n",
+    "└── server/\n",
+    "    ├── environment.py ← Environment logic\n",
+    "    ├── app.py         ← FastAPI server\n",
+    "    └── Dockerfile     ← Container\n",
+    "```\n",
+    "\n",
+    "### Current Integrations\n",
+    "\n",
+    "OpenEnv already integrates several environments:\n",
+    "- **OpenSpiel** (6 games from DeepMind)\n",
+    "- **Echo** (test environment)\n",
+    "- **Coding** (Python code execution)\n",
+    "- **Atari** (classic games)\n",
+    "- More coming!\n",
+    "\n",
+    "Let's explore one integration to see how it all works..."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "---\n",
+    "\n",
+    "## Part 3: Setup"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Check if in Colab\n",
+    "try:\n",
+    "    import google.colab\n",
+    "    IN_COLAB = True\n",
+    "except ImportError:\n",
+    "    IN_COLAB = False\n",
+    "\n",
+    "if IN_COLAB:\n",
+    "    !git clone https://github.com/meta-pytorch/OpenEnv.git\n",
+    "    %cd OpenEnv\n",
+    "    !pip install -q fastapi uvicorn requests\n",
+    "    import sys\n",
+    "    sys.path.insert(0, './src')\n",
+    "    print(\"✅ OpenEnv ready!\")\n",
+    "else:\n",
+    "    import sys\n",
+    "    from pathlib import Path\n",
+    "    sys.path.insert(0, str(Path.cwd() / 'src'))\n",
+    "    print(\"✅ Using local OpenEnv\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "---\n",
+    "\n",
+    "## Part 4: Exploring OpenEnv's Structure\n",
+    "\n",
+    "Let's look at the actual OpenEnv code to understand how it works.\n",
+    "\n",
+    "### The Base Classes"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from core.env_server import Environment, Action, Observation, State\n",
+    "from core.http_env_client import HTTPEnvClient\n",
+    "\n",
+    "print(\"=\" * 70)\n",
+    "print(\"OpenEnv Core Abstractions\")\n",
+    "print(\"=\" * 70)\n",
+    "\n",
+    "print(\"\"\"\n",
+    "SERVER SIDE (runs in Docker):\n",
+    "\n",
+    "  class Environment(ABC):\n",
+    "      '''Base class for all environment implementations'''\n",
+    "      \n",
+    "      @abstractmethod\n",
+    "      def reset(self) -> Observation:\n",
+    "          '''Start new episode'''\n",
+    "      \n",
+    "      @abstractmethod\n",
+    "      def step(self, action: Action) -> Observation:\n",
+    "          '''Execute action'''\n",
+    "      \n",
+    "      @property\n",
+    "      def state(self) -> State:\n",
+    "          '''Episode metadata'''\n",
+    "\n",
+    "CLIENT SIDE (your training code):\n",
+    "\n",
+    "  class HTTPEnvClient(ABC):\n",
+    "      '''Base class for HTTP clients'''\n",
+    "      \n",
+    "      def reset(self) -> StepResult:\n",
+    "          # HTTP POST to /reset\n",
+    "      \n",
+    "      def step(self, action) -> StepResult:\n",
+    "          # HTTP POST to /step\n",
+    "      \n",
+    "      def state(self) -> State:\n",
+    "          # HTTP GET to /state\n",
+    "\"\"\")\n",
+    "\n",
+    "print(\"=\" * 70)\n",
+    "print(\"💡 Same interface, communication via HTTP\")\n",
+    "print(\"=\" * 70)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "---\n",
+    "\n",
+    "## Part 5: Example Integration - OpenSpiel\n",
+    "\n",
+    "### What is OpenSpiel?\n",
+    "\n",
+    "OpenSpiel is a **library from DeepMind** with 70+ game environments for RL research.\n",
+    "\n",
+    "### Our Integration\n",
+    "\n",
+    "**OpenEnv wraps 6 OpenSpiel games** following our standard pattern:\n",
+    "\n",
+    "1. **Catch** - Catch falling ball (single-player)\n",
+    "2. **Tic-Tac-Toe** - Classic 3×3 (2-player)\n",
+    "3. **Kuhn Poker** - Imperfect info poker (2-player)\n",
+    "4. **Cliff Walking** - Grid navigation (single-player)\n",
+    "5. **2048** - Tile puzzle (single-player)\n",
+    "6. **Blackjack** - Card game (single-player)\n",
+    "\n",
+    "Let's see how the integration is structured:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Import the OpenSpiel integration models\n",
+    "from envs.openspiel_env.models import (\n",
+    "    OpenSpielAction,\n",
+    "    OpenSpielObservation,\n",
+    "    OpenSpielState\n",
+    ")\n",
+    "from dataclasses import fields\n",
+    "\n",
+    "print(\"=\" * 70)\n",
+    "print(\"OpenSpiel Integration - Type-Safe Models\")\n",
+    "print(\"=\" * 70)\n",
+    "\n",
+    "print(\"\\n📤 OpenSpielAction (what you send):\")\n",
+    "for field in fields(OpenSpielAction):\n",
+    "    print(f\"   • {field.name}: {field.type}\")\n",
+    "\n",
+    "print(\"\\n📥 OpenSpielObservation (what you receive):\")\n",
+    "for field in fields(OpenSpielObservation):\n",
+    "    print(f\"   • {field.name}: {field.type}\")\n",
+    "\n",
+    "print(\"\\n📊 OpenSpielState (episode metadata):\")\n",
+    "for field in fields(OpenSpielState):\n",
+    "    print(f\"   • {field.name}: {field.type}\")\n",
+    "\n",
+    "print(\"\\n\" + \"=\" * 70)\n",
+    "print(\"💡 This is how OpenEnv integrates external libraries:\")\n",
+    "print(\"   1. Wrap in standardized types\")\n",
+    "print(\"   2. Expose via HTTPEnvClient\")\n",
+    "print(\"   3. Package in Docker\")\n",
+    "print(\"=\" * 70)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### How the Client Works"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from envs.openspiel_env.client import OpenSpielEnv\n",
+    "\n",
+    "print(\"=\" * 70)\n",
+    "print(\"OpenSpielEnv Client (HTTPEnvClient Implementation)\")\n",
+    "print(\"=\" * 70)\n",
+    "\n",
+    "print(\"\"\"\n",
+    "How OpenEnv wraps OpenSpiel:\n",
+    "\n",
+    "class OpenSpielEnv(HTTPEnvClient[OpenSpielAction, OpenSpielObservation]):\n",
+    "    \n",
+    "    def _step_payload(self, action: OpenSpielAction) -> dict:\n",
+    "        '''Convert action to JSON for HTTP request'''\n",
+    "        return {\n",
+    "            \"action_id\": action.action_id,\n",
+    "            \"game_name\": action.game_name,\n",
+    "        }\n",
+    "    \n",
+    "    def _parse_result(self, payload: dict) -> StepResult:\n",
+    "        '''Parse HTTP response into typed observation'''\n",
+    "        return StepResult(\n",
+    "            observation=OpenSpielObservation(...),\n",
+    "            reward=payload['reward'],\n",
+    "            done=payload['done']\n",
+    "        )\n",
+    "\n",
+    "Usage (same for ALL OpenEnv environments):\n",
+    "\n",
+    "  env = OpenSpielEnv(base_url=\"http://localhost:8000\")\n",
+    "  result = env.reset()  # Returns StepResult[OpenSpielObservation]\n",
+    "  result = env.step(OpenSpielAction(action_id=2, game_name=\"catch\"))\n",
+    "  state = env.state()   # Returns OpenSpielState\n",
+    "\"\"\")\n",
+    "\n",
+    "print(\"=\" * 70)\n",
+    "print(\"💡 This pattern works for ANY environment you want to wrap!\")\n",
+    "print(\"=\" * 70)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "---\n",
+    "\n",
+    "## Part 6: Interactive Demo - See It In Action\n",
+    "\n",
+    "Let's build a **Catch game** environment following OpenEnv's pattern.\n",
+    "\n",
+    "This shows you:\n",
+    "- How to structure an environment\n",
+    "- How the RL loop works\n",
+    "- How different policies perform\n",
+    "\n",
+    "### The Game:\n",
+    "- 5×5 grid, ball falls from top 🔴\n",
+    "- Control paddle at bottom 🏓\n",
+    "- **Actions**: 0=LEFT, 1=STAY, 2=RIGHT\n",
+    "- **Reward**: +1 caught, 0 missed"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import random\n",
+    "from dataclasses import dataclass\n",
+    "from typing import List, Tuple\n",
+    "\n",
+    "# Define types (following OpenEnv pattern)\n",
+    "@dataclass\n",
+    "class CatchObservation:\n",
+    "    \"\"\"Type-safe observation.\"\"\"\n",
+    "    info_state: List[float]\n",
+    "    legal_actions: List[int]\n",
+    "    done: bool\n",
+    "    reward: float\n",
+    "    # For visualization\n",
+    "    ball_position: Tuple[int, int]\n",
+    "    paddle_position: int\n",
+    "\n",
+    "\n",
+    "class CatchEnvironment:\n",
+    "    \"\"\"\n",
+    "    Catch game following OpenEnv Environment pattern.\n",
+    "    \n",
+    "    In production: This would run in Docker, accessed via HTTPEnvClient\n",
+    "    For demo: We run it locally to see the internals\n",
+    "    \"\"\"\n",
+    "    \n",
+    "    def __init__(self, grid_size=5):\n",
+    "        self.grid_size = grid_size\n",
+    "    \n",
+    "    def reset(self) -> CatchObservation:\n",
+    "        \"\"\"Start new episode (implements Environment.reset()).\"\"\"\n",
+    "        self.ball_row = 0\n",
+    "        self.ball_col = random.randint(0, self.grid_size - 1)\n",
+    "        self.paddle_col = self.grid_size // 2\n",
+    "        self.done = False\n",
+    "        return self._make_observation()\n",
+    "    \n",
+    "    def step(self, action: int) -> CatchObservation:\n",
+    "        \"\"\"Execute action (implements Environment.step()).\"\"\"\n",
+    "        # Move paddle\n",
+    "        if action == 0 and self.paddle_col > 0:\n",
+    "            self.paddle_col -= 1\n",
+    "        elif action == 2 and self.paddle_col < self.grid_size - 1:\n",
+    "            self.paddle_col += 1\n",
+    "        \n",
+    "        # Move ball\n",
+    "        self.ball_row += 1\n",
+    "        \n",
+    "        # Check done\n",
+    "        if self.ball_row >= self.grid_size - 1:\n",
+    "            self.done = True\n",
+    "            reward = 1.0 if self.ball_col == self.paddle_col else 0.0\n",
+    "        else:\n",
+    "            reward = 0.0\n",
+    "        \n",
+    "        return self._make_observation(reward)\n",
+    "    \n",
+    "    def _make_observation(self, reward=0.0) -> CatchObservation:\n",
+    "        \"\"\"Create type-safe observation.\"\"\"\n",
+    "        info_state = [0.0] * (self.grid_size * self.grid_size)\n",
+    "        ball_idx = self.ball_row * self.grid_size + self.ball_col\n",
+    "        paddle_idx = (self.grid_size - 1) * self.grid_size + self.paddle_col\n",
+    "        info_state[ball_idx] = 1.0\n",
+    "        info_state[paddle_idx] = 0.5\n",
+    "        \n",
+    "        return CatchObservation(\n",
+    "            info_state=info_state,\n",
+    "            legal_actions=[0, 1, 2],\n",
+    "            done=self.done,\n",
+    "            reward=reward,\n",
+    "            ball_position=(self.ball_row, self.ball_col),\n",
+    "            paddle_position=self.paddle_col\n",
+    "        )\n",
+    "    \n",
+    "    def render(self):\n",
+    "        \"\"\"Visualize.\"\"\"\n",
+    "        for row in range(self.grid_size):\n",
+    "            line = \"  \"\n",
+    "            for col in range(self.grid_size):\n",
+    "                if row == self.ball_row and col == self.ball_col:\n",
+    "                    line += \"🔴 \"\n",
+    "                elif row == self.grid_size - 1 and col == self.paddle_col:\n",
+    "                    line += \"🏓 \"\n",
+    "                else:\n",
+    "                    line += \"⬜ \"\n",
+    "            print(line)\n",
+    "\n",
+    "\n",
+    "print(\"✅ Environment created following OpenEnv pattern!\")\n",
+    "print(\"\\n   Implements: reset(), step()\")\n",
+    "print(\"   Returns: Type-safe observations\")\n",
+    "print(\"   In production: Would run in Docker + FastAPI\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Test It"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "env = CatchEnvironment()\n",
+    "obs = env.reset()\n",
+    "\n",
+    "print(\"Initial State:\")\n",
+    "print(\"=\" * 50)\n",
+    "env.render()\n",
+    "print(f\"\\nBall: column {obs.ball_position[1]}\")\n",
+    "print(f\"Paddle: column {obs.paddle_position}\")\n",
+    "print(f\"Legal actions: {obs.legal_actions} (0=LEFT, 1=STAY, 2=RIGHT)\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "---\n",
+    "\n",
+    "## Part 7: Different Policies\n",
+    "\n",
+    "A policy maps observations → actions. Let's test 4 strategies:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "class RandomPolicy:\n",
+    "    name = \"Random\"\n",
+    "    def select_action(self, obs): \n",
+    "        return random.choice(obs.legal_actions)\n",
+    "\n",
+    "class AlwaysStayPolicy:\n",
+    "    name = \"Always Stay\"\n",
+    "    def select_action(self, obs): \n",
+    "        return 1\n",
+    "\n",
+    "class SmartPolicy:\n",
+    "    name = \"Smart Heuristic\"\n",
+    "    def select_action(self, obs):\n",
+    "        ball_col = obs.ball_position[1]\n",
+    "        paddle_col = obs.paddle_position\n",
+    "        if paddle_col < ball_col: return 2  # RIGHT\n",
+    "        elif paddle_col > ball_col: return 0  # LEFT\n",
+    "        else: return 1  # STAY\n",
+    "\n",
+    "class LearningPolicy:\n",
+    "    name = \"Learning Agent\"\n",
+    "    def __init__(self):\n",
+    "        self.steps = 0\n",
+    "    \n",
+    "    def select_action(self, obs):\n",
+    "        self.steps += 1\n",
+    "        epsilon = max(0.1, 1.0 - (self.steps / 100))\n",
+    "        \n",
+    "        if random.random() < epsilon:  # Explore\n",
+    "            return random.choice(obs.legal_actions)\n",
+    "        else:  # Exploit\n",
+    "            ball_col = obs.ball_position[1]\n",
+    "            paddle_col = obs.paddle_position\n",
+    "            if paddle_col < ball_col: return 2\n",
+    "            elif paddle_col > ball_col: return 0\n",
+    "            else: return 1\n",
+    "\n",
+    "print(\"✅ 4 Policies created:\")\n",
+    "print(\"   1. Random - Baseline\")\n",
+    "print(\"   2. Always Stay - Bad strategy\")\n",
+    "print(\"   3. Smart - Optimal heuristic\")\n",
+    "print(\"   4. Learning - Simulated RL\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Watch Them Play"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import time\n",
+    "\n",
+    "def run_episode(env, policy, visualize=True, delay=0.4):\n",
+    "    obs = env.reset()\n",
+    "    \n",
+    "    if visualize:\n",
+    "        print(f\"\\n{'='*50}\")\n",
+    "        print(f\"Policy: {policy.name} | Ball: col {obs.ball_position[1]}\")\n",
+    "        print('='*50 + '\\n')\n",
+    "        env.render()\n",
+    "        time.sleep(delay)\n",
+    "    \n",
+    "    total_reward = 0\n",
+    "    step = 0\n",
+    "    \n",
+    "    while not obs.done:\n",
+    "        action = policy.select_action(obs)\n",
+    "        obs = env.step(action)\n",
+    "        total_reward += obs.reward\n",
+    "        \n",
+    "        if visualize:\n",
+    "            print(f\"\\nStep {step + 1}: {['LEFT','STAY','RIGHT'][action]}\")\n",
+    "            env.render()\n",
+    "            time.sleep(delay)\n",
+    "        \n",
+    "        step += 1\n",
+    "    \n",
+    "    if visualize:\n",
+    "        print(f\"\\n{'🎉 CAUGHT!' if total_reward > 0 else '😢 MISSED'} Reward: {total_reward}\")\n",
+    "    \n",
+    "    return total_reward > 0\n",
+    "\n",
+    "# Demo\n",
+    "env = CatchEnvironment()\n",
+    "run_episode(env, SmartPolicy(), visualize=True, delay=0.3)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Compare All Policies"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def evaluate_policies(num_episodes=50):\n",
+    "    policies = [RandomPolicy(), AlwaysStayPolicy(), SmartPolicy(), LearningPolicy()]\n",
+    "    \n",
+    "    print(\"\\n\" + \"=\"*70)\n",
+    "    print(f\"🏆 POLICY COMPARISON ({num_episodes} episodes)\")\n",
+    "    print(\"=\"*70 + \"\\n\")\n",
+    "    \n",
+    "    results = []\n",
+    "    for policy in policies:\n",
+    "        env = CatchEnvironment()\n",
+    "        successes = sum(run_episode(env, policy, visualize=False) \n",
+    "                       for _ in range(num_episodes))\n",
+    "        rate = (successes / num_episodes) * 100\n",
+    "        results.append((policy.name, rate))\n",
+    "        print(f\"{policy.name:20s}: {rate:5.1f}%\")\n",
+    "    \n",
+    "    print(\"\\n\" + \"=\"*70)\n",
+    "    results.sort(key=lambda x: x[1], reverse=True)\n",
+    "    for name, rate in results:\n",
+    "        bar = \"█\" * int(rate / 2)\n",
+    "        print(f\"{name:20s} [{bar:<50}] {rate:.1f}%\")\n",
+    "    \n",
+    "    print(\"\\n\" + \"=\"*70)\n",
+    "    print(\"💡 RL in action: Random → Learning → Optimal\")\n",
+    "    print(\"=\"*70)\n",
+    "\n",
+    "evaluate_policies(50)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "---\n",
+    "\n",
+    "## Part 8: Using Real OpenSpiel Integration\n",
+    "\n",
+    "What we just built **is how OpenEnv works**!\n",
+    "\n",
+    "### Demo vs Production:\n",
+    "\n",
+    "| Component | Our Demo | OpenEnv + OpenSpiel |\n",
+    "|-----------|----------|---------------------|\n",
+    "| Environment | Local class | Docker container |\n",
+    "| Communication | Direct | HTTP |\n",
+    "| Client | Direct | HTTPEnvClient |\n",
+    "| Type Safety | ✅ | ✅ |\n",
+    "| API | reset/step | reset/step |\n",
+    "\n",
+    "### Using OpenSpiel Integration:\n",
+    "\n",
+    "```python\n",
+    "# Install OpenSpiel\n",
+    "!pip install open_spiel\n",
+    "\n",
+    "# Import OpenEnv's integration\n",
+    "from envs.openspiel_env import OpenSpielEnv, OpenSpielAction\n",
+    "\n",
+    "# Connect to server\n",
+    "env = OpenSpielEnv(base_url=\"http://localhost:8000\")\n",
+    "\n",
+    "# Same API!\n",
+    "result = env.reset()\n",
+    "result = env.step(OpenSpielAction(action_id=2, game_name=\"catch\"))\n",
+    "state = env.state()\n",
+    "```\n",
+    "\n",
+    "### Available Games:\n",
+    "1. Catch (what we demoed!)\n",
+    "2. Tic-Tac-Toe\n",
+    "3. Kuhn Poker\n",
+    "4. Cliff Walking\n",
+    "5. 2048\n",
+    "6. Blackjack"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "---\n",
+    "\n",
+    "## Part 9: Adding Your Own Integration\n",
+    "\n",
+    "Want to wrap your own environment? Follow the pattern:\n",
+    "\n",
+    "### 1. Define Types (models.py)\n",
+    "```python\n",
+    "@dataclass\n",
+    "class YourAction(Action):\n",
+    "    # Your action fields\n",
+    "\n",
+    "@dataclass\n",
+    "class YourObservation(Observation):\n",
+    "    # Your observation fields\n",
+    "```\n",
+    "\n",
+    "### 2. Implement Environment (server/environment.py)\n",
+    "```python\n",
+    "class YourEnvironment(Environment):\n",
+    "    def reset(self) -> Observation:\n",
+    "        return YourObservation(...)\n",
+    "    \n",
+    "    def step(self, action: Action) -> Observation:\n",
+    "        return YourObservation(...)\n",
+    "```\n",
+    "\n",
+    "### 3. Create Client (client.py)\n",
+    "```python\n",
+    "class YourEnv(HTTPEnvClient[YourAction, YourObservation]):\n",
+    "    def _step_payload(self, action):\n",
+    "        return {\"field\": action.field}\n",
+    "    \n",
+    "    def _parse_result(self, payload):\n",
+    "        return StepResult(observation=YourObservation(...))\n",
+    "```\n",
+    "\n",
+    "### 4. Create Server (server/app.py)\n",
+    "```python\n",
+    "from core.env_server import create_fastapi_app\n",
+    "\n",
+    "env = YourEnvironment()\n",
+    "app = create_fastapi_app(env)\n",
+    "```\n",
+    "\n",
+    "### 5. Dockerize (server/Dockerfile)\n",
+    "```dockerfile\n",
+    "FROM python:3.11\n",
+    "COPY . /app\n",
+    "WORKDIR /app\n",
+    "RUN pip install -r requirements.txt\n",
+    "CMD [\"uvicorn\", \"app:app\", \"--host\", \"0.0.0.0\"]\n",
+    "```\n",
+    "\n",
+    "### Examples to Study:\n",
+    "- `src/envs/echo_env/` - Simple test environment\n",
+    "- `src/envs/openspiel_env/` - Our OpenSpiel integration\n",
+    "- `src/envs/coding_env/` - Python code execution"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "---\n",
+    "\n",
+    "## Summary\n",
+    "\n",
+    "### What You Learned:\n",
+    "\n",
+    "1. **RL Basics** - The core loop\n",
+    "2. **OpenEnv Framework** - Standardized, production-ready RL environments\n",
+    "3. **Example Integration** - How OpenSpiel is wrapped\n",
+    "4. **Interactive Demo** - Policies in action\n",
+    "5. **Adding Integrations** - The pattern to follow\n",
+    "\n",
+    "### OpenEnv's Value:\n",
+    "\n",
+    "| Feature | Traditional | OpenEnv |\n",
+    "|---------|------------|----------|\n",
+    "| Type Safety | ❌ | ✅ |\n",
+    "| Isolation | ❌ | ✅ Docker |\n",
+    "| Deployment | ❌ | ✅ K8s-ready |\n",
+    "| Language | Python only | Any (HTTP) |\n",
+    "| Reproducibility | ❌ | ✅ |\n",
+    "\n",
+    "### Next Steps:\n",
+    "\n",
+    "1. Try OpenSpiel integration\n",
+    "2. Implement real RL (Q-learning, DQN, PPO)\n",
+    "3. Wrap your own environments\n",
+    "4. Deploy to production\n",
+    "5. Use with RL libraries (TorchRL, etc.)\n",
+    "\n",
+    "### Resources:\n",
+    "\n",
+    "- **OpenEnv**: https://github.com/meta-pytorch/OpenEnv\n",
+    "- **Docs**: `src/envs/README.md`\n",
+    "- **Examples**: `examples/` directory\n",
+    "\n",
+    "---\n",
+    "\n",
+    "## 🎉 You're Ready!\n",
+    "\n",
+    "You now understand:\n",
+    "- ✅ OpenEnv framework\n",
+    "- ✅ How integrations work\n",
+    "- ✅ Using existing environments\n",
+    "- ✅ Creating new integrations\n",
+    "- ✅ Production deployment\n",
+    "\n",
+    "**Welcome to production-ready RL!** 🚀"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.11.0"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}

From 5f45df525bc82e9ec0bdb60fb892439f181c95b8 Mon Sep 17 00:00:00 2001
From: Sanyam Bhutani <sanyambhutani@meta.com>
Date: Mon, 20 Oct 2025 13:35:51 -0700
Subject: [PATCH 02/19] add improvement

---
 examples/OpenEnv_Tutorial.ipynb | 573 ++++++++++++++++++++------------
 1 file changed, 366 insertions(+), 207 deletions(-)

diff --git a/examples/OpenEnv_Tutorial.ipynb b/examples/OpenEnv_Tutorial.ipynb
index db4ddc8..5079472 100644
--- a/examples/OpenEnv_Tutorial.ipynb
+++ b/examples/OpenEnv_Tutorial.ipynb
@@ -4,21 +4,33 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "# OpenEnv: Production-Ready RL Environments\n",
+    "<div align=\"center\">\n",
+    "\n",
+    "# 🎮 OpenEnv: Production-Ready RL Environments\n",
+    "\n",
+    "<img src=\"https://github.com/user-attachments/assets/2700a971-e5d6-4036-b03f-2f89c9791609\" width=\"100\" />\n",
     "\n",
     "**Learn how OpenEnv standardizes RL environments for production use**\n",
     "\n",
-    "---\n",
+    "[![GitHub](https://img.shields.io/badge/GitHub-meta--pytorch%2FOpenEnv-blue?logo=github)](https://github.com/meta-pytorch/OpenEnv)\n",
+    "[![Python](https://img.shields.io/badge/Python-3.11+-blue?logo=python)](https://www.python.org/)\n",
+    "[![Docker](https://img.shields.io/badge/Docker-Ready-blue?logo=docker)](https://www.docker.com/)\n",
     "\n",
-    "## What You'll Learn\n",
+    "</div>\n",
     "\n",
-    "This notebook teaches you:\n",
+    "---\n",
+    "\n",
+    "## 📚 What You'll Learn\n",
     "\n",
-    "1. **RL Fundamentals** - The core loop in 5 minutes\n",
-    "2. **OpenEnv Framework** - Why we built it and how it works\n",
-    "3. **Using Integrations** - Work with existing environments (OpenSpiel example)\n",
-    "4. **Interactive Demo** - See policies in action\n",
-    "5. **Adding Integrations** - Wrap your own environments\n",
+    "<table>\n",
+    "<tr>\n",
+    "<td width=\"20%\" align=\"center\">🧠<br><b>RL Fundamentals</b><br><sub>5 minutes</sub></td>\n",
+    "<td width=\"20%\" align=\"center\">🏗️<br><b>OpenEnv Framework</b><br><sub>Architecture</sub></td>\n",
+    "<td width=\"20%\" align=\"center\">🔌<br><b>Integrations</b><br><sub>OpenSpiel example</sub></td>\n",
+    "<td width=\"20%\" align=\"center\">🎯<br><b>Interactive Demo</b><br><sub>See it work</sub></td>\n",
+    "<td width=\"20%\" align=\"center\">➕<br><b>Add Your Own</b><br><sub>Extend it</sub></td>\n",
+    "</tr>\n",
+    "</table>\n",
     "\n",
     "---"
    ]
@@ -27,7 +39,23 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "## Part 1: RL Fundamentals - The Core Loop\n",
+    "## 🧠 Part 1: RL Fundamentals - The Core Loop\n",
+    "\n",
+    "<div align=\"center\">\n",
+    "\n",
+    "```mermaid\n",
+    "graph LR\n",
+    "    A[🤖 Agent] -->|observes| B[👀 State]\n",
+    "    B -->|decides| C[⚡ Action]\n",
+    "    C -->|executes| D[🌍 Environment]\n",
+    "    D -->|returns| E[🎁 Reward]\n",
+    "    E -->|learns| A\n",
+    "    style A fill:#e1f5ff\n",
+    "    style D fill:#fff4e1\n",
+    "    style E fill:#ffe1e1\n",
+    "```\n",
+    "\n",
+    "</div>\n",
     "\n",
     "Reinforcement Learning boils down to a simple loop:\n",
     "\n",
@@ -35,7 +63,7 @@
     "Agent observes → chooses action → gets reward → repeat\n",
     "```\n",
     "\n",
-    "Let's see it:"
+    "Let's see it in action with a simple example:"
    ]
   },
   {
@@ -75,13 +103,21 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "**The Problem**: How do we make this production-ready?\n",
-    "- Need type safety\n",
-    "- Need isolation\n",
-    "- Need deployment\n",
-    "- Need standardization\n",
-    "\n",
-    "**Enter OpenEnv.**\n",
+    "<div style=\"background-color: #fff3cd; border-left: 4px solid #ffc107; padding: 15px; margin: 20px 0;\">\n",
+    "    <h3 style=\"margin-top: 0;\">⚠️ The Problem</h3>\n",
+    "    <p>How do we make this production-ready?</p>\n",
+    "    <ul>\n",
+    "        <li>❌ Need type safety</li>\n",
+    "        <li>❌ Need isolation</li>\n",
+    "        <li>❌ Need deployment</li>\n",
+    "        <li>❌ Need standardization</li>\n",
+    "    </ul>\n",
+    "</div>\n",
+    "\n",
+    "<div style=\"background-color: #d4edda; border-left: 4px solid #28a745; padding: 15px; margin: 20px 0;\">\n",
+    "    <h3 style=\"margin-top: 0;\">✅ The Solution: OpenEnv</h3>\n",
+    "    <p>A production-ready framework that solves all these problems!</p>\n",
+    "</div>\n",
     "\n",
     "---"
    ]
@@ -90,72 +126,104 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "## Part 2: OpenEnv - The Framework\n",
-    "\n",
-    "### What is OpenEnv?\n",
+    "## 🏗️ Part 2: OpenEnv - The Framework\n",
     "\n",
-    "OpenEnv is a **framework for creating, deploying, and using isolated RL environments**.\n",
+    "<div align=\"center\">\n",
+    "    <h3>🚀 Think \"Docker for RL Environments\"</h3>\n",
+    "</div>\n",
     "\n",
-    "Think \"Docker for RL environments\" with:\n",
-    "- ✅ Standardized API (reset, step, state)\n",
-    "- ✅ Type-safe dataclasses\n",
-    "- ✅ Docker isolation\n",
-    "- ✅ HTTP communication (language-agnostic)\n",
-    "- ✅ Production-ready deployment\n",
+    "### ✨ What is OpenEnv?\n",
     "\n",
-    "### The Architecture\n",
+    "OpenEnv is a **framework for creating, deploying, and using isolated RL environments**.\n",
     "\n",
-    "```\n",
-    "┌────────────────────────────────────┐\n",
-    "│  Your Training Code                │  Python, Rust, Julia...\n",
-    "│                                    │\n",
-    "│  env = SomeEnv(...)                │  ← Import OpenEnv client\n",
-    "│  result = env.reset()              │  ← Type-safe!\n",
-    "│  result = env.step(action)         │  ← Type-safe!\n",
-    "└──────────┬─────────────────────────┘\n",
-    "           │\n",
-    "           │ HTTP/JSON\n",
-    "           │\n",
-    "┌──────────▼─────────────────────────┐\n",
-    "│  Docker Container                  │\n",
-    "│                                    │\n",
-    "│  FastAPI Server                    │\n",
-    "│  └─ Environment Logic              │\n",
-    "│     └─ Your game/simulation        │\n",
-    "└────────────────────────────────────┘\n",
+    "<table>\n",
+    "<tr>\n",
+    "<td align=\"center\">✅<br><b>Standardized API</b><br><sub>reset, step, state</sub></td>\n",
+    "<td align=\"center\">🔒<br><b>Type-safe</b><br><sub>dataclasses</sub></td>\n",
+    "<td align=\"center\">🐳<br><b>Docker isolation</b><br><sub>secure</sub></td>\n",
+    "<td align=\"center\">🌐<br><b>HTTP API</b><br><sub>any language</sub></td>\n",
+    "<td align=\"center\">☸️<br><b>Production-ready</b><br><sub>K8s deploy</sub></td>\n",
+    "</tr>\n",
+    "</table>\n",
+    "\n",
+    "### 🎨 The Architecture\n",
+    "\n",
+    "```mermaid\n",
+    "graph TB\n",
+    "    subgraph Client[\"💻 Your Training Code\"]\n",
+    "        A[\"🐍 Python/Rust/Julia\"]\n",
+    "        B[\"env = OpenSpielEnv()\"]\n",
+    "        C[\"result = env.reset()\"]\n",
+    "        D[\"result = env.step(action)\"]\n",
+    "    end\n",
+    "    \n",
+    "    subgraph HTTP[\"🌐 HTTP/JSON\"]\n",
+    "        E[\"POST /reset\"]\n",
+    "        F[\"POST /step\"]\n",
+    "        G[\"GET /state\"]\n",
+    "    end\n",
+    "    \n",
+    "    subgraph Server[\"🐳 Docker Container\"]\n",
+    "        H[\"⚡ FastAPI Server\"]\n",
+    "        I[\"🎮 Environment Logic\"]\n",
+    "        J[\"🎯 Game/Simulation\"]\n",
+    "    end\n",
+    "    \n",
+    "    Client --> HTTP\n",
+    "    HTTP --> Server\n",
+    "    \n",
+    "    style Client fill:#e1f5ff\n",
+    "    style HTTP fill:#fff4e1\n",
+    "    style Server fill:#ffe1f5\n",
     "```\n",
     "\n",
-    "### The Pattern - Every Environment Has:\n",
+    "### 📁 The Pattern - Every Environment Has:\n",
     "\n",
     "```\n",
     "src/envs/your_env/\n",
-    "├── models.py         ← Type-safe contracts (Action, Observation, State)\n",
-    "├── client.py         ← Client API (what you import)\n",
-    "└── server/\n",
-    "    ├── environment.py ← Environment logic\n",
-    "    ├── app.py         ← FastAPI server\n",
-    "    └── Dockerfile     ← Container\n",
+    "├── 📝 models.py         ← Type-safe contracts (Action, Observation, State)\n",
+    "├── 📱 client.py         ← Client API (what you import)\n",
+    "└── 🖥️ server/\n",
+    "    ├── environment.py  ← Environment logic\n",
+    "    ├── app.py          ← FastAPI server\n",
+    "    └── Dockerfile      ← Container\n",
     "```\n",
     "\n",
-    "### Current Integrations\n",
-    "\n",
-    "OpenEnv already integrates several environments:\n",
-    "- **OpenSpiel** (6 games from DeepMind)\n",
-    "- **Echo** (test environment)\n",
-    "- **Coding** (Python code execution)\n",
-    "- **Atari** (classic games)\n",
-    "- More coming!\n",
+    "### 🎮 Current Integrations\n",
+    "\n",
+    "<div style=\"display: flex; flex-wrap: wrap; gap: 10px; margin: 20px 0;\">\n",
+    "    <div style=\"background: linear-gradient(135deg, #667eea 0%, #764ba2 100%); color: white; padding: 15px; border-radius: 10px; flex: 1; min-width: 200px;\">\n",
+    "        <h4>🎯 OpenSpiel</h4>\n",
+    "        <p>6 games from DeepMind</p>\n",
+    "    </div>\n",
+    "    <div style=\"background: linear-gradient(135deg, #f093fb 0%, #f5576c 100%); color: white; padding: 15px; border-radius: 10px; flex: 1; min-width: 200px;\">\n",
+    "        <h4>📢 Echo</h4>\n",
+    "        <p>Test environment</p>\n",
+    "    </div>\n",
+    "    <div style=\"background: linear-gradient(135deg, #4facfe 0%, #00f2fe 100%); color: white; padding: 15px; border-radius: 10px; flex: 1; min-width: 200px;\">\n",
+    "        <h4>💻 Coding</h4>\n",
+    "        <p>Python execution</p>\n",
+    "    </div>\n",
+    "    <div style=\"background: linear-gradient(135deg, #43e97b 0%, #38f9d7 100%); color: white; padding: 15px; border-radius: 10px; flex: 1; min-width: 200px;\">\n",
+    "        <h4>🕹️ Atari</h4>\n",
+    "        <p>Classic games</p>\n",
+    "    </div>\n",
+    "</div>\n",
+    "\n",
+    "Let's explore one integration to see how it all works...\n",
     "\n",
-    "Let's explore one integration to see how it all works..."
+    "---"
    ]
   },
   {
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "---\n",
+    "## ⚙️ Part 3: Setup\n",
     "\n",
-    "## Part 3: Setup"
+    "<div align=\"center\">\n",
+    "    <h3>🔧 Getting Started</h3>\n",
+    "</div>"
    ]
   },
   {
@@ -191,11 +259,13 @@
    "source": [
     "---\n",
     "\n",
-    "## Part 4: Exploring OpenEnv's Structure\n",
+    "## 🔍 Part 4: Exploring OpenEnv's Structure\n",
     "\n",
-    "Let's look at the actual OpenEnv code to understand how it works.\n",
+    "<div align=\"center\">\n",
+    "    <h3>Let's look at the actual code!</h3>\n",
+    "</div>\n",
     "\n",
-    "### The Base Classes"
+    "### 🧩 The Base Classes"
    ]
   },
   {
@@ -208,11 +278,11 @@
     "from core.http_env_client import HTTPEnvClient\n",
     "\n",
     "print(\"=\" * 70)\n",
-    "print(\"OpenEnv Core Abstractions\")\n",
+    "print(\"🔧 OpenEnv Core Abstractions\")\n",
     "print(\"=\" * 70)\n",
     "\n",
     "print(\"\"\"\n",
-    "SERVER SIDE (runs in Docker):\n",
+    "🖥️  SERVER SIDE (runs in Docker):\n",
     "\n",
     "  class Environment(ABC):\n",
     "      '''Base class for all environment implementations'''\n",
@@ -229,7 +299,7 @@
     "      def state(self) -> State:\n",
     "          '''Episode metadata'''\n",
     "\n",
-    "CLIENT SIDE (your training code):\n",
+    "📱 CLIENT SIDE (your training code):\n",
     "\n",
     "  class HTTPEnvClient(ABC):\n",
     "      '''Base class for HTTP clients'''\n",
@@ -245,7 +315,7 @@
     "\"\"\")\n",
     "\n",
     "print(\"=\" * 70)\n",
-    "print(\"💡 Same interface, communication via HTTP\")\n",
+    "print(\"💡 Same interface, communication via HTTP!\")\n",
     "print(\"=\" * 70)"
    ]
   },
@@ -255,22 +325,33 @@
    "source": [
     "---\n",
     "\n",
-    "## Part 5: Example Integration - OpenSpiel\n",
+    "## 🔌 Part 5: Example Integration - OpenSpiel\n",
+    "\n",
+    "<div align=\"center\">\n",
+    "    <img src=\"https://img.shields.io/badge/OpenSpiel-DeepMind-red?style=for-the-badge\" />\n",
+    "    <h3>70+ Game Environments</h3>\n",
+    "</div>\n",
     "\n",
-    "### What is OpenSpiel?\n",
+    "### 🎮 What is OpenSpiel?\n",
     "\n",
     "OpenSpiel is a **library from DeepMind** with 70+ game environments for RL research.\n",
     "\n",
-    "### Our Integration\n",
+    "### 🎯 Our Integration\n",
     "\n",
     "**OpenEnv wraps 6 OpenSpiel games** following our standard pattern:\n",
     "\n",
-    "1. **Catch** - Catch falling ball (single-player)\n",
-    "2. **Tic-Tac-Toe** - Classic 3×3 (2-player)\n",
-    "3. **Kuhn Poker** - Imperfect info poker (2-player)\n",
-    "4. **Cliff Walking** - Grid navigation (single-player)\n",
-    "5. **2048** - Tile puzzle (single-player)\n",
-    "6. **Blackjack** - Card game (single-player)\n",
+    "<table>\n",
+    "<tr>\n",
+    "<td align=\"center\">🎯<br><b>Catch</b><br><sub>Catch falling ball</sub></td>\n",
+    "<td align=\"center\">❌<br><b>Tic-Tac-Toe</b><br><sub>Classic 3×3</sub></td>\n",
+    "<td align=\"center\">🃏<br><b>Kuhn Poker</b><br><sub>Imperfect info</sub></td>\n",
+    "</tr>\n",
+    "<tr>\n",
+    "<td align=\"center\">🏔️<br><b>Cliff Walking</b><br><sub>Grid navigation</sub></td>\n",
+    "<td align=\"center\">🔢<br><b>2048</b><br><sub>Tile puzzle</sub></td>\n",
+    "<td align=\"center\">🂡<br><b>Blackjack</b><br><sub>Card game</sub></td>\n",
+    "</tr>\n",
+    "</table>\n",
     "\n",
     "Let's see how the integration is structured:"
    ]
@@ -281,7 +362,6 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "# Import the OpenSpiel integration models\n",
     "from envs.openspiel_env.models import (\n",
     "    OpenSpielAction,\n",
     "    OpenSpielObservation,\n",
@@ -290,7 +370,7 @@
     "from dataclasses import fields\n",
     "\n",
     "print(\"=\" * 70)\n",
-    "print(\"OpenSpiel Integration - Type-Safe Models\")\n",
+    "print(\"🔒 OpenSpiel Integration - Type-Safe Models\")\n",
     "print(\"=\" * 70)\n",
     "\n",
     "print(\"\\n📤 OpenSpielAction (what you send):\")\n",
@@ -317,7 +397,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "### How the Client Works"
+    "### 🔧 How the Client Works"
    ]
   },
   {
@@ -329,7 +409,7 @@
     "from envs.openspiel_env.client import OpenSpielEnv\n",
     "\n",
     "print(\"=\" * 70)\n",
-    "print(\"OpenSpielEnv Client (HTTPEnvClient Implementation)\")\n",
+    "print(\"📱 OpenSpielEnv Client (HTTPEnvClient Implementation)\")\n",
     "print(\"=\" * 70)\n",
     "\n",
     "print(\"\"\"\n",
@@ -371,20 +451,25 @@
    "source": [
     "---\n",
     "\n",
-    "## Part 6: Interactive Demo - See It In Action\n",
+    "## 🎯 Part 6: Interactive Demo - See It In Action\n",
+    "\n",
+    "<div align=\"center\">\n",
+    "    <h2>🎮 Let's Build the Catch Game!</h2>\n",
+    "    <img width=\"200\" src=\"https://user-images.githubusercontent.com/placeholder-catch-game.gif\" onerror=\"this.style.display='none'\" />\n",
+    "</div>\n",
     "\n",
-    "Let's build a **Catch game** environment following OpenEnv's pattern.\n",
+    "### 🎲 The Game Rules:\n",
     "\n",
-    "This shows you:\n",
-    "- How to structure an environment\n",
-    "- How the RL loop works\n",
-    "- How different policies perform\n",
+    "<table>\n",
+    "<tr>\n",
+    "<td width=\"25%\" align=\"center\">📐<br><b>5×5 Grid</b></td>\n",
+    "<td width=\"25%\" align=\"center\">🔴<br><b>Ball falls</b></td>\n",
+    "<td width=\"25%\" align=\"center\">🏓<br><b>Catch it!</b></td>\n",
+    "<td width=\"25%\" align=\"center\">🎁<br><b>+1 reward</b></td>\n",
+    "</tr>\n",
+    "</table>\n",
     "\n",
-    "### The Game:\n",
-    "- 5×5 grid, ball falls from top 🔴\n",
-    "- Control paddle at bottom 🏓\n",
-    "- **Actions**: 0=LEFT, 1=STAY, 2=RIGHT\n",
-    "- **Reward**: +1 caught, 0 missed"
+    "**Actions**: 0=LEFT ⬅️ | 1=STAY ⏸️ | 2=RIGHT ➡️"
    ]
   },
   {
@@ -405,7 +490,6 @@
     "    legal_actions: List[int]\n",
     "    done: bool\n",
     "    reward: float\n",
-    "    # For visualization\n",
     "    ball_position: Tuple[int, int]\n",
     "    paddle_position: int\n",
     "\n",
@@ -431,16 +515,13 @@
     "    \n",
     "    def step(self, action: int) -> CatchObservation:\n",
     "        \"\"\"Execute action (implements Environment.step()).\"\"\"\n",
-    "        # Move paddle\n",
     "        if action == 0 and self.paddle_col > 0:\n",
     "            self.paddle_col -= 1\n",
     "        elif action == 2 and self.paddle_col < self.grid_size - 1:\n",
     "            self.paddle_col += 1\n",
     "        \n",
-    "        # Move ball\n",
     "        self.ball_row += 1\n",
     "        \n",
-    "        # Check done\n",
     "        if self.ball_row >= self.grid_size - 1:\n",
     "            self.done = True\n",
     "            reward = 1.0 if self.ball_col == self.paddle_col else 0.0\n",
@@ -450,7 +531,6 @@
     "        return self._make_observation(reward)\n",
     "    \n",
     "    def _make_observation(self, reward=0.0) -> CatchObservation:\n",
-    "        \"\"\"Create type-safe observation.\"\"\"\n",
     "        info_state = [0.0] * (self.grid_size * self.grid_size)\n",
     "        ball_idx = self.ball_row * self.grid_size + self.ball_col\n",
     "        paddle_idx = (self.grid_size - 1) * self.grid_size + self.paddle_col\n",
@@ -467,7 +547,6 @@
     "        )\n",
     "    \n",
     "    def render(self):\n",
-    "        \"\"\"Visualize.\"\"\"\n",
     "        for row in range(self.grid_size):\n",
     "            line = \"  \"\n",
     "            for col in range(self.grid_size):\n",
@@ -479,18 +558,17 @@
     "                    line += \"⬜ \"\n",
     "            print(line)\n",
     "\n",
-    "\n",
     "print(\"✅ Environment created following OpenEnv pattern!\")\n",
-    "print(\"\\n   Implements: reset(), step()\")\n",
-    "print(\"   Returns: Type-safe observations\")\n",
-    "print(\"   In production: Would run in Docker + FastAPI\")"
+    "print(\"   🔧 Implements: reset(), step()\")\n",
+    "print(\"   🔒 Returns: Type-safe observations\")\n",
+    "print(\"   🐳 In production: Would run in Docker + FastAPI\")"
    ]
   },
   {
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "### Test It"
+    "### 🧪 Test It"
    ]
   },
   {
@@ -502,12 +580,12 @@
     "env = CatchEnvironment()\n",
     "obs = env.reset()\n",
     "\n",
-    "print(\"Initial State:\")\n",
+    "print(\"🎮 Initial State:\")\n",
     "print(\"=\" * 50)\n",
     "env.render()\n",
-    "print(f\"\\nBall: column {obs.ball_position[1]}\")\n",
-    "print(f\"Paddle: column {obs.paddle_position}\")\n",
-    "print(f\"Legal actions: {obs.legal_actions} (0=LEFT, 1=STAY, 2=RIGHT)\")"
+    "print(f\"\\n🔴 Ball: column {obs.ball_position[1]}\")\n",
+    "print(f\"🏓 Paddle: column {obs.paddle_position}\")\n",
+    "print(f\"⚡ Legal actions: {obs.legal_actions} (0=LEFT, 1=STAY, 2=RIGHT)\")"
    ]
   },
   {
@@ -516,9 +594,23 @@
    "source": [
     "---\n",
     "\n",
-    "## Part 7: Different Policies\n",
+    "## 🤖 Part 7: Different Policies\n",
+    "\n",
+    "<div align=\"center\">\n",
+    "    <h3>A policy maps: Observation → Action</h3>\n",
+    "</div>\n",
     "\n",
-    "A policy maps observations → actions. Let's test 4 strategies:"
+    "Let's test 4 strategies from dumb to smart!\n",
+    "\n",
+    "```mermaid\n",
+    "graph LR\n",
+    "    A[👀 Observation] --> B{🤖 Policy}\n",
+    "    B -->|Random| C[🎲 Action]\n",
+    "    B -->|Always Stay| D[⏸️ Action]\n",
+    "    B -->|Smart| E[🎯 Action]\n",
+    "    B -->|Learning| F[🧠 Action]\n",
+    "    style B fill:#ffe1f5\n",
+    "```"
    ]
   },
   {
@@ -528,26 +620,26 @@
    "outputs": [],
    "source": [
     "class RandomPolicy:\n",
-    "    name = \"Random\"\n",
+    "    name = \"🎲 Random\"\n",
     "    def select_action(self, obs): \n",
     "        return random.choice(obs.legal_actions)\n",
     "\n",
     "class AlwaysStayPolicy:\n",
-    "    name = \"Always Stay\"\n",
+    "    name = \"⏸️ Always Stay\"\n",
     "    def select_action(self, obs): \n",
     "        return 1\n",
     "\n",
     "class SmartPolicy:\n",
-    "    name = \"Smart Heuristic\"\n",
+    "    name = \"🎯 Smart Heuristic\"\n",
     "    def select_action(self, obs):\n",
     "        ball_col = obs.ball_position[1]\n",
     "        paddle_col = obs.paddle_position\n",
-    "        if paddle_col < ball_col: return 2  # RIGHT\n",
-    "        elif paddle_col > ball_col: return 0  # LEFT\n",
-    "        else: return 1  # STAY\n",
+    "        if paddle_col < ball_col: return 2\n",
+    "        elif paddle_col > ball_col: return 0\n",
+    "        else: return 1\n",
     "\n",
     "class LearningPolicy:\n",
-    "    name = \"Learning Agent\"\n",
+    "    name = \"🧠 Learning Agent\"\n",
     "    def __init__(self):\n",
     "        self.steps = 0\n",
     "    \n",
@@ -555,27 +647,27 @@
     "        self.steps += 1\n",
     "        epsilon = max(0.1, 1.0 - (self.steps / 100))\n",
     "        \n",
-    "        if random.random() < epsilon:  # Explore\n",
+    "        if random.random() < epsilon:\n",
     "            return random.choice(obs.legal_actions)\n",
-    "        else:  # Exploit\n",
+    "        else:\n",
     "            ball_col = obs.ball_position[1]\n",
     "            paddle_col = obs.paddle_position\n",
     "            if paddle_col < ball_col: return 2\n",
     "            elif paddle_col > ball_col: return 0\n",
     "            else: return 1\n",
     "\n",
-    "print(\"✅ 4 Policies created:\")\n",
-    "print(\"   1. Random - Baseline\")\n",
-    "print(\"   2. Always Stay - Bad strategy\")\n",
-    "print(\"   3. Smart - Optimal heuristic\")\n",
-    "print(\"   4. Learning - Simulated RL\")"
+    "print(\"✅ 4 Policies created!\")\n",
+    "print(\"   🎲 Random - Baseline\")\n",
+    "print(\"   ⏸️  Always Stay - Bad strategy\")\n",
+    "print(\"   🎯 Smart - Optimal heuristic\")\n",
+    "print(\"   🧠 Learning - Simulated RL\")"
    ]
   },
   {
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "### Watch Them Play"
+    "### 👀 Watch Them Play"
    ]
   },
   {
@@ -591,7 +683,7 @@
     "    \n",
     "    if visualize:\n",
     "        print(f\"\\n{'='*50}\")\n",
-    "        print(f\"Policy: {policy.name} | Ball: col {obs.ball_position[1]}\")\n",
+    "        print(f\"🤖 Policy: {policy.name} | 🔴 Ball: col {obs.ball_position[1]}\")\n",
     "        print('='*50 + '\\n')\n",
     "        env.render()\n",
     "        time.sleep(delay)\n",
@@ -605,7 +697,8 @@
     "        total_reward += obs.reward\n",
     "        \n",
     "        if visualize:\n",
-    "            print(f\"\\nStep {step + 1}: {['LEFT','STAY','RIGHT'][action]}\")\n",
+    "            actions = [\"⬅️ LEFT\", \"⏸️ STAY\", \"➡️ RIGHT\"]\n",
+    "            print(f\"\\n⚡ Step {step + 1}: {actions[action]}\")\n",
     "            env.render()\n",
     "            time.sleep(delay)\n",
     "        \n",
@@ -625,7 +718,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "### Compare All Policies"
+    "### 📊 Compare All Policies"
    ]
   },
   {
@@ -648,13 +741,16 @@
     "                       for _ in range(num_episodes))\n",
     "        rate = (successes / num_episodes) * 100\n",
     "        results.append((policy.name, rate))\n",
-    "        print(f\"{policy.name:20s}: {rate:5.1f}%\")\n",
+    "        print(f\"{policy.name:25s}: {rate:5.1f}%\")\n",
     "    \n",
     "    print(\"\\n\" + \"=\"*70)\n",
+    "    print(\"📊 VISUAL COMPARISON\")\n",
+    "    print(\"=\"*70 + \"\\n\")\n",
+    "    \n",
     "    results.sort(key=lambda x: x[1], reverse=True)\n",
     "    for name, rate in results:\n",
     "        bar = \"█\" * int(rate / 2)\n",
-    "        print(f\"{name:20s} [{bar:<50}] {rate:.1f}%\")\n",
+    "        print(f\"{name:25s} [{bar:<50}] {rate:.1f}%\")\n",
     "    \n",
     "    print(\"\\n\" + \"=\"*70)\n",
     "    print(\"💡 RL in action: Random → Learning → Optimal\")\n",
@@ -669,21 +765,23 @@
    "source": [
     "---\n",
     "\n",
-    "## Part 8: Using Real OpenSpiel Integration\n",
+    "## 🌐 Part 8: Using Real OpenSpiel Integration\n",
     "\n",
-    "What we just built **is how OpenEnv works**!\n",
+    "<div style=\"background-color: #d4edda; border: 2px solid #28a745; border-radius: 10px; padding: 20px; margin: 20px 0;\">\n",
+    "    <h3 style=\"margin-top: 0;\">✨ What We Just Built = How OpenEnv Works!</h3>\n",
+    "</div>\n",
     "\n",
-    "### Demo vs Production:\n",
+    "### 🔄 Demo vs Production:\n",
     "\n",
-    "| Component | Our Demo | OpenEnv + OpenSpiel |\n",
-    "|-----------|----------|---------------------|\n",
-    "| Environment | Local class | Docker container |\n",
-    "| Communication | Direct | HTTP |\n",
-    "| Client | Direct | HTTPEnvClient |\n",
+    "| Component | 🧪 Our Demo | 🚀 OpenEnv + OpenSpiel |\n",
+    "|-----------|-------------|------------------------|\n",
+    "| Environment | Local class | 🐳 Docker container |\n",
+    "| Communication | Direct calls | 🌐 HTTP |\n",
+    "| Client | Direct access | 📱 HTTPEnvClient |\n",
     "| Type Safety | ✅ | ✅ |\n",
     "| API | reset/step | reset/step |\n",
     "\n",
-    "### Using OpenSpiel Integration:\n",
+    "### 🎮 Using OpenSpiel Integration:\n",
     "\n",
     "```python\n",
     "# Install OpenSpiel\n",
@@ -701,26 +799,50 @@
     "state = env.state()\n",
     "```\n",
     "\n",
-    "### Available Games:\n",
-    "1. Catch (what we demoed!)\n",
-    "2. Tic-Tac-Toe\n",
-    "3. Kuhn Poker\n",
-    "4. Cliff Walking\n",
-    "5. 2048\n",
-    "6. Blackjack"
+    "### 🎯 Available Games:\n",
+    "\n",
+    "<div style=\"display: grid; grid-template-columns: repeat(3, 1fr); gap: 10px; margin: 20px 0;\">\n",
+    "    <div style=\"background: #e1f5ff; padding: 15px; border-radius: 8px; text-align: center;\">\n",
+    "        <h4>🎯 Catch</h4>\n",
+    "        <small>What we demoed!</small>\n",
+    "    </div>\n",
+    "    <div style=\"background: #ffe1e1; padding: 15px; border-radius: 8px; text-align: center;\">\n",
+    "        <h4>❌ Tic-Tac-Toe</h4>\n",
+    "        <small>2-player</small>\n",
+    "    </div>\n",
+    "    <div style=\"background: #fff4e1; padding: 15px; border-radius: 8px; text-align: center;\">\n",
+    "        <h4>🃏 Kuhn Poker</h4>\n",
+    "        <small>Imperfect info</small>\n",
+    "    </div>\n",
+    "    <div style=\"background: #e8f5e9; padding: 15px; border-radius: 8px; text-align: center;\">\n",
+    "        <h4>🏔️ Cliff Walking</h4>\n",
+    "        <small>Navigation</small>\n",
+    "    </div>\n",
+    "    <div style=\"background: #f3e5f5; padding: 15px; border-radius: 8px; text-align: center;\">\n",
+    "        <h4>🔢 2048</h4>\n",
+    "        <small>Puzzle</small>\n",
+    "    </div>\n",
+    "    <div style=\"background: #fff3e0; padding: 15px; border-radius: 8px; text-align: center;\">\n",
+    "        <h4>🂡 Blackjack</h4>\n",
+    "        <small>Cards</small>\n",
+    "    </div>\n",
+    "</div>\n",
+    "\n",
+    "---"
    ]
   },
   {
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "---\n",
-    "\n",
-    "## Part 9: Adding Your Own Integration\n",
+    "## ➕ Part 9: Adding Your Own Integration\n",
     "\n",
-    "Want to wrap your own environment? Follow the pattern:\n",
+    "<div align=\"center\">\n",
+    "    <h3>🛠️ Want to wrap your own environment?</h3>\n",
+    "    <p>Follow the 5-step pattern!</p>\n",
+    "</div>\n",
     "\n",
-    "### 1. Define Types (models.py)\n",
+    "### 📝 1. Define Types (models.py)\n",
     "```python\n",
     "@dataclass\n",
     "class YourAction(Action):\n",
@@ -731,7 +853,7 @@
     "    # Your observation fields\n",
     "```\n",
     "\n",
-    "### 2. Implement Environment (server/environment.py)\n",
+    "### 🖥️ 2. Implement Environment (server/environment.py)\n",
     "```python\n",
     "class YourEnvironment(Environment):\n",
     "    def reset(self) -> Observation:\n",
@@ -741,7 +863,7 @@
     "        return YourObservation(...)\n",
     "```\n",
     "\n",
-    "### 3. Create Client (client.py)\n",
+    "### 📱 3. Create Client (client.py)\n",
     "```python\n",
     "class YourEnv(HTTPEnvClient[YourAction, YourObservation]):\n",
     "    def _step_payload(self, action):\n",
@@ -751,7 +873,7 @@
     "        return StepResult(observation=YourObservation(...))\n",
     "```\n",
     "\n",
-    "### 4. Create Server (server/app.py)\n",
+    "### ⚡ 4. Create Server (server/app.py)\n",
     "```python\n",
     "from core.env_server import create_fastapi_app\n",
     "\n",
@@ -759,7 +881,7 @@
     "app = create_fastapi_app(env)\n",
     "```\n",
     "\n",
-    "### 5. Dockerize (server/Dockerfile)\n",
+    "### 🐳 5. Dockerize (server/Dockerfile)\n",
     "```dockerfile\n",
     "FROM python:3.11\n",
     "COPY . /app\n",
@@ -768,64 +890,101 @@
     "CMD [\"uvicorn\", \"app:app\", \"--host\", \"0.0.0.0\"]\n",
     "```\n",
     "\n",
-    "### Examples to Study:\n",
-    "- `src/envs/echo_env/` - Simple test environment\n",
-    "- `src/envs/openspiel_env/` - Our OpenSpiel integration\n",
-    "- `src/envs/coding_env/` - Python code execution"
+    "### 📚 Examples to Study:\n",
+    "\n",
+    "<table>\n",
+    "<tr>\n",
+    "<td>📢 <code>src/envs/echo_env/</code></td>\n",
+    "<td>Simple test environment</td>\n",
+    "</tr>\n",
+    "<tr>\n",
+    "<td>🎮 <code>src/envs/openspiel_env/</code></td>\n",
+    "<td>Our OpenSpiel integration</td>\n",
+    "</tr>\n",
+    "<tr>\n",
+    "<td>💻 <code>src/envs/coding_env/</code></td>\n",
+    "<td>Python code execution</td>\n",
+    "</tr>\n",
+    "</table>\n",
+    "\n",
+    "---"
    ]
   },
   {
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "---\n",
-    "\n",
-    "## Summary\n",
-    "\n",
-    "### What You Learned:\n",
-    "\n",
-    "1. **RL Basics** - The core loop\n",
-    "2. **OpenEnv Framework** - Standardized, production-ready RL environments\n",
-    "3. **Example Integration** - How OpenSpiel is wrapped\n",
-    "4. **Interactive Demo** - Policies in action\n",
-    "5. **Adding Integrations** - The pattern to follow\n",
-    "\n",
-    "### OpenEnv's Value:\n",
-    "\n",
-    "| Feature | Traditional | OpenEnv |\n",
-    "|---------|------------|----------|\n",
-    "| Type Safety | ❌ | ✅ |\n",
-    "| Isolation | ❌ | ✅ Docker |\n",
-    "| Deployment | ❌ | ✅ K8s-ready |\n",
-    "| Language | Python only | Any (HTTP) |\n",
-    "| Reproducibility | ❌ | ✅ |\n",
-    "\n",
-    "### Next Steps:\n",
-    "\n",
-    "1. Try OpenSpiel integration\n",
-    "2. Implement real RL (Q-learning, DQN, PPO)\n",
-    "3. Wrap your own environments\n",
-    "4. Deploy to production\n",
-    "5. Use with RL libraries (TorchRL, etc.)\n",
-    "\n",
-    "### Resources:\n",
-    "\n",
-    "- **OpenEnv**: https://github.com/meta-pytorch/OpenEnv\n",
-    "- **Docs**: `src/envs/README.md`\n",
-    "- **Examples**: `examples/` directory\n",
+    "## 🎓 Summary\n",
+    "\n",
+    "<div style=\"background: linear-gradient(135deg, #667eea 0%, #764ba2 100%); color: white; padding: 30px; border-radius: 15px; margin: 20px 0;\">\n",
+    "    <h3 style=\"margin-top: 0; text-align: center;\">🎉 What You Learned</h3>\n",
+    "</div>\n",
+    "\n",
+    "### 📖 The Journey:\n",
+    "\n",
+    "1. **🧠 RL Basics** - The core loop\n",
+    "2. **🏗️ OpenEnv Framework** - Standardized, production-ready\n",
+    "3. **🔌 Example Integration** - How OpenSpiel is wrapped\n",
+    "4. **🎯 Interactive Demo** - Policies in action\n",
+    "5. **➕ Adding Integrations** - The pattern to follow\n",
+    "\n",
+    "### ✨ OpenEnv's Value:\n",
+    "\n",
+    "| Feature | 🏠 Traditional | 🚀 OpenEnv |\n",
+    "|---------|---------------|------------|\n",
+    "| **Type Safety** | ❌ | ✅ Dataclasses |\n",
+    "| **Isolation** | ❌ | ✅ Docker |\n",
+    "| **Deployment** | ❌ | ✅ K8s-ready |\n",
+    "| **Language** | Python only | Any (HTTP) |\n",
+    "| **Reproducibility** | ❌ | ✅ Containers |\n",
+    "\n",
+    "### 🚀 Next Steps:\n",
+    "\n",
+    "<div style=\"display: grid; grid-template-columns: repeat(2, 1fr); gap: 15px; margin: 20px 0;\">\n",
+    "    <div style=\"background: #e1f5ff; padding: 20px; border-radius: 10px;\">\n",
+    "        <h4>1️⃣ Try OpenSpiel</h4>\n",
+    "        <p>Install and play with the 6 games</p>\n",
+    "    </div>\n",
+    "    <div style=\"background: #ffe1e1; padding: 20px; border-radius: 10px;\">\n",
+    "        <h4>2️⃣ Implement Real RL</h4>\n",
+    "        <p>Q-learning, DQN, PPO</p>\n",
+    "    </div>\n",
+    "    <div style=\"background: #fff4e1; padding: 20px; border-radius: 10px;\">\n",
+    "        <h4>3️⃣ Wrap Your Environments</h4>\n",
+    "        <p>Follow the 5-step pattern</p>\n",
+    "    </div>\n",
+    "    <div style=\"background: #e8f5e9; padding: 20px; border-radius: 10px;\">\n",
+    "        <h4>4️⃣ Deploy to Production</h4>\n",
+    "        <p>Docker → Kubernetes</p>\n",
+    "    </div>\n",
+    "</div>\n",
+    "\n",
+    "### 📚 Resources:\n",
+    "\n",
+    "- 🏠 **OpenEnv**: https://github.com/meta-pytorch/OpenEnv\n",
+    "- 📖 **Docs**: `src/envs/README.md`\n",
+    "- 💡 **Examples**: `examples/` directory\n",
     "\n",
     "---\n",
     "\n",
-    "## 🎉 You're Ready!\n",
-    "\n",
-    "You now understand:\n",
-    "- ✅ OpenEnv framework\n",
-    "- ✅ How integrations work\n",
-    "- ✅ Using existing environments\n",
-    "- ✅ Creating new integrations\n",
-    "- ✅ Production deployment\n",
-    "\n",
-    "**Welcome to production-ready RL!** 🚀"
+    "<div align=\"center\" style=\"background: linear-gradient(135deg, #f093fb 0%, #f5576c 100%); color: white; padding: 40px; border-radius: 20px; margin: 30px 0;\">\n",
+    "    <h2>🎉 You're Ready!</h2>\n",
+    "    <p style=\"font-size: 1.2em; margin: 20px 0;\">You now understand:</p>\n",
+    "    <table style=\"margin: 20px auto;\">\n",
+    "        <tr>\n",
+    "            <td>✅ OpenEnv framework</td>\n",
+    "            <td>✅ How integrations work</td>\n",
+    "        </tr>\n",
+    "        <tr>\n",
+    "            <td>✅ Using existing environments</td>\n",
+    "            <td>✅ Creating new integrations</td>\n",
+    "        </tr>\n",
+    "        <tr>\n",
+    "            <td colspan=\"2\">✅ Production deployment</td>\n",
+    "        </tr>\n",
+    "    </table>\n",
+    "    <h3 style=\"margin-top: 30px;\">Welcome to production-ready RL! 🚀</h3>\n",
+    "</div>"
    ]
   }
  ],

From 72cf8cee6e5339e761d3fddc789436e1754e71b7 Mon Sep 17 00:00:00 2001
From: Sanyam Bhutani <sanyambhutani@meta.com>
Date: Mon, 20 Oct 2025 13:36:49 -0700
Subject: [PATCH 03/19] fix mermaid

---
 examples/OpenEnv_Tutorial.ipynb | 142 +-------------------------------
 1 file changed, 4 insertions(+), 138 deletions(-)

diff --git a/examples/OpenEnv_Tutorial.ipynb b/examples/OpenEnv_Tutorial.ipynb
index 5079472..6874228 100644
--- a/examples/OpenEnv_Tutorial.ipynb
+++ b/examples/OpenEnv_Tutorial.ipynb
@@ -38,33 +38,7 @@
   {
    "cell_type": "markdown",
    "metadata": {},
-   "source": [
-    "## 🧠 Part 1: RL Fundamentals - The Core Loop\n",
-    "\n",
-    "<div align=\"center\">\n",
-    "\n",
-    "```mermaid\n",
-    "graph LR\n",
-    "    A[🤖 Agent] -->|observes| B[👀 State]\n",
-    "    B -->|decides| C[⚡ Action]\n",
-    "    C -->|executes| D[🌍 Environment]\n",
-    "    D -->|returns| E[🎁 Reward]\n",
-    "    E -->|learns| A\n",
-    "    style A fill:#e1f5ff\n",
-    "    style D fill:#fff4e1\n",
-    "    style E fill:#ffe1e1\n",
-    "```\n",
-    "\n",
-    "</div>\n",
-    "\n",
-    "Reinforcement Learning boils down to a simple loop:\n",
-    "\n",
-    "```\n",
-    "Agent observes → chooses action → gets reward → repeat\n",
-    "```\n",
-    "\n",
-    "Let's see it in action with a simple example:"
-   ]
+   "source": "## 🧠 Part 1: RL Fundamentals - The Core Loop\n\n<div align=\"center\">\n<table style=\"border: none; margin: 20px auto;\">\n<tr>\n<td style=\"background: #e1f5ff; padding: 15px; border-radius: 10px; text-align: center; font-size: 18px; border: 2px solid #4facfe;\">\n<b>🤖 Agent</b><br><small>learns</small>\n</td>\n<td style=\"font-size: 24px; padding: 0 10px;\">↓</td>\n<td style=\"background: #fff4e1; padding: 15px; border-radius: 10px; text-align: center; font-size: 18px; border: 2px solid #ffc107;\">\n<b>👀 State</b><br><small>observes</small>\n</td>\n</tr>\n<tr>\n<td style=\"font-size: 24px; text-align: center;\">↑</td>\n<td></td>\n<td style=\"font-size: 24px; text-align: center;\">↓</td>\n</tr>\n<tr>\n<td style=\"background: #ffe1e1; padding: 15px; border-radius: 10px; text-align: center; font-size: 18px; border: 2px solid #f5576c;\">\n<b>🎁 Reward</b><br><small>returns</small>\n</td>\n<td style=\"font-size: 24px; padding: 0 10px;\">←</td>\n<td style=\"background: #fff4e1; padding: 15px; border-radius: 10px; text-align: center; font-size: 18px; border: 2px solid #ffc107;\">\n<b>⚡ Action</b><br><small>decides</small>\n</td>\n</tr>\n<tr>\n<td style=\"font-size: 24px; text-align: center;\">↑</td>\n<td></td>\n<td style=\"font-size: 24px; text-align: center;\">↓</td>\n</tr>\n<tr>\n<td colspan=\"3\" style=\"background: #fff4e1; padding: 15px; border-radius: 10px; text-align: center; font-size: 18px; border: 2px solid #ffc107;\">\n<b>🌍 Environment</b><br><small>executes</small>\n</td>\n</tr>\n</table>\n</div>\n\nReinforcement Learning boils down to a simple loop:\n\n```\nAgent observes → chooses action → gets reward → repeat\n```\n\nLet's see it in action with a simple example:"
   },
   {
    "cell_type": "code",
@@ -125,95 +99,7 @@
   {
    "cell_type": "markdown",
    "metadata": {},
-   "source": [
-    "## 🏗️ Part 2: OpenEnv - The Framework\n",
-    "\n",
-    "<div align=\"center\">\n",
-    "    <h3>🚀 Think \"Docker for RL Environments\"</h3>\n",
-    "</div>\n",
-    "\n",
-    "### ✨ What is OpenEnv?\n",
-    "\n",
-    "OpenEnv is a **framework for creating, deploying, and using isolated RL environments**.\n",
-    "\n",
-    "<table>\n",
-    "<tr>\n",
-    "<td align=\"center\">✅<br><b>Standardized API</b><br><sub>reset, step, state</sub></td>\n",
-    "<td align=\"center\">🔒<br><b>Type-safe</b><br><sub>dataclasses</sub></td>\n",
-    "<td align=\"center\">🐳<br><b>Docker isolation</b><br><sub>secure</sub></td>\n",
-    "<td align=\"center\">🌐<br><b>HTTP API</b><br><sub>any language</sub></td>\n",
-    "<td align=\"center\">☸️<br><b>Production-ready</b><br><sub>K8s deploy</sub></td>\n",
-    "</tr>\n",
-    "</table>\n",
-    "\n",
-    "### 🎨 The Architecture\n",
-    "\n",
-    "```mermaid\n",
-    "graph TB\n",
-    "    subgraph Client[\"💻 Your Training Code\"]\n",
-    "        A[\"🐍 Python/Rust/Julia\"]\n",
-    "        B[\"env = OpenSpielEnv()\"]\n",
-    "        C[\"result = env.reset()\"]\n",
-    "        D[\"result = env.step(action)\"]\n",
-    "    end\n",
-    "    \n",
-    "    subgraph HTTP[\"🌐 HTTP/JSON\"]\n",
-    "        E[\"POST /reset\"]\n",
-    "        F[\"POST /step\"]\n",
-    "        G[\"GET /state\"]\n",
-    "    end\n",
-    "    \n",
-    "    subgraph Server[\"🐳 Docker Container\"]\n",
-    "        H[\"⚡ FastAPI Server\"]\n",
-    "        I[\"🎮 Environment Logic\"]\n",
-    "        J[\"🎯 Game/Simulation\"]\n",
-    "    end\n",
-    "    \n",
-    "    Client --> HTTP\n",
-    "    HTTP --> Server\n",
-    "    \n",
-    "    style Client fill:#e1f5ff\n",
-    "    style HTTP fill:#fff4e1\n",
-    "    style Server fill:#ffe1f5\n",
-    "```\n",
-    "\n",
-    "### 📁 The Pattern - Every Environment Has:\n",
-    "\n",
-    "```\n",
-    "src/envs/your_env/\n",
-    "├── 📝 models.py         ← Type-safe contracts (Action, Observation, State)\n",
-    "├── 📱 client.py         ← Client API (what you import)\n",
-    "└── 🖥️ server/\n",
-    "    ├── environment.py  ← Environment logic\n",
-    "    ├── app.py          ← FastAPI server\n",
-    "    └── Dockerfile      ← Container\n",
-    "```\n",
-    "\n",
-    "### 🎮 Current Integrations\n",
-    "\n",
-    "<div style=\"display: flex; flex-wrap: wrap; gap: 10px; margin: 20px 0;\">\n",
-    "    <div style=\"background: linear-gradient(135deg, #667eea 0%, #764ba2 100%); color: white; padding: 15px; border-radius: 10px; flex: 1; min-width: 200px;\">\n",
-    "        <h4>🎯 OpenSpiel</h4>\n",
-    "        <p>6 games from DeepMind</p>\n",
-    "    </div>\n",
-    "    <div style=\"background: linear-gradient(135deg, #f093fb 0%, #f5576c 100%); color: white; padding: 15px; border-radius: 10px; flex: 1; min-width: 200px;\">\n",
-    "        <h4>📢 Echo</h4>\n",
-    "        <p>Test environment</p>\n",
-    "    </div>\n",
-    "    <div style=\"background: linear-gradient(135deg, #4facfe 0%, #00f2fe 100%); color: white; padding: 15px; border-radius: 10px; flex: 1; min-width: 200px;\">\n",
-    "        <h4>💻 Coding</h4>\n",
-    "        <p>Python execution</p>\n",
-    "    </div>\n",
-    "    <div style=\"background: linear-gradient(135deg, #43e97b 0%, #38f9d7 100%); color: white; padding: 15px; border-radius: 10px; flex: 1; min-width: 200px;\">\n",
-    "        <h4>🕹️ Atari</h4>\n",
-    "        <p>Classic games</p>\n",
-    "    </div>\n",
-    "</div>\n",
-    "\n",
-    "Let's explore one integration to see how it all works...\n",
-    "\n",
-    "---"
-   ]
+   "source": "## 🏗️ Part 2: OpenEnv - The Framework\n\n<div align=\"center\">\n    <h3>🚀 Think \"Docker for RL Environments\"</h3>\n</div>\n\n### ✨ What is OpenEnv?\n\nOpenEnv is a **framework for creating, deploying, and using isolated RL environments**.\n\n<table>\n<tr>\n<td align=\"center\">✅<br><b>Standardized API</b><br><sub>reset, step, state</sub></td>\n<td align=\"center\">🔒<br><b>Type-safe</b><br><sub>dataclasses</sub></td>\n<td align=\"center\">🐳<br><b>Docker isolation</b><br><sub>secure</sub></td>\n<td align=\"center\">🌐<br><b>HTTP API</b><br><sub>any language</sub></td>\n<td align=\"center\">☸️<br><b>Production-ready</b><br><sub>K8s deploy</sub></td>\n</tr>\n</table>\n\n### 🎨 The Architecture\n\n<div style=\"margin: 30px 0;\">\n<table style=\"width: 100%; border: none;\">\n<tr>\n<td colspan=\"3\" style=\"background: linear-gradient(135deg, #e1f5ff 0%, #b3e0ff 100%); padding: 20px; border-radius: 10px; border: 3px solid #4facfe;\">\n<div style=\"text-align: center;\">\n<h4 style=\"margin: 5px 0;\">💻 Your Training Code (Client)</h4>\n<code>env = OpenSpielEnv()</code><br>\n<code>result = env.reset()</code><br>\n<code>result = env.step(action)</code>\n</div>\n</td>\n</tr>\n<tr>\n<td colspan=\"3\" style=\"text-align: center; font-size: 32px; padding: 10px;\">↓</td>\n</tr>\n<tr>\n<td colspan=\"3\" style=\"background: linear-gradient(135deg, #fff4e1 0%, #ffe4b3 100%); padding: 20px; border-radius: 10px; border: 3px solid #ffc107;\">\n<div style=\"text-align: center;\">\n<h4 style=\"margin: 5px 0;\">🌐 HTTP/JSON Protocol</h4>\n<code>POST /reset</code> | <code>POST /step</code> | <code>GET /state</code>\n</div>\n</td>\n</tr>\n<tr>\n<td colspan=\"3\" style=\"text-align: center; font-size: 32px; padding: 10px;\">↓</td>\n</tr>\n<tr>\n<td colspan=\"3\" style=\"background: linear-gradient(135deg, #ffe1f5 0%, #ffb3e6 100%); padding: 20px; border-radius: 10px; border: 3px solid #f093fb;\">\n<div style=\"text-align: center;\">\n<h4 style=\"margin: 5px 0;\">🐳 Docker Container (Server)</h4>\n⚡ FastAPI Server → 🎮 Environment Logic → 🎯 Game/Simulation\n</div>\n</td>\n</tr>\n</table>\n</div>\n\n### 📁 The Pattern - Every Environment Has:\n\n```\nsrc/envs/your_env/\n├── 📝 models.py         ← Type-safe contracts (Action, Observation, State)\n├── 📱 client.py         ← Client API (what you import)\n└── 🖥️ server/\n    ├── environment.py  ← Environment logic\n    ├── app.py          ← FastAPI server\n    └── Dockerfile      ← Container\n```\n\n### 🎮 Current Integrations\n\n<div style=\"display: flex; flex-wrap: wrap; gap: 10px; margin: 20px 0;\">\n    <div style=\"background: linear-gradient(135deg, #667eea 0%, #764ba2 100%); color: white; padding: 15px; border-radius: 10px; flex: 1; min-width: 200px;\">\n        <h4>🎯 OpenSpiel</h4>\n        <p>6 games from DeepMind</p>\n    </div>\n    <div style=\"background: linear-gradient(135deg, #f093fb 0%, #f5576c 100%); color: white; padding: 15px; border-radius: 10px; flex: 1; min-width: 200px;\">\n        <h4>📢 Echo</h4>\n        <p>Test environment</p>\n    </div>\n    <div style=\"background: linear-gradient(135deg, #4facfe 0%, #00f2fe 100%); color: white; padding: 15px; border-radius: 10px; flex: 1; min-width: 200px;\">\n        <h4>💻 Coding</h4>\n        <p>Python execution</p>\n    </div>\n    <div style=\"background: linear-gradient(135deg, #43e97b 0%, #38f9d7 100%); color: white; padding: 15px; border-radius: 10px; flex: 1; min-width: 200px;\">\n        <h4>🕹️ Atari</h4>\n        <p>Classic games</p>\n    </div>\n</div>\n\nLet's explore one integration to see how it all works...\n\n---"
   },
   {
    "cell_type": "markdown",
@@ -591,27 +477,7 @@
   {
    "cell_type": "markdown",
    "metadata": {},
-   "source": [
-    "---\n",
-    "\n",
-    "## 🤖 Part 7: Different Policies\n",
-    "\n",
-    "<div align=\"center\">\n",
-    "    <h3>A policy maps: Observation → Action</h3>\n",
-    "</div>\n",
-    "\n",
-    "Let's test 4 strategies from dumb to smart!\n",
-    "\n",
-    "```mermaid\n",
-    "graph LR\n",
-    "    A[👀 Observation] --> B{🤖 Policy}\n",
-    "    B -->|Random| C[🎲 Action]\n",
-    "    B -->|Always Stay| D[⏸️ Action]\n",
-    "    B -->|Smart| E[🎯 Action]\n",
-    "    B -->|Learning| F[🧠 Action]\n",
-    "    style B fill:#ffe1f5\n",
-    "```"
-   ]
+   "source": "---\n\n## 🤖 Part 7: Different Policies\n\n<div align=\"center\">\n    <h3>A policy maps: Observation → Action</h3>\n</div>\n\nLet's test 4 strategies from dumb to smart!\n\n<div style=\"margin: 30px auto; max-width: 600px;\">\n<table style=\"width: 100%; border: none;\">\n<tr>\n<td rowspan=\"4\" style=\"background: #fff4e1; padding: 20px; border-radius: 10px; text-align: center; font-size: 18px; border: 3px solid #ffc107; vertical-align: middle;\">\n<b>👀 Observation</b><br><small>Ball & paddle positions</small>\n</td>\n<td style=\"font-size: 32px; text-align: center; padding: 0 20px;\">→</td>\n<td rowspan=\"4\" style=\"background: #ffe1f5; padding: 20px; border-radius: 10px; text-align: center; font-size: 18px; border: 3px solid #f093fb; vertical-align: middle;\">\n<b>🤖 Policy</b><br><small>Decision maker</small>\n</td>\n<td style=\"font-size: 24px; text-align: center; padding: 0 15px;\">→</td>\n<td style=\"background: #e1ffe1; padding: 10px; border-radius: 8px; text-align: center; border: 2px solid #43e97b;\">\n🎲 Random\n</td>\n</tr>\n<tr>\n<td style=\"font-size: 24px; text-align: center; padding: 0 15px;\"></td>\n<td style=\"font-size: 24px; text-align: center; padding: 0 15px;\">→</td>\n<td style=\"background: #ffe1e1; padding: 10px; border-radius: 8px; text-align: center; border: 2px solid #f5576c;\">\n⏸️ Always Stay\n</td>\n</tr>\n<tr>\n<td style=\"font-size: 24px; text-align: center; padding: 0 15px;\"></td>\n<td style=\"font-size: 24px; text-align: center; padding: 0 15px;\">→</td>\n<td style=\"background: #e1f5ff; padding: 10px; border-radius: 8px; text-align: center; border: 2px solid #4facfe;\">\n🎯 Smart\n</td>\n</tr>\n<tr>\n<td style=\"font-size: 24px; text-align: center; padding: 0 15px;\"></td>\n<td style=\"font-size: 24px; text-align: center; padding: 0 15px;\">→</td>\n<td style=\"background: #f5e1ff; padding: 10px; border-radius: 8px; text-align: center; border: 2px solid #b388ff;\">\n🧠 Learning\n</td>\n</tr>\n</table>\n</div>"
   },
   {
    "cell_type": "code",
@@ -1009,4 +875,4 @@
  },
  "nbformat": 4,
  "nbformat_minor": 4
-}
+}
\ No newline at end of file

From 78e71d4cae5be273941dcbab024b05e6522f6cda Mon Sep 17 00:00:00 2001
From: Sanyam Bhutani <sanyambhutani@meta.com>
Date: Mon, 20 Oct 2025 13:42:11 -0700
Subject: [PATCH 04/19] update viz

---
 examples/OpenEnv_Tutorial.ipynb | 1345 +++++++++++++++++++++++--------
 1 file changed, 1002 insertions(+), 343 deletions(-)

diff --git a/examples/OpenEnv_Tutorial.ipynb b/examples/OpenEnv_Tutorial.ipynb
index 6874228..fe93747 100644
--- a/examples/OpenEnv_Tutorial.ipynb
+++ b/examples/OpenEnv_Tutorial.ipynb
@@ -6,39 +6,100 @@
    "source": [
     "<div align=\"center\">\n",
     "\n",
-    "# 🎮 OpenEnv: Production-Ready RL Environments\n",
+    "# 🚀 OpenEnv: Production RL Made Simple\n",
     "\n",
-    "<img src=\"https://github.com/user-attachments/assets/2700a971-e5d6-4036-b03f-2f89c9791609\" width=\"100\" />\n",
+    "### *From \"Hello World\" to Production Deployment in 30 Minutes*\n",
     "\n",
-    "**Learn how OpenEnv standardizes RL environments for production use**\n",
+    "---\n",
+    "\n",
+    "**What if RL environments were as easy to use as REST APIs?**\n",
+    "\n",
+    "That's OpenEnv. Type-safe. Isolated. Production-ready.\n",
     "\n",
     "[![GitHub](https://img.shields.io/badge/GitHub-meta--pytorch%2FOpenEnv-blue?logo=github)](https://github.com/meta-pytorch/OpenEnv)\n",
-    "[![Python](https://img.shields.io/badge/Python-3.11+-blue?logo=python)](https://www.python.org/)\n",
-    "[![Docker](https://img.shields.io/badge/Docker-Ready-blue?logo=docker)](https://www.docker.com/)\n",
+    "[![License](https://img.shields.io/badge/License-BSD%203--Clause-green.svg)](https://opensource.org/licenses/BSD-3-Clause)\n",
     "\n",
     "</div>\n",
     "\n",
-    "---\n",
-    "\n",
-    "## 📚 What You'll Learn\n",
+    "---"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## 📋 What You'll Learn\n",
     "\n",
     "<table>\n",
     "<tr>\n",
-    "<td width=\"20%\" align=\"center\">🧠<br><b>RL Fundamentals</b><br><sub>5 minutes</sub></td>\n",
-    "<td width=\"20%\" align=\"center\">🏗️<br><b>OpenEnv Framework</b><br><sub>Architecture</sub></td>\n",
-    "<td width=\"20%\" align=\"center\">🔌<br><b>Integrations</b><br><sub>OpenSpiel example</sub></td>\n",
-    "<td width=\"20%\" align=\"center\">🎯<br><b>Interactive Demo</b><br><sub>See it work</sub></td>\n",
-    "<td width=\"20%\" align=\"center\">➕<br><b>Add Your Own</b><br><sub>Extend it</sub></td>\n",
+    "<td width=\"50%\">\n",
+    "\n",
+    "**🎯 Part 1-2: The Fundamentals**\n",
+    "- RL in 60 seconds\n",
+    "- Why existing solutions fall short\n",
+    "- The OpenEnv solution\n",
+    "\n",
+    "</td>\n",
+    "<td width=\"50%\">\n",
+    "\n",
+    "**🏗️ Part 3-5: The Architecture**\n",
+    "- How OpenEnv works\n",
+    "- Exploring real code\n",
+    "- OpenSpiel integration example\n",
+    "\n",
+    "</td>\n",
+    "</tr>\n",
+    "<tr>\n",
+    "<td width=\"50%\">\n",
+    "\n",
+    "**🎮 Part 6-8: Hands-On Demo**\n",
+    "- Build a game environment\n",
+    "- Test 4 different policies\n",
+    "- Watch learning happen live\n",
+    "\n",
+    "</td>\n",
+    "<td width=\"50%\">\n",
+    "\n",
+    "**🔧 Part 9-10: Going Further**\n",
+    "- Use real OpenSpiel\n",
+    "- Create your own integration\n",
+    "- Deploy to production\n",
+    "\n",
+    "</td>\n",
     "</tr>\n",
     "</table>\n",
     "\n",
-    "---"
+    "> 💡 **Pro Tip**: This notebook is designed to run top-to-bottom in Google Colab with zero setup!"
    ]
   },
   {
    "cell_type": "markdown",
    "metadata": {},
-   "source": "## 🧠 Part 1: RL Fundamentals - The Core Loop\n\n<div align=\"center\">\n<table style=\"border: none; margin: 20px auto;\">\n<tr>\n<td style=\"background: #e1f5ff; padding: 15px; border-radius: 10px; text-align: center; font-size: 18px; border: 2px solid #4facfe;\">\n<b>🤖 Agent</b><br><small>learns</small>\n</td>\n<td style=\"font-size: 24px; padding: 0 10px;\">↓</td>\n<td style=\"background: #fff4e1; padding: 15px; border-radius: 10px; text-align: center; font-size: 18px; border: 2px solid #ffc107;\">\n<b>👀 State</b><br><small>observes</small>\n</td>\n</tr>\n<tr>\n<td style=\"font-size: 24px; text-align: center;\">↑</td>\n<td></td>\n<td style=\"font-size: 24px; text-align: center;\">↓</td>\n</tr>\n<tr>\n<td style=\"background: #ffe1e1; padding: 15px; border-radius: 10px; text-align: center; font-size: 18px; border: 2px solid #f5576c;\">\n<b>🎁 Reward</b><br><small>returns</small>\n</td>\n<td style=\"font-size: 24px; padding: 0 10px;\">←</td>\n<td style=\"background: #fff4e1; padding: 15px; border-radius: 10px; text-align: center; font-size: 18px; border: 2px solid #ffc107;\">\n<b>⚡ Action</b><br><small>decides</small>\n</td>\n</tr>\n<tr>\n<td style=\"font-size: 24px; text-align: center;\">↑</td>\n<td></td>\n<td style=\"font-size: 24px; text-align: center;\">↓</td>\n</tr>\n<tr>\n<td colspan=\"3\" style=\"background: #fff4e1; padding: 15px; border-radius: 10px; text-align: center; font-size: 18px; border: 2px solid #ffc107;\">\n<b>🌍 Environment</b><br><small>executes</small>\n</td>\n</tr>\n</table>\n</div>\n\nReinforcement Learning boils down to a simple loop:\n\n```\nAgent observes → chooses action → gets reward → repeat\n```\n\nLet's see it in action with a simple example:"
+   "source": [
+    "---\n",
+    "\n",
+    "# Part 1: RL in 60 Seconds ⏱️\n",
+    "\n",
+    "<div style=\"background-color: #f0f7ff; padding: 20px; border-left: 5px solid #2196F3; margin: 20px 0;\">\n",
+    "\n",
+    "**Reinforcement Learning is simpler than you think.**\n",
+    "\n",
+    "It's just a loop:\n",
+    "\n",
+    "```\n",
+    "while not done:\n",
+    "    observation = environment.observe()\n",
+    "    action = policy.choose(observation)\n",
+    "    reward = environment.step(action)\n",
+    "    policy.learn(reward)\n",
+    "```\n",
+    "\n",
+    "That's it. That's RL.\n",
+    "\n",
+    "</div>\n",
+    "\n",
+    "Let's see it in action:"
+   ]
   },
   {
    "cell_type": "code",
@@ -48,67 +109,174 @@
    "source": [
     "import random\n",
     "\n",
-    "# Simple RL: Guess a number\n",
+    "print(\"🎲 Number Guessing Game - The Simplest RL Example\")\n",
+    "print(\"=\" * 60)\n",
+    "\n",
+    "# Environment\n",
     "target = random.randint(1, 10)\n",
-    "guesses = 3\n",
+    "guesses_left = 3\n",
     "\n",
-    "print(\"🎯 Guess a number (1-10)\\n\")\n",
+    "print(f\"\\n🎯 I'm thinking of a number between 1 and 10...\\n\")\n",
     "\n",
-    "while guesses > 0:\n",
-    "    guess = random.randint(1, 10)  # Policy: random\n",
-    "    guesses -= 1\n",
+    "# The RL Loop\n",
+    "while guesses_left > 0:\n",
+    "    # Policy: Random guessing (no learning yet!)\n",
+    "    guess = random.randint(1, 10)\n",
+    "    guesses_left -= 1\n",
     "    \n",
-    "    print(f\"Guess: {guess}\", end=\" → \")\n",
+    "    print(f\"💭 Guess #{3-guesses_left}: {guess}\", end=\" → \")\n",
     "    \n",
+    "    # Reward signal\n",
     "    if guess == target:\n",
-    "        print(\"🎉 Correct! Reward: +1\")\n",
+    "        print(\"🎉 Correct! +10 points\")\n",
     "        break\n",
     "    elif abs(guess - target) <= 2:\n",
-    "        print(\"🔥 Warm\")\n",
+    "        print(\"🔥 Warm! (close)\")\n",
     "    else:\n",
-    "        print(\"❄️ Cold\")\n",
+    "        print(\"❄️  Cold! (far)\")\n",
     "else:\n",
-    "    print(f\"\\nIt was {target}. Reward: 0\")\n",
+    "    print(f\"\\n💔 Out of guesses. The number was {target}.\")\n",
     "\n",
-    "print(\"\\n💡 That's RL: observe → act → reward → repeat\")"
+    "print(\"\\n\" + \"=\" * 60)\n",
+    "print(\"\\n💡 This is RL: Observe → Act → Reward → Repeat\")\n",
+    "print(\"   But this policy is terrible! It doesn't learn.\\n\")"
    ]
   },
   {
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "<div style=\"background-color: #fff3cd; border-left: 4px solid #ffc107; padding: 15px; margin: 20px 0;\">\n",
-    "    <h3 style=\"margin-top: 0;\">⚠️ The Problem</h3>\n",
-    "    <p>How do we make this production-ready?</p>\n",
-    "    <ul>\n",
-    "        <li>❌ Need type safety</li>\n",
-    "        <li>❌ Need isolation</li>\n",
-    "        <li>❌ Need deployment</li>\n",
-    "        <li>❌ Need standardization</li>\n",
-    "    </ul>\n",
-    "</div>\n",
+    "<div style=\"background-color: #fff3cd; padding: 15px; border-left: 5px solid #ffc107; margin: 20px 0;\">\n",
     "\n",
-    "<div style=\"background-color: #d4edda; border-left: 4px solid #28a745; padding: 15px; margin: 20px 0;\">\n",
-    "    <h3 style=\"margin-top: 0;\">✅ The Solution: OpenEnv</h3>\n",
-    "    <p>A production-ready framework that solves all these problems!</p>\n",
-    "</div>\n",
+    "**🤔 The Problem**: Our random guesser never improves because it doesn't use the rewards!\n",
     "\n",
-    "---"
+    "Real RL agents:\n",
+    "- 📊 Track which actions lead to rewards\n",
+    "- 🎯 Choose better actions over time\n",
+    "- 🔄 Balance exploration (trying new things) vs exploitation (using what works)\n",
+    "\n",
+    "We'll build this later!\n",
+    "\n",
+    "</div>"
    ]
   },
   {
    "cell_type": "markdown",
    "metadata": {},
-   "source": "## 🏗️ Part 2: OpenEnv - The Framework\n\n<div align=\"center\">\n    <h3>🚀 Think \"Docker for RL Environments\"</h3>\n</div>\n\n### ✨ What is OpenEnv?\n\nOpenEnv is a **framework for creating, deploying, and using isolated RL environments**.\n\n<table>\n<tr>\n<td align=\"center\">✅<br><b>Standardized API</b><br><sub>reset, step, state</sub></td>\n<td align=\"center\">🔒<br><b>Type-safe</b><br><sub>dataclasses</sub></td>\n<td align=\"center\">🐳<br><b>Docker isolation</b><br><sub>secure</sub></td>\n<td align=\"center\">🌐<br><b>HTTP API</b><br><sub>any language</sub></td>\n<td align=\"center\">☸️<br><b>Production-ready</b><br><sub>K8s deploy</sub></td>\n</tr>\n</table>\n\n### 🎨 The Architecture\n\n<div style=\"margin: 30px 0;\">\n<table style=\"width: 100%; border: none;\">\n<tr>\n<td colspan=\"3\" style=\"background: linear-gradient(135deg, #e1f5ff 0%, #b3e0ff 100%); padding: 20px; border-radius: 10px; border: 3px solid #4facfe;\">\n<div style=\"text-align: center;\">\n<h4 style=\"margin: 5px 0;\">💻 Your Training Code (Client)</h4>\n<code>env = OpenSpielEnv()</code><br>\n<code>result = env.reset()</code><br>\n<code>result = env.step(action)</code>\n</div>\n</td>\n</tr>\n<tr>\n<td colspan=\"3\" style=\"text-align: center; font-size: 32px; padding: 10px;\">↓</td>\n</tr>\n<tr>\n<td colspan=\"3\" style=\"background: linear-gradient(135deg, #fff4e1 0%, #ffe4b3 100%); padding: 20px; border-radius: 10px; border: 3px solid #ffc107;\">\n<div style=\"text-align: center;\">\n<h4 style=\"margin: 5px 0;\">🌐 HTTP/JSON Protocol</h4>\n<code>POST /reset</code> | <code>POST /step</code> | <code>GET /state</code>\n</div>\n</td>\n</tr>\n<tr>\n<td colspan=\"3\" style=\"text-align: center; font-size: 32px; padding: 10px;\">↓</td>\n</tr>\n<tr>\n<td colspan=\"3\" style=\"background: linear-gradient(135deg, #ffe1f5 0%, #ffb3e6 100%); padding: 20px; border-radius: 10px; border: 3px solid #f093fb;\">\n<div style=\"text-align: center;\">\n<h4 style=\"margin: 5px 0;\">🐳 Docker Container (Server)</h4>\n⚡ FastAPI Server → 🎮 Environment Logic → 🎯 Game/Simulation\n</div>\n</td>\n</tr>\n</table>\n</div>\n\n### 📁 The Pattern - Every Environment Has:\n\n```\nsrc/envs/your_env/\n├── 📝 models.py         ← Type-safe contracts (Action, Observation, State)\n├── 📱 client.py         ← Client API (what you import)\n└── 🖥️ server/\n    ├── environment.py  ← Environment logic\n    ├── app.py          ← FastAPI server\n    └── Dockerfile      ← Container\n```\n\n### 🎮 Current Integrations\n\n<div style=\"display: flex; flex-wrap: wrap; gap: 10px; margin: 20px 0;\">\n    <div style=\"background: linear-gradient(135deg, #667eea 0%, #764ba2 100%); color: white; padding: 15px; border-radius: 10px; flex: 1; min-width: 200px;\">\n        <h4>🎯 OpenSpiel</h4>\n        <p>6 games from DeepMind</p>\n    </div>\n    <div style=\"background: linear-gradient(135deg, #f093fb 0%, #f5576c 100%); color: white; padding: 15px; border-radius: 10px; flex: 1; min-width: 200px;\">\n        <h4>📢 Echo</h4>\n        <p>Test environment</p>\n    </div>\n    <div style=\"background: linear-gradient(135deg, #4facfe 0%, #00f2fe 100%); color: white; padding: 15px; border-radius: 10px; flex: 1; min-width: 200px;\">\n        <h4>💻 Coding</h4>\n        <p>Python execution</p>\n    </div>\n    <div style=\"background: linear-gradient(135deg, #43e97b 0%, #38f9d7 100%); color: white; padding: 15px; border-radius: 10px; flex: 1; min-width: 200px;\">\n        <h4>🕹️ Atari</h4>\n        <p>Classic games</p>\n    </div>\n</div>\n\nLet's explore one integration to see how it all works...\n\n---"
+   "source": [
+    "---\n",
+    "\n",
+    "# Part 2: The Problem with Traditional RL 😤\n",
+    "\n",
+    "## Why Can't We Just Use OpenAI Gym?\n",
+    "\n",
+    "<table>\n",
+    "<tr>\n",
+    "<th>Problem</th>\n",
+    "<th>Traditional Approach</th>\n",
+    "<th>OpenEnv Solution</th>\n",
+    "</tr>\n",
+    "<tr>\n",
+    "<td><b>Type Safety</b></td>\n",
+    "<td>❌ <code>obs[0][3]</code> - what is this?</td>\n",
+    "<td>✅ <code>obs.info_state</code> - IDE knows!</td>\n",
+    "</tr>\n",
+    "<tr>\n",
+    "<td><b>Isolation</b></td>\n",
+    "<td>❌ Same process (can crash your training)</td>\n",
+    "<td>✅ Docker containers (fully isolated)</td>\n",
+    "</tr>\n",
+    "<tr>\n",
+    "<td><b>Deployment</b></td>\n",
+    "<td>❌ \"Works on my machine\"</td>\n",
+    "<td>✅ Same container everywhere</td>\n",
+    "</tr>\n",
+    "<tr>\n",
+    "<td><b>Scaling</b></td>\n",
+    "<td>❌ Hard to distribute</td>\n",
+    "<td>✅ Deploy to Kubernetes</td>\n",
+    "</tr>\n",
+    "<tr>\n",
+    "<td><b>Language</b></td>\n",
+    "<td>❌ Python only</td>\n",
+    "<td>✅ Any language (HTTP API)</td>\n",
+    "</tr>\n",
+    "</table>\n",
+    "\n",
+    "<div style=\"background-color: #d4edda; padding: 20px; border-left: 5px solid #28a745; margin: 20px 0;\">\n",
+    "\n",
+    "## 💡 The OpenEnv Philosophy\n",
+    "\n",
+    "**\"RL environments should be like microservices\"**\n",
+    "\n",
+    "- 🔒 **Isolated**: Run in containers\n",
+    "- 🌐 **Standard**: HTTP API, works everywhere\n",
+    "- 📦 **Versioned**: Docker images\n",
+    "- 🚀 **Scalable**: Deploy anywhere\n",
+    "- 🛡️ **Type-safe**: Know exactly what you're sending/receiving\n",
+    "\n",
+    "</div>"
+   ]
   },
   {
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "## ⚙️ Part 3: Setup\n",
+    "### The Architecture\n",
+    "\n",
+    "```\n",
+    "┌────────────────────────────────────────────────────────────┐\n",
+    "│  YOUR TRAINING CODE                                        │\n",
+    "│                                                            │\n",
+    "│  env = OpenSpielEnv(...)        ← Import the client      │\n",
+    "│  result = env.reset()           ← Type-safe!             │\n",
+    "│  result = env.step(action)      ← Type-safe!             │\n",
+    "│                                                            │\n",
+    "└─────────────────┬──────────────────────────────────────────┘\n",
+    "                  │\n",
+    "                  │  HTTP/JSON (Language-Agnostic)\n",
+    "                  │  POST /reset, POST /step, GET /state\n",
+    "                  │\n",
+    "┌─────────────────▼──────────────────────────────────────────┐\n",
+    "│  DOCKER CONTAINER                                          │\n",
+    "│                                                            │\n",
+    "│  ┌──────────────────────────────────────────────┐         │\n",
+    "│  │  FastAPI Server                              │         │\n",
+    "│  │  └─ Environment (reset, step, state)         │         │\n",
+    "│  │     └─ Your Game/Simulation Logic            │         │\n",
+    "│  └──────────────────────────────────────────────┘         │\n",
+    "│                                                            │\n",
+    "│  Isolated • Reproducible • Secure                          │\n",
+    "└────────────────────────────────────────────────────────────┘\n",
+    "```\n",
+    "\n",
+    "<div style=\"background-color: #e7f3ff; padding: 15px; border-left: 5px solid #0366d6; margin: 20px 0;\">\n",
+    "\n",
+    "**🎯 Key Insight**: You never see HTTP details - just clean Python methods!\n",
+    "\n",
+    "```python\n",
+    "env.reset()    # Under the hood: HTTP POST to /reset\n",
+    "env.step(...)  # Under the hood: HTTP POST to /step\n",
+    "env.state()    # Under the hood: HTTP GET to /state\n",
+    "```\n",
+    "\n",
+    "</div>"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "---\n",
+    "\n",
+    "# Part 3: Setup 🛠️\n",
+    "\n",
+    "<div style=\"background-color: #f8f9fa; padding: 15px; border-radius: 5px; margin: 20px 0;\">\n",
+    "\n",
+    "**Running in Colab?** This cell will clone OpenEnv and install dependencies automatically.\n",
+    "\n",
+    "**Running locally?** Make sure you're in the OpenEnv directory.\n",
     "\n",
-    "<div align=\"center\">\n",
-    "    <h3>🔧 Getting Started</h3>\n",
     "</div>"
    ]
   },
@@ -118,25 +286,33 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "# Check if in Colab\n",
+    "# Detect environment\n",
     "try:\n",
     "    import google.colab\n",
     "    IN_COLAB = True\n",
+    "    print(\"🌐 Running in Google Colab\")\n",
     "except ImportError:\n",
     "    IN_COLAB = False\n",
+    "    print(\"💻 Running locally\")\n",
     "\n",
     "if IN_COLAB:\n",
-    "    !git clone https://github.com/meta-pytorch/OpenEnv.git\n",
+    "    print(\"\\n📦 Cloning OpenEnv repository...\")\n",
+    "    !git clone https://github.com/meta-pytorch/OpenEnv.git > /dev/null 2>&1\n",
     "    %cd OpenEnv\n",
+    "    \n",
+    "    print(\"📚 Installing dependencies...\")\n",
     "    !pip install -q fastapi uvicorn requests\n",
+    "    \n",
     "    import sys\n",
     "    sys.path.insert(0, './src')\n",
-    "    print(\"✅ OpenEnv ready!\")\n",
+    "    print(\"\\n✅ Setup complete!\")\n",
     "else:\n",
     "    import sys\n",
     "    from pathlib import Path\n",
-    "    sys.path.insert(0, str(Path.cwd() / 'src'))\n",
-    "    print(\"✅ Using local OpenEnv\")"
+    "    sys.path.insert(0, str(Path.cwd().parent / 'src'))\n",
+    "    print(\"✅ Using local OpenEnv\")\n",
+    "\n",
+    "print(\"\\n🚀 Ready to explore OpenEnv!\")"
    ]
   },
   {
@@ -145,13 +321,29 @@
    "source": [
     "---\n",
     "\n",
-    "## 🔍 Part 4: Exploring OpenEnv's Structure\n",
+    "# Part 4: The OpenEnv Pattern 🏗️\n",
+    "\n",
+    "<div style=\"background-color: #f0f7ff; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n",
+    "\n",
+    "## Every OpenEnv Environment Has 3 Components:\n",
+    "\n",
+    "```\n",
+    "src/envs/your_env/\n",
+    "├── 📝 models.py          ← Type-safe contracts\n",
+    "│                           (Action, Observation, State)\n",
+    "│\n",
+    "├── 📱 client.py          ← What YOU import\n",
+    "│                           (HTTPEnvClient implementation)\n",
+    "│\n",
+    "└── 🖥️  server/\n",
+    "    ├── environment.py    ← Game/simulation logic\n",
+    "    ├── app.py            ← FastAPI server\n",
+    "    └── Dockerfile        ← Container definition\n",
+    "```\n",
     "\n",
-    "<div align=\"center\">\n",
-    "    <h3>Let's look at the actual code!</h3>\n",
     "</div>\n",
     "\n",
-    "### 🧩 The Base Classes"
+    "Let's explore the actual OpenEnv code to see how this works:"
    ]
   },
   {
@@ -160,49 +352,49 @@
    "metadata": {},
    "outputs": [],
    "source": [
+    "# Import OpenEnv's core abstractions\n",
     "from core.env_server import Environment, Action, Observation, State\n",
     "from core.http_env_client import HTTPEnvClient\n",
     "\n",
-    "print(\"=\" * 70)\n",
-    "print(\"🔧 OpenEnv Core Abstractions\")\n",
-    "print(\"=\" * 70)\n",
+    "print(\"=\"*70)\n",
+    "print(\"   🧩 OPENENV CORE ABSTRACTIONS\")\n",
+    "print(\"=\"*70)\n",
     "\n",
     "print(\"\"\"\n",
     "🖥️  SERVER SIDE (runs in Docker):\n",
     "\n",
-    "  class Environment(ABC):\n",
-    "      '''Base class for all environment implementations'''\n",
-    "      \n",
-    "      @abstractmethod\n",
-    "      def reset(self) -> Observation:\n",
-    "          '''Start new episode'''\n",
-    "      \n",
-    "      @abstractmethod\n",
-    "      def step(self, action: Action) -> Observation:\n",
-    "          '''Execute action'''\n",
-    "      \n",
-    "      @property\n",
-    "      def state(self) -> State:\n",
-    "          '''Episode metadata'''\n",
+    "    class Environment(ABC):\n",
+    "        '''Base class for all environment implementations'''\n",
+    "        \n",
+    "        @abstractmethod\n",
+    "        def reset(self) -> Observation:\n",
+    "            '''Start new episode'''\n",
+    "        \n",
+    "        @abstractmethod\n",
+    "        def step(self, action: Action) -> Observation:\n",
+    "            '''Execute action, return observation'''\n",
+    "        \n",
+    "        @property\n",
+    "        def state(self) -> State:\n",
+    "            '''Get episode metadata'''\n",
     "\n",
     "📱 CLIENT SIDE (your training code):\n",
     "\n",
-    "  class HTTPEnvClient(ABC):\n",
-    "      '''Base class for HTTP clients'''\n",
-    "      \n",
-    "      def reset(self) -> StepResult:\n",
-    "          # HTTP POST to /reset\n",
-    "      \n",
-    "      def step(self, action) -> StepResult:\n",
-    "          # HTTP POST to /step\n",
-    "      \n",
-    "      def state(self) -> State:\n",
-    "          # HTTP GET to /state\n",
+    "    class HTTPEnvClient(ABC):\n",
+    "        '''Base class for HTTP clients'''\n",
+    "        \n",
+    "        def reset(self) -> StepResult:\n",
+    "            # HTTP POST /reset\n",
+    "        \n",
+    "        def step(self, action) -> StepResult:\n",
+    "            # HTTP POST /step\n",
+    "        \n",
+    "        def state(self) -> State:\n",
+    "            # HTTP GET /state\n",
     "\"\"\")\n",
     "\n",
-    "print(\"=\" * 70)\n",
-    "print(\"💡 Same interface, communication via HTTP!\")\n",
-    "print(\"=\" * 70)"
+    "print(\"=\"*70)\n",
+    "print(\"\\n💡 Same interface on both sides - communication via HTTP!\\n\")"
    ]
   },
   {
@@ -211,35 +403,42 @@
    "source": [
     "---\n",
     "\n",
-    "## 🔌 Part 5: Example Integration - OpenSpiel\n",
+    "# Part 5: Example Integration - OpenSpiel 🎮\n",
     "\n",
-    "<div align=\"center\">\n",
-    "    <img src=\"https://img.shields.io/badge/OpenSpiel-DeepMind-red?style=for-the-badge\" />\n",
-    "    <h3>70+ Game Environments</h3>\n",
-    "</div>\n",
+    "<div style=\"background-color: #fff3e0; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n",
     "\n",
-    "### 🎮 What is OpenSpiel?\n",
+    "## What is OpenSpiel?\n",
     "\n",
-    "OpenSpiel is a **library from DeepMind** with 70+ game environments for RL research.\n",
+    "**OpenSpiel** is a library from DeepMind with **70+ game environments** for RL research.\n",
     "\n",
-    "### 🎯 Our Integration\n",
+    "## OpenEnv's Integration\n",
     "\n",
-    "**OpenEnv wraps 6 OpenSpiel games** following our standard pattern:\n",
+    "We've wrapped **6 OpenSpiel games** following the OpenEnv pattern:\n",
     "\n",
     "<table>\n",
     "<tr>\n",
-    "<td align=\"center\">🎯<br><b>Catch</b><br><sub>Catch falling ball</sub></td>\n",
-    "<td align=\"center\">❌<br><b>Tic-Tac-Toe</b><br><sub>Classic 3×3</sub></td>\n",
-    "<td align=\"center\">🃏<br><b>Kuhn Poker</b><br><sub>Imperfect info</sub></td>\n",
-    "</tr>\n",
-    "<tr>\n",
-    "<td align=\"center\">🏔️<br><b>Cliff Walking</b><br><sub>Grid navigation</sub></td>\n",
-    "<td align=\"center\">🔢<br><b>2048</b><br><sub>Tile puzzle</sub></td>\n",
-    "<td align=\"center\">🂡<br><b>Blackjack</b><br><sub>Card game</sub></td>\n",
+    "<td width=\"50%\">\n",
+    "\n",
+    "**🎯 Single-Player**\n",
+    "1. **Catch** - Catch falling ball\n",
+    "2. **Cliff Walking** - Navigate grid\n",
+    "3. **2048** - Tile puzzle\n",
+    "4. **Blackjack** - Card game\n",
+    "\n",
+    "</td>\n",
+    "<td width=\"50%\">\n",
+    "\n",
+    "**👥 Multi-Player**\n",
+    "5. **Tic-Tac-Toe** - Classic 3×3\n",
+    "6. **Kuhn Poker** - Imperfect info poker\n",
+    "\n",
+    "</td>\n",
     "</tr>\n",
     "</table>\n",
     "\n",
-    "Let's see how the integration is structured:"
+    "This shows how OpenEnv can wrap **any** existing RL library!\n",
+    "\n",
+    "</div>"
    ]
   },
   {
@@ -248,6 +447,7 @@
    "metadata": {},
    "outputs": [],
    "source": [
+    "# Import OpenSpiel integration models\n",
     "from envs.openspiel_env.models import (\n",
     "    OpenSpielAction,\n",
     "    OpenSpielObservation,\n",
@@ -255,35 +455,50 @@
     ")\n",
     "from dataclasses import fields\n",
     "\n",
-    "print(\"=\" * 70)\n",
-    "print(\"🔒 OpenSpiel Integration - Type-Safe Models\")\n",
-    "print(\"=\" * 70)\n",
+    "print(\"=\"*70)\n",
+    "print(\"   🎮 OPENSPIEL INTEGRATION - TYPE-SAFE MODELS\")\n",
+    "print(\"=\"*70)\n",
     "\n",
     "print(\"\\n📤 OpenSpielAction (what you send):\")\n",
+    "print(\"   \" + \"─\" * 64)\n",
     "for field in fields(OpenSpielAction):\n",
-    "    print(f\"   • {field.name}: {field.type}\")\n",
+    "    print(f\"   • {field.name:20s} : {field.type}\")\n",
     "\n",
     "print(\"\\n📥 OpenSpielObservation (what you receive):\")\n",
+    "print(\"   \" + \"─\" * 64)\n",
     "for field in fields(OpenSpielObservation):\n",
-    "    print(f\"   • {field.name}: {field.type}\")\n",
+    "    print(f\"   • {field.name:20s} : {field.type}\")\n",
     "\n",
     "print(\"\\n📊 OpenSpielState (episode metadata):\")\n",
+    "print(\"   \" + \"─\" * 64)\n",
     "for field in fields(OpenSpielState):\n",
-    "    print(f\"   • {field.name}: {field.type}\")\n",
-    "\n",
-    "print(\"\\n\" + \"=\" * 70)\n",
-    "print(\"💡 This is how OpenEnv integrates external libraries:\")\n",
-    "print(\"   1. Wrap in standardized types\")\n",
-    "print(\"   2. Expose via HTTPEnvClient\")\n",
-    "print(\"   3. Package in Docker\")\n",
-    "print(\"=\" * 70)"
+    "    print(f\"   • {field.name:20s} : {field.type}\")\n",
+    "\n",
+    "print(\"\\n\" + \"=\"*70)\n",
+    "print(\"\\n💡 Type safety means:\")\n",
+    "print(\"   ✅ Your IDE autocompletes these fields\")\n",
+    "print(\"   ✅ Typos are caught before running\")\n",
+    "print(\"   ✅ Refactoring is safe\")\n",
+    "print(\"   ✅ Self-documenting code\\n\")"
    ]
   },
   {
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "### 🔧 How the Client Works"
+    "### How the Client Works\n",
+    "\n",
+    "<div style=\"background-color: #e7f3ff; padding: 15px; border-radius: 5px; margin: 20px 0;\">\n",
+    "\n",
+    "The client **inherits from HTTPEnvClient** and implements 3 methods:\n",
+    "\n",
+    "1. `_step_payload()` - Convert action → JSON\n",
+    "2. `_parse_result()` - Parse JSON → typed observation  \n",
+    "3. `_parse_state()` - Parse JSON → state\n",
+    "\n",
+    "That's it! The base class handles all HTTP communication.\n",
+    "\n",
+    "</div>"
    ]
   },
   {
@@ -294,41 +509,47 @@
    "source": [
     "from envs.openspiel_env.client import OpenSpielEnv\n",
     "\n",
-    "print(\"=\" * 70)\n",
-    "print(\"📱 OpenSpielEnv Client (HTTPEnvClient Implementation)\")\n",
-    "print(\"=\" * 70)\n",
+    "print(\"=\"*70)\n",
+    "print(\"   🔌 HOW OPENENV WRAPS OPENSPIEL\")\n",
+    "print(\"=\"*70)\n",
     "\n",
     "print(\"\"\"\n",
-    "How OpenEnv wraps OpenSpiel:\n",
-    "\n",
     "class OpenSpielEnv(HTTPEnvClient[OpenSpielAction, OpenSpielObservation]):\n",
     "    \n",
     "    def _step_payload(self, action: OpenSpielAction) -> dict:\n",
-    "        '''Convert action to JSON for HTTP request'''\n",
+    "        '''Convert typed action to JSON for HTTP'''\n",
     "        return {\n",
     "            \"action_id\": action.action_id,\n",
     "            \"game_name\": action.game_name,\n",
     "        }\n",
     "    \n",
     "    def _parse_result(self, payload: dict) -> StepResult:\n",
-    "        '''Parse HTTP response into typed observation'''\n",
+    "        '''Parse HTTP JSON response into typed observation'''\n",
     "        return StepResult(\n",
     "            observation=OpenSpielObservation(...),\n",
     "            reward=payload['reward'],\n",
     "            done=payload['done']\n",
     "        )\n",
     "\n",
-    "Usage (same for ALL OpenEnv environments):\n",
+    "\"\"\")\n",
     "\n",
+    "print(\"─\" * 70)\n",
+    "print(\"\\n✨ Usage (works for ALL OpenEnv environments):\")\n",
+    "print(\"\"\"\n",
     "  env = OpenSpielEnv(base_url=\"http://localhost:8000\")\n",
-    "  result = env.reset()  # Returns StepResult[OpenSpielObservation]\n",
+    "  \n",
+    "  result = env.reset()\n",
+    "  # Returns StepResult[OpenSpielObservation] - Type safe!\n",
+    "  \n",
     "  result = env.step(OpenSpielAction(action_id=2, game_name=\"catch\"))\n",
-    "  state = env.state()   # Returns OpenSpielState\n",
+    "  # Type checker knows this is valid!\n",
+    "  \n",
+    "  state = env.state()\n",
+    "  # Returns OpenSpielState\n",
     "\"\"\")\n",
     "\n",
-    "print(\"=\" * 70)\n",
-    "print(\"💡 This pattern works for ANY environment you want to wrap!\")\n",
-    "print(\"=\" * 70)"
+    "print(\"─\" * 70)\n",
+    "print(\"\\n🎯 This pattern works for ANY environment you want to wrap!\\n\")"
    ]
   },
   {
@@ -337,25 +558,68 @@
    "source": [
     "---\n",
     "\n",
-    "## 🎯 Part 6: Interactive Demo - See It In Action\n",
+    "<div style=\"text-align: center; background: linear-gradient(135deg, #667eea 0%, #764ba2 100%); color: white; padding: 30px; border-radius: 15px; margin: 30px 0;\">\n",
     "\n",
-    "<div align=\"center\">\n",
-    "    <h2>🎮 Let's Build the Catch Game!</h2>\n",
-    "    <img width=\"200\" src=\"https://user-images.githubusercontent.com/placeholder-catch-game.gif\" onerror=\"this.style.display='none'\" />\n",
-    "</div>\n",
+    "# 🎮 Part 6: Interactive Demo\n",
+    "\n",
+    "### Now let's BUILD something!\n",
     "\n",
-    "### 🎲 The Game Rules:\n",
+    "We'll create a Catch game following OpenEnv patterns,<br>\n",
+    "then watch 4 different AI policies compete.\n",
+    "\n",
+    "</div>"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## The Game: Catch 🔴🏓\n",
     "\n",
     "<table>\n",
     "<tr>\n",
-    "<td width=\"25%\" align=\"center\">📐<br><b>5×5 Grid</b></td>\n",
-    "<td width=\"25%\" align=\"center\">🔴<br><b>Ball falls</b></td>\n",
-    "<td width=\"25%\" align=\"center\">🏓<br><b>Catch it!</b></td>\n",
-    "<td width=\"25%\" align=\"center\">🎁<br><b>+1 reward</b></td>\n",
+    "<td width=\"40%\" style=\"text-align: center;\">\n",
+    "\n",
+    "```\n",
+    "⬜ ⬜ 🔴 ⬜ ⬜   \n",
+    "⬜ ⬜ ⬜ ⬜ ⬜   Ball\n",
+    "⬜ ⬜ ⬜ ⬜ ⬜   falls\n",
+    "⬜ ⬜ ⬜ ⬜ ⬜   down\n",
+    "⬜ ⬜ 🏓 ⬜ ⬜   \n",
+    "     Paddle\n",
+    "```\n",
+    "\n",
+    "</td>\n",
+    "<td width=\"60%\">\n",
+    "\n",
+    "**Rules:**\n",
+    "- 5×5 grid\n",
+    "- Ball falls from random column\n",
+    "- Move paddle to catch it\n",
+    "\n",
+    "**Actions:**\n",
+    "- `0` = Move LEFT ⬅️\n",
+    "- `1` = STAY 🛑\n",
+    "- `2` = Move RIGHT ➡️\n",
+    "\n",
+    "**Reward:**\n",
+    "- `+1` if caught 🎉\n",
+    "- `0` if missed 😢\n",
+    "\n",
+    "</td>\n",
     "</tr>\n",
     "</table>\n",
     "\n",
-    "**Actions**: 0=LEFT ⬅️ | 1=STAY ⏸️ | 2=RIGHT ➡️"
+    "<div style=\"background-color: #d4edda; padding: 15px; border-left: 5px solid #28a745; margin: 20px 0;\">\n",
+    "\n",
+    "**🎯 Why This Game?**\n",
+    "- Simple rules (easy to understand)\n",
+    "- Visual (see what's happening)\n",
+    "- Fast episodes (~5 steps)\n",
+    "- Clear success/failure\n",
+    "- Perfect for testing policies!\n",
+    "\n",
+    "</div>"
    ]
   },
   {
@@ -368,24 +632,38 @@
     "from dataclasses import dataclass\n",
     "from typing import List, Tuple\n",
     "\n",
-    "# Define types (following OpenEnv pattern)\n",
+    "# ============================================================================\n",
+    "# MODELS - Type-safe contracts (following OpenEnv pattern)\n",
+    "# ============================================================================\n",
+    "\n",
     "@dataclass\n",
     "class CatchObservation:\n",
-    "    \"\"\"Type-safe observation.\"\"\"\n",
-    "    info_state: List[float]\n",
-    "    legal_actions: List[int]\n",
-    "    done: bool\n",
-    "    reward: float\n",
+    "    \"\"\"Type-safe observation following OpenEnv Observation base class.\"\"\"\n",
+    "    info_state: List[float]      # Grid as flat array\n",
+    "    legal_actions: List[int]     # [0, 1, 2] always\n",
+    "    done: bool                   # Episode finished?\n",
+    "    reward: float                # +1 or 0\n",
+    "    # Extra fields for visualization\n",
     "    ball_position: Tuple[int, int]\n",
     "    paddle_position: int\n",
     "\n",
     "\n",
+    "# ============================================================================\n",
+    "# ENVIRONMENT - Server-side logic (following OpenEnv Environment pattern)\n",
+    "# ============================================================================\n",
+    "\n",
     "class CatchEnvironment:\n",
     "    \"\"\"\n",
-    "    Catch game following OpenEnv Environment pattern.\n",
+    "    Catch game following OpenEnv's Environment pattern.\n",
+    "    \n",
+    "    In production:\n",
+    "      • Runs in Docker container\n",
+    "      • Accessed via HTTPEnvClient\n",
+    "      • Exposed via FastAPI server\n",
     "    \n",
-    "    In production: This would run in Docker, accessed via HTTPEnvClient\n",
-    "    For demo: We run it locally to see the internals\n",
+    "    For this demo:\n",
+    "      • We run it locally to see internals\n",
+    "      • But the structure is identical!\n",
     "    \"\"\"\n",
     "    \n",
     "    def __init__(self, grid_size=5):\n",
@@ -400,14 +678,21 @@
     "        return self._make_observation()\n",
     "    \n",
     "    def step(self, action: int) -> CatchObservation:\n",
-    "        \"\"\"Execute action (implements Environment.step()).\"\"\"\n",
+    "        \"\"\"Execute action (implements Environment.step()).\n",
+    "        \n",
+    "        Args:\n",
+    "            action: 0=LEFT, 1=STAY, 2=RIGHT\n",
+    "        \"\"\"\n",
+    "        # Move paddle\n",
     "        if action == 0 and self.paddle_col > 0:\n",
     "            self.paddle_col -= 1\n",
     "        elif action == 2 and self.paddle_col < self.grid_size - 1:\n",
     "            self.paddle_col += 1\n",
     "        \n",
+    "        # Move ball down\n",
     "        self.ball_row += 1\n",
     "        \n",
+    "        # Check if episode done\n",
     "        if self.ball_row >= self.grid_size - 1:\n",
     "            self.done = True\n",
     "            reward = 1.0 if self.ball_col == self.paddle_col else 0.0\n",
@@ -417,11 +702,13 @@
     "        return self._make_observation(reward)\n",
     "    \n",
     "    def _make_observation(self, reward=0.0) -> CatchObservation:\n",
+    "        \"\"\"Create type-safe observation.\"\"\"\n",
+    "        # Flatten grid to vector (like real RL environments do)\n",
     "        info_state = [0.0] * (self.grid_size * self.grid_size)\n",
     "        ball_idx = self.ball_row * self.grid_size + self.ball_col\n",
     "        paddle_idx = (self.grid_size - 1) * self.grid_size + self.paddle_col\n",
-    "        info_state[ball_idx] = 1.0\n",
-    "        info_state[paddle_idx] = 0.5\n",
+    "        info_state[ball_idx] = 1.0      # Ball = 1.0\n",
+    "        info_state[paddle_idx] = 0.5    # Paddle = 0.5\n",
     "        \n",
     "        return CatchObservation(\n",
     "            info_state=info_state,\n",
@@ -433,6 +720,7 @@
     "        )\n",
     "    \n",
     "    def render(self):\n",
+    "        \"\"\"Visualize current state.\"\"\"\n",
     "        for row in range(self.grid_size):\n",
     "            line = \"  \"\n",
     "            for col in range(self.grid_size):\n",
@@ -444,17 +732,21 @@
     "                    line += \"⬜ \"\n",
     "            print(line)\n",
     "\n",
+    "\n",
     "print(\"✅ Environment created following OpenEnv pattern!\")\n",
-    "print(\"   🔧 Implements: reset(), step()\")\n",
-    "print(\"   🔒 Returns: Type-safe observations\")\n",
-    "print(\"   🐳 In production: Would run in Docker + FastAPI\")"
+    "print(\"\\n📋 What we just built:\")\n",
+    "print(\"   • reset() → CatchObservation (type-safe!)\")\n",
+    "print(\"   • step(action) → CatchObservation (type-safe!)\")\n",
+    "print(\"   • render() → Visual display\")\n",
+    "print(\"\\n🚀 In production: This would run in Docker + FastAPI\")\n",
+    "print(\"   But the structure is EXACTLY the same!\")"
    ]
   },
   {
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "### 🧪 Test It"
+    "### Test the Environment"
    ]
   },
   {
@@ -463,21 +755,65 @@
    "metadata": {},
    "outputs": [],
    "source": [
+    "# Create environment\n",
     "env = CatchEnvironment()\n",
     "obs = env.reset()\n",
     "\n",
-    "print(\"🎮 Initial State:\")\n",
-    "print(\"=\" * 50)\n",
+    "print(\"=\"*60)\n",
+    "print(\"   🎮 INITIAL STATE\")\n",
+    "print(\"=\"*60 + \"\\n\")\n",
     "env.render()\n",
-    "print(f\"\\n🔴 Ball: column {obs.ball_position[1]}\")\n",
-    "print(f\"🏓 Paddle: column {obs.paddle_position}\")\n",
-    "print(f\"⚡ Legal actions: {obs.legal_actions} (0=LEFT, 1=STAY, 2=RIGHT)\")"
+    "print(f\"\\n🔴 Ball at: column {obs.ball_position[1]}\")\n",
+    "print(f\"🏓 Paddle at: column {obs.paddle_position}\")\n",
+    "print(f\"\\n📊 Observation:\")\n",
+    "print(f\"   • Legal actions: {obs.legal_actions}\")\n",
+    "print(f\"   • Info state size: {len(obs.info_state)} (5×5 grid flattened)\")\n",
+    "print(f\"   • Done: {obs.done}\")\n",
+    "print(f\"   • Reward: {obs.reward}\")"
    ]
   },
   {
    "cell_type": "markdown",
    "metadata": {},
-   "source": "---\n\n## 🤖 Part 7: Different Policies\n\n<div align=\"center\">\n    <h3>A policy maps: Observation → Action</h3>\n</div>\n\nLet's test 4 strategies from dumb to smart!\n\n<div style=\"margin: 30px auto; max-width: 600px;\">\n<table style=\"width: 100%; border: none;\">\n<tr>\n<td rowspan=\"4\" style=\"background: #fff4e1; padding: 20px; border-radius: 10px; text-align: center; font-size: 18px; border: 3px solid #ffc107; vertical-align: middle;\">\n<b>👀 Observation</b><br><small>Ball & paddle positions</small>\n</td>\n<td style=\"font-size: 32px; text-align: center; padding: 0 20px;\">→</td>\n<td rowspan=\"4\" style=\"background: #ffe1f5; padding: 20px; border-radius: 10px; text-align: center; font-size: 18px; border: 3px solid #f093fb; vertical-align: middle;\">\n<b>🤖 Policy</b><br><small>Decision maker</small>\n</td>\n<td style=\"font-size: 24px; text-align: center; padding: 0 15px;\">→</td>\n<td style=\"background: #e1ffe1; padding: 10px; border-radius: 8px; text-align: center; border: 2px solid #43e97b;\">\n🎲 Random\n</td>\n</tr>\n<tr>\n<td style=\"font-size: 24px; text-align: center; padding: 0 15px;\"></td>\n<td style=\"font-size: 24px; text-align: center; padding: 0 15px;\">→</td>\n<td style=\"background: #ffe1e1; padding: 10px; border-radius: 8px; text-align: center; border: 2px solid #f5576c;\">\n⏸️ Always Stay\n</td>\n</tr>\n<tr>\n<td style=\"font-size: 24px; text-align: center; padding: 0 15px;\"></td>\n<td style=\"font-size: 24px; text-align: center; padding: 0 15px;\">→</td>\n<td style=\"background: #e1f5ff; padding: 10px; border-radius: 8px; text-align: center; border: 2px solid #4facfe;\">\n🎯 Smart\n</td>\n</tr>\n<tr>\n<td style=\"font-size: 24px; text-align: center; padding: 0 15px;\"></td>\n<td style=\"font-size: 24px; text-align: center; padding: 0 15px;\">→</td>\n<td style=\"background: #f5e1ff; padding: 10px; border-radius: 8px; text-align: center; border: 2px solid #b388ff;\">\n🧠 Learning\n</td>\n</tr>\n</table>\n</div>"
+   "source": [
+    "---\n",
+    "\n",
+    "# Part 7: Four Policies 🤖\n",
+    "\n",
+    "<div style=\"background-color: #f8f9fa; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n",
+    "\n",
+    "## Let's test 4 different AI strategies:\n",
+    "\n",
+    "<table>\n",
+    "<tr>\n",
+    "<th width=\"25%\">Policy</th>\n",
+    "<th width=\"50%\">Strategy</th>\n",
+    "<th width=\"25%\">Expected Performance</th>\n",
+    "</tr>\n",
+    "<tr>\n",
+    "<td><b>🎲 Random</b></td>\n",
+    "<td>Pick random action every step</td>\n",
+    "<td>~20% (pure luck)</td>\n",
+    "</tr>\n",
+    "<tr>\n",
+    "<td><b>🛑 Always Stay</b></td>\n",
+    "<td>Never move, hope ball lands in center</td>\n",
+    "<td>~20% (terrible!)</td>\n",
+    "</tr>\n",
+    "<tr>\n",
+    "<td><b>🧠 Smart</b></td>\n",
+    "<td>Move paddle toward ball</td>\n",
+    "<td>100% (optimal!)</td>\n",
+    "</tr>\n",
+    "<tr>\n",
+    "<td><b>📈 Learning</b></td>\n",
+    "<td>Start random, learn smart strategy</td>\n",
+    "<td>~85% (improves over time)</td>\n",
+    "</tr>\n",
+    "</table>\n",
+    "\n",
+    "</div>"
+   ]
   },
   {
    "cell_type": "code",
@@ -485,55 +821,81 @@
    "metadata": {},
    "outputs": [],
    "source": [
+    "# ============================================================================\n",
+    "# POLICIES - Different AI strategies\n",
+    "# ============================================================================\n",
+    "\n",
     "class RandomPolicy:\n",
-    "    name = \"🎲 Random\"\n",
-    "    def select_action(self, obs): \n",
+    "    \"\"\"Baseline: Pure random guessing.\"\"\"\n",
+    "    name = \"🎲 Random Guesser\"\n",
+    "    \n",
+    "    def select_action(self, obs: CatchObservation) -> int:\n",
     "        return random.choice(obs.legal_actions)\n",
     "\n",
+    "\n",
     "class AlwaysStayPolicy:\n",
-    "    name = \"⏸️ Always Stay\"\n",
-    "    def select_action(self, obs): \n",
-    "        return 1\n",
+    "    \"\"\"Bad strategy: Never moves.\"\"\"\n",
+    "    name = \"🛑 Always Stay\"\n",
+    "    \n",
+    "    def select_action(self, obs: CatchObservation) -> int:\n",
+    "        return 1  # STAY\n",
+    "\n",
     "\n",
     "class SmartPolicy:\n",
-    "    name = \"🎯 Smart Heuristic\"\n",
-    "    def select_action(self, obs):\n",
+    "    \"\"\"Optimal: Move paddle toward ball.\"\"\"\n",
+    "    name = \"🧠 Smart Heuristic\"\n",
+    "    \n",
+    "    def select_action(self, obs: CatchObservation) -> int:\n",
     "        ball_col = obs.ball_position[1]\n",
     "        paddle_col = obs.paddle_position\n",
-    "        if paddle_col < ball_col: return 2\n",
-    "        elif paddle_col > ball_col: return 0\n",
-    "        else: return 1\n",
+    "        \n",
+    "        if paddle_col < ball_col:\n",
+    "            return 2  # Move RIGHT\n",
+    "        elif paddle_col > ball_col:\n",
+    "            return 0  # Move LEFT\n",
+    "        else:\n",
+    "            return 1  # STAY (already aligned)\n",
+    "\n",
     "\n",
     "class LearningPolicy:\n",
-    "    name = \"🧠 Learning Agent\"\n",
+    "    \"\"\"Simulated RL: Epsilon-greedy exploration.\"\"\"\n",
+    "    name = \"📈 Learning Agent\"\n",
+    "    \n",
     "    def __init__(self):\n",
     "        self.steps = 0\n",
     "    \n",
-    "    def select_action(self, obs):\n",
+    "    def select_action(self, obs: CatchObservation) -> int:\n",
     "        self.steps += 1\n",
+    "        \n",
+    "        # Decay exploration rate over time\n",
     "        epsilon = max(0.1, 1.0 - (self.steps / 100))\n",
     "        \n",
     "        if random.random() < epsilon:\n",
+    "            # Explore: random action\n",
     "            return random.choice(obs.legal_actions)\n",
     "        else:\n",
+    "            # Exploit: use smart strategy\n",
     "            ball_col = obs.ball_position[1]\n",
     "            paddle_col = obs.paddle_position\n",
-    "            if paddle_col < ball_col: return 2\n",
-    "            elif paddle_col > ball_col: return 0\n",
-    "            else: return 1\n",
-    "\n",
-    "print(\"✅ 4 Policies created!\")\n",
-    "print(\"   🎲 Random - Baseline\")\n",
-    "print(\"   ⏸️  Always Stay - Bad strategy\")\n",
-    "print(\"   🎯 Smart - Optimal heuristic\")\n",
-    "print(\"   🧠 Learning - Simulated RL\")"
+    "            if paddle_col < ball_col:\n",
+    "                return 2\n",
+    "            elif paddle_col > ball_col:\n",
+    "                return 0\n",
+    "            else:\n",
+    "                return 1\n",
+    "\n",
+    "\n",
+    "print(\"✅ 4 Policies created!\\n\")\n",
+    "policies = [RandomPolicy(), AlwaysStayPolicy(), SmartPolicy(), LearningPolicy()]\n",
+    "for i, policy in enumerate(policies, 1):\n",
+    "    print(f\"   {i}. {policy.name}\")"
    ]
   },
   {
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "### 👀 Watch Them Play"
+    "### Watch a Policy Play!"
    ]
   },
   {
@@ -545,46 +907,85 @@
     "import time\n",
     "\n",
     "def run_episode(env, policy, visualize=True, delay=0.4):\n",
+    "    \"\"\"Run one episode with a policy.\"\"\"\n",
+    "    \n",
+    "    # RESET\n",
     "    obs = env.reset()\n",
     "    \n",
     "    if visualize:\n",
-    "        print(f\"\\n{'='*50}\")\n",
-    "        print(f\"🤖 Policy: {policy.name} | 🔴 Ball: col {obs.ball_position[1]}\")\n",
-    "        print('='*50 + '\\n')\n",
+    "        print(f\"\\n{'='*60}\")\n",
+    "        print(f\"   🎮 {policy.name}\")\n",
+    "        print(f\"   🔴 Ball will fall at column: {obs.ball_position[1]}\")\n",
+    "        print('='*60 + '\\n')\n",
     "        env.render()\n",
     "        time.sleep(delay)\n",
     "    \n",
     "    total_reward = 0\n",
     "    step = 0\n",
+    "    action_names = [\"⬅️  LEFT\", \"🛑 STAY\", \"➡️  RIGHT\"]\n",
     "    \n",
+    "    # THE RL LOOP\n",
     "    while not obs.done:\n",
+    "        # 1. Policy chooses action\n",
     "        action = policy.select_action(obs)\n",
+    "        \n",
+    "        # 2. Environment executes\n",
     "        obs = env.step(action)\n",
+    "        \n",
+    "        # 3. Collect reward\n",
     "        total_reward += obs.reward\n",
     "        \n",
     "        if visualize:\n",
-    "            actions = [\"⬅️ LEFT\", \"⏸️ STAY\", \"➡️ RIGHT\"]\n",
-    "            print(f\"\\n⚡ Step {step + 1}: {actions[action]}\")\n",
+    "            print(f\"\\n📍 Step {step + 1}: {action_names[action]}\")\n",
     "            env.render()\n",
     "            time.sleep(delay)\n",
     "        \n",
     "        step += 1\n",
     "    \n",
     "    if visualize:\n",
-    "        print(f\"\\n{'🎉 CAUGHT!' if total_reward > 0 else '😢 MISSED'} Reward: {total_reward}\")\n",
+    "        result = \"🎉 CAUGHT!\" if total_reward > 0 else \"😢 MISSED\"\n",
+    "        print(f\"\\n{'='*60}\")\n",
+    "        print(f\"   {result} Reward: {total_reward}\")\n",
+    "        print('='*60)\n",
     "    \n",
     "    return total_reward > 0\n",
     "\n",
-    "# Demo\n",
+    "\n",
+    "# Demo: Watch Smart Policy in action\n",
     "env = CatchEnvironment()\n",
-    "run_episode(env, SmartPolicy(), visualize=True, delay=0.3)"
+    "policy = SmartPolicy()\n",
+    "run_episode(env, policy, visualize=True, delay=0.4)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "<div style=\"background-color: #fff3cd; padding: 15px; border-left: 5px solid #ffc107; margin: 20px 0;\">\n",
+    "\n",
+    "**💡 Try changing the policy!**\n",
+    "\n",
+    "Replace `SmartPolicy()` with:\n",
+    "- `RandomPolicy()` - Watch it fail!\n",
+    "- `AlwaysStayPolicy()` - Usually fails\n",
+    "- `LearningPolicy()` - Gets better over time\n",
+    "\n",
+    "</div>"
    ]
   },
   {
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "### 📊 Compare All Policies"
+    "---\n",
+    "\n",
+    "# Part 8: Policy Competition! 🏆\n",
+    "\n",
+    "<div style=\"background-color: #e7f3ff; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n",
+    "\n",
+    "Let's run **50 episodes** for each policy and see who wins!\n",
+    "\n",
+    "</div>"
    ]
   },
   {
@@ -594,35 +995,69 @@
    "outputs": [],
    "source": [
     "def evaluate_policies(num_episodes=50):\n",
-    "    policies = [RandomPolicy(), AlwaysStayPolicy(), SmartPolicy(), LearningPolicy()]\n",
+    "    \"\"\"Compare all policies over many episodes.\"\"\"\n",
+    "    policies = [\n",
+    "        RandomPolicy(),\n",
+    "        AlwaysStayPolicy(),\n",
+    "        SmartPolicy(),\n",
+    "        LearningPolicy(),\n",
+    "    ]\n",
     "    \n",
     "    print(\"\\n\" + \"=\"*70)\n",
-    "    print(f\"🏆 POLICY COMPARISON ({num_episodes} episodes)\")\n",
+    "    print(f\"   🏆 POLICY SHOWDOWN - {num_episodes} Episodes Each\")\n",
     "    print(\"=\"*70 + \"\\n\")\n",
     "    \n",
     "    results = []\n",
     "    for policy in policies:\n",
+    "        print(f\"Testing {policy.name}...\", end=\" \")\n",
     "        env = CatchEnvironment()\n",
     "        successes = sum(run_episode(env, policy, visualize=False) \n",
     "                       for _ in range(num_episodes))\n",
-    "        rate = (successes / num_episodes) * 100\n",
-    "        results.append((policy.name, rate))\n",
-    "        print(f\"{policy.name:25s}: {rate:5.1f}%\")\n",
+    "        success_rate = (successes / num_episodes) * 100\n",
+    "        results.append((policy.name, success_rate, successes))\n",
+    "        print(f\"✓\")\n",
     "    \n",
     "    print(\"\\n\" + \"=\"*70)\n",
-    "    print(\"📊 VISUAL COMPARISON\")\n",
+    "    print(\"   📊 RESULTS\")\n",
     "    print(\"=\"*70 + \"\\n\")\n",
     "    \n",
+    "    # Sort by success rate\n",
     "    results.sort(key=lambda x: x[1], reverse=True)\n",
-    "    for name, rate in results:\n",
+    "    \n",
+    "    for name, rate, successes in results:\n",
     "        bar = \"█\" * int(rate / 2)\n",
-    "        print(f\"{name:25s} [{bar:<50}] {rate:.1f}%\")\n",
+    "        print(f\"{name:25s} [{bar:<50}] {rate:5.1f}% ({successes}/{num_episodes})\")\n",
     "    \n",
     "    print(\"\\n\" + \"=\"*70)\n",
-    "    print(\"💡 RL in action: Random → Learning → Optimal\")\n",
-    "    print(\"=\"*70)\n",
+    "    print(\"\\n💡 Key Insights:\")\n",
+    "    print(\"   • Random (~20%):    Baseline - pure luck\")\n",
+    "    print(\"   • Always Stay (~20%): Bad - only works if ball in center\")\n",
+    "    print(\"   • Smart (100%):     Optimal - always catches\")\n",
+    "    print(\"   • Learning (~85%):  Improves over time with experience\")\n",
+    "    print(\"\\n🎓 This is RL in action:\")\n",
+    "    print(\"   1. Start with exploration (random)\")\n",
+    "    print(\"   2. Learn from rewards\")\n",
+    "    print(\"   3. Converge to optimal behavior\\n\")\n",
+    "\n",
+    "# Run the competition!\n",
+    "evaluate_policies(num_episodes=50)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "---\n",
+    "\n",
+    "<div style=\"background: linear-gradient(135deg, #f093fb 0%, #f5576c 100%); color: white; padding: 30px; border-radius: 15px; margin: 30px 0; text-align: center;\">\n",
+    "\n",
+    "# 🎉 Congratulations!\n",
     "\n",
-    "evaluate_policies(50)"
+    "### You just built and tested a complete RL environment!\n",
+    "\n",
+    "But we did it **the OpenEnv way**: type-safe, structured, production-ready.\n",
+    "\n",
+    "</div>"
    ]
   },
   {
@@ -631,225 +1066,449 @@
    "source": [
     "---\n",
     "\n",
-    "## 🌐 Part 8: Using Real OpenSpiel Integration\n",
+    "# Part 9: Using Real OpenSpiel 🎮\n",
     "\n",
-    "<div style=\"background-color: #d4edda; border: 2px solid #28a745; border-radius: 10px; padding: 20px; margin: 20px 0;\">\n",
-    "    <h3 style=\"margin-top: 0;\">✨ What We Just Built = How OpenEnv Works!</h3>\n",
-    "</div>\n",
+    "<div style=\"background-color: #d4edda; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n",
+    "\n",
+    "## What We Just Built vs Production OpenSpiel\n",
+    "\n",
+    "<table>\n",
+    "<tr>\n",
+    "<th>Component</th>\n",
+    "<th>Our Demo</th>\n",
+    "<th>OpenEnv + OpenSpiel</th>\n",
+    "</tr>\n",
+    "<tr>\n",
+    "<td><b>Environment</b></td>\n",
+    "<td>Local Python class</td>\n",
+    "<td>Docker container</td>\n",
+    "</tr>\n",
+    "<tr>\n",
+    "<td><b>Communication</b></td>\n",
+    "<td>Direct function calls</td>\n",
+    "<td>HTTP/JSON</td>\n",
+    "</tr>\n",
+    "<tr>\n",
+    "<td><b>Client</b></td>\n",
+    "<td>Direct access</td>\n",
+    "<td>HTTPEnvClient</td>\n",
+    "</tr>\n",
+    "<tr>\n",
+    "<td><b>Type Safety</b></td>\n",
+    "<td>✅ Dataclasses</td>\n",
+    "<td>✅ Dataclasses</td>\n",
+    "</tr>\n",
+    "<tr>\n",
+    "<td><b>API</b></td>\n",
+    "<td>reset(), step()</td>\n",
+    "<td>reset(), step() <em>(same!)</em></td>\n",
+    "</tr>\n",
+    "</table>\n",
     "\n",
-    "### 🔄 Demo vs Production:\n",
+    "**🎯 Same structure, production features!**\n",
     "\n",
-    "| Component | 🧪 Our Demo | 🚀 OpenEnv + OpenSpiel |\n",
-    "|-----------|-------------|------------------------|\n",
-    "| Environment | Local class | 🐳 Docker container |\n",
-    "| Communication | Direct calls | 🌐 HTTP |\n",
-    "| Client | Direct access | 📱 HTTPEnvClient |\n",
-    "| Type Safety | ✅ | ✅ |\n",
-    "| API | reset/step | reset/step |\n",
+    "</div>\n",
     "\n",
-    "### 🎮 Using OpenSpiel Integration:\n",
+    "### Using OpenSpiel Integration:\n",
     "\n",
     "```python\n",
-    "# Install OpenSpiel\n",
+    "# 1. Install OpenSpiel\n",
     "!pip install open_spiel\n",
     "\n",
-    "# Import OpenEnv's integration\n",
+    "# 2. Import OpenEnv's integration\n",
     "from envs.openspiel_env import OpenSpielEnv, OpenSpielAction\n",
     "\n",
-    "# Connect to server\n",
+    "# 3. Connect to server (HTTP!)\n",
     "env = OpenSpielEnv(base_url=\"http://localhost:8000\")\n",
     "\n",
-    "# Same API!\n",
+    "# 4. Same API you just learned!\n",
     "result = env.reset()\n",
     "result = env.step(OpenSpielAction(action_id=2, game_name=\"catch\"))\n",
     "state = env.state()\n",
+    "\n",
+    "# 5. Switch games by changing game_name:\n",
+    "result = env.step(OpenSpielAction(action_id=4, game_name=\"tic_tac_toe\"))\n",
     "```\n",
     "\n",
-    "### 🎯 Available Games:\n",
-    "\n",
-    "<div style=\"display: grid; grid-template-columns: repeat(3, 1fr); gap: 10px; margin: 20px 0;\">\n",
-    "    <div style=\"background: #e1f5ff; padding: 15px; border-radius: 8px; text-align: center;\">\n",
-    "        <h4>🎯 Catch</h4>\n",
-    "        <small>What we demoed!</small>\n",
-    "    </div>\n",
-    "    <div style=\"background: #ffe1e1; padding: 15px; border-radius: 8px; text-align: center;\">\n",
-    "        <h4>❌ Tic-Tac-Toe</h4>\n",
-    "        <small>2-player</small>\n",
-    "    </div>\n",
-    "    <div style=\"background: #fff4e1; padding: 15px; border-radius: 8px; text-align: center;\">\n",
-    "        <h4>🃏 Kuhn Poker</h4>\n",
-    "        <small>Imperfect info</small>\n",
-    "    </div>\n",
-    "    <div style=\"background: #e8f5e9; padding: 15px; border-radius: 8px; text-align: center;\">\n",
-    "        <h4>🏔️ Cliff Walking</h4>\n",
-    "        <small>Navigation</small>\n",
-    "    </div>\n",
-    "    <div style=\"background: #f3e5f5; padding: 15px; border-radius: 8px; text-align: center;\">\n",
-    "        <h4>🔢 2048</h4>\n",
-    "        <small>Puzzle</small>\n",
-    "    </div>\n",
-    "    <div style=\"background: #fff3e0; padding: 15px; border-radius: 8px; text-align: center;\">\n",
-    "        <h4>🂡 Blackjack</h4>\n",
-    "        <small>Cards</small>\n",
-    "    </div>\n",
-    "</div>\n",
+    "<div style=\"background-color: #fff3e0; padding: 15px; border-radius: 5px; margin: 20px 0;\">\n",
     "\n",
-    "---"
+    "**🎮 6 Games Available:**\n",
+    "\n",
+    "1. `\"catch\"` - What we just built!\n",
+    "2. `\"tic_tac_toe\"` - Classic 3×3\n",
+    "3. `\"kuhn_poker\"` - Imperfect information poker\n",
+    "4. `\"cliff_walking\"` - Grid navigation\n",
+    "5. `\"2048\"` - Tile puzzle\n",
+    "6. `\"blackjack\"` - Card game\n",
+    "\n",
+    "**All use the exact same interface!**\n",
+    "\n",
+    "</div>"
    ]
   },
   {
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "## ➕ Part 9: Adding Your Own Integration\n",
+    "---\n",
+    "\n",
+    "# Part 10: Create Your Own Integration 🛠️\n",
+    "\n",
+    "<div style=\"background-color: #e7f3ff; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n",
+    "\n",
+    "## The 5-Step Pattern\n",
+    "\n",
+    "Want to wrap your own environment in OpenEnv? Here's how:\n",
     "\n",
-    "<div align=\"center\">\n",
-    "    <h3>🛠️ Want to wrap your own environment?</h3>\n",
-    "    <p>Follow the 5-step pattern!</p>\n",
     "</div>\n",
     "\n",
-    "### 📝 1. Define Types (models.py)\n",
+    "### Step 1: Define Types (`models.py`)\n",
+    "\n",
     "```python\n",
+    "from dataclasses import dataclass\n",
+    "from core.env_server import Action, Observation, State\n",
+    "\n",
     "@dataclass\n",
     "class YourAction(Action):\n",
-    "    # Your action fields\n",
+    "    action_value: int\n",
+    "    # Add your action fields\n",
     "\n",
     "@dataclass\n",
     "class YourObservation(Observation):\n",
-    "    # Your observation fields\n",
+    "    state_data: List[float]\n",
+    "    done: bool\n",
+    "    reward: float\n",
+    "    # Add your observation fields\n",
+    "\n",
+    "@dataclass\n",
+    "class YourState(State):\n",
+    "    episode_id: str\n",
+    "    step_count: int\n",
+    "    # Add your state fields\n",
     "```\n",
     "\n",
-    "### 🖥️ 2. Implement Environment (server/environment.py)\n",
+    "### Step 2: Implement Environment (`server/environment.py`)\n",
+    "\n",
     "```python\n",
+    "from core.env_server import Environment\n",
+    "\n",
     "class YourEnvironment(Environment):\n",
     "    def reset(self) -> Observation:\n",
+    "        # Initialize your game/simulation\n",
     "        return YourObservation(...)\n",
     "    \n",
     "    def step(self, action: Action) -> Observation:\n",
+    "        # Execute action, update state\n",
     "        return YourObservation(...)\n",
+    "    \n",
+    "    @property\n",
+    "    def state(self) -> State:\n",
+    "        return self._state\n",
     "```\n",
     "\n",
-    "### 📱 3. Create Client (client.py)\n",
+    "### Step 3: Create Client (`client.py`)\n",
+    "\n",
     "```python\n",
+    "from core.http_env_client import HTTPEnvClient\n",
+    "from core.types import StepResult\n",
+    "\n",
     "class YourEnv(HTTPEnvClient[YourAction, YourObservation]):\n",
-    "    def _step_payload(self, action):\n",
-    "        return {\"field\": action.field}\n",
+    "    def _step_payload(self, action: YourAction) -> dict:\n",
+    "        \"\"\"Convert action to JSON\"\"\"\n",
+    "        return {\"action_value\": action.action_value}\n",
+    "    \n",
+    "    def _parse_result(self, payload: dict) -> StepResult:\n",
+    "        \"\"\"Parse JSON to observation\"\"\"\n",
+    "        return StepResult(\n",
+    "            observation=YourObservation(...),\n",
+    "            reward=payload['reward'],\n",
+    "            done=payload['done']\n",
+    "        )\n",
     "    \n",
-    "    def _parse_result(self, payload):\n",
-    "        return StepResult(observation=YourObservation(...))\n",
+    "    def _parse_state(self, payload: dict) -> YourState:\n",
+    "        return YourState(...)\n",
     "```\n",
     "\n",
-    "### ⚡ 4. Create Server (server/app.py)\n",
+    "### Step 4: Create Server (`server/app.py`)\n",
+    "\n",
     "```python\n",
     "from core.env_server import create_fastapi_app\n",
+    "from .your_environment import YourEnvironment\n",
     "\n",
     "env = YourEnvironment()\n",
     "app = create_fastapi_app(env)\n",
+    "\n",
+    "# That's it! OpenEnv creates all endpoints for you.\n",
     "```\n",
     "\n",
-    "### 🐳 5. Dockerize (server/Dockerfile)\n",
+    "### Step 5: Dockerize (`server/Dockerfile`)\n",
+    "\n",
     "```dockerfile\n",
-    "FROM python:3.11\n",
-    "COPY . /app\n",
+    "FROM python:3.11-slim\n",
+    "\n",
     "WORKDIR /app\n",
-    "RUN pip install -r requirements.txt\n",
-    "CMD [\"uvicorn\", \"app:app\", \"--host\", \"0.0.0.0\"]\n",
+    "COPY requirements.txt .\n",
+    "RUN pip install --no-cache-dir -r requirements.txt\n",
+    "\n",
+    "COPY . .\n",
+    "CMD [\"uvicorn\", \"app:app\", \"--host\", \"0.0.0.0\", \"--port\", \"8000\"]\n",
     "```\n",
     "\n",
-    "### 📚 Examples to Study:\n",
+    "<div style=\"background-color: #d4edda; padding: 20px; border-left: 5px solid #28a745; margin: 20px 0;\">\n",
+    "\n",
+    "### 🎓 Examples to Study\n",
+    "\n",
+    "OpenEnv includes 3 complete examples:\n",
+    "\n",
+    "1. **`src/envs/echo_env/`**\n",
+    "   - Simplest possible environment\n",
+    "   - Great for testing and learning\n",
+    "\n",
+    "2. **`src/envs/openspiel_env/`**\n",
+    "   - Wraps external library (OpenSpiel)\n",
+    "   - Shows integration pattern\n",
+    "   - 6 games in one integration\n",
+    "\n",
+    "3. **`src/envs/coding_env/`**\n",
+    "   - Python code execution environment\n",
+    "   - Shows complex use case\n",
+    "   - Security considerations\n",
+    "\n",
+    "**💡 Study these to understand the patterns!**\n",
+    "\n",
+    "</div>"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "---\n",
+    "\n",
+    "<div style=\"background: linear-gradient(135deg, #667eea 0%, #764ba2 100%); color: white; padding: 40px; border-radius: 15px; margin: 40px 0; text-align: center;\">\n",
+    "\n",
+    "# 🎓 Summary: Your Journey\n",
+    "\n",
+    "</div>"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## What You Learned\n",
     "\n",
     "<table>\n",
     "<tr>\n",
-    "<td>📢 <code>src/envs/echo_env/</code></td>\n",
-    "<td>Simple test environment</td>\n",
+    "<td width=\"50%\" style=\"vertical-align: top;\">\n",
+    "\n",
+    "### 📚 Concepts\n",
+    "\n",
+    "✅ **RL Fundamentals**\n",
+    "- The observe-act-reward loop\n",
+    "- What makes good policies\n",
+    "- Exploration vs exploitation\n",
+    "\n",
+    "✅ **OpenEnv Architecture**\n",
+    "- Client-server separation\n",
+    "- Type-safe contracts\n",
+    "- HTTP communication layer\n",
+    "\n",
+    "✅ **Production Patterns**\n",
+    "- Docker isolation\n",
+    "- API design\n",
+    "- Reproducible deployments\n",
+    "\n",
+    "</td>\n",
+    "<td width=\"50%\" style=\"vertical-align: top;\">\n",
+    "\n",
+    "### 🛠️ Skills\n",
+    "\n",
+    "✅ **Using Environments**\n",
+    "- Import OpenEnv clients\n",
+    "- Call reset/step/state\n",
+    "- Work with typed observations\n",
+    "\n",
+    "✅ **Building Environments**\n",
+    "- Define type-safe models\n",
+    "- Implement Environment class\n",
+    "- Create HTTPEnvClient\n",
+    "\n",
+    "✅ **Testing & Debugging**\n",
+    "- Compare policies\n",
+    "- Visualize episodes\n",
+    "- Measure performance\n",
+    "\n",
+    "</td>\n",
+    "</tr>\n",
+    "</table>"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## OpenEnv vs Traditional RL\n",
+    "\n",
+    "<table>\n",
+    "<tr>\n",
+    "<th>Feature</th>\n",
+    "<th>Traditional (Gym)</th>\n",
+    "<th>OpenEnv</th>\n",
+    "<th>Winner</th>\n",
+    "</tr>\n",
+    "<tr>\n",
+    "<td><b>Type Safety</b></td>\n",
+    "<td>❌ Arrays, dicts</td>\n",
+    "<td>✅ Dataclasses</td>\n",
+    "<td>🏆 OpenEnv</td>\n",
+    "</tr>\n",
+    "<tr>\n",
+    "<td><b>Isolation</b></td>\n",
+    "<td>❌ Same process</td>\n",
+    "<td>✅ Docker</td>\n",
+    "<td>🏆 OpenEnv</td>\n",
+    "</tr>\n",
+    "<tr>\n",
+    "<td><b>Deployment</b></td>\n",
+    "<td>❌ Manual setup</td>\n",
+    "<td>✅ K8s-ready</td>\n",
+    "<td>🏆 OpenEnv</td>\n",
+    "</tr>\n",
+    "<tr>\n",
+    "<td><b>Language</b></td>\n",
+    "<td>❌ Python only</td>\n",
+    "<td>✅ Any (HTTP)</td>\n",
+    "<td>🏆 OpenEnv</td>\n",
     "</tr>\n",
     "<tr>\n",
-    "<td>🎮 <code>src/envs/openspiel_env/</code></td>\n",
-    "<td>Our OpenSpiel integration</td>\n",
+    "<td><b>Reproducibility</b></td>\n",
+    "<td>❌ \"Works on my machine\"</td>\n",
+    "<td>✅ Same everywhere</td>\n",
+    "<td>🏆 OpenEnv</td>\n",
     "</tr>\n",
     "<tr>\n",
-    "<td>💻 <code>src/envs/coding_env/</code></td>\n",
-    "<td>Python code execution</td>\n",
+    "<td><b>Community</b></td>\n",
+    "<td>✅ Large ecosystem</td>\n",
+    "<td>🟡 Growing</td>\n",
+    "<td>🤝 Both!</td>\n",
     "</tr>\n",
     "</table>\n",
     "\n",
-    "---"
+    "<div style=\"background-color: #e7f3ff; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n",
+    "\n",
+    "**🎯 The Bottom Line**\n",
+    "\n",
+    "OpenEnv brings **production engineering** to RL:\n",
+    "- Same environments work locally and in production\n",
+    "- Type safety catches bugs early\n",
+    "- Docker isolation prevents conflicts\n",
+    "- HTTP API works with any language\n",
+    "\n",
+    "**It's RL for 2024 and beyond.**\n",
+    "\n",
+    "</div>"
    ]
   },
   {
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "## 🎓 Summary\n",
+    "---\n",
     "\n",
-    "<div style=\"background: linear-gradient(135deg, #667eea 0%, #764ba2 100%); color: white; padding: 30px; border-radius: 15px; margin: 20px 0;\">\n",
-    "    <h3 style=\"margin-top: 0; text-align: center;\">🎉 What You Learned</h3>\n",
-    "</div>\n",
+    "## 🚀 Next Steps\n",
     "\n",
-    "### 📖 The Journey:\n",
-    "\n",
-    "1. **🧠 RL Basics** - The core loop\n",
-    "2. **🏗️ OpenEnv Framework** - Standardized, production-ready\n",
-    "3. **🔌 Example Integration** - How OpenSpiel is wrapped\n",
-    "4. **🎯 Interactive Demo** - Policies in action\n",
-    "5. **➕ Adding Integrations** - The pattern to follow\n",
-    "\n",
-    "### ✨ OpenEnv's Value:\n",
-    "\n",
-    "| Feature | 🏠 Traditional | 🚀 OpenEnv |\n",
-    "|---------|---------------|------------|\n",
-    "| **Type Safety** | ❌ | ✅ Dataclasses |\n",
-    "| **Isolation** | ❌ | ✅ Docker |\n",
-    "| **Deployment** | ❌ | ✅ K8s-ready |\n",
-    "| **Language** | Python only | Any (HTTP) |\n",
-    "| **Reproducibility** | ❌ | ✅ Containers |\n",
-    "\n",
-    "### 🚀 Next Steps:\n",
-    "\n",
-    "<div style=\"display: grid; grid-template-columns: repeat(2, 1fr); gap: 15px; margin: 20px 0;\">\n",
-    "    <div style=\"background: #e1f5ff; padding: 20px; border-radius: 10px;\">\n",
-    "        <h4>1️⃣ Try OpenSpiel</h4>\n",
-    "        <p>Install and play with the 6 games</p>\n",
-    "    </div>\n",
-    "    <div style=\"background: #ffe1e1; padding: 20px; border-radius: 10px;\">\n",
-    "        <h4>2️⃣ Implement Real RL</h4>\n",
-    "        <p>Q-learning, DQN, PPO</p>\n",
-    "    </div>\n",
-    "    <div style=\"background: #fff4e1; padding: 20px; border-radius: 10px;\">\n",
-    "        <h4>3️⃣ Wrap Your Environments</h4>\n",
-    "        <p>Follow the 5-step pattern</p>\n",
-    "    </div>\n",
-    "    <div style=\"background: #e8f5e9; padding: 20px; border-radius: 10px;\">\n",
-    "        <h4>4️⃣ Deploy to Production</h4>\n",
-    "        <p>Docker → Kubernetes</p>\n",
-    "    </div>\n",
-    "</div>\n",
+    "<table>\n",
+    "<tr>\n",
+    "<td width=\"33%\">\n",
+    "\n",
+    "### 📖 Learn More\n",
     "\n",
-    "### 📚 Resources:\n",
+    "- Explore `src/envs/README.md`\n",
+    "- Read [RFC 001](https://github.com/meta-pytorch/OpenEnv/pull/26)\n",
+    "- Check example scripts in `examples/`\n",
+    "- Study OpenSpiel integration\n",
     "\n",
-    "- 🏠 **OpenEnv**: https://github.com/meta-pytorch/OpenEnv\n",
-    "- 📖 **Docs**: `src/envs/README.md`\n",
-    "- 💡 **Examples**: `examples/` directory\n",
+    "</td>\n",
+    "<td width=\"33%\">\n",
     "\n",
+    "### 🛠️ Build\n",
+    "\n",
+    "- Wrap your favorite RL environment\n",
+    "- Implement real RL algorithms (DQN, PPO)\n",
+    "- Create a custom game\n",
+    "- Deploy to production\n",
+    "\n",
+    "</td>\n",
+    "<td width=\"33%\">\n",
+    "\n",
+    "### 🤝 Contribute\n",
+    "\n",
+    "- Star the [repo](https://github.com/meta-pytorch/OpenEnv)\n",
+    "- Report issues\n",
+    "- Submit PRs\n",
+    "- Share your integrations\n",
+    "\n",
+    "</td>\n",
+    "</tr>\n",
+    "</table>"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## 📚 Resources\n",
+    "\n",
+    "<div style=\"background-color: #f8f9fa; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n",
+    "\n",
+    "**🔗 Links**\n",
+    "\n",
+    "- **OpenEnv GitHub**: https://github.com/meta-pytorch/OpenEnv\n",
+    "- **OpenSpiel**: https://github.com/google-deepmind/open_spiel\n",
+    "- **FastAPI Docs**: https://fastapi.tiangolo.com/\n",
+    "- **Docker Guide**: https://docs.docker.com/get-started/\n",
+    "\n",
+    "**📖 Documentation**\n",
+    "\n",
+    "- Environment creation guide: `src/envs/README.md`\n",
+    "- OpenSpiel integration: `src/envs/openspiel_env/README.md`\n",
+    "- Example scripts: `examples/`\n",
+    "\n",
+    "**🎓 Community**\n",
+    "\n",
+    "- Supported by: Meta PyTorch, Hugging Face, Unsloth AI, and more\n",
+    "- License: BSD 3-Clause\n",
+    "- Contributions welcome!\n",
+    "\n",
+    "</div>"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
     "---\n",
     "\n",
-    "<div align=\"center\" style=\"background: linear-gradient(135deg, #f093fb 0%, #f5576c 100%); color: white; padding: 40px; border-radius: 20px; margin: 30px 0;\">\n",
-    "    <h2>🎉 You're Ready!</h2>\n",
-    "    <p style=\"font-size: 1.2em; margin: 20px 0;\">You now understand:</p>\n",
-    "    <table style=\"margin: 20px auto;\">\n",
-    "        <tr>\n",
-    "            <td>✅ OpenEnv framework</td>\n",
-    "            <td>✅ How integrations work</td>\n",
-    "        </tr>\n",
-    "        <tr>\n",
-    "            <td>✅ Using existing environments</td>\n",
-    "            <td>✅ Creating new integrations</td>\n",
-    "        </tr>\n",
-    "        <tr>\n",
-    "            <td colspan=\"2\">✅ Production deployment</td>\n",
-    "        </tr>\n",
-    "    </table>\n",
-    "    <h3 style=\"margin-top: 30px;\">Welcome to production-ready RL! 🚀</h3>\n",
+    "<div style=\"background: linear-gradient(135deg, #f093fb 0%, #f5576c 100%); color: white; padding: 50px; border-radius: 20px; margin: 40px 0; text-align: center;\">\n",
+    "\n",
+    "# 🎉 Congratulations!\n",
+    "\n",
+    "### You're now an OpenEnv expert!\n",
+    "\n",
+    "You understand:\n",
+    "- ✅ How RL works\n",
+    "- ✅ Why OpenEnv matters\n",
+    "- ✅ How to use existing environments\n",
+    "- ✅ How to create new integrations\n",
+    "- ✅ How to deploy to production\n",
+    "\n",
+    "---\n",
+    "\n",
+    "### Now go build something amazing! 🚀\n",
+    "\n",
+    "**Welcome to the future of RL.**\n",
+    "\n",
     "</div>"
    ]
   }
@@ -875,4 +1534,4 @@
  },
  "nbformat": 4,
  "nbformat_minor": 4
-}
\ No newline at end of file
+}

From f96db499277a8db170a122a50fc1c9c14c43140b Mon Sep 17 00:00:00 2001
From: Sanyam Bhutani <sanyambhutani@meta.com>
Date: Mon, 20 Oct 2025 13:47:15 -0700
Subject: [PATCH 05/19] improve UO

---
 examples/OpenEnv_Tutorial.ipynb | 556 +-------------------------------
 1 file changed, 14 insertions(+), 542 deletions(-)

diff --git a/examples/OpenEnv_Tutorial.ipynb b/examples/OpenEnv_Tutorial.ipynb
index fe93747..e534fcc 100644
--- a/examples/OpenEnv_Tutorial.ipynb
+++ b/examples/OpenEnv_Tutorial.ipynb
@@ -3,74 +3,12 @@
   {
    "cell_type": "markdown",
    "metadata": {},
-   "source": [
-    "<div align=\"center\">\n",
-    "\n",
-    "# 🚀 OpenEnv: Production RL Made Simple\n",
-    "\n",
-    "### *From \"Hello World\" to Production Deployment in 30 Minutes*\n",
-    "\n",
-    "---\n",
-    "\n",
-    "**What if RL environments were as easy to use as REST APIs?**\n",
-    "\n",
-    "That's OpenEnv. Type-safe. Isolated. Production-ready.\n",
-    "\n",
-    "[![GitHub](https://img.shields.io/badge/GitHub-meta--pytorch%2FOpenEnv-blue?logo=github)](https://github.com/meta-pytorch/OpenEnv)\n",
-    "[![License](https://img.shields.io/badge/License-BSD%203--Clause-green.svg)](https://opensource.org/licenses/BSD-3-Clause)\n",
-    "\n",
-    "</div>\n",
-    "\n",
-    "---"
-   ]
+   "source": "<div align=\"center\">\n\n<img src=\"https://pytorch.org/assets/images/pytorch-logo.png\" width=\"200\" alt=\"PyTorch\">\n\n# OpenEnv: Production RL Made Simple\n\n### *From \"Hello World\" to Production Deployment in 30 Minutes* ✨\n\n---\n\n**What if RL environments were as easy to use as REST APIs?**\n\nThat's OpenEnv. Type-safe. Isolated. Production-ready. 🎯\n\n[![GitHub](https://img.shields.io/badge/GitHub-meta--pytorch%2FOpenEnv-blue?logo=github)](https://github.com/meta-pytorch/OpenEnv)\n[![License](https://img.shields.io/badge/License-BSD%203--Clause-green.svg)](https://opensource.org/licenses/BSD-3-Clause)\n[![PyTorch](https://img.shields.io/badge/PyTorch-EE4C2C?logo=pytorch&logoColor=white)](https://pytorch.org/)\n\n</div>\n\n---"
   },
   {
    "cell_type": "markdown",
    "metadata": {},
-   "source": [
-    "## 📋 What You'll Learn\n",
-    "\n",
-    "<table>\n",
-    "<tr>\n",
-    "<td width=\"50%\">\n",
-    "\n",
-    "**🎯 Part 1-2: The Fundamentals**\n",
-    "- RL in 60 seconds\n",
-    "- Why existing solutions fall short\n",
-    "- The OpenEnv solution\n",
-    "\n",
-    "</td>\n",
-    "<td width=\"50%\">\n",
-    "\n",
-    "**🏗️ Part 3-5: The Architecture**\n",
-    "- How OpenEnv works\n",
-    "- Exploring real code\n",
-    "- OpenSpiel integration example\n",
-    "\n",
-    "</td>\n",
-    "</tr>\n",
-    "<tr>\n",
-    "<td width=\"50%\">\n",
-    "\n",
-    "**🎮 Part 6-8: Hands-On Demo**\n",
-    "- Build a game environment\n",
-    "- Test 4 different policies\n",
-    "- Watch learning happen live\n",
-    "\n",
-    "</td>\n",
-    "<td width=\"50%\">\n",
-    "\n",
-    "**🔧 Part 9-10: Going Further**\n",
-    "- Use real OpenSpiel\n",
-    "- Create your own integration\n",
-    "- Deploy to production\n",
-    "\n",
-    "</td>\n",
-    "</tr>\n",
-    "</table>\n",
-    "\n",
-    "> 💡 **Pro Tip**: This notebook is designed to run top-to-bottom in Google Colab with zero setup!"
-   ]
+   "source": "## 📋 What You'll Learn\n\n<table>\n<tr>\n<td width=\"50%\">\n\n**🎯 Part 1-2: The Fundamentals**\n- ⚡ RL in 60 seconds\n- 🤔 Why existing solutions fall short\n- 💡 The OpenEnv solution\n\n</td>\n<td width=\"50%\">\n\n**🏗️ Part 3-5: The Architecture**\n- 🔧 How OpenEnv works\n- 🔍 Exploring real code\n- 🎮 OpenSpiel integration example\n\n</td>\n</tr>\n<tr>\n<td width=\"50%\">\n\n**🎮 Part 6-8: Hands-On Demo**\n- 🔨 Build a game environment\n- 🤖 Test 4 different policies\n- 👀 Watch learning happen live\n\n</td>\n<td width=\"50%\">\n\n**🔧 Part 9-10: Going Further**\n- 🚀 Use real OpenSpiel\n- ✨ Create your own integration\n- 🌐 Deploy to production\n\n</td>\n</tr>\n</table>\n\n> 💡 **Pro Tip**: This notebook is designed to run top-to-bottom in Google Colab with zero setup!\n> \n> ⏱️ **Time**: ~30 minutes | 📊 **Difficulty**: Beginner-friendly | 🎯 **Outcome**: Production-ready RL knowledge"
   },
   {
    "cell_type": "markdown",
@@ -106,41 +44,7 @@
    "execution_count": null,
    "metadata": {},
    "outputs": [],
-   "source": [
-    "import random\n",
-    "\n",
-    "print(\"🎲 Number Guessing Game - The Simplest RL Example\")\n",
-    "print(\"=\" * 60)\n",
-    "\n",
-    "# Environment\n",
-    "target = random.randint(1, 10)\n",
-    "guesses_left = 3\n",
-    "\n",
-    "print(f\"\\n🎯 I'm thinking of a number between 1 and 10...\\n\")\n",
-    "\n",
-    "# The RL Loop\n",
-    "while guesses_left > 0:\n",
-    "    # Policy: Random guessing (no learning yet!)\n",
-    "    guess = random.randint(1, 10)\n",
-    "    guesses_left -= 1\n",
-    "    \n",
-    "    print(f\"💭 Guess #{3-guesses_left}: {guess}\", end=\" → \")\n",
-    "    \n",
-    "    # Reward signal\n",
-    "    if guess == target:\n",
-    "        print(\"🎉 Correct! +10 points\")\n",
-    "        break\n",
-    "    elif abs(guess - target) <= 2:\n",
-    "        print(\"🔥 Warm! (close)\")\n",
-    "    else:\n",
-    "        print(\"❄️  Cold! (far)\")\n",
-    "else:\n",
-    "    print(f\"\\n💔 Out of guesses. The number was {target}.\")\n",
-    "\n",
-    "print(\"\\n\" + \"=\" * 60)\n",
-    "print(\"\\n💡 This is RL: Observe → Act → Reward → Repeat\")\n",
-    "print(\"   But this policy is terrible! It doesn't learn.\\n\")"
-   ]
+   "source": "import random\n\nprint(\"🎲 \" + \"=\"*58 + \" 🎲\")\nprint(\"   Number Guessing Game - The Simplest RL Example\")\nprint(\"🎲 \" + \"=\"*58 + \" 🎲\")\n\n# Environment setup\ntarget = random.randint(1, 10)\nguesses_left = 3\n\nprint(f\"\\n🎯 I'm thinking of a number between 1 and 10...\")\nprint(f\"💭 You have {guesses_left} guesses. Let's see how random guessing works!\\n\")\n\n# The RL Loop - Pure random policy (no learning!)\nwhile guesses_left > 0:\n    # Policy: Random guessing (no learning yet!)\n    guess = random.randint(1, 10)\n    guesses_left -= 1\n    \n    print(f\"💭 Guess #{3-guesses_left}: {guess}\", end=\" → \")\n    \n    # Reward signal (but we're not using it!)\n    if guess == target:\n        print(\"🎉 Correct! +10 points\")\n        break\n    elif abs(guess - target) <= 2:\n        print(\"🔥 Warm! (close)\")\n    else:\n        print(\"❄️  Cold! (far)\")\nelse:\n    print(f\"\\n💔 Out of guesses. The number was {target}.\")\n\nprint(\"\\n\" + \"=\"*62)\nprint(\"💡 This is RL: Observe → Act → Reward → Repeat\")\nprint(\"   But this policy is terrible! It doesn't learn from rewards.\")\nprint(\"=\"*62 + \"\\n\")"
   },
   {
    "cell_type": "markdown",
@@ -163,60 +67,7 @@
   {
    "cell_type": "markdown",
    "metadata": {},
-   "source": [
-    "---\n",
-    "\n",
-    "# Part 2: The Problem with Traditional RL 😤\n",
-    "\n",
-    "## Why Can't We Just Use OpenAI Gym?\n",
-    "\n",
-    "<table>\n",
-    "<tr>\n",
-    "<th>Problem</th>\n",
-    "<th>Traditional Approach</th>\n",
-    "<th>OpenEnv Solution</th>\n",
-    "</tr>\n",
-    "<tr>\n",
-    "<td><b>Type Safety</b></td>\n",
-    "<td>❌ <code>obs[0][3]</code> - what is this?</td>\n",
-    "<td>✅ <code>obs.info_state</code> - IDE knows!</td>\n",
-    "</tr>\n",
-    "<tr>\n",
-    "<td><b>Isolation</b></td>\n",
-    "<td>❌ Same process (can crash your training)</td>\n",
-    "<td>✅ Docker containers (fully isolated)</td>\n",
-    "</tr>\n",
-    "<tr>\n",
-    "<td><b>Deployment</b></td>\n",
-    "<td>❌ \"Works on my machine\"</td>\n",
-    "<td>✅ Same container everywhere</td>\n",
-    "</tr>\n",
-    "<tr>\n",
-    "<td><b>Scaling</b></td>\n",
-    "<td>❌ Hard to distribute</td>\n",
-    "<td>✅ Deploy to Kubernetes</td>\n",
-    "</tr>\n",
-    "<tr>\n",
-    "<td><b>Language</b></td>\n",
-    "<td>❌ Python only</td>\n",
-    "<td>✅ Any language (HTTP API)</td>\n",
-    "</tr>\n",
-    "</table>\n",
-    "\n",
-    "<div style=\"background-color: #d4edda; padding: 20px; border-left: 5px solid #28a745; margin: 20px 0;\">\n",
-    "\n",
-    "## 💡 The OpenEnv Philosophy\n",
-    "\n",
-    "**\"RL environments should be like microservices\"**\n",
-    "\n",
-    "- 🔒 **Isolated**: Run in containers\n",
-    "- 🌐 **Standard**: HTTP API, works everywhere\n",
-    "- 📦 **Versioned**: Docker images\n",
-    "- 🚀 **Scalable**: Deploy anywhere\n",
-    "- 🛡️ **Type-safe**: Know exactly what you're sending/receiving\n",
-    "\n",
-    "</div>"
-   ]
+   "source": "---\n\n# Part 2: The Problem with Traditional RL 😤\n\n<div style=\"background-color: #fff3e0; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n\n## 🤔 Why Can't We Just Use OpenAI Gym?\n\nGood question! Gym is great for research, but production needs more...\n\n</div>\n\n<table>\n<tr>\n<th>Challenge</th>\n<th>Traditional Approach</th>\n<th>OpenEnv Solution</th>\n</tr>\n<tr>\n<td><b>Type Safety</b></td>\n<td>❌ <code>obs[0][3]</code> - what is this?</td>\n<td>✅ <code>obs.info_state</code> - IDE knows!</td>\n</tr>\n<tr>\n<td><b>Isolation</b></td>\n<td>❌ Same process (can crash your training)</td>\n<td>✅ Docker containers (fully isolated)</td>\n</tr>\n<tr>\n<td><b>Deployment</b></td>\n<td>❌ \"Works on my machine\" 🤷</td>\n<td>✅ Same container everywhere 🐳</td>\n</tr>\n<tr>\n<td><b>Scaling</b></td>\n<td>❌ Hard to distribute</td>\n<td>✅ Deploy to Kubernetes ☸️</td>\n</tr>\n<tr>\n<td><b>Language</b></td>\n<td>❌ Python only</td>\n<td>✅ Any language (HTTP API) 🌐</td>\n</tr>\n<tr>\n<td><b>Debugging</b></td>\n<td>❌ Cryptic numpy errors</td>\n<td>✅ Clear type errors 🐛</td>\n</tr>\n</table>\n\n<div style=\"background-color: #d4edda; padding: 20px; border-left: 5px solid #28a745; margin: 20px 0;\">\n\n## 💡 The OpenEnv Philosophy\n\n**\"RL environments should be like microservices\"**\n\nThink of it like this: You don't run your database in the same process as your web server, right? Same principle!\n\n- 🔒 **Isolated**: Run in containers (security + stability)\n- 🌐 **Standard**: HTTP API, works everywhere\n- 📦 **Versioned**: Docker images (reproducibility!)\n- 🚀 **Scalable**: Deploy to cloud with one command\n- 🛡️ **Type-safe**: Catch bugs before they happen\n- 🔄 **Portable**: Works on Mac, Linux, Windows, Cloud\n\n</div>"
   },
   {
    "cell_type": "markdown",
@@ -285,35 +136,7 @@
    "execution_count": null,
    "metadata": {},
    "outputs": [],
-   "source": [
-    "# Detect environment\n",
-    "try:\n",
-    "    import google.colab\n",
-    "    IN_COLAB = True\n",
-    "    print(\"🌐 Running in Google Colab\")\n",
-    "except ImportError:\n",
-    "    IN_COLAB = False\n",
-    "    print(\"💻 Running locally\")\n",
-    "\n",
-    "if IN_COLAB:\n",
-    "    print(\"\\n📦 Cloning OpenEnv repository...\")\n",
-    "    !git clone https://github.com/meta-pytorch/OpenEnv.git > /dev/null 2>&1\n",
-    "    %cd OpenEnv\n",
-    "    \n",
-    "    print(\"📚 Installing dependencies...\")\n",
-    "    !pip install -q fastapi uvicorn requests\n",
-    "    \n",
-    "    import sys\n",
-    "    sys.path.insert(0, './src')\n",
-    "    print(\"\\n✅ Setup complete!\")\n",
-    "else:\n",
-    "    import sys\n",
-    "    from pathlib import Path\n",
-    "    sys.path.insert(0, str(Path.cwd().parent / 'src'))\n",
-    "    print(\"✅ Using local OpenEnv\")\n",
-    "\n",
-    "print(\"\\n🚀 Ready to explore OpenEnv!\")"
-   ]
+   "source": "# Detect environment\ntry:\n    import google.colab\n    IN_COLAB = True\n    print(\"🌐 Running in Google Colab - Perfect!\")\nexcept ImportError:\n    IN_COLAB = False\n    print(\"💻 Running locally - Nice!\")\n\nif IN_COLAB:\n    print(\"\\n📦 Cloning OpenEnv repository...\")\n    !git clone https://github.com/meta-pytorch/OpenEnv.git > /dev/null 2>&1\n    %cd OpenEnv\n    \n    print(\"📚 Installing dependencies (this takes ~10 seconds)...\")\n    !pip install -q fastapi uvicorn requests\n    \n    import sys\n    sys.path.insert(0, './src')\n    print(\"\\n✅ Setup complete! Everything is ready to go! 🎉\")\nelse:\n    import sys\n    from pathlib import Path\n    sys.path.insert(0, str(Path.cwd().parent / 'src'))\n    print(\"✅ Using local OpenEnv installation\")\n\nprint(\"\\n🚀 Ready to explore OpenEnv and build amazing things!\")\nprint(\"💡 Tip: Run cells top-to-bottom for the best experience.\\n\")"
   },
   {
    "cell_type": "markdown",
@@ -351,51 +174,7 @@
    "execution_count": null,
    "metadata": {},
    "outputs": [],
-   "source": [
-    "# Import OpenEnv's core abstractions\n",
-    "from core.env_server import Environment, Action, Observation, State\n",
-    "from core.http_env_client import HTTPEnvClient\n",
-    "\n",
-    "print(\"=\"*70)\n",
-    "print(\"   🧩 OPENENV CORE ABSTRACTIONS\")\n",
-    "print(\"=\"*70)\n",
-    "\n",
-    "print(\"\"\"\n",
-    "🖥️  SERVER SIDE (runs in Docker):\n",
-    "\n",
-    "    class Environment(ABC):\n",
-    "        '''Base class for all environment implementations'''\n",
-    "        \n",
-    "        @abstractmethod\n",
-    "        def reset(self) -> Observation:\n",
-    "            '''Start new episode'''\n",
-    "        \n",
-    "        @abstractmethod\n",
-    "        def step(self, action: Action) -> Observation:\n",
-    "            '''Execute action, return observation'''\n",
-    "        \n",
-    "        @property\n",
-    "        def state(self) -> State:\n",
-    "            '''Get episode metadata'''\n",
-    "\n",
-    "📱 CLIENT SIDE (your training code):\n",
-    "\n",
-    "    class HTTPEnvClient(ABC):\n",
-    "        '''Base class for HTTP clients'''\n",
-    "        \n",
-    "        def reset(self) -> StepResult:\n",
-    "            # HTTP POST /reset\n",
-    "        \n",
-    "        def step(self, action) -> StepResult:\n",
-    "            # HTTP POST /step\n",
-    "        \n",
-    "        def state(self) -> State:\n",
-    "            # HTTP GET /state\n",
-    "\"\"\")\n",
-    "\n",
-    "print(\"=\"*70)\n",
-    "print(\"\\n💡 Same interface on both sides - communication via HTTP!\\n\")"
-   ]
+   "source": "# Import OpenEnv's core abstractions\nfrom core.env_server import Environment, Action, Observation, State\nfrom core.http_env_client import HTTPEnvClient\n\nprint(\"=\"*70)\nprint(\"   🧩 OPENENV CORE ABSTRACTIONS\")\nprint(\"=\"*70)\n\nprint(\"\"\"\n🖥️  SERVER SIDE (runs in Docker):\n\n    class Environment(ABC):\n        '''Base class for all environment implementations'''\n        \n        @abstractmethod\n        def reset(self) -> Observation:\n            '''Start new episode'''\n        \n        @abstractmethod\n        def step(self, action: Action) -> Observation:\n            '''Execute action, return observation'''\n        \n        @property\n        def state(self) -> State:\n            '''Get episode metadata'''\n\n📱 CLIENT SIDE (your training code):\n\n    class HTTPEnvClient(ABC):\n        '''Base class for HTTP clients'''\n        \n        def reset(self) -> StepResult:\n            # HTTP POST /reset\n        \n        def step(self, action) -> StepResult:\n            # HTTP POST /step\n        \n        def state(self) -> State:\n            # HTTP GET /state\n\"\"\")\n\nprint(\"=\"*70)\nprint(\"\\n✨ Same interface on both sides - communication via HTTP!\")\nprint(\"🎯 You focus on RL, OpenEnv handles the infrastructure.\\n\")"
   },
   {
    "cell_type": "markdown",
@@ -555,20 +334,7 @@
   {
    "cell_type": "markdown",
    "metadata": {},
-   "source": [
-    "---\n",
-    "\n",
-    "<div style=\"text-align: center; background: linear-gradient(135deg, #667eea 0%, #764ba2 100%); color: white; padding: 30px; border-radius: 15px; margin: 30px 0;\">\n",
-    "\n",
-    "# 🎮 Part 6: Interactive Demo\n",
-    "\n",
-    "### Now let's BUILD something!\n",
-    "\n",
-    "We'll create a Catch game following OpenEnv patterns,<br>\n",
-    "then watch 4 different AI policies compete.\n",
-    "\n",
-    "</div>"
-   ]
+   "source": "---\n\n<div style=\"text-align: center; background: linear-gradient(135deg, #667eea 0%, #764ba2 100%); color: white; padding: 30px; border-radius: 15px; margin: 30px 0;\">\n\n# 🎮 Part 6: Interactive Demo\n\n### Now let's BUILD something!\n\nWe'll create a **Catch game** following OpenEnv patterns,<br>\nthen watch **4 different AI policies** compete for the championship! 🏆\n\n<br>\n\n**Get ready for:**\n- ⚡ Live gameplay visualization\n- 🤖 AI policy showdown\n- 📊 Real-time learning metrics\n- 🎯 Production-ready patterns\n\n</div>"
   },
   {
    "cell_type": "markdown",
@@ -627,120 +393,7 @@
    "execution_count": null,
    "metadata": {},
    "outputs": [],
-   "source": [
-    "import random\n",
-    "from dataclasses import dataclass\n",
-    "from typing import List, Tuple\n",
-    "\n",
-    "# ============================================================================\n",
-    "# MODELS - Type-safe contracts (following OpenEnv pattern)\n",
-    "# ============================================================================\n",
-    "\n",
-    "@dataclass\n",
-    "class CatchObservation:\n",
-    "    \"\"\"Type-safe observation following OpenEnv Observation base class.\"\"\"\n",
-    "    info_state: List[float]      # Grid as flat array\n",
-    "    legal_actions: List[int]     # [0, 1, 2] always\n",
-    "    done: bool                   # Episode finished?\n",
-    "    reward: float                # +1 or 0\n",
-    "    # Extra fields for visualization\n",
-    "    ball_position: Tuple[int, int]\n",
-    "    paddle_position: int\n",
-    "\n",
-    "\n",
-    "# ============================================================================\n",
-    "# ENVIRONMENT - Server-side logic (following OpenEnv Environment pattern)\n",
-    "# ============================================================================\n",
-    "\n",
-    "class CatchEnvironment:\n",
-    "    \"\"\"\n",
-    "    Catch game following OpenEnv's Environment pattern.\n",
-    "    \n",
-    "    In production:\n",
-    "      • Runs in Docker container\n",
-    "      • Accessed via HTTPEnvClient\n",
-    "      • Exposed via FastAPI server\n",
-    "    \n",
-    "    For this demo:\n",
-    "      • We run it locally to see internals\n",
-    "      • But the structure is identical!\n",
-    "    \"\"\"\n",
-    "    \n",
-    "    def __init__(self, grid_size=5):\n",
-    "        self.grid_size = grid_size\n",
-    "    \n",
-    "    def reset(self) -> CatchObservation:\n",
-    "        \"\"\"Start new episode (implements Environment.reset()).\"\"\"\n",
-    "        self.ball_row = 0\n",
-    "        self.ball_col = random.randint(0, self.grid_size - 1)\n",
-    "        self.paddle_col = self.grid_size // 2\n",
-    "        self.done = False\n",
-    "        return self._make_observation()\n",
-    "    \n",
-    "    def step(self, action: int) -> CatchObservation:\n",
-    "        \"\"\"Execute action (implements Environment.step()).\n",
-    "        \n",
-    "        Args:\n",
-    "            action: 0=LEFT, 1=STAY, 2=RIGHT\n",
-    "        \"\"\"\n",
-    "        # Move paddle\n",
-    "        if action == 0 and self.paddle_col > 0:\n",
-    "            self.paddle_col -= 1\n",
-    "        elif action == 2 and self.paddle_col < self.grid_size - 1:\n",
-    "            self.paddle_col += 1\n",
-    "        \n",
-    "        # Move ball down\n",
-    "        self.ball_row += 1\n",
-    "        \n",
-    "        # Check if episode done\n",
-    "        if self.ball_row >= self.grid_size - 1:\n",
-    "            self.done = True\n",
-    "            reward = 1.0 if self.ball_col == self.paddle_col else 0.0\n",
-    "        else:\n",
-    "            reward = 0.0\n",
-    "        \n",
-    "        return self._make_observation(reward)\n",
-    "    \n",
-    "    def _make_observation(self, reward=0.0) -> CatchObservation:\n",
-    "        \"\"\"Create type-safe observation.\"\"\"\n",
-    "        # Flatten grid to vector (like real RL environments do)\n",
-    "        info_state = [0.0] * (self.grid_size * self.grid_size)\n",
-    "        ball_idx = self.ball_row * self.grid_size + self.ball_col\n",
-    "        paddle_idx = (self.grid_size - 1) * self.grid_size + self.paddle_col\n",
-    "        info_state[ball_idx] = 1.0      # Ball = 1.0\n",
-    "        info_state[paddle_idx] = 0.5    # Paddle = 0.5\n",
-    "        \n",
-    "        return CatchObservation(\n",
-    "            info_state=info_state,\n",
-    "            legal_actions=[0, 1, 2],\n",
-    "            done=self.done,\n",
-    "            reward=reward,\n",
-    "            ball_position=(self.ball_row, self.ball_col),\n",
-    "            paddle_position=self.paddle_col\n",
-    "        )\n",
-    "    \n",
-    "    def render(self):\n",
-    "        \"\"\"Visualize current state.\"\"\"\n",
-    "        for row in range(self.grid_size):\n",
-    "            line = \"  \"\n",
-    "            for col in range(self.grid_size):\n",
-    "                if row == self.ball_row and col == self.ball_col:\n",
-    "                    line += \"🔴 \"\n",
-    "                elif row == self.grid_size - 1 and col == self.paddle_col:\n",
-    "                    line += \"🏓 \"\n",
-    "                else:\n",
-    "                    line += \"⬜ \"\n",
-    "            print(line)\n",
-    "\n",
-    "\n",
-    "print(\"✅ Environment created following OpenEnv pattern!\")\n",
-    "print(\"\\n📋 What we just built:\")\n",
-    "print(\"   • reset() → CatchObservation (type-safe!)\")\n",
-    "print(\"   • step(action) → CatchObservation (type-safe!)\")\n",
-    "print(\"   • render() → Visual display\")\n",
-    "print(\"\\n🚀 In production: This would run in Docker + FastAPI\")\n",
-    "print(\"   But the structure is EXACTLY the same!\")"
-   ]
+   "source": "import random\nfrom dataclasses import dataclass\nfrom typing import List, Tuple\n\n# ============================================================================\n# MODELS - Type-safe contracts (following OpenEnv pattern)\n# ============================================================================\n\n@dataclass\nclass CatchObservation:\n    \"\"\"Type-safe observation following OpenEnv Observation base class.\"\"\"\n    info_state: List[float]      # Grid as flat array\n    legal_actions: List[int]     # [0, 1, 2] always\n    done: bool                   # Episode finished?\n    reward: float                # +1 or 0\n    # Extra fields for visualization\n    ball_position: Tuple[int, int]\n    paddle_position: int\n\n\n# ============================================================================\n# ENVIRONMENT - Server-side logic (following OpenEnv Environment pattern)\n# ============================================================================\n\nclass CatchEnvironment:\n    \"\"\"\n    Catch game following OpenEnv's Environment pattern.\n    \n    In production:\n      • Runs in Docker container\n      • Accessed via HTTPEnvClient\n      • Exposed via FastAPI server\n    \n    For this demo:\n      • We run it locally to see internals\n      • But the structure is identical!\n    \"\"\"\n    \n    def __init__(self, grid_size=5):\n        self.grid_size = grid_size\n    \n    def reset(self) -> CatchObservation:\n        \"\"\"Start new episode (implements Environment.reset()).\"\"\"\n        self.ball_row = 0\n        self.ball_col = random.randint(0, self.grid_size - 1)\n        self.paddle_col = self.grid_size // 2\n        self.done = False\n        return self._make_observation()\n    \n    def step(self, action: int) -> CatchObservation:\n        \"\"\"Execute action (implements Environment.step()).\n        \n        Args:\n            action: 0=LEFT, 1=STAY, 2=RIGHT\n        \"\"\"\n        # Move paddle\n        if action == 0 and self.paddle_col > 0:\n            self.paddle_col -= 1\n        elif action == 2 and self.paddle_col < self.grid_size - 1:\n            self.paddle_col += 1\n        \n        # Move ball down\n        self.ball_row += 1\n        \n        # Check if episode done\n        if self.ball_row >= self.grid_size - 1:\n            self.done = True\n            reward = 1.0 if self.ball_col == self.paddle_col else 0.0\n        else:\n            reward = 0.0\n        \n        return self._make_observation(reward)\n    \n    def _make_observation(self, reward=0.0) -> CatchObservation:\n        \"\"\"Create type-safe observation.\"\"\"\n        # Flatten grid to vector (like real RL environments do)\n        info_state = [0.0] * (self.grid_size * self.grid_size)\n        ball_idx = self.ball_row * self.grid_size + self.ball_col\n        paddle_idx = (self.grid_size - 1) * self.grid_size + self.paddle_col\n        info_state[ball_idx] = 1.0      # Ball = 1.0\n        info_state[paddle_idx] = 0.5    # Paddle = 0.5\n        \n        return CatchObservation(\n            info_state=info_state,\n            legal_actions=[0, 1, 2],\n            done=self.done,\n            reward=reward,\n            ball_position=(self.ball_row, self.ball_col),\n            paddle_position=self.paddle_col\n        )\n    \n    def render(self):\n        \"\"\"Visualize current state.\"\"\"\n        for row in range(self.grid_size):\n            line = \"  \"\n            for col in range(self.grid_size):\n                if row == self.ball_row and col == self.ball_col:\n                    line += \"🔴 \"\n                elif row == self.grid_size - 1 and col == self.paddle_col:\n                    line += \"🏓 \"\n                else:\n                    line += \"⬜ \"\n            print(line)\n\n\nprint(\"🎉 \" + \"=\"*64 + \" 🎉\")\nprint(\"   ✅ Environment Created Following OpenEnv Pattern!\")\nprint(\"🎉 \" + \"=\"*64 + \" 🎉\")\nprint(\"\\n📋 What we just built:\")\nprint(\"   • reset() → CatchObservation (type-safe!)\")\nprint(\"   • step(action) → CatchObservation (type-safe!)\")\nprint(\"   • render() → Visual display\")\nprint(\"\\n🚀 In production: This would run in Docker + FastAPI\")\nprint(\"   But the structure is EXACTLY the same!\")\nprint(\"\\n💡 This is your blueprint for creating ANY OpenEnv environment!\\n\")"
   },
   {
    "cell_type": "markdown",
@@ -754,23 +407,7 @@
    "execution_count": null,
    "metadata": {},
    "outputs": [],
-   "source": [
-    "# Create environment\n",
-    "env = CatchEnvironment()\n",
-    "obs = env.reset()\n",
-    "\n",
-    "print(\"=\"*60)\n",
-    "print(\"   🎮 INITIAL STATE\")\n",
-    "print(\"=\"*60 + \"\\n\")\n",
-    "env.render()\n",
-    "print(f\"\\n🔴 Ball at: column {obs.ball_position[1]}\")\n",
-    "print(f\"🏓 Paddle at: column {obs.paddle_position}\")\n",
-    "print(f\"\\n📊 Observation:\")\n",
-    "print(f\"   • Legal actions: {obs.legal_actions}\")\n",
-    "print(f\"   • Info state size: {len(obs.info_state)} (5×5 grid flattened)\")\n",
-    "print(f\"   • Done: {obs.done}\")\n",
-    "print(f\"   • Reward: {obs.reward}\")"
-   ]
+   "source": "# Create environment and start a new episode\nenv = CatchEnvironment()\nobs = env.reset()\n\nprint(\"🎮 \" + \"=\"*58 + \" 🎮\")\nprint(\"   INITIAL GAME STATE\")\nprint(\"🎮 \" + \"=\"*58 + \" 🎮\\n\")\n\n# Visualize the game board\nenv.render()\n\n# Show game info\nprint(f\"\\n📍 Game Info:\")\nprint(f\"   🔴 Ball at: column {obs.ball_position[1]} (row {obs.ball_position[0]})\")\nprint(f\"   🏓 Paddle at: column {obs.paddle_position}\")\n\nprint(f\"\\n📊 Observation Details:\")\nprint(f\"   • Legal actions: {obs.legal_actions} → [LEFT, STAY, RIGHT]\")\nprint(f\"   • Info state size: {len(obs.info_state)} (5×5 grid flattened)\")\nprint(f\"   • Episode done: {obs.done}\")\nprint(f\"   • Current reward: {obs.reward}\")\n\nprint(\"\\n💡 The ball will fall down each step. Can your policy catch it?\")\nprint(\"=\"*62)"
   },
   {
    "cell_type": "markdown",
@@ -820,76 +457,7 @@
    "execution_count": null,
    "metadata": {},
    "outputs": [],
-   "source": [
-    "# ============================================================================\n",
-    "# POLICIES - Different AI strategies\n",
-    "# ============================================================================\n",
-    "\n",
-    "class RandomPolicy:\n",
-    "    \"\"\"Baseline: Pure random guessing.\"\"\"\n",
-    "    name = \"🎲 Random Guesser\"\n",
-    "    \n",
-    "    def select_action(self, obs: CatchObservation) -> int:\n",
-    "        return random.choice(obs.legal_actions)\n",
-    "\n",
-    "\n",
-    "class AlwaysStayPolicy:\n",
-    "    \"\"\"Bad strategy: Never moves.\"\"\"\n",
-    "    name = \"🛑 Always Stay\"\n",
-    "    \n",
-    "    def select_action(self, obs: CatchObservation) -> int:\n",
-    "        return 1  # STAY\n",
-    "\n",
-    "\n",
-    "class SmartPolicy:\n",
-    "    \"\"\"Optimal: Move paddle toward ball.\"\"\"\n",
-    "    name = \"🧠 Smart Heuristic\"\n",
-    "    \n",
-    "    def select_action(self, obs: CatchObservation) -> int:\n",
-    "        ball_col = obs.ball_position[1]\n",
-    "        paddle_col = obs.paddle_position\n",
-    "        \n",
-    "        if paddle_col < ball_col:\n",
-    "            return 2  # Move RIGHT\n",
-    "        elif paddle_col > ball_col:\n",
-    "            return 0  # Move LEFT\n",
-    "        else:\n",
-    "            return 1  # STAY (already aligned)\n",
-    "\n",
-    "\n",
-    "class LearningPolicy:\n",
-    "    \"\"\"Simulated RL: Epsilon-greedy exploration.\"\"\"\n",
-    "    name = \"📈 Learning Agent\"\n",
-    "    \n",
-    "    def __init__(self):\n",
-    "        self.steps = 0\n",
-    "    \n",
-    "    def select_action(self, obs: CatchObservation) -> int:\n",
-    "        self.steps += 1\n",
-    "        \n",
-    "        # Decay exploration rate over time\n",
-    "        epsilon = max(0.1, 1.0 - (self.steps / 100))\n",
-    "        \n",
-    "        if random.random() < epsilon:\n",
-    "            # Explore: random action\n",
-    "            return random.choice(obs.legal_actions)\n",
-    "        else:\n",
-    "            # Exploit: use smart strategy\n",
-    "            ball_col = obs.ball_position[1]\n",
-    "            paddle_col = obs.paddle_position\n",
-    "            if paddle_col < ball_col:\n",
-    "                return 2\n",
-    "            elif paddle_col > ball_col:\n",
-    "                return 0\n",
-    "            else:\n",
-    "                return 1\n",
-    "\n",
-    "\n",
-    "print(\"✅ 4 Policies created!\\n\")\n",
-    "policies = [RandomPolicy(), AlwaysStayPolicy(), SmartPolicy(), LearningPolicy()]\n",
-    "for i, policy in enumerate(policies, 1):\n",
-    "    print(f\"   {i}. {policy.name}\")"
-   ]
+   "source": "# ============================================================================\n# POLICIES - Different AI strategies\n# ============================================================================\n\nclass RandomPolicy:\n    \"\"\"Baseline: Pure random guessing.\"\"\"\n    name = \"🎲 Random Guesser\"\n    \n    def select_action(self, obs: CatchObservation) -> int:\n        return random.choice(obs.legal_actions)\n\n\nclass AlwaysStayPolicy:\n    \"\"\"Bad strategy: Never moves.\"\"\"\n    name = \"🛑 Always Stay\"\n    \n    def select_action(self, obs: CatchObservation) -> int:\n        return 1  # STAY\n\n\nclass SmartPolicy:\n    \"\"\"Optimal: Move paddle toward ball.\"\"\"\n    name = \"🧠 Smart Heuristic\"\n    \n    def select_action(self, obs: CatchObservation) -> int:\n        ball_col = obs.ball_position[1]\n        paddle_col = obs.paddle_position\n        \n        if paddle_col < ball_col:\n            return 2  # Move RIGHT\n        elif paddle_col > ball_col:\n            return 0  # Move LEFT\n        else:\n            return 1  # STAY (already aligned)\n\n\nclass LearningPolicy:\n    \"\"\"Simulated RL: Epsilon-greedy exploration.\"\"\"\n    name = \"📈 Learning Agent\"\n    \n    def __init__(self):\n        self.steps = 0\n    \n    def select_action(self, obs: CatchObservation) -> int:\n        self.steps += 1\n        \n        # Decay exploration rate over time\n        epsilon = max(0.1, 1.0 - (self.steps / 100))\n        \n        if random.random() < epsilon:\n            # Explore: random action\n            return random.choice(obs.legal_actions)\n        else:\n            # Exploit: use smart strategy\n            ball_col = obs.ball_position[1]\n            paddle_col = obs.paddle_position\n            if paddle_col < ball_col:\n                return 2\n            elif paddle_col > ball_col:\n                return 0\n            else:\n                return 1\n\n\nprint(\"🤖 \" + \"=\"*64 + \" 🤖\")\nprint(\"   ✅ 4 Policies Created!\")\nprint(\"🤖 \" + \"=\"*64 + \" 🤖\\n\")\n\npolicies = [RandomPolicy(), AlwaysStayPolicy(), SmartPolicy(), LearningPolicy()]\nfor i, policy in enumerate(policies, 1):\n    print(f\"   {i}. {policy.name}\")\n\nprint(\"\\n💡 Each policy represents a different approach to solving the game!\")\nprint(\"   Let's see who performs best! 🏆\\n\")"
   },
   {
    "cell_type": "markdown",
@@ -993,55 +561,7 @@
    "execution_count": null,
    "metadata": {},
    "outputs": [],
-   "source": [
-    "def evaluate_policies(num_episodes=50):\n",
-    "    \"\"\"Compare all policies over many episodes.\"\"\"\n",
-    "    policies = [\n",
-    "        RandomPolicy(),\n",
-    "        AlwaysStayPolicy(),\n",
-    "        SmartPolicy(),\n",
-    "        LearningPolicy(),\n",
-    "    ]\n",
-    "    \n",
-    "    print(\"\\n\" + \"=\"*70)\n",
-    "    print(f\"   🏆 POLICY SHOWDOWN - {num_episodes} Episodes Each\")\n",
-    "    print(\"=\"*70 + \"\\n\")\n",
-    "    \n",
-    "    results = []\n",
-    "    for policy in policies:\n",
-    "        print(f\"Testing {policy.name}...\", end=\" \")\n",
-    "        env = CatchEnvironment()\n",
-    "        successes = sum(run_episode(env, policy, visualize=False) \n",
-    "                       for _ in range(num_episodes))\n",
-    "        success_rate = (successes / num_episodes) * 100\n",
-    "        results.append((policy.name, success_rate, successes))\n",
-    "        print(f\"✓\")\n",
-    "    \n",
-    "    print(\"\\n\" + \"=\"*70)\n",
-    "    print(\"   📊 RESULTS\")\n",
-    "    print(\"=\"*70 + \"\\n\")\n",
-    "    \n",
-    "    # Sort by success rate\n",
-    "    results.sort(key=lambda x: x[1], reverse=True)\n",
-    "    \n",
-    "    for name, rate, successes in results:\n",
-    "        bar = \"█\" * int(rate / 2)\n",
-    "        print(f\"{name:25s} [{bar:<50}] {rate:5.1f}% ({successes}/{num_episodes})\")\n",
-    "    \n",
-    "    print(\"\\n\" + \"=\"*70)\n",
-    "    print(\"\\n💡 Key Insights:\")\n",
-    "    print(\"   • Random (~20%):    Baseline - pure luck\")\n",
-    "    print(\"   • Always Stay (~20%): Bad - only works if ball in center\")\n",
-    "    print(\"   • Smart (100%):     Optimal - always catches\")\n",
-    "    print(\"   • Learning (~85%):  Improves over time with experience\")\n",
-    "    print(\"\\n🎓 This is RL in action:\")\n",
-    "    print(\"   1. Start with exploration (random)\")\n",
-    "    print(\"   2. Learn from rewards\")\n",
-    "    print(\"   3. Converge to optimal behavior\\n\")\n",
-    "\n",
-    "# Run the competition!\n",
-    "evaluate_policies(num_episodes=50)"
-   ]
+   "source": "def evaluate_policies(num_episodes=50):\n    \"\"\"Compare all policies over many episodes.\"\"\"\n    policies = [\n        RandomPolicy(),\n        AlwaysStayPolicy(),\n        SmartPolicy(),\n        LearningPolicy(),\n    ]\n    \n    print(\"\\n🏆 \" + \"=\"*66 + \" 🏆\")\n    print(f\"   POLICY SHOWDOWN - {num_episodes} Episodes Each\")\n    print(\"🏆 \" + \"=\"*66 + \" 🏆\\n\")\n    \n    results = []\n    for policy in policies:\n        print(f\"⚡ Testing {policy.name}...\", end=\" \")\n        env = CatchEnvironment()\n        successes = sum(run_episode(env, policy, visualize=False) \n                       for _ in range(num_episodes))\n        success_rate = (successes / num_episodes) * 100\n        results.append((policy.name, success_rate, successes))\n        print(f\"✓ Done!\")\n    \n    print(\"\\n\" + \"=\"*70)\n    print(\"   📊 FINAL RESULTS\")\n    print(\"=\"*70 + \"\\n\")\n    \n    # Sort by success rate (descending)\n    results.sort(key=lambda x: x[1], reverse=True)\n    \n    # Award medals to top 3\n    medals = [\"🥇\", \"🥈\", \"🥉\", \"  \"]\n    \n    for i, (name, rate, successes) in enumerate(results):\n        medal = medals[i]\n        bar = \"█\" * int(rate / 2)\n        print(f\"{medal} {name:25s} [{bar:<50}] {rate:5.1f}% ({successes}/{num_episodes})\")\n    \n    print(\"\\n\" + \"=\"*70)\n    print(\"\\n✨ Key Insights:\")\n    print(\"   • Random (~20%):      Baseline - pure luck 🎲\")\n    print(\"   • Always Stay (~20%): Bad strategy - stays center 🛑\")\n    print(\"   • Smart (100%):       Optimal - perfect play! 🧠\")\n    print(\"   • Learning (~85%):    Improves over time 📈\")\n    print(\"\\n🎓 This is Reinforcement Learning in action:\")\n    print(\"   1. Start with exploration (trying random things)\")\n    print(\"   2. Learn from rewards (what works, what doesn't)\")\n    print(\"   3. Converge to optimal behavior (smart strategy)\")\n    print(\"\\n🎯 The Learning Agent gets smarter with every episode!\\n\")\n\n# Run the epic competition!\nprint(\"🎮 Starting the showdown...\")\nevaluate_policies(num_episodes=50)"
   },
   {
    "cell_type": "markdown",
@@ -1457,60 +977,12 @@
   {
    "cell_type": "markdown",
    "metadata": {},
-   "source": [
-    "## 📚 Resources\n",
-    "\n",
-    "<div style=\"background-color: #f8f9fa; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n",
-    "\n",
-    "**🔗 Links**\n",
-    "\n",
-    "- **OpenEnv GitHub**: https://github.com/meta-pytorch/OpenEnv\n",
-    "- **OpenSpiel**: https://github.com/google-deepmind/open_spiel\n",
-    "- **FastAPI Docs**: https://fastapi.tiangolo.com/\n",
-    "- **Docker Guide**: https://docs.docker.com/get-started/\n",
-    "\n",
-    "**📖 Documentation**\n",
-    "\n",
-    "- Environment creation guide: `src/envs/README.md`\n",
-    "- OpenSpiel integration: `src/envs/openspiel_env/README.md`\n",
-    "- Example scripts: `examples/`\n",
-    "\n",
-    "**🎓 Community**\n",
-    "\n",
-    "- Supported by: Meta PyTorch, Hugging Face, Unsloth AI, and more\n",
-    "- License: BSD 3-Clause\n",
-    "- Contributions welcome!\n",
-    "\n",
-    "</div>"
-   ]
+   "source": "## 📚 Resources\n\n<div style=\"background-color: #f8f9fa; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n\n### 🔗 Essential Links\n\n- **🏠 OpenEnv GitHub**: https://github.com/meta-pytorch/OpenEnv\n- **🎮 OpenSpiel**: https://github.com/google-deepmind/open_spiel\n- **⚡ FastAPI Docs**: https://fastapi.tiangolo.com/\n- **🐳 Docker Guide**: https://docs.docker.com/get-started/\n- **🔥 PyTorch**: https://pytorch.org/\n\n### 📖 Documentation Deep Dives\n\n- **Environment Creation Guide**: `src/envs/README.md`\n- **OpenSpiel Integration**: `src/envs/openspiel_env/README.md`\n- **Example Scripts**: `examples/`\n- **RFC 001**: [Baseline API Specs](https://github.com/meta-pytorch/OpenEnv/pull/26)\n\n### 🎓 Community & Support\n\n**Supported by amazing organizations:**\n- 🔥 Meta PyTorch\n- 🤗 Hugging Face\n- ⚡ Unsloth AI\n- 🌟 Reflection AI\n- 🚀 And many more!\n\n**License**: BSD 3-Clause (very permissive!)\n\n**Contributions**: Always welcome! Check out the issues tab.\n\n</div>\n\n---\n\n### 🌈 What's Next?\n\n1. ⭐ **Star the repo** to show support and stay updated\n2. 🔄 **Try modifying** the Catch game (make it harder? bigger grid?)\n3. 🎮 **Explore** other OpenSpiel games\n4. 🛠️ **Build** your own environment integration\n5. 💬 **Share** what you build with the community!"
   },
   {
    "cell_type": "markdown",
    "metadata": {},
-   "source": [
-    "---\n",
-    "\n",
-    "<div style=\"background: linear-gradient(135deg, #f093fb 0%, #f5576c 100%); color: white; padding: 50px; border-radius: 20px; margin: 40px 0; text-align: center;\">\n",
-    "\n",
-    "# 🎉 Congratulations!\n",
-    "\n",
-    "### You're now an OpenEnv expert!\n",
-    "\n",
-    "You understand:\n",
-    "- ✅ How RL works\n",
-    "- ✅ Why OpenEnv matters\n",
-    "- ✅ How to use existing environments\n",
-    "- ✅ How to create new integrations\n",
-    "- ✅ How to deploy to production\n",
-    "\n",
-    "---\n",
-    "\n",
-    "### Now go build something amazing! 🚀\n",
-    "\n",
-    "**Welcome to the future of RL.**\n",
-    "\n",
-    "</div>"
-   ]
+   "source": "---\n\n<div style=\"background: linear-gradient(135deg, #f093fb 0%, #f5576c 100%); color: white; padding: 50px; border-radius: 20px; margin: 40px 0; text-align: center;\">\n\n# 🎉 Congratulations! You Did It! 🎉\n\n### You're now an OpenEnv expert!\n\n<br>\n\n## ✅ What You've Mastered:\n\n**🧠 Concepts**\n- How RL works (the observe-act-reward loop)\n- Why OpenEnv matters (production-ready RL)\n- How to use existing environments\n\n**🛠️ Practical Skills**\n- Creating new integrations\n- Building type-safe environments\n- Deploying to production\n\n**🎯 Real Experience**\n- Built a complete RL environment\n- Tested multiple policies\n- Watched learning happen in real-time!\n\n---\n\n### Now go build something amazing! 🚀\n\n**Welcome to the future of RL with PyTorch & OpenEnv**\n\n<br>\n\n[![Star on GitHub](https://img.shields.io/badge/⭐_Star_on_GitHub-gray?style=for-the-badge)](https://github.com/meta-pytorch/OpenEnv)\n\n</div>\n\n---\n\n<div style=\"background-color: #f0f7ff; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n\n## 🌟 Want to Learn More?\n\n- 📖 Check out the [docs](https://github.com/meta-pytorch/OpenEnv)\n- 🎮 Try the other example games\n- 💬 Join the community discussions\n- 🛠️ Build your own integration\n- 🚀 Deploy to production\n- ⭐ Star the repo to stay updated!\n\n**Happy coding! 🎊**\n\n</div>"
   }
  ],
  "metadata": {
@@ -1534,4 +1006,4 @@
  },
  "nbformat": 4,
  "nbformat_minor": 4
-}
+}
\ No newline at end of file

From c756d8a3c82acdaea0f9a5cec86bd5a583da52ed Mon Sep 17 00:00:00 2001
From: Sanyam Bhutani <sanyambhutani@meta.com>
Date: Mon, 20 Oct 2025 13:48:25 -0700
Subject: [PATCH 06/19] Add TOC

---
 examples/OpenEnv_Tutorial.ipynb | 20 ++++++--------------
 1 file changed, 6 insertions(+), 14 deletions(-)

diff --git a/examples/OpenEnv_Tutorial.ipynb b/examples/OpenEnv_Tutorial.ipynb
index e534fcc..a8635bc 100644
--- a/examples/OpenEnv_Tutorial.ipynb
+++ b/examples/OpenEnv_Tutorial.ipynb
@@ -10,6 +10,11 @@
    "metadata": {},
    "source": "## 📋 What You'll Learn\n\n<table>\n<tr>\n<td width=\"50%\">\n\n**🎯 Part 1-2: The Fundamentals**\n- ⚡ RL in 60 seconds\n- 🤔 Why existing solutions fall short\n- 💡 The OpenEnv solution\n\n</td>\n<td width=\"50%\">\n\n**🏗️ Part 3-5: The Architecture**\n- 🔧 How OpenEnv works\n- 🔍 Exploring real code\n- 🎮 OpenSpiel integration example\n\n</td>\n</tr>\n<tr>\n<td width=\"50%\">\n\n**🎮 Part 6-8: Hands-On Demo**\n- 🔨 Build a game environment\n- 🤖 Test 4 different policies\n- 👀 Watch learning happen live\n\n</td>\n<td width=\"50%\">\n\n**🔧 Part 9-10: Going Further**\n- 🚀 Use real OpenSpiel\n- ✨ Create your own integration\n- 🌐 Deploy to production\n\n</td>\n</tr>\n</table>\n\n> 💡 **Pro Tip**: This notebook is designed to run top-to-bottom in Google Colab with zero setup!\n> \n> ⏱️ **Time**: ~30 minutes | 📊 **Difficulty**: Beginner-friendly | 🎯 **Outcome**: Production-ready RL knowledge"
   },
+  {
+   "cell_type": "markdown",
+   "source": "---\n\n<a id=\"part-1\"></a>\n# Part 1: RL in 60 Seconds ⏱️\n\n<div style=\"background-color: #f0f7ff; padding: 20px; border-left: 5px solid #2196F3; margin: 20px 0;\">\n\n**Reinforcement Learning is simpler than you think.**\n\nIt's just a loop:\n\n```\nwhile not done:\n    observation = environment.observe()\n    action = policy.choose(observation)\n    reward = environment.step(action)\n    policy.learn(reward)\n```\n\nThat's it. That's RL.\n\n</div>\n\nLet's see it in action:",
+   "metadata": {}
+  },
   {
    "cell_type": "markdown",
    "metadata": {},
@@ -49,20 +54,7 @@
   {
    "cell_type": "markdown",
    "metadata": {},
-   "source": [
-    "<div style=\"background-color: #fff3cd; padding: 15px; border-left: 5px solid #ffc107; margin: 20px 0;\">\n",
-    "\n",
-    "**🤔 The Problem**: Our random guesser never improves because it doesn't use the rewards!\n",
-    "\n",
-    "Real RL agents:\n",
-    "- 📊 Track which actions lead to rewards\n",
-    "- 🎯 Choose better actions over time\n",
-    "- 🔄 Balance exploration (trying new things) vs exploitation (using what works)\n",
-    "\n",
-    "We'll build this later!\n",
-    "\n",
-    "</div>"
-   ]
+   "source": "---\n\n<a id=\"part-2\"></a>\n# Part 2: The Problem with Traditional RL 😤\n\n<div style=\"background-color: #fff3e0; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n\n## 🤔 Why Can't We Just Use OpenAI Gym?\n\nGood question! Gym is great for research, but production needs more...\n\n</div>\n\n<table>\n<tr>\n<th>Challenge</th>\n<th>Traditional Approach</th>\n<th>OpenEnv Solution</th>\n</tr>\n<tr>\n<td><b>Type Safety</b></td>\n<td>❌ <code>obs[0][3]</code> - what is this?</td>\n<td>✅ <code>obs.info_state</code> - IDE knows!</td>\n</tr>\n<tr>\n<td><b>Isolation</b></td>\n<td>❌ Same process (can crash your training)</td>\n<td>✅ Docker containers (fully isolated)</td>\n</tr>\n<tr>\n<td><b>Deployment</b></td>\n<td>❌ \"Works on my machine\" 🤷</td>\n<td>✅ Same container everywhere 🐳</td>\n</tr>\n<tr>\n<td><b>Scaling</b></td>\n<td>❌ Hard to distribute</td>\n<td>✅ Deploy to Kubernetes ☸️</td>\n</tr>\n<tr>\n<td><b>Language</b></td>\n<td>❌ Python only</td>\n<td>✅ Any language (HTTP API) 🌐</td>\n</tr>\n<tr>\n<td><b>Debugging</b></td>\n<td>❌ Cryptic numpy errors</td>\n<td>✅ Clear type errors 🐛</td>\n</tr>\n</table>\n\n<div style=\"background-color: #d4edda; padding: 20px; border-left: 5px solid #28a745; margin: 20px 0;\">\n\n## 💡 The OpenEnv Philosophy\n\n**\"RL environments should be like microservices\"**\n\nThink of it like this: You don't run your database in the same process as your web server, right? Same principle!\n\n- 🔒 **Isolated**: Run in containers (security + stability)\n- 🌐 **Standard**: HTTP API, works everywhere\n- 📦 **Versioned**: Docker images (reproducibility!)\n- 🚀 **Scalable**: Deploy to cloud with one command\n- 🛡️ **Type-safe**: Catch bugs before they happen\n- 🔄 **Portable**: Works on Mac, Linux, Windows, Cloud\n\n</div>"
   },
   {
    "cell_type": "markdown",

From af5de22232159fd93cad5a29c206599f4d06445f Mon Sep 17 00:00:00 2001
From: Sanyam Bhutani <sanyambhutani@meta.com>
Date: Mon, 20 Oct 2025 13:49:04 -0700
Subject: [PATCH 07/19] Update OpenEnv_Tutorial.ipynb

---
 examples/OpenEnv_Tutorial.ipynb | 92 ++-------------------------------
 1 file changed, 4 insertions(+), 88 deletions(-)

diff --git a/examples/OpenEnv_Tutorial.ipynb b/examples/OpenEnv_Tutorial.ipynb
index a8635bc..4d6378f 100644
--- a/examples/OpenEnv_Tutorial.ipynb
+++ b/examples/OpenEnv_Tutorial.ipynb
@@ -64,47 +64,7 @@
   {
    "cell_type": "markdown",
    "metadata": {},
-   "source": [
-    "### The Architecture\n",
-    "\n",
-    "```\n",
-    "┌────────────────────────────────────────────────────────────┐\n",
-    "│  YOUR TRAINING CODE                                        │\n",
-    "│                                                            │\n",
-    "│  env = OpenSpielEnv(...)        ← Import the client      │\n",
-    "│  result = env.reset()           ← Type-safe!             │\n",
-    "│  result = env.step(action)      ← Type-safe!             │\n",
-    "│                                                            │\n",
-    "└─────────────────┬──────────────────────────────────────────┘\n",
-    "                  │\n",
-    "                  │  HTTP/JSON (Language-Agnostic)\n",
-    "                  │  POST /reset, POST /step, GET /state\n",
-    "                  │\n",
-    "┌─────────────────▼──────────────────────────────────────────┐\n",
-    "│  DOCKER CONTAINER                                          │\n",
-    "│                                                            │\n",
-    "│  ┌──────────────────────────────────────────────┐         │\n",
-    "│  │  FastAPI Server                              │         │\n",
-    "│  │  └─ Environment (reset, step, state)         │         │\n",
-    "│  │     └─ Your Game/Simulation Logic            │         │\n",
-    "│  └──────────────────────────────────────────────┘         │\n",
-    "│                                                            │\n",
-    "│  Isolated • Reproducible • Secure                          │\n",
-    "└────────────────────────────────────────────────────────────┘\n",
-    "```\n",
-    "\n",
-    "<div style=\"background-color: #e7f3ff; padding: 15px; border-left: 5px solid #0366d6; margin: 20px 0;\">\n",
-    "\n",
-    "**🎯 Key Insight**: You never see HTTP details - just clean Python methods!\n",
-    "\n",
-    "```python\n",
-    "env.reset()    # Under the hood: HTTP POST to /reset\n",
-    "env.step(...)  # Under the hood: HTTP POST to /step\n",
-    "env.state()    # Under the hood: HTTP GET to /state\n",
-    "```\n",
-    "\n",
-    "</div>"
-   ]
+   "source": "---\n\n<a id=\"part-3\"></a>\n# Part 3: Setup 🛠️\n\n<div style=\"background-color: #f8f9fa; padding: 15px; border-radius: 5px; margin: 20px 0;\">\n\n**Running in Colab?** This cell will clone OpenEnv and install dependencies automatically.\n\n**Running locally?** Make sure you're in the OpenEnv directory.\n\n</div>"
   },
   {
    "cell_type": "markdown",
@@ -128,7 +88,7 @@
    "execution_count": null,
    "metadata": {},
    "outputs": [],
-   "source": "# Detect environment\ntry:\n    import google.colab\n    IN_COLAB = True\n    print(\"🌐 Running in Google Colab - Perfect!\")\nexcept ImportError:\n    IN_COLAB = False\n    print(\"💻 Running locally - Nice!\")\n\nif IN_COLAB:\n    print(\"\\n📦 Cloning OpenEnv repository...\")\n    !git clone https://github.com/meta-pytorch/OpenEnv.git > /dev/null 2>&1\n    %cd OpenEnv\n    \n    print(\"📚 Installing dependencies (this takes ~10 seconds)...\")\n    !pip install -q fastapi uvicorn requests\n    \n    import sys\n    sys.path.insert(0, './src')\n    print(\"\\n✅ Setup complete! Everything is ready to go! 🎉\")\nelse:\n    import sys\n    from pathlib import Path\n    sys.path.insert(0, str(Path.cwd().parent / 'src'))\n    print(\"✅ Using local OpenEnv installation\")\n\nprint(\"\\n🚀 Ready to explore OpenEnv and build amazing things!\")\nprint(\"💡 Tip: Run cells top-to-bottom for the best experience.\\n\")"
+   "source": "---\n\n<a id=\"part-4\"></a>\n# Part 4: The OpenEnv Pattern 🏗️\n\n<div style=\"background-color: #f0f7ff; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n\n## Every OpenEnv Environment Has 3 Components:\n\n```\nsrc/envs/your_env/\n├── 📝 models.py          ← Type-safe contracts\n│                           (Action, Observation, State)\n│\n├── 📱 client.py          ← What YOU import\n│                           (HTTPEnvClient implementation)\n│\n└── 🖥️  server/\n    ├── environment.py    ← Game/simulation logic\n    ├── app.py            ← FastAPI server\n    └── Dockerfile        ← Container definition\n```\n\n</div>\n\nLet's explore the actual OpenEnv code to see how this works:"
   },
   {
    "cell_type": "markdown",
@@ -166,7 +126,7 @@
    "execution_count": null,
    "metadata": {},
    "outputs": [],
-   "source": "# Import OpenEnv's core abstractions\nfrom core.env_server import Environment, Action, Observation, State\nfrom core.http_env_client import HTTPEnvClient\n\nprint(\"=\"*70)\nprint(\"   🧩 OPENENV CORE ABSTRACTIONS\")\nprint(\"=\"*70)\n\nprint(\"\"\"\n🖥️  SERVER SIDE (runs in Docker):\n\n    class Environment(ABC):\n        '''Base class for all environment implementations'''\n        \n        @abstractmethod\n        def reset(self) -> Observation:\n            '''Start new episode'''\n        \n        @abstractmethod\n        def step(self, action: Action) -> Observation:\n            '''Execute action, return observation'''\n        \n        @property\n        def state(self) -> State:\n            '''Get episode metadata'''\n\n📱 CLIENT SIDE (your training code):\n\n    class HTTPEnvClient(ABC):\n        '''Base class for HTTP clients'''\n        \n        def reset(self) -> StepResult:\n            # HTTP POST /reset\n        \n        def step(self, action) -> StepResult:\n            # HTTP POST /step\n        \n        def state(self) -> State:\n            # HTTP GET /state\n\"\"\")\n\nprint(\"=\"*70)\nprint(\"\\n✨ Same interface on both sides - communication via HTTP!\")\nprint(\"🎯 You focus on RL, OpenEnv handles the infrastructure.\\n\")"
+   "source": "---\n\n<a id=\"part-5\"></a>\n# Part 5: Example Integration - OpenSpiel 🎮\n\n<div style=\"background-color: #fff3e0; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n\n## What is OpenSpiel?\n\n**OpenSpiel** is a library from DeepMind with **70+ game environments** for RL research.\n\n## OpenEnv's Integration\n\nWe've wrapped **6 OpenSpiel games** following the OpenEnv pattern:\n\n<table>\n<tr>\n<td width=\"50%\">\n\n**🎯 Single-Player**\n1. **Catch** - Catch falling ball\n2. **Cliff Walking** - Navigate grid\n3. **2048** - Tile puzzle\n4. **Blackjack** - Card game\n\n</td>\n<td width=\"50%\">\n\n**👥 Multi-Player**\n5. **Tic-Tac-Toe** - Classic 3×3\n6. **Kuhn Poker** - Imperfect info poker\n\n</td>\n</tr>\n</table>\n\nThis shows how OpenEnv can wrap **any** existing RL library!\n\n</div>"
   },
   {
    "cell_type": "markdown",
@@ -277,51 +237,7 @@
    "execution_count": null,
    "metadata": {},
    "outputs": [],
-   "source": [
-    "from envs.openspiel_env.client import OpenSpielEnv\n",
-    "\n",
-    "print(\"=\"*70)\n",
-    "print(\"   🔌 HOW OPENENV WRAPS OPENSPIEL\")\n",
-    "print(\"=\"*70)\n",
-    "\n",
-    "print(\"\"\"\n",
-    "class OpenSpielEnv(HTTPEnvClient[OpenSpielAction, OpenSpielObservation]):\n",
-    "    \n",
-    "    def _step_payload(self, action: OpenSpielAction) -> dict:\n",
-    "        '''Convert typed action to JSON for HTTP'''\n",
-    "        return {\n",
-    "            \"action_id\": action.action_id,\n",
-    "            \"game_name\": action.game_name,\n",
-    "        }\n",
-    "    \n",
-    "    def _parse_result(self, payload: dict) -> StepResult:\n",
-    "        '''Parse HTTP JSON response into typed observation'''\n",
-    "        return StepResult(\n",
-    "            observation=OpenSpielObservation(...),\n",
-    "            reward=payload['reward'],\n",
-    "            done=payload['done']\n",
-    "        )\n",
-    "\n",
-    "\"\"\")\n",
-    "\n",
-    "print(\"─\" * 70)\n",
-    "print(\"\\n✨ Usage (works for ALL OpenEnv environments):\")\n",
-    "print(\"\"\"\n",
-    "  env = OpenSpielEnv(base_url=\"http://localhost:8000\")\n",
-    "  \n",
-    "  result = env.reset()\n",
-    "  # Returns StepResult[OpenSpielObservation] - Type safe!\n",
-    "  \n",
-    "  result = env.step(OpenSpielAction(action_id=2, game_name=\"catch\"))\n",
-    "  # Type checker knows this is valid!\n",
-    "  \n",
-    "  state = env.state()\n",
-    "  # Returns OpenSpielState\n",
-    "\"\"\")\n",
-    "\n",
-    "print(\"─\" * 70)\n",
-    "print(\"\\n🎯 This pattern works for ANY environment you want to wrap!\\n\")"
-   ]
+   "source": "---\n\n<a id=\"part-6\"></a>\n<div style=\"text-align: center; background: linear-gradient(135deg, #667eea 0%, #764ba2 100%); color: white; padding: 30px; border-radius: 15px; margin: 30px 0;\">\n\n# 🎮 Part 6: Interactive Demo\n\n### Now let's BUILD something!\n\nWe'll create a **Catch game** following OpenEnv patterns,<br>\nthen watch **4 different AI policies** compete for the championship! 🏆\n\n<br>\n\n**Get ready for:**\n- ⚡ Live gameplay visualization\n- 🤖 AI policy showdown\n- 📊 Real-time learning metrics\n- 🎯 Production-ready patterns\n\n</div>"
   },
   {
    "cell_type": "markdown",

From 520983c2e899e4c5e9c20bfde0b8eef1117e6a1e Mon Sep 17 00:00:00 2001
From: Sanyam Bhutani <sanyambhutani@meta.com>
Date: Mon, 20 Oct 2025 13:56:09 -0700
Subject: [PATCH 08/19] Update OpenEnv_Tutorial.ipynb

---
 examples/OpenEnv_Tutorial.ipynb | 288 ++------------------------------
 1 file changed, 11 insertions(+), 277 deletions(-)

diff --git a/examples/OpenEnv_Tutorial.ipynb b/examples/OpenEnv_Tutorial.ipynb
index 4d6378f..3f52343 100644
--- a/examples/OpenEnv_Tutorial.ipynb
+++ b/examples/OpenEnv_Tutorial.ipynb
@@ -56,6 +56,11 @@
    "metadata": {},
    "source": "---\n\n<a id=\"part-2\"></a>\n# Part 2: The Problem with Traditional RL 😤\n\n<div style=\"background-color: #fff3e0; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n\n## 🤔 Why Can't We Just Use OpenAI Gym?\n\nGood question! Gym is great for research, but production needs more...\n\n</div>\n\n<table>\n<tr>\n<th>Challenge</th>\n<th>Traditional Approach</th>\n<th>OpenEnv Solution</th>\n</tr>\n<tr>\n<td><b>Type Safety</b></td>\n<td>❌ <code>obs[0][3]</code> - what is this?</td>\n<td>✅ <code>obs.info_state</code> - IDE knows!</td>\n</tr>\n<tr>\n<td><b>Isolation</b></td>\n<td>❌ Same process (can crash your training)</td>\n<td>✅ Docker containers (fully isolated)</td>\n</tr>\n<tr>\n<td><b>Deployment</b></td>\n<td>❌ \"Works on my machine\" 🤷</td>\n<td>✅ Same container everywhere 🐳</td>\n</tr>\n<tr>\n<td><b>Scaling</b></td>\n<td>❌ Hard to distribute</td>\n<td>✅ Deploy to Kubernetes ☸️</td>\n</tr>\n<tr>\n<td><b>Language</b></td>\n<td>❌ Python only</td>\n<td>✅ Any language (HTTP API) 🌐</td>\n</tr>\n<tr>\n<td><b>Debugging</b></td>\n<td>❌ Cryptic numpy errors</td>\n<td>✅ Clear type errors 🐛</td>\n</tr>\n</table>\n\n<div style=\"background-color: #d4edda; padding: 20px; border-left: 5px solid #28a745; margin: 20px 0;\">\n\n## 💡 The OpenEnv Philosophy\n\n**\"RL environments should be like microservices\"**\n\nThink of it like this: You don't run your database in the same process as your web server, right? Same principle!\n\n- 🔒 **Isolated**: Run in containers (security + stability)\n- 🌐 **Standard**: HTTP API, works everywhere\n- 📦 **Versioned**: Docker images (reproducibility!)\n- 🚀 **Scalable**: Deploy to cloud with one command\n- 🛡️ **Type-safe**: Catch bugs before they happen\n- 🔄 **Portable**: Works on Mac, Linux, Windows, Cloud\n\n</div>"
   },
+  {
+   "cell_type": "markdown",
+   "source": "### The Architecture\n\n```\n┌────────────────────────────────────────────────────────────┐\n│  YOUR TRAINING CODE                                        │\n│                                                            │\n│  env = OpenSpielEnv(...)        ← Import the client      │\n│  result = env.reset()           ← Type-safe!             │\n│  result = env.step(action)      ← Type-safe!             │\n│                                                            │\n└─────────────────┬──────────────────────────────────────────┘\n                  │\n                  │  HTTP/JSON (Language-Agnostic)\n                  │  POST /reset, POST /step, GET /state\n                  │\n┌─────────────────▼──────────────────────────────────────────┐\n│  DOCKER CONTAINER                                          │\n│                                                            │\n│  ┌──────────────────────────────────────────────┐         │\n│  │  FastAPI Server                              │         │\n│  │  └─ Environment (reset, step, state)         │         │\n│  │     └─ Your Game/Simulation Logic            │         │\n│  └──────────────────────────────────────────────┘         │\n│                                                            │\n│  Isolated • Reproducible • Secure                          │\n└────────────────────────────────────────────────────────────┘\n```\n\n<div style=\"background-color: #e7f3ff; padding: 15px; border-left: 5px solid #0366d6; margin: 20px 0;\">\n\n**🎯 Key Insight**: You never see HTTP details - just clean Python methods!\n\n```python\nenv.reset()    # Under the hood: HTTP POST to /reset\nenv.step(...)  # Under the hood: HTTP POST to /step\nenv.state()    # Under the hood: HTTP GET to /state\n```\n\nThe magic? OpenEnv handles all the plumbing. You focus on RL! ✨\n\n</div>",
+   "metadata": {}
+  },
   {
    "cell_type": "markdown",
    "metadata": {},
@@ -315,7 +320,7 @@
    "execution_count": null,
    "metadata": {},
    "outputs": [],
-   "source": "# Create environment and start a new episode\nenv = CatchEnvironment()\nobs = env.reset()\n\nprint(\"🎮 \" + \"=\"*58 + \" 🎮\")\nprint(\"   INITIAL GAME STATE\")\nprint(\"🎮 \" + \"=\"*58 + \" 🎮\\n\")\n\n# Visualize the game board\nenv.render()\n\n# Show game info\nprint(f\"\\n📍 Game Info:\")\nprint(f\"   🔴 Ball at: column {obs.ball_position[1]} (row {obs.ball_position[0]})\")\nprint(f\"   🏓 Paddle at: column {obs.paddle_position}\")\n\nprint(f\"\\n📊 Observation Details:\")\nprint(f\"   • Legal actions: {obs.legal_actions} → [LEFT, STAY, RIGHT]\")\nprint(f\"   • Info state size: {len(obs.info_state)} (5×5 grid flattened)\")\nprint(f\"   • Episode done: {obs.done}\")\nprint(f\"   • Current reward: {obs.reward}\")\n\nprint(\"\\n💡 The ball will fall down each step. Can your policy catch it?\")\nprint(\"=\"*62)"
+   "source": "---\n\n<a id=\"part-7\"></a>\n# Part 7: Four Policies 🤖\n\n<div style=\"background-color: #f8f9fa; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n\n## Let's test 4 different AI strategies:\n\n<table>\n<tr>\n<th width=\"25%\">Policy</th>\n<th width=\"50%\">Strategy</th>\n<th width=\"25%\">Expected Performance</th>\n</tr>\n<tr>\n<td><b>🎲 Random</b></td>\n<td>Pick random action every step</td>\n<td>~20% (pure luck)</td>\n</tr>\n<tr>\n<td><b>🛑 Always Stay</b></td>\n<td>Never move, hope ball lands in center</td>\n<td>~20% (terrible!)</td>\n</tr>\n<tr>\n<td><b>🧠 Smart</b></td>\n<td>Move paddle toward ball</td>\n<td>100% (optimal!)</td>\n</tr>\n<tr>\n<td><b>📈 Learning</b></td>\n<td>Start random, learn smart strategy</td>\n<td>~85% (improves over time)</td>\n</tr>\n</table>\n\n</div>"
   },
   {
    "cell_type": "markdown",
@@ -436,18 +441,7 @@
   {
    "cell_type": "markdown",
    "metadata": {},
-   "source": [
-    "<div style=\"background-color: #fff3cd; padding: 15px; border-left: 5px solid #ffc107; margin: 20px 0;\">\n",
-    "\n",
-    "**💡 Try changing the policy!**\n",
-    "\n",
-    "Replace `SmartPolicy()` with:\n",
-    "- `RandomPolicy()` - Watch it fail!\n",
-    "- `AlwaysStayPolicy()` - Usually fails\n",
-    "- `LearningPolicy()` - Gets better over time\n",
-    "\n",
-    "</div>"
-   ]
+   "source": "---\n\n<a id=\"part-8\"></a>\n# Part 8: Policy Competition! 🏆\n\n<div style=\"background-color: #e7f3ff; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n\nLet's run **50 episodes** for each policy and see who wins!\n\n</div>"
   },
   {
    "cell_type": "markdown",
@@ -474,238 +468,17 @@
   {
    "cell_type": "markdown",
    "metadata": {},
-   "source": [
-    "---\n",
-    "\n",
-    "<div style=\"background: linear-gradient(135deg, #f093fb 0%, #f5576c 100%); color: white; padding: 30px; border-radius: 15px; margin: 30px 0; text-align: center;\">\n",
-    "\n",
-    "# 🎉 Congratulations!\n",
-    "\n",
-    "### You just built and tested a complete RL environment!\n",
-    "\n",
-    "But we did it **the OpenEnv way**: type-safe, structured, production-ready.\n",
-    "\n",
-    "</div>"
-   ]
+   "source": "---\n\n<a id=\"part-9\"></a>\n# Part 9: Using Real OpenSpiel 🎮\n\n<div style=\"background-color: #d4edda; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n\n## What We Just Built vs Production OpenSpiel\n\n<table>\n<tr>\n<th>Component</th>\n<th>Our Demo</th>\n<th>OpenEnv + OpenSpiel</th>\n</tr>\n<tr>\n<td><b>Environment</b></td>\n<td>Local Python class</td>\n<td>Docker container</td>\n</tr>\n<tr>\n<td><b>Communication</b></td>\n<td>Direct function calls</td>\n<td>HTTP/JSON</td>\n</tr>\n<tr>\n<td><b>Client</b></td>\n<td>Direct access</td>\n<td>HTTPEnvClient</td>\n</tr>\n<tr>\n<td><b>Type Safety</b></td>\n<td>✅ Dataclasses</td>\n<td>✅ Dataclasses</td>\n</tr>\n<tr>\n<td><b>API</b></td>\n<td>reset(), step()</td>\n<td>reset(), step() <em>(same!)</em></td>\n</tr>\n</table>\n\n**🎯 Same structure, production features!**\n\n</div>\n\n### Using OpenSpiel Integration:\n\n```python\n# 1. Install OpenSpiel\n!pip install open_spiel\n\n# 2. Import OpenEnv's integration\nfrom envs.openspiel_env import OpenSpielEnv, OpenSpielAction\n\n# 3. Connect to server (HTTP!)\nenv = OpenSpielEnv(base_url=\"http://localhost:8000\")\n\n# 4. Same API you just learned!\nresult = env.reset()\nresult = env.step(OpenSpielAction(action_id=2, game_name=\"catch\"))\nstate = env.state()\n\n# 5. Switch games by changing game_name:\nresult = env.step(OpenSpielAction(action_id=4, game_name=\"tic_tac_toe\"))\n```\n\n<div style=\"background-color: #fff3e0; padding: 15px; border-radius: 5px; margin: 20px 0;\">\n\n**🎮 6 Games Available:**\n\n1. `\"catch\"` - What we just built!\n2. `\"tic_tac_toe\"` - Classic 3×3\n3. `\"kuhn_poker\"` - Imperfect information poker\n4. `\"cliff_walking\"` - Grid navigation\n5. `\"2048\"` - Tile puzzle\n6. `\"blackjack\"` - Card game\n\n**All use the exact same interface!**\n\n</div>"
   },
   {
    "cell_type": "markdown",
    "metadata": {},
-   "source": [
-    "---\n",
-    "\n",
-    "# Part 9: Using Real OpenSpiel 🎮\n",
-    "\n",
-    "<div style=\"background-color: #d4edda; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n",
-    "\n",
-    "## What We Just Built vs Production OpenSpiel\n",
-    "\n",
-    "<table>\n",
-    "<tr>\n",
-    "<th>Component</th>\n",
-    "<th>Our Demo</th>\n",
-    "<th>OpenEnv + OpenSpiel</th>\n",
-    "</tr>\n",
-    "<tr>\n",
-    "<td><b>Environment</b></td>\n",
-    "<td>Local Python class</td>\n",
-    "<td>Docker container</td>\n",
-    "</tr>\n",
-    "<tr>\n",
-    "<td><b>Communication</b></td>\n",
-    "<td>Direct function calls</td>\n",
-    "<td>HTTP/JSON</td>\n",
-    "</tr>\n",
-    "<tr>\n",
-    "<td><b>Client</b></td>\n",
-    "<td>Direct access</td>\n",
-    "<td>HTTPEnvClient</td>\n",
-    "</tr>\n",
-    "<tr>\n",
-    "<td><b>Type Safety</b></td>\n",
-    "<td>✅ Dataclasses</td>\n",
-    "<td>✅ Dataclasses</td>\n",
-    "</tr>\n",
-    "<tr>\n",
-    "<td><b>API</b></td>\n",
-    "<td>reset(), step()</td>\n",
-    "<td>reset(), step() <em>(same!)</em></td>\n",
-    "</tr>\n",
-    "</table>\n",
-    "\n",
-    "**🎯 Same structure, production features!**\n",
-    "\n",
-    "</div>\n",
-    "\n",
-    "### Using OpenSpiel Integration:\n",
-    "\n",
-    "```python\n",
-    "# 1. Install OpenSpiel\n",
-    "!pip install open_spiel\n",
-    "\n",
-    "# 2. Import OpenEnv's integration\n",
-    "from envs.openspiel_env import OpenSpielEnv, OpenSpielAction\n",
-    "\n",
-    "# 3. Connect to server (HTTP!)\n",
-    "env = OpenSpielEnv(base_url=\"http://localhost:8000\")\n",
-    "\n",
-    "# 4. Same API you just learned!\n",
-    "result = env.reset()\n",
-    "result = env.step(OpenSpielAction(action_id=2, game_name=\"catch\"))\n",
-    "state = env.state()\n",
-    "\n",
-    "# 5. Switch games by changing game_name:\n",
-    "result = env.step(OpenSpielAction(action_id=4, game_name=\"tic_tac_toe\"))\n",
-    "```\n",
-    "\n",
-    "<div style=\"background-color: #fff3e0; padding: 15px; border-radius: 5px; margin: 20px 0;\">\n",
-    "\n",
-    "**🎮 6 Games Available:**\n",
-    "\n",
-    "1. `\"catch\"` - What we just built!\n",
-    "2. `\"tic_tac_toe\"` - Classic 3×3\n",
-    "3. `\"kuhn_poker\"` - Imperfect information poker\n",
-    "4. `\"cliff_walking\"` - Grid navigation\n",
-    "5. `\"2048\"` - Tile puzzle\n",
-    "6. `\"blackjack\"` - Card game\n",
-    "\n",
-    "**All use the exact same interface!**\n",
-    "\n",
-    "</div>"
-   ]
+   "source": "---\n\n<a id=\"part-10\"></a>\n# Part 10: Create Your Own Integration 🛠️\n\n<div style=\"background-color: #e7f3ff; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n\n## The 5-Step Pattern\n\nWant to wrap your own environment in OpenEnv? Here's how:\n\n</div>\n\n### Step 1: Define Types (`models.py`)\n\n```python\nfrom dataclasses import dataclass\nfrom core.env_server import Action, Observation, State\n\n@dataclass\nclass YourAction(Action):\n    action_value: int\n    # Add your action fields\n\n@dataclass\nclass YourObservation(Observation):\n    state_data: List[float]\n    done: bool\n    reward: float\n    # Add your observation fields\n\n@dataclass\nclass YourState(State):\n    episode_id: str\n    step_count: int\n    # Add your state fields\n```\n\n### Step 2: Implement Environment (`server/environment.py`)\n\n```python\nfrom core.env_server import Environment\n\nclass YourEnvironment(Environment):\n    def reset(self) -> Observation:\n        # Initialize your game/simulation\n        return YourObservation(...)\n    \n    def step(self, action: Action) -> Observation:\n        # Execute action, update state\n        return YourObservation(...)\n    \n    @property\n    def state(self) -> State:\n        return self._state\n```\n\n### Step 3: Create Client (`client.py`)\n\n```python\nfrom core.http_env_client import HTTPEnvClient\nfrom core.types import StepResult\n\nclass YourEnv(HTTPEnvClient[YourAction, YourObservation]):\n    def _step_payload(self, action: YourAction) -> dict:\n        \"\"\"Convert action to JSON\"\"\"\n        return {\"action_value\": action.action_value}\n    \n    def _parse_result(self, payload: dict) -> StepResult:\n        \"\"\"Parse JSON to observation\"\"\"\n        return StepResult(\n            observation=YourObservation(...),\n            reward=payload['reward'],\n            done=payload['done']\n        )\n    \n    def _parse_state(self, payload: dict) -> YourState:\n        return YourState(...)\n```\n\n### Step 4: Create Server (`server/app.py`)\n\n```python\nfrom core.env_server import create_fastapi_app\nfrom .your_environment import YourEnvironment\n\nenv = YourEnvironment()\napp = create_fastapi_app(env)\n\n# That's it! OpenEnv creates all endpoints for you.\n```\n\n### Step 5: Dockerize (`server/Dockerfile`)\n\n```dockerfile\nFROM python:3.11-slim\n\nWORKDIR /app\nCOPY requirements.txt .\nRUN pip install --no-cache-dir -r requirements.txt\n\nCOPY . .\nCMD [\"uvicorn\", \"app:app\", \"--host\", \"0.0.0.0\", \"--port\", \"8000\"]\n```\n\n<div style=\"background-color: #d4edda; padding: 20px; border-left: 5px solid #28a745; margin: 20px 0;\">\n\n### 🎓 Examples to Study\n\nOpenEnv includes 3 complete examples:\n\n1. **`src/envs/echo_env/`**\n   - Simplest possible environment\n   - Great for testing and learning\n\n2. **`src/envs/openspiel_env/`**\n   - Wraps external library (OpenSpiel)\n   - Shows integration pattern\n   - 6 games in one integration\n\n3. **`src/envs/coding_env/`**\n   - Python code execution environment\n   - Shows complex use case\n   - Security considerations\n\n**💡 Study these to understand the patterns!**\n\n</div>"
   },
   {
    "cell_type": "markdown",
    "metadata": {},
-   "source": [
-    "---\n",
-    "\n",
-    "# Part 10: Create Your Own Integration 🛠️\n",
-    "\n",
-    "<div style=\"background-color: #e7f3ff; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n",
-    "\n",
-    "## The 5-Step Pattern\n",
-    "\n",
-    "Want to wrap your own environment in OpenEnv? Here's how:\n",
-    "\n",
-    "</div>\n",
-    "\n",
-    "### Step 1: Define Types (`models.py`)\n",
-    "\n",
-    "```python\n",
-    "from dataclasses import dataclass\n",
-    "from core.env_server import Action, Observation, State\n",
-    "\n",
-    "@dataclass\n",
-    "class YourAction(Action):\n",
-    "    action_value: int\n",
-    "    # Add your action fields\n",
-    "\n",
-    "@dataclass\n",
-    "class YourObservation(Observation):\n",
-    "    state_data: List[float]\n",
-    "    done: bool\n",
-    "    reward: float\n",
-    "    # Add your observation fields\n",
-    "\n",
-    "@dataclass\n",
-    "class YourState(State):\n",
-    "    episode_id: str\n",
-    "    step_count: int\n",
-    "    # Add your state fields\n",
-    "```\n",
-    "\n",
-    "### Step 2: Implement Environment (`server/environment.py`)\n",
-    "\n",
-    "```python\n",
-    "from core.env_server import Environment\n",
-    "\n",
-    "class YourEnvironment(Environment):\n",
-    "    def reset(self) -> Observation:\n",
-    "        # Initialize your game/simulation\n",
-    "        return YourObservation(...)\n",
-    "    \n",
-    "    def step(self, action: Action) -> Observation:\n",
-    "        # Execute action, update state\n",
-    "        return YourObservation(...)\n",
-    "    \n",
-    "    @property\n",
-    "    def state(self) -> State:\n",
-    "        return self._state\n",
-    "```\n",
-    "\n",
-    "### Step 3: Create Client (`client.py`)\n",
-    "\n",
-    "```python\n",
-    "from core.http_env_client import HTTPEnvClient\n",
-    "from core.types import StepResult\n",
-    "\n",
-    "class YourEnv(HTTPEnvClient[YourAction, YourObservation]):\n",
-    "    def _step_payload(self, action: YourAction) -> dict:\n",
-    "        \"\"\"Convert action to JSON\"\"\"\n",
-    "        return {\"action_value\": action.action_value}\n",
-    "    \n",
-    "    def _parse_result(self, payload: dict) -> StepResult:\n",
-    "        \"\"\"Parse JSON to observation\"\"\"\n",
-    "        return StepResult(\n",
-    "            observation=YourObservation(...),\n",
-    "            reward=payload['reward'],\n",
-    "            done=payload['done']\n",
-    "        )\n",
-    "    \n",
-    "    def _parse_state(self, payload: dict) -> YourState:\n",
-    "        return YourState(...)\n",
-    "```\n",
-    "\n",
-    "### Step 4: Create Server (`server/app.py`)\n",
-    "\n",
-    "```python\n",
-    "from core.env_server import create_fastapi_app\n",
-    "from .your_environment import YourEnvironment\n",
-    "\n",
-    "env = YourEnvironment()\n",
-    "app = create_fastapi_app(env)\n",
-    "\n",
-    "# That's it! OpenEnv creates all endpoints for you.\n",
-    "```\n",
-    "\n",
-    "### Step 5: Dockerize (`server/Dockerfile`)\n",
-    "\n",
-    "```dockerfile\n",
-    "FROM python:3.11-slim\n",
-    "\n",
-    "WORKDIR /app\n",
-    "COPY requirements.txt .\n",
-    "RUN pip install --no-cache-dir -r requirements.txt\n",
-    "\n",
-    "COPY . .\n",
-    "CMD [\"uvicorn\", \"app:app\", \"--host\", \"0.0.0.0\", \"--port\", \"8000\"]\n",
-    "```\n",
-    "\n",
-    "<div style=\"background-color: #d4edda; padding: 20px; border-left: 5px solid #28a745; margin: 20px 0;\">\n",
-    "\n",
-    "### 🎓 Examples to Study\n",
-    "\n",
-    "OpenEnv includes 3 complete examples:\n",
-    "\n",
-    "1. **`src/envs/echo_env/`**\n",
-    "   - Simplest possible environment\n",
-    "   - Great for testing and learning\n",
-    "\n",
-    "2. **`src/envs/openspiel_env/`**\n",
-    "   - Wraps external library (OpenSpiel)\n",
-    "   - Shows integration pattern\n",
-    "   - 6 games in one integration\n",
-    "\n",
-    "3. **`src/envs/coding_env/`**\n",
-    "   - Python code execution environment\n",
-    "   - Shows complex use case\n",
-    "   - Security considerations\n",
-    "\n",
-    "**💡 Study these to understand the patterns!**\n",
-    "\n",
-    "</div>"
-   ]
+   "source": "---\n\n<a id=\"summary\"></a>\n<div style=\"background: linear-gradient(135deg, #667eea 0%, #764ba2 100%); color: white; padding: 40px; border-radius: 15px; margin: 40px 0; text-align: center;\">\n\n# 🎓 Summary: Your Journey\n\n</div>"
   },
   {
    "cell_type": "markdown",
@@ -841,46 +614,7 @@
   {
    "cell_type": "markdown",
    "metadata": {},
-   "source": [
-    "---\n",
-    "\n",
-    "## 🚀 Next Steps\n",
-    "\n",
-    "<table>\n",
-    "<tr>\n",
-    "<td width=\"33%\">\n",
-    "\n",
-    "### 📖 Learn More\n",
-    "\n",
-    "- Explore `src/envs/README.md`\n",
-    "- Read [RFC 001](https://github.com/meta-pytorch/OpenEnv/pull/26)\n",
-    "- Check example scripts in `examples/`\n",
-    "- Study OpenSpiel integration\n",
-    "\n",
-    "</td>\n",
-    "<td width=\"33%\">\n",
-    "\n",
-    "### 🛠️ Build\n",
-    "\n",
-    "- Wrap your favorite RL environment\n",
-    "- Implement real RL algorithms (DQN, PPO)\n",
-    "- Create a custom game\n",
-    "- Deploy to production\n",
-    "\n",
-    "</td>\n",
-    "<td width=\"33%\">\n",
-    "\n",
-    "### 🤝 Contribute\n",
-    "\n",
-    "- Star the [repo](https://github.com/meta-pytorch/OpenEnv)\n",
-    "- Report issues\n",
-    "- Submit PRs\n",
-    "- Share your integrations\n",
-    "\n",
-    "</td>\n",
-    "</tr>\n",
-    "</table>"
-   ]
+   "source": "<a id=\"resources\"></a>\n## 📚 Resources\n\n<div style=\"background-color: #f8f9fa; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n\n### 🔗 Essential Links\n\n- **🏠 OpenEnv GitHub**: https://github.com/meta-pytorch/OpenEnv\n- **🎮 OpenSpiel**: https://github.com/google-deepmind/open_spiel\n- **⚡ FastAPI Docs**: https://fastapi.tiangolo.com/\n- **🐳 Docker Guide**: https://docs.docker.com/get-started/\n- **🔥 PyTorch**: https://pytorch.org/\n\n### 📖 Documentation Deep Dives\n\n- **Environment Creation Guide**: `src/envs/README.md`\n- **OpenSpiel Integration**: `src/envs/openspiel_env/README.md`\n- **Example Scripts**: `examples/`\n- **RFC 001**: [Baseline API Specs](https://github.com/meta-pytorch/OpenEnv/pull/26)\n\n### 🎓 Community & Support\n\n**Supported by amazing organizations:**\n- 🔥 Meta PyTorch\n- 🤗 Hugging Face\n- ⚡ Unsloth AI\n- 🌟 Reflection AI\n- 🚀 And many more!\n\n**License**: BSD 3-Clause (very permissive!)\n\n**Contributions**: Always welcome! Check out the issues tab.\n\n</div>\n\n---\n\n### 🌈 What's Next?\n\n1. ⭐ **Star the repo** to show support and stay updated\n2. 🔄 **Try modifying** the Catch game (make it harder? bigger grid?)\n3. 🎮 **Explore** other OpenSpiel games\n4. 🛠️ **Build** your own environment integration\n5. 💬 **Share** what you build with the community!"
   },
   {
    "cell_type": "markdown",

From ebfbeb379b3946e180071cba6ada3578990b4863 Mon Sep 17 00:00:00 2001
From: Sanyam Bhutani <sanyambhutani@meta.com>
Date: Mon, 20 Oct 2025 13:57:34 -0700
Subject: [PATCH 09/19] Update OpenEnv_Tutorial.ipynb

---
 examples/OpenEnv_Tutorial.ipynb | 1163 ++++++++++++++++++++++++++++++-
 1 file changed, 1127 insertions(+), 36 deletions(-)

diff --git a/examples/OpenEnv_Tutorial.ipynb b/examples/OpenEnv_Tutorial.ipynb
index 3f52343..f588c32 100644
--- a/examples/OpenEnv_Tutorial.ipynb
+++ b/examples/OpenEnv_Tutorial.ipynb
@@ -3,17 +3,111 @@
   {
    "cell_type": "markdown",
    "metadata": {},
-   "source": "<div align=\"center\">\n\n<img src=\"https://pytorch.org/assets/images/pytorch-logo.png\" width=\"200\" alt=\"PyTorch\">\n\n# OpenEnv: Production RL Made Simple\n\n### *From \"Hello World\" to Production Deployment in 30 Minutes* ✨\n\n---\n\n**What if RL environments were as easy to use as REST APIs?**\n\nThat's OpenEnv. Type-safe. Isolated. Production-ready. 🎯\n\n[![GitHub](https://img.shields.io/badge/GitHub-meta--pytorch%2FOpenEnv-blue?logo=github)](https://github.com/meta-pytorch/OpenEnv)\n[![License](https://img.shields.io/badge/License-BSD%203--Clause-green.svg)](https://opensource.org/licenses/BSD-3-Clause)\n[![PyTorch](https://img.shields.io/badge/PyTorch-EE4C2C?logo=pytorch&logoColor=white)](https://pytorch.org/)\n\n</div>\n\n---"
+   "source": [
+    "<div align=\"center\">\n",
+    "\n",
+    "<img src=\"https://pytorch.org/assets/images/pytorch-logo.png\" width=\"200\" alt=\"PyTorch\">\n",
+    "\n",
+    "Author: [Sanyam Bhutani](http://twitter.com/bhutanisanyam1/)\n",
+    "\n",
+    "# OpenEnv: Production RL Made Simple\n",
+    "\n",
+    "### *From \"Hello World\" to RL Training in 5 Minutes* ✨\n",
+    "\n",
+    "---\n",
+    "\n",
+    "**What if RL environments were as easy to use as REST APIs?**\n",
+    "\n",
+    "That's OpenEnv. Type-safe. Isolated. Production-ready. 🎯\n",
+    "\n",
+    "[![GitHub](https://img.shields.io/badge/GitHub-meta--pytorch%2FOpenEnv-blue?logo=github)](https://github.com/meta-pytorch/OpenEnv)\n",
+    "[![License](https://img.shields.io/badge/License-BSD%203--Clause-green.svg)](https://opensource.org/licenses/BSD-3-Clause)\n",
+    "[![PyTorch](https://img.shields.io/badge/PyTorch-EE4C2C?logo=pytorch&logoColor=white)](https://pytorch.org/)\n",
+    "\n",
+    "</div>\n",
+    "\n",
+    "---"
+   ]
   },
   {
    "cell_type": "markdown",
    "metadata": {},
-   "source": "## 📋 What You'll Learn\n\n<table>\n<tr>\n<td width=\"50%\">\n\n**🎯 Part 1-2: The Fundamentals**\n- ⚡ RL in 60 seconds\n- 🤔 Why existing solutions fall short\n- 💡 The OpenEnv solution\n\n</td>\n<td width=\"50%\">\n\n**🏗️ Part 3-5: The Architecture**\n- 🔧 How OpenEnv works\n- 🔍 Exploring real code\n- 🎮 OpenSpiel integration example\n\n</td>\n</tr>\n<tr>\n<td width=\"50%\">\n\n**🎮 Part 6-8: Hands-On Demo**\n- 🔨 Build a game environment\n- 🤖 Test 4 different policies\n- 👀 Watch learning happen live\n\n</td>\n<td width=\"50%\">\n\n**🔧 Part 9-10: Going Further**\n- 🚀 Use real OpenSpiel\n- ✨ Create your own integration\n- 🌐 Deploy to production\n\n</td>\n</tr>\n</table>\n\n> 💡 **Pro Tip**: This notebook is designed to run top-to-bottom in Google Colab with zero setup!\n> \n> ⏱️ **Time**: ~30 minutes | 📊 **Difficulty**: Beginner-friendly | 🎯 **Outcome**: Production-ready RL knowledge"
+   "source": [
+    "## 📋 What You'll Learn\n",
+    "\n",
+    "<table>\n",
+    "<tr>\n",
+    "<td width=\"50%\">\n",
+    "\n",
+    "**🎯 Part 1-2: The Fundamentals**\n",
+    "- ⚡ RL in 60 seconds\n",
+    "- 🤔 Why existing solutions fall short\n",
+    "- 💡 The OpenEnv solution\n",
+    "\n",
+    "</td>\n",
+    "<td width=\"50%\">\n",
+    "\n",
+    "**🏗️ Part 3-5: The Architecture**\n",
+    "- 🔧 How OpenEnv works\n",
+    "- 🔍 Exploring real code\n",
+    "- 🎮 OpenSpiel integration example\n",
+    "\n",
+    "</td>\n",
+    "</tr>\n",
+    "<tr>\n",
+    "<td width=\"50%\">\n",
+    "\n",
+    "**🎮 Part 6-8: Hands-On Demo**\n",
+    "- 🔨 Build a game environment\n",
+    "- 🤖 Test 4 different policies\n",
+    "- 👀 Watch learning happen live\n",
+    "\n",
+    "</td>\n",
+    "<td width=\"50%\">\n",
+    "\n",
+    "**🔧 Part 9-10: Going Further**\n",
+    "- 🚀 Use real OpenSpiel\n",
+    "- ✨ Create your own integration\n",
+    "- 🌐 Deploy to production\n",
+    "\n",
+    "</td>\n",
+    "</tr>\n",
+    "</table>\n",
+    "\n",
+    "> 💡 **Pro Tip**: This notebook is designed to run top-to-bottom in Google Colab with zero setup!\n",
+    "> \n",
+    "> ⏱️ **Time**: ~30 minutes | 📊 **Difficulty**: Beginner-friendly | 🎯 **Outcome**: Production-ready RL knowledge"
+   ]
   },
   {
    "cell_type": "markdown",
-   "source": "---\n\n<a id=\"part-1\"></a>\n# Part 1: RL in 60 Seconds ⏱️\n\n<div style=\"background-color: #f0f7ff; padding: 20px; border-left: 5px solid #2196F3; margin: 20px 0;\">\n\n**Reinforcement Learning is simpler than you think.**\n\nIt's just a loop:\n\n```\nwhile not done:\n    observation = environment.observe()\n    action = policy.choose(observation)\n    reward = environment.step(action)\n    policy.learn(reward)\n```\n\nThat's it. That's RL.\n\n</div>\n\nLet's see it in action:",
-   "metadata": {}
+   "metadata": {},
+   "source": [
+    "---\n",
+    "\n",
+    "<a id=\"part-1\"></a>\n",
+    "# Part 1: RL in 60 Seconds ⏱️\n",
+    "\n",
+    "<div style=\"background-color: #f0f7ff; padding: 20px; border-left: 5px solid #2196F3; margin: 20px 0;\">\n",
+    "\n",
+    "**Reinforcement Learning is simpler than you think.**\n",
+    "\n",
+    "It's just a loop:\n",
+    "\n",
+    "```\n",
+    "while not done:\n",
+    "    observation = environment.observe()\n",
+    "    action = policy.choose(observation)\n",
+    "    reward = environment.step(action)\n",
+    "    policy.learn(reward)\n",
+    "```\n",
+    "\n",
+    "That's it. That's RL.\n",
+    "\n",
+    "</div>\n",
+    "\n",
+    "Let's see it in action:"
+   ]
   },
   {
    "cell_type": "markdown",
@@ -49,27 +143,254 @@
    "execution_count": null,
    "metadata": {},
    "outputs": [],
-   "source": "import random\n\nprint(\"🎲 \" + \"=\"*58 + \" 🎲\")\nprint(\"   Number Guessing Game - The Simplest RL Example\")\nprint(\"🎲 \" + \"=\"*58 + \" 🎲\")\n\n# Environment setup\ntarget = random.randint(1, 10)\nguesses_left = 3\n\nprint(f\"\\n🎯 I'm thinking of a number between 1 and 10...\")\nprint(f\"💭 You have {guesses_left} guesses. Let's see how random guessing works!\\n\")\n\n# The RL Loop - Pure random policy (no learning!)\nwhile guesses_left > 0:\n    # Policy: Random guessing (no learning yet!)\n    guess = random.randint(1, 10)\n    guesses_left -= 1\n    \n    print(f\"💭 Guess #{3-guesses_left}: {guess}\", end=\" → \")\n    \n    # Reward signal (but we're not using it!)\n    if guess == target:\n        print(\"🎉 Correct! +10 points\")\n        break\n    elif abs(guess - target) <= 2:\n        print(\"🔥 Warm! (close)\")\n    else:\n        print(\"❄️  Cold! (far)\")\nelse:\n    print(f\"\\n💔 Out of guesses. The number was {target}.\")\n\nprint(\"\\n\" + \"=\"*62)\nprint(\"💡 This is RL: Observe → Act → Reward → Repeat\")\nprint(\"   But this policy is terrible! It doesn't learn from rewards.\")\nprint(\"=\"*62 + \"\\n\")"
+   "source": [
+    "import random\n",
+    "\n",
+    "print(\"🎲 \" + \"=\"*58 + \" 🎲\")\n",
+    "print(\"   Number Guessing Game - The Simplest RL Example\")\n",
+    "print(\"🎲 \" + \"=\"*58 + \" 🎲\")\n",
+    "\n",
+    "# Environment setup\n",
+    "target = random.randint(1, 10)\n",
+    "guesses_left = 3\n",
+    "\n",
+    "print(f\"\\n🎯 I'm thinking of a number between 1 and 10...\")\n",
+    "print(f\"💭 You have {guesses_left} guesses. Let's see how random guessing works!\\n\")\n",
+    "\n",
+    "# The RL Loop - Pure random policy (no learning!)\n",
+    "while guesses_left > 0:\n",
+    "    # Policy: Random guessing (no learning yet!)\n",
+    "    guess = random.randint(1, 10)\n",
+    "    guesses_left -= 1\n",
+    "    \n",
+    "    print(f\"💭 Guess #{3-guesses_left}: {guess}\", end=\" → \")\n",
+    "    \n",
+    "    # Reward signal (but we're not using it!)\n",
+    "    if guess == target:\n",
+    "        print(\"🎉 Correct! +10 points\")\n",
+    "        break\n",
+    "    elif abs(guess - target) <= 2:\n",
+    "        print(\"🔥 Warm! (close)\")\n",
+    "    else:\n",
+    "        print(\"❄️  Cold! (far)\")\n",
+    "else:\n",
+    "    print(f\"\\n💔 Out of guesses. The number was {target}.\")\n",
+    "\n",
+    "print(\"\\n\" + \"=\"*62)\n",
+    "print(\"💡 This is RL: Observe → Act → Reward → Repeat\")\n",
+    "print(\"   But this policy is terrible! It doesn't learn from rewards.\")\n",
+    "print(\"=\"*62 + \"\\n\")"
+   ]
   },
   {
    "cell_type": "markdown",
    "metadata": {},
-   "source": "---\n\n<a id=\"part-2\"></a>\n# Part 2: The Problem with Traditional RL 😤\n\n<div style=\"background-color: #fff3e0; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n\n## 🤔 Why Can't We Just Use OpenAI Gym?\n\nGood question! Gym is great for research, but production needs more...\n\n</div>\n\n<table>\n<tr>\n<th>Challenge</th>\n<th>Traditional Approach</th>\n<th>OpenEnv Solution</th>\n</tr>\n<tr>\n<td><b>Type Safety</b></td>\n<td>❌ <code>obs[0][3]</code> - what is this?</td>\n<td>✅ <code>obs.info_state</code> - IDE knows!</td>\n</tr>\n<tr>\n<td><b>Isolation</b></td>\n<td>❌ Same process (can crash your training)</td>\n<td>✅ Docker containers (fully isolated)</td>\n</tr>\n<tr>\n<td><b>Deployment</b></td>\n<td>❌ \"Works on my machine\" 🤷</td>\n<td>✅ Same container everywhere 🐳</td>\n</tr>\n<tr>\n<td><b>Scaling</b></td>\n<td>❌ Hard to distribute</td>\n<td>✅ Deploy to Kubernetes ☸️</td>\n</tr>\n<tr>\n<td><b>Language</b></td>\n<td>❌ Python only</td>\n<td>✅ Any language (HTTP API) 🌐</td>\n</tr>\n<tr>\n<td><b>Debugging</b></td>\n<td>❌ Cryptic numpy errors</td>\n<td>✅ Clear type errors 🐛</td>\n</tr>\n</table>\n\n<div style=\"background-color: #d4edda; padding: 20px; border-left: 5px solid #28a745; margin: 20px 0;\">\n\n## 💡 The OpenEnv Philosophy\n\n**\"RL environments should be like microservices\"**\n\nThink of it like this: You don't run your database in the same process as your web server, right? Same principle!\n\n- 🔒 **Isolated**: Run in containers (security + stability)\n- 🌐 **Standard**: HTTP API, works everywhere\n- 📦 **Versioned**: Docker images (reproducibility!)\n- 🚀 **Scalable**: Deploy to cloud with one command\n- 🛡️ **Type-safe**: Catch bugs before they happen\n- 🔄 **Portable**: Works on Mac, Linux, Windows, Cloud\n\n</div>"
+   "source": [
+    "---\n",
+    "\n",
+    "<a id=\"part-2\"></a>\n",
+    "# Part 2: The Problem with Traditional RL 😤\n",
+    "\n",
+    "<div style=\"background-color: #fff3e0; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n",
+    "\n",
+    "## 🤔 Why Can't We Just Use OpenAI Gym?\n",
+    "\n",
+    "Good question! Gym is great for research, but production needs more...\n",
+    "\n",
+    "</div>\n",
+    "\n",
+    "<table>\n",
+    "<tr>\n",
+    "<th>Challenge</th>\n",
+    "<th>Traditional Approach</th>\n",
+    "<th>OpenEnv Solution</th>\n",
+    "</tr>\n",
+    "<tr>\n",
+    "<td><b>Type Safety</b></td>\n",
+    "<td>❌ <code>obs[0][3]</code> - what is this?</td>\n",
+    "<td>✅ <code>obs.info_state</code> - IDE knows!</td>\n",
+    "</tr>\n",
+    "<tr>\n",
+    "<td><b>Isolation</b></td>\n",
+    "<td>❌ Same process (can crash your training)</td>\n",
+    "<td>✅ Docker containers (fully isolated)</td>\n",
+    "</tr>\n",
+    "<tr>\n",
+    "<td><b>Deployment</b></td>\n",
+    "<td>❌ \"Works on my machine\" 🤷</td>\n",
+    "<td>✅ Same container everywhere 🐳</td>\n",
+    "</tr>\n",
+    "<tr>\n",
+    "<td><b>Scaling</b></td>\n",
+    "<td>❌ Hard to distribute</td>\n",
+    "<td>✅ Deploy to Kubernetes ☸️</td>\n",
+    "</tr>\n",
+    "<tr>\n",
+    "<td><b>Language</b></td>\n",
+    "<td>❌ Python only</td>\n",
+    "<td>✅ Any language (HTTP API) 🌐</td>\n",
+    "</tr>\n",
+    "<tr>\n",
+    "<td><b>Debugging</b></td>\n",
+    "<td>❌ Cryptic numpy errors</td>\n",
+    "<td>✅ Clear type errors 🐛</td>\n",
+    "</tr>\n",
+    "</table>\n",
+    "\n",
+    "<div style=\"background-color: #d4edda; padding: 20px; border-left: 5px solid #28a745; margin: 20px 0;\">\n",
+    "\n",
+    "## 💡 The OpenEnv Philosophy\n",
+    "\n",
+    "**\"RL environments should be like microservices\"**\n",
+    "\n",
+    "Think of it like this: You don't run your database in the same process as your web server, right? Same principle!\n",
+    "\n",
+    "- 🔒 **Isolated**: Run in containers (security + stability)\n",
+    "- 🌐 **Standard**: HTTP API, works everywhere\n",
+    "- 📦 **Versioned**: Docker images (reproducibility!)\n",
+    "- 🚀 **Scalable**: Deploy to cloud with one command\n",
+    "- 🛡️ **Type-safe**: Catch bugs before they happen\n",
+    "- 🔄 **Portable**: Works on Mac, Linux, Windows, Cloud\n",
+    "\n",
+    "</div>"
+   ]
   },
   {
    "cell_type": "markdown",
-   "source": "### The Architecture\n\n```\n┌────────────────────────────────────────────────────────────┐\n│  YOUR TRAINING CODE                                        │\n│                                                            │\n│  env = OpenSpielEnv(...)        ← Import the client      │\n│  result = env.reset()           ← Type-safe!             │\n│  result = env.step(action)      ← Type-safe!             │\n│                                                            │\n└─────────────────┬──────────────────────────────────────────┘\n                  │\n                  │  HTTP/JSON (Language-Agnostic)\n                  │  POST /reset, POST /step, GET /state\n                  │\n┌─────────────────▼──────────────────────────────────────────┐\n│  DOCKER CONTAINER                                          │\n│                                                            │\n│  ┌──────────────────────────────────────────────┐         │\n│  │  FastAPI Server                              │         │\n│  │  └─ Environment (reset, step, state)         │         │\n│  │     └─ Your Game/Simulation Logic            │         │\n│  └──────────────────────────────────────────────┘         │\n│                                                            │\n│  Isolated • Reproducible • Secure                          │\n└────────────────────────────────────────────────────────────┘\n```\n\n<div style=\"background-color: #e7f3ff; padding: 15px; border-left: 5px solid #0366d6; margin: 20px 0;\">\n\n**🎯 Key Insight**: You never see HTTP details - just clean Python methods!\n\n```python\nenv.reset()    # Under the hood: HTTP POST to /reset\nenv.step(...)  # Under the hood: HTTP POST to /step\nenv.state()    # Under the hood: HTTP GET to /state\n```\n\nThe magic? OpenEnv handles all the plumbing. You focus on RL! ✨\n\n</div>",
-   "metadata": {}
+   "metadata": {},
+   "source": [
+    "### The Architecture\n",
+    "\n",
+    "```\n",
+    "┌────────────────────────────────────────────────────────────┐\n",
+    "│  YOUR TRAINING CODE                                        │\n",
+    "│                                                            │\n",
+    "│  env = OpenSpielEnv(...)        ← Import the client      │\n",
+    "│  result = env.reset()           ← Type-safe!             │\n",
+    "│  result = env.step(action)      ← Type-safe!             │\n",
+    "│                                                            │\n",
+    "└─────────────────┬──────────────────────────────────────────┘\n",
+    "                  │\n",
+    "                  │  HTTP/JSON (Language-Agnostic)\n",
+    "                  │  POST /reset, POST /step, GET /state\n",
+    "                  │\n",
+    "┌─────────────────▼──────────────────────────────────────────┐\n",
+    "│  DOCKER CONTAINER                                          │\n",
+    "│                                                            │\n",
+    "│  ┌──────────────────────────────────────────────┐         │\n",
+    "│  │  FastAPI Server                              │         │\n",
+    "│  │  └─ Environment (reset, step, state)         │         │\n",
+    "│  │     └─ Your Game/Simulation Logic            │         │\n",
+    "│  └──────────────────────────────────────────────┘         │\n",
+    "│                                                            │\n",
+    "│  Isolated • Reproducible • Secure                          │\n",
+    "└────────────────────────────────────────────────────────────┘\n",
+    "```\n",
+    "\n",
+    "<div style=\"background-color: #e7f3ff; padding: 15px; border-left: 5px solid #0366d6; margin: 20px 0;\">\n",
+    "\n",
+    "**🎯 Key Insight**: You never see HTTP details - just clean Python methods!\n",
+    "\n",
+    "```python\n",
+    "env.reset()    # Under the hood: HTTP POST to /reset\n",
+    "env.step(...)  # Under the hood: HTTP POST to /step\n",
+    "env.state()    # Under the hood: HTTP GET to /state\n",
+    "```\n",
+    "\n",
+    "The magic? OpenEnv handles all the plumbing. You focus on RL! ✨\n",
+    "\n",
+    "</div>"
+   ]
   },
   {
    "cell_type": "markdown",
    "metadata": {},
-   "source": "---\n\n# Part 2: The Problem with Traditional RL 😤\n\n<div style=\"background-color: #fff3e0; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n\n## 🤔 Why Can't We Just Use OpenAI Gym?\n\nGood question! Gym is great for research, but production needs more...\n\n</div>\n\n<table>\n<tr>\n<th>Challenge</th>\n<th>Traditional Approach</th>\n<th>OpenEnv Solution</th>\n</tr>\n<tr>\n<td><b>Type Safety</b></td>\n<td>❌ <code>obs[0][3]</code> - what is this?</td>\n<td>✅ <code>obs.info_state</code> - IDE knows!</td>\n</tr>\n<tr>\n<td><b>Isolation</b></td>\n<td>❌ Same process (can crash your training)</td>\n<td>✅ Docker containers (fully isolated)</td>\n</tr>\n<tr>\n<td><b>Deployment</b></td>\n<td>❌ \"Works on my machine\" 🤷</td>\n<td>✅ Same container everywhere 🐳</td>\n</tr>\n<tr>\n<td><b>Scaling</b></td>\n<td>❌ Hard to distribute</td>\n<td>✅ Deploy to Kubernetes ☸️</td>\n</tr>\n<tr>\n<td><b>Language</b></td>\n<td>❌ Python only</td>\n<td>✅ Any language (HTTP API) 🌐</td>\n</tr>\n<tr>\n<td><b>Debugging</b></td>\n<td>❌ Cryptic numpy errors</td>\n<td>✅ Clear type errors 🐛</td>\n</tr>\n</table>\n\n<div style=\"background-color: #d4edda; padding: 20px; border-left: 5px solid #28a745; margin: 20px 0;\">\n\n## 💡 The OpenEnv Philosophy\n\n**\"RL environments should be like microservices\"**\n\nThink of it like this: You don't run your database in the same process as your web server, right? Same principle!\n\n- 🔒 **Isolated**: Run in containers (security + stability)\n- 🌐 **Standard**: HTTP API, works everywhere\n- 📦 **Versioned**: Docker images (reproducibility!)\n- 🚀 **Scalable**: Deploy to cloud with one command\n- 🛡️ **Type-safe**: Catch bugs before they happen\n- 🔄 **Portable**: Works on Mac, Linux, Windows, Cloud\n\n</div>"
+   "source": [
+    "---\n",
+    "\n",
+    "# Part 2: The Problem with Traditional RL 😤\n",
+    "\n",
+    "<div style=\"background-color: #fff3e0; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n",
+    "\n",
+    "## 🤔 Why Can't We Just Use OpenAI Gym?\n",
+    "\n",
+    "Good question! Gym is great for research, but production needs more...\n",
+    "\n",
+    "</div>\n",
+    "\n",
+    "<table>\n",
+    "<tr>\n",
+    "<th>Challenge</th>\n",
+    "<th>Traditional Approach</th>\n",
+    "<th>OpenEnv Solution</th>\n",
+    "</tr>\n",
+    "<tr>\n",
+    "<td><b>Type Safety</b></td>\n",
+    "<td>❌ <code>obs[0][3]</code> - what is this?</td>\n",
+    "<td>✅ <code>obs.info_state</code> - IDE knows!</td>\n",
+    "</tr>\n",
+    "<tr>\n",
+    "<td><b>Isolation</b></td>\n",
+    "<td>❌ Same process (can crash your training)</td>\n",
+    "<td>✅ Docker containers (fully isolated)</td>\n",
+    "</tr>\n",
+    "<tr>\n",
+    "<td><b>Deployment</b></td>\n",
+    "<td>❌ \"Works on my machine\" 🤷</td>\n",
+    "<td>✅ Same container everywhere 🐳</td>\n",
+    "</tr>\n",
+    "<tr>\n",
+    "<td><b>Scaling</b></td>\n",
+    "<td>❌ Hard to distribute</td>\n",
+    "<td>✅ Deploy to Kubernetes ☸️</td>\n",
+    "</tr>\n",
+    "<tr>\n",
+    "<td><b>Language</b></td>\n",
+    "<td>❌ Python only</td>\n",
+    "<td>✅ Any language (HTTP API) 🌐</td>\n",
+    "</tr>\n",
+    "<tr>\n",
+    "<td><b>Debugging</b></td>\n",
+    "<td>❌ Cryptic numpy errors</td>\n",
+    "<td>✅ Clear type errors 🐛</td>\n",
+    "</tr>\n",
+    "</table>\n",
+    "\n",
+    "<div style=\"background-color: #d4edda; padding: 20px; border-left: 5px solid #28a745; margin: 20px 0;\">\n",
+    "\n",
+    "## 💡 The OpenEnv Philosophy\n",
+    "\n",
+    "**\"RL environments should be like microservices\"**\n",
+    "\n",
+    "Think of it like this: You don't run your database in the same process as your web server, right? Same principle!\n",
+    "\n",
+    "- 🔒 **Isolated**: Run in containers (security + stability)\n",
+    "- 🌐 **Standard**: HTTP API, works everywhere\n",
+    "- 📦 **Versioned**: Docker images (reproducibility!)\n",
+    "- 🚀 **Scalable**: Deploy to cloud with one command\n",
+    "- 🛡️ **Type-safe**: Catch bugs before they happen\n",
+    "- 🔄 **Portable**: Works on Mac, Linux, Windows, Cloud\n",
+    "\n",
+    "</div>"
+   ]
   },
   {
    "cell_type": "markdown",
    "metadata": {},
-   "source": "---\n\n<a id=\"part-3\"></a>\n# Part 3: Setup 🛠️\n\n<div style=\"background-color: #f8f9fa; padding: 15px; border-radius: 5px; margin: 20px 0;\">\n\n**Running in Colab?** This cell will clone OpenEnv and install dependencies automatically.\n\n**Running locally?** Make sure you're in the OpenEnv directory.\n\n</div>"
+   "source": [
+    "---\n",
+    "\n",
+    "<a id=\"part-3\"></a>\n",
+    "# Part 3: Setup 🛠️\n",
+    "\n",
+    "<div style=\"background-color: #f8f9fa; padding: 15px; border-radius: 5px; margin: 20px 0;\">\n",
+    "\n",
+    "**Running in Colab?** This cell will clone OpenEnv and install dependencies automatically.\n",
+    "\n",
+    "**Running locally?** Make sure you're in the OpenEnv directory.\n",
+    "\n",
+    "</div>"
+   ]
   },
   {
    "cell_type": "markdown",
@@ -93,7 +414,34 @@
    "execution_count": null,
    "metadata": {},
    "outputs": [],
-   "source": "---\n\n<a id=\"part-4\"></a>\n# Part 4: The OpenEnv Pattern 🏗️\n\n<div style=\"background-color: #f0f7ff; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n\n## Every OpenEnv Environment Has 3 Components:\n\n```\nsrc/envs/your_env/\n├── 📝 models.py          ← Type-safe contracts\n│                           (Action, Observation, State)\n│\n├── 📱 client.py          ← What YOU import\n│                           (HTTPEnvClient implementation)\n│\n└── 🖥️  server/\n    ├── environment.py    ← Game/simulation logic\n    ├── app.py            ← FastAPI server\n    └── Dockerfile        ← Container definition\n```\n\n</div>\n\nLet's explore the actual OpenEnv code to see how this works:"
+   "source": [
+    "---\n",
+    "\n",
+    "<a id=\"part-4\"></a>\n",
+    "# Part 4: The OpenEnv Pattern 🏗️\n",
+    "\n",
+    "<div style=\"background-color: #f0f7ff; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n",
+    "\n",
+    "## Every OpenEnv Environment Has 3 Components:\n",
+    "\n",
+    "```\n",
+    "src/envs/your_env/\n",
+    "├── 📝 models.py          ← Type-safe contracts\n",
+    "│                           (Action, Observation, State)\n",
+    "│\n",
+    "├── 📱 client.py          ← What YOU import\n",
+    "│                           (HTTPEnvClient implementation)\n",
+    "│\n",
+    "└── 🖥️  server/\n",
+    "    ├── environment.py    ← Game/simulation logic\n",
+    "    ├── app.py            ← FastAPI server\n",
+    "    └── Dockerfile        ← Container definition\n",
+    "```\n",
+    "\n",
+    "</div>\n",
+    "\n",
+    "Let's explore the actual OpenEnv code to see how this works:"
+   ]
   },
   {
    "cell_type": "markdown",
@@ -131,7 +479,47 @@
    "execution_count": null,
    "metadata": {},
    "outputs": [],
-   "source": "---\n\n<a id=\"part-5\"></a>\n# Part 5: Example Integration - OpenSpiel 🎮\n\n<div style=\"background-color: #fff3e0; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n\n## What is OpenSpiel?\n\n**OpenSpiel** is a library from DeepMind with **70+ game environments** for RL research.\n\n## OpenEnv's Integration\n\nWe've wrapped **6 OpenSpiel games** following the OpenEnv pattern:\n\n<table>\n<tr>\n<td width=\"50%\">\n\n**🎯 Single-Player**\n1. **Catch** - Catch falling ball\n2. **Cliff Walking** - Navigate grid\n3. **2048** - Tile puzzle\n4. **Blackjack** - Card game\n\n</td>\n<td width=\"50%\">\n\n**👥 Multi-Player**\n5. **Tic-Tac-Toe** - Classic 3×3\n6. **Kuhn Poker** - Imperfect info poker\n\n</td>\n</tr>\n</table>\n\nThis shows how OpenEnv can wrap **any** existing RL library!\n\n</div>"
+   "source": [
+    "---\n",
+    "\n",
+    "<a id=\"part-5\"></a>\n",
+    "# Part 5: Example Integration - OpenSpiel 🎮\n",
+    "\n",
+    "<div style=\"background-color: #fff3e0; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n",
+    "\n",
+    "## What is OpenSpiel?\n",
+    "\n",
+    "**OpenSpiel** is a library from DeepMind with **70+ game environments** for RL research.\n",
+    "\n",
+    "## OpenEnv's Integration\n",
+    "\n",
+    "We've wrapped **6 OpenSpiel games** following the OpenEnv pattern:\n",
+    "\n",
+    "<table>\n",
+    "<tr>\n",
+    "<td width=\"50%\">\n",
+    "\n",
+    "**🎯 Single-Player**\n",
+    "1. **Catch** - Catch falling ball\n",
+    "2. **Cliff Walking** - Navigate grid\n",
+    "3. **2048** - Tile puzzle\n",
+    "4. **Blackjack** - Card game\n",
+    "\n",
+    "</td>\n",
+    "<td width=\"50%\">\n",
+    "\n",
+    "**👥 Multi-Player**\n",
+    "5. **Tic-Tac-Toe** - Classic 3×3\n",
+    "6. **Kuhn Poker** - Imperfect info poker\n",
+    "\n",
+    "</td>\n",
+    "</tr>\n",
+    "</table>\n",
+    "\n",
+    "This shows how OpenEnv can wrap **any** existing RL library!\n",
+    "\n",
+    "</div>"
+   ]
   },
   {
    "cell_type": "markdown",
@@ -237,18 +625,61 @@
     "</div>"
    ]
   },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": "---\n\n<a id=\"part-6\"></a>\n<div style=\"text-align: center; background: linear-gradient(135deg, #667eea 0%, #764ba2 100%); color: white; padding: 30px; border-radius: 15px; margin: 30px 0;\">\n\n# 🎮 Part 6: Interactive Demo\n\n### Now let's BUILD something!\n\nWe'll create a **Catch game** following OpenEnv patterns,<br>\nthen watch **4 different AI policies** compete for the championship! 🏆\n\n<br>\n\n**Get ready for:**\n- ⚡ Live gameplay visualization\n- 🤖 AI policy showdown\n- 📊 Real-time learning metrics\n- 🎯 Production-ready patterns\n\n</div>"
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": "---\n\n<div style=\"text-align: center; background: linear-gradient(135deg, #667eea 0%, #764ba2 100%); color: white; padding: 30px; border-radius: 15px; margin: 30px 0;\">\n\n# 🎮 Part 6: Interactive Demo\n\n### Now let's BUILD something!\n\nWe'll create a **Catch game** following OpenEnv patterns,<br>\nthen watch **4 different AI policies** compete for the championship! 🏆\n\n<br>\n\n**Get ready for:**\n- ⚡ Live gameplay visualization\n- 🤖 AI policy showdown\n- 📊 Real-time learning metrics\n- 🎯 Production-ready patterns\n\n</div>"
-  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "---\n",
+    "\n",
+    "<a id=\"part-6\"></a>\n",
+    "<div style=\"text-align: center; background: linear-gradient(135deg, #667eea 0%, #764ba2 100%); color: white; padding: 30px; border-radius: 15px; margin: 30px 0;\">\n",
+    "\n",
+    "# 🎮 Part 6: Interactive Demo\n",
+    "\n",
+    "### Now let's BUILD something!\n",
+    "\n",
+    "We'll create a **Catch game** following OpenEnv patterns,<br>\n",
+    "then watch **4 different AI policies** compete for the championship! 🏆\n",
+    "\n",
+    "<br>\n",
+    "\n",
+    "**Get ready for:**\n",
+    "- ⚡ Live gameplay visualization\n",
+    "- 🤖 AI policy showdown\n",
+    "- 📊 Real-time learning metrics\n",
+    "- 🎯 Production-ready patterns\n",
+    "\n",
+    "</div>"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "---\n",
+    "\n",
+    "<div style=\"text-align: center; background: linear-gradient(135deg, #667eea 0%, #764ba2 100%); color: white; padding: 30px; border-radius: 15px; margin: 30px 0;\">\n",
+    "\n",
+    "# 🎮 Part 6: Interactive Demo\n",
+    "\n",
+    "### Now let's BUILD something!\n",
+    "\n",
+    "We'll create a **Catch game** following OpenEnv patterns,<br>\n",
+    "then watch **4 different AI policies** compete for the championship! 🏆\n",
+    "\n",
+    "<br>\n",
+    "\n",
+    "**Get ready for:**\n",
+    "- ⚡ Live gameplay visualization\n",
+    "- 🤖 AI policy showdown\n",
+    "- 📊 Real-time learning metrics\n",
+    "- 🎯 Production-ready patterns\n",
+    "\n",
+    "</div>"
+   ]
+  },
   {
    "cell_type": "markdown",
    "metadata": {},
@@ -306,7 +737,123 @@
    "execution_count": null,
    "metadata": {},
    "outputs": [],
-   "source": "import random\nfrom dataclasses import dataclass\nfrom typing import List, Tuple\n\n# ============================================================================\n# MODELS - Type-safe contracts (following OpenEnv pattern)\n# ============================================================================\n\n@dataclass\nclass CatchObservation:\n    \"\"\"Type-safe observation following OpenEnv Observation base class.\"\"\"\n    info_state: List[float]      # Grid as flat array\n    legal_actions: List[int]     # [0, 1, 2] always\n    done: bool                   # Episode finished?\n    reward: float                # +1 or 0\n    # Extra fields for visualization\n    ball_position: Tuple[int, int]\n    paddle_position: int\n\n\n# ============================================================================\n# ENVIRONMENT - Server-side logic (following OpenEnv Environment pattern)\n# ============================================================================\n\nclass CatchEnvironment:\n    \"\"\"\n    Catch game following OpenEnv's Environment pattern.\n    \n    In production:\n      • Runs in Docker container\n      • Accessed via HTTPEnvClient\n      • Exposed via FastAPI server\n    \n    For this demo:\n      • We run it locally to see internals\n      • But the structure is identical!\n    \"\"\"\n    \n    def __init__(self, grid_size=5):\n        self.grid_size = grid_size\n    \n    def reset(self) -> CatchObservation:\n        \"\"\"Start new episode (implements Environment.reset()).\"\"\"\n        self.ball_row = 0\n        self.ball_col = random.randint(0, self.grid_size - 1)\n        self.paddle_col = self.grid_size // 2\n        self.done = False\n        return self._make_observation()\n    \n    def step(self, action: int) -> CatchObservation:\n        \"\"\"Execute action (implements Environment.step()).\n        \n        Args:\n            action: 0=LEFT, 1=STAY, 2=RIGHT\n        \"\"\"\n        # Move paddle\n        if action == 0 and self.paddle_col > 0:\n            self.paddle_col -= 1\n        elif action == 2 and self.paddle_col < self.grid_size - 1:\n            self.paddle_col += 1\n        \n        # Move ball down\n        self.ball_row += 1\n        \n        # Check if episode done\n        if self.ball_row >= self.grid_size - 1:\n            self.done = True\n            reward = 1.0 if self.ball_col == self.paddle_col else 0.0\n        else:\n            reward = 0.0\n        \n        return self._make_observation(reward)\n    \n    def _make_observation(self, reward=0.0) -> CatchObservation:\n        \"\"\"Create type-safe observation.\"\"\"\n        # Flatten grid to vector (like real RL environments do)\n        info_state = [0.0] * (self.grid_size * self.grid_size)\n        ball_idx = self.ball_row * self.grid_size + self.ball_col\n        paddle_idx = (self.grid_size - 1) * self.grid_size + self.paddle_col\n        info_state[ball_idx] = 1.0      # Ball = 1.0\n        info_state[paddle_idx] = 0.5    # Paddle = 0.5\n        \n        return CatchObservation(\n            info_state=info_state,\n            legal_actions=[0, 1, 2],\n            done=self.done,\n            reward=reward,\n            ball_position=(self.ball_row, self.ball_col),\n            paddle_position=self.paddle_col\n        )\n    \n    def render(self):\n        \"\"\"Visualize current state.\"\"\"\n        for row in range(self.grid_size):\n            line = \"  \"\n            for col in range(self.grid_size):\n                if row == self.ball_row and col == self.ball_col:\n                    line += \"🔴 \"\n                elif row == self.grid_size - 1 and col == self.paddle_col:\n                    line += \"🏓 \"\n                else:\n                    line += \"⬜ \"\n            print(line)\n\n\nprint(\"🎉 \" + \"=\"*64 + \" 🎉\")\nprint(\"   ✅ Environment Created Following OpenEnv Pattern!\")\nprint(\"🎉 \" + \"=\"*64 + \" 🎉\")\nprint(\"\\n📋 What we just built:\")\nprint(\"   • reset() → CatchObservation (type-safe!)\")\nprint(\"   • step(action) → CatchObservation (type-safe!)\")\nprint(\"   • render() → Visual display\")\nprint(\"\\n🚀 In production: This would run in Docker + FastAPI\")\nprint(\"   But the structure is EXACTLY the same!\")\nprint(\"\\n💡 This is your blueprint for creating ANY OpenEnv environment!\\n\")"
+   "source": [
+    "import random\n",
+    "from dataclasses import dataclass\n",
+    "from typing import List, Tuple\n",
+    "\n",
+    "# ============================================================================\n",
+    "# MODELS - Type-safe contracts (following OpenEnv pattern)\n",
+    "# ============================================================================\n",
+    "\n",
+    "@dataclass\n",
+    "class CatchObservation:\n",
+    "    \"\"\"Type-safe observation following OpenEnv Observation base class.\"\"\"\n",
+    "    info_state: List[float]      # Grid as flat array\n",
+    "    legal_actions: List[int]     # [0, 1, 2] always\n",
+    "    done: bool                   # Episode finished?\n",
+    "    reward: float                # +1 or 0\n",
+    "    # Extra fields for visualization\n",
+    "    ball_position: Tuple[int, int]\n",
+    "    paddle_position: int\n",
+    "\n",
+    "\n",
+    "# ============================================================================\n",
+    "# ENVIRONMENT - Server-side logic (following OpenEnv Environment pattern)\n",
+    "# ============================================================================\n",
+    "\n",
+    "class CatchEnvironment:\n",
+    "    \"\"\"\n",
+    "    Catch game following OpenEnv's Environment pattern.\n",
+    "    \n",
+    "    In production:\n",
+    "      • Runs in Docker container\n",
+    "      • Accessed via HTTPEnvClient\n",
+    "      • Exposed via FastAPI server\n",
+    "    \n",
+    "    For this demo:\n",
+    "      • We run it locally to see internals\n",
+    "      • But the structure is identical!\n",
+    "    \"\"\"\n",
+    "    \n",
+    "    def __init__(self, grid_size=5):\n",
+    "        self.grid_size = grid_size\n",
+    "    \n",
+    "    def reset(self) -> CatchObservation:\n",
+    "        \"\"\"Start new episode (implements Environment.reset()).\"\"\"\n",
+    "        self.ball_row = 0\n",
+    "        self.ball_col = random.randint(0, self.grid_size - 1)\n",
+    "        self.paddle_col = self.grid_size // 2\n",
+    "        self.done = False\n",
+    "        return self._make_observation()\n",
+    "    \n",
+    "    def step(self, action: int) -> CatchObservation:\n",
+    "        \"\"\"Execute action (implements Environment.step()).\n",
+    "        \n",
+    "        Args:\n",
+    "            action: 0=LEFT, 1=STAY, 2=RIGHT\n",
+    "        \"\"\"\n",
+    "        # Move paddle\n",
+    "        if action == 0 and self.paddle_col > 0:\n",
+    "            self.paddle_col -= 1\n",
+    "        elif action == 2 and self.paddle_col < self.grid_size - 1:\n",
+    "            self.paddle_col += 1\n",
+    "        \n",
+    "        # Move ball down\n",
+    "        self.ball_row += 1\n",
+    "        \n",
+    "        # Check if episode done\n",
+    "        if self.ball_row >= self.grid_size - 1:\n",
+    "            self.done = True\n",
+    "            reward = 1.0 if self.ball_col == self.paddle_col else 0.0\n",
+    "        else:\n",
+    "            reward = 0.0\n",
+    "        \n",
+    "        return self._make_observation(reward)\n",
+    "    \n",
+    "    def _make_observation(self, reward=0.0) -> CatchObservation:\n",
+    "        \"\"\"Create type-safe observation.\"\"\"\n",
+    "        # Flatten grid to vector (like real RL environments do)\n",
+    "        info_state = [0.0] * (self.grid_size * self.grid_size)\n",
+    "        ball_idx = self.ball_row * self.grid_size + self.ball_col\n",
+    "        paddle_idx = (self.grid_size - 1) * self.grid_size + self.paddle_col\n",
+    "        info_state[ball_idx] = 1.0      # Ball = 1.0\n",
+    "        info_state[paddle_idx] = 0.5    # Paddle = 0.5\n",
+    "        \n",
+    "        return CatchObservation(\n",
+    "            info_state=info_state,\n",
+    "            legal_actions=[0, 1, 2],\n",
+    "            done=self.done,\n",
+    "            reward=reward,\n",
+    "            ball_position=(self.ball_row, self.ball_col),\n",
+    "            paddle_position=self.paddle_col\n",
+    "        )\n",
+    "    \n",
+    "    def render(self):\n",
+    "        \"\"\"Visualize current state.\"\"\"\n",
+    "        for row in range(self.grid_size):\n",
+    "            line = \"  \"\n",
+    "            for col in range(self.grid_size):\n",
+    "                if row == self.ball_row and col == self.ball_col:\n",
+    "                    line += \"🔴 \"\n",
+    "                elif row == self.grid_size - 1 and col == self.paddle_col:\n",
+    "                    line += \"🏓 \"\n",
+    "                else:\n",
+    "                    line += \"⬜ \"\n",
+    "            print(line)\n",
+    "\n",
+    "\n",
+    "print(\"🎉 \" + \"=\"*64 + \" 🎉\")\n",
+    "print(\"   ✅ Environment Created Following OpenEnv Pattern!\")\n",
+    "print(\"🎉 \" + \"=\"*64 + \" 🎉\")\n",
+    "print(\"\\n📋 What we just built:\")\n",
+    "print(\"   • reset() → CatchObservation (type-safe!)\")\n",
+    "print(\"   • step(action) → CatchObservation (type-safe!)\")\n",
+    "print(\"   • render() → Visual display\")\n",
+    "print(\"\\n🚀 In production: This would run in Docker + FastAPI\")\n",
+    "print(\"   But the structure is EXACTLY the same!\")\n",
+    "print(\"\\n💡 This is your blueprint for creating ANY OpenEnv environment!\\n\")"
+   ]
   },
   {
    "cell_type": "markdown",
@@ -320,7 +867,46 @@
    "execution_count": null,
    "metadata": {},
    "outputs": [],
-   "source": "---\n\n<a id=\"part-7\"></a>\n# Part 7: Four Policies 🤖\n\n<div style=\"background-color: #f8f9fa; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n\n## Let's test 4 different AI strategies:\n\n<table>\n<tr>\n<th width=\"25%\">Policy</th>\n<th width=\"50%\">Strategy</th>\n<th width=\"25%\">Expected Performance</th>\n</tr>\n<tr>\n<td><b>🎲 Random</b></td>\n<td>Pick random action every step</td>\n<td>~20% (pure luck)</td>\n</tr>\n<tr>\n<td><b>🛑 Always Stay</b></td>\n<td>Never move, hope ball lands in center</td>\n<td>~20% (terrible!)</td>\n</tr>\n<tr>\n<td><b>🧠 Smart</b></td>\n<td>Move paddle toward ball</td>\n<td>100% (optimal!)</td>\n</tr>\n<tr>\n<td><b>📈 Learning</b></td>\n<td>Start random, learn smart strategy</td>\n<td>~85% (improves over time)</td>\n</tr>\n</table>\n\n</div>"
+   "source": [
+    "---\n",
+    "\n",
+    "<a id=\"part-7\"></a>\n",
+    "# Part 7: Four Policies 🤖\n",
+    "\n",
+    "<div style=\"background-color: #f8f9fa; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n",
+    "\n",
+    "## Let's test 4 different AI strategies:\n",
+    "\n",
+    "<table>\n",
+    "<tr>\n",
+    "<th width=\"25%\">Policy</th>\n",
+    "<th width=\"50%\">Strategy</th>\n",
+    "<th width=\"25%\">Expected Performance</th>\n",
+    "</tr>\n",
+    "<tr>\n",
+    "<td><b>🎲 Random</b></td>\n",
+    "<td>Pick random action every step</td>\n",
+    "<td>~20% (pure luck)</td>\n",
+    "</tr>\n",
+    "<tr>\n",
+    "<td><b>🛑 Always Stay</b></td>\n",
+    "<td>Never move, hope ball lands in center</td>\n",
+    "<td>~20% (terrible!)</td>\n",
+    "</tr>\n",
+    "<tr>\n",
+    "<td><b>🧠 Smart</b></td>\n",
+    "<td>Move paddle toward ball</td>\n",
+    "<td>100% (optimal!)</td>\n",
+    "</tr>\n",
+    "<tr>\n",
+    "<td><b>📈 Learning</b></td>\n",
+    "<td>Start random, learn smart strategy</td>\n",
+    "<td>~85% (improves over time)</td>\n",
+    "</tr>\n",
+    "</table>\n",
+    "\n",
+    "</div>"
+   ]
   },
   {
    "cell_type": "markdown",
@@ -370,7 +956,82 @@
    "execution_count": null,
    "metadata": {},
    "outputs": [],
-   "source": "# ============================================================================\n# POLICIES - Different AI strategies\n# ============================================================================\n\nclass RandomPolicy:\n    \"\"\"Baseline: Pure random guessing.\"\"\"\n    name = \"🎲 Random Guesser\"\n    \n    def select_action(self, obs: CatchObservation) -> int:\n        return random.choice(obs.legal_actions)\n\n\nclass AlwaysStayPolicy:\n    \"\"\"Bad strategy: Never moves.\"\"\"\n    name = \"🛑 Always Stay\"\n    \n    def select_action(self, obs: CatchObservation) -> int:\n        return 1  # STAY\n\n\nclass SmartPolicy:\n    \"\"\"Optimal: Move paddle toward ball.\"\"\"\n    name = \"🧠 Smart Heuristic\"\n    \n    def select_action(self, obs: CatchObservation) -> int:\n        ball_col = obs.ball_position[1]\n        paddle_col = obs.paddle_position\n        \n        if paddle_col < ball_col:\n            return 2  # Move RIGHT\n        elif paddle_col > ball_col:\n            return 0  # Move LEFT\n        else:\n            return 1  # STAY (already aligned)\n\n\nclass LearningPolicy:\n    \"\"\"Simulated RL: Epsilon-greedy exploration.\"\"\"\n    name = \"📈 Learning Agent\"\n    \n    def __init__(self):\n        self.steps = 0\n    \n    def select_action(self, obs: CatchObservation) -> int:\n        self.steps += 1\n        \n        # Decay exploration rate over time\n        epsilon = max(0.1, 1.0 - (self.steps / 100))\n        \n        if random.random() < epsilon:\n            # Explore: random action\n            return random.choice(obs.legal_actions)\n        else:\n            # Exploit: use smart strategy\n            ball_col = obs.ball_position[1]\n            paddle_col = obs.paddle_position\n            if paddle_col < ball_col:\n                return 2\n            elif paddle_col > ball_col:\n                return 0\n            else:\n                return 1\n\n\nprint(\"🤖 \" + \"=\"*64 + \" 🤖\")\nprint(\"   ✅ 4 Policies Created!\")\nprint(\"🤖 \" + \"=\"*64 + \" 🤖\\n\")\n\npolicies = [RandomPolicy(), AlwaysStayPolicy(), SmartPolicy(), LearningPolicy()]\nfor i, policy in enumerate(policies, 1):\n    print(f\"   {i}. {policy.name}\")\n\nprint(\"\\n💡 Each policy represents a different approach to solving the game!\")\nprint(\"   Let's see who performs best! 🏆\\n\")"
+   "source": [
+    "# ============================================================================\n",
+    "# POLICIES - Different AI strategies\n",
+    "# ============================================================================\n",
+    "\n",
+    "class RandomPolicy:\n",
+    "    \"\"\"Baseline: Pure random guessing.\"\"\"\n",
+    "    name = \"🎲 Random Guesser\"\n",
+    "    \n",
+    "    def select_action(self, obs: CatchObservation) -> int:\n",
+    "        return random.choice(obs.legal_actions)\n",
+    "\n",
+    "\n",
+    "class AlwaysStayPolicy:\n",
+    "    \"\"\"Bad strategy: Never moves.\"\"\"\n",
+    "    name = \"🛑 Always Stay\"\n",
+    "    \n",
+    "    def select_action(self, obs: CatchObservation) -> int:\n",
+    "        return 1  # STAY\n",
+    "\n",
+    "\n",
+    "class SmartPolicy:\n",
+    "    \"\"\"Optimal: Move paddle toward ball.\"\"\"\n",
+    "    name = \"🧠 Smart Heuristic\"\n",
+    "    \n",
+    "    def select_action(self, obs: CatchObservation) -> int:\n",
+    "        ball_col = obs.ball_position[1]\n",
+    "        paddle_col = obs.paddle_position\n",
+    "        \n",
+    "        if paddle_col < ball_col:\n",
+    "            return 2  # Move RIGHT\n",
+    "        elif paddle_col > ball_col:\n",
+    "            return 0  # Move LEFT\n",
+    "        else:\n",
+    "            return 1  # STAY (already aligned)\n",
+    "\n",
+    "\n",
+    "class LearningPolicy:\n",
+    "    \"\"\"Simulated RL: Epsilon-greedy exploration.\"\"\"\n",
+    "    name = \"📈 Learning Agent\"\n",
+    "    \n",
+    "    def __init__(self):\n",
+    "        self.steps = 0\n",
+    "    \n",
+    "    def select_action(self, obs: CatchObservation) -> int:\n",
+    "        self.steps += 1\n",
+    "        \n",
+    "        # Decay exploration rate over time\n",
+    "        epsilon = max(0.1, 1.0 - (self.steps / 100))\n",
+    "        \n",
+    "        if random.random() < epsilon:\n",
+    "            # Explore: random action\n",
+    "            return random.choice(obs.legal_actions)\n",
+    "        else:\n",
+    "            # Exploit: use smart strategy\n",
+    "            ball_col = obs.ball_position[1]\n",
+    "            paddle_col = obs.paddle_position\n",
+    "            if paddle_col < ball_col:\n",
+    "                return 2\n",
+    "            elif paddle_col > ball_col:\n",
+    "                return 0\n",
+    "            else:\n",
+    "                return 1\n",
+    "\n",
+    "\n",
+    "print(\"🤖 \" + \"=\"*64 + \" 🤖\")\n",
+    "print(\"   ✅ 4 Policies Created!\")\n",
+    "print(\"🤖 \" + \"=\"*64 + \" 🤖\\n\")\n",
+    "\n",
+    "policies = [RandomPolicy(), AlwaysStayPolicy(), SmartPolicy(), LearningPolicy()]\n",
+    "for i, policy in enumerate(policies, 1):\n",
+    "    print(f\"   {i}. {policy.name}\")\n",
+    "\n",
+    "print(\"\\n💡 Each policy represents a different approach to solving the game!\")\n",
+    "print(\"   Let's see who performs best! 🏆\\n\")"
+   ]
   },
   {
    "cell_type": "markdown",
@@ -441,7 +1102,18 @@
   {
    "cell_type": "markdown",
    "metadata": {},
-   "source": "---\n\n<a id=\"part-8\"></a>\n# Part 8: Policy Competition! 🏆\n\n<div style=\"background-color: #e7f3ff; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n\nLet's run **50 episodes** for each policy and see who wins!\n\n</div>"
+   "source": [
+    "---\n",
+    "\n",
+    "<a id=\"part-8\"></a>\n",
+    "# Part 8: Policy Competition! 🏆\n",
+    "\n",
+    "<div style=\"background-color: #e7f3ff; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n",
+    "\n",
+    "Let's run **50 episodes** for each policy and see who wins!\n",
+    "\n",
+    "</div>"
+   ]
   },
   {
    "cell_type": "markdown",
@@ -463,22 +1135,296 @@
    "execution_count": null,
    "metadata": {},
    "outputs": [],
-   "source": "def evaluate_policies(num_episodes=50):\n    \"\"\"Compare all policies over many episodes.\"\"\"\n    policies = [\n        RandomPolicy(),\n        AlwaysStayPolicy(),\n        SmartPolicy(),\n        LearningPolicy(),\n    ]\n    \n    print(\"\\n🏆 \" + \"=\"*66 + \" 🏆\")\n    print(f\"   POLICY SHOWDOWN - {num_episodes} Episodes Each\")\n    print(\"🏆 \" + \"=\"*66 + \" 🏆\\n\")\n    \n    results = []\n    for policy in policies:\n        print(f\"⚡ Testing {policy.name}...\", end=\" \")\n        env = CatchEnvironment()\n        successes = sum(run_episode(env, policy, visualize=False) \n                       for _ in range(num_episodes))\n        success_rate = (successes / num_episodes) * 100\n        results.append((policy.name, success_rate, successes))\n        print(f\"✓ Done!\")\n    \n    print(\"\\n\" + \"=\"*70)\n    print(\"   📊 FINAL RESULTS\")\n    print(\"=\"*70 + \"\\n\")\n    \n    # Sort by success rate (descending)\n    results.sort(key=lambda x: x[1], reverse=True)\n    \n    # Award medals to top 3\n    medals = [\"🥇\", \"🥈\", \"🥉\", \"  \"]\n    \n    for i, (name, rate, successes) in enumerate(results):\n        medal = medals[i]\n        bar = \"█\" * int(rate / 2)\n        print(f\"{medal} {name:25s} [{bar:<50}] {rate:5.1f}% ({successes}/{num_episodes})\")\n    \n    print(\"\\n\" + \"=\"*70)\n    print(\"\\n✨ Key Insights:\")\n    print(\"   • Random (~20%):      Baseline - pure luck 🎲\")\n    print(\"   • Always Stay (~20%): Bad strategy - stays center 🛑\")\n    print(\"   • Smart (100%):       Optimal - perfect play! 🧠\")\n    print(\"   • Learning (~85%):    Improves over time 📈\")\n    print(\"\\n🎓 This is Reinforcement Learning in action:\")\n    print(\"   1. Start with exploration (trying random things)\")\n    print(\"   2. Learn from rewards (what works, what doesn't)\")\n    print(\"   3. Converge to optimal behavior (smart strategy)\")\n    print(\"\\n🎯 The Learning Agent gets smarter with every episode!\\n\")\n\n# Run the epic competition!\nprint(\"🎮 Starting the showdown...\")\nevaluate_policies(num_episodes=50)"
+   "source": [
+    "def evaluate_policies(num_episodes=50):\n",
+    "    \"\"\"Compare all policies over many episodes.\"\"\"\n",
+    "    policies = [\n",
+    "        RandomPolicy(),\n",
+    "        AlwaysStayPolicy(),\n",
+    "        SmartPolicy(),\n",
+    "        LearningPolicy(),\n",
+    "    ]\n",
+    "    \n",
+    "    print(\"\\n🏆 \" + \"=\"*66 + \" 🏆\")\n",
+    "    print(f\"   POLICY SHOWDOWN - {num_episodes} Episodes Each\")\n",
+    "    print(\"🏆 \" + \"=\"*66 + \" 🏆\\n\")\n",
+    "    \n",
+    "    results = []\n",
+    "    for policy in policies:\n",
+    "        print(f\"⚡ Testing {policy.name}...\", end=\" \")\n",
+    "        env = CatchEnvironment()\n",
+    "        successes = sum(run_episode(env, policy, visualize=False) \n",
+    "                       for _ in range(num_episodes))\n",
+    "        success_rate = (successes / num_episodes) * 100\n",
+    "        results.append((policy.name, success_rate, successes))\n",
+    "        print(f\"✓ Done!\")\n",
+    "    \n",
+    "    print(\"\\n\" + \"=\"*70)\n",
+    "    print(\"   📊 FINAL RESULTS\")\n",
+    "    print(\"=\"*70 + \"\\n\")\n",
+    "    \n",
+    "    # Sort by success rate (descending)\n",
+    "    results.sort(key=lambda x: x[1], reverse=True)\n",
+    "    \n",
+    "    # Award medals to top 3\n",
+    "    medals = [\"🥇\", \"🥈\", \"🥉\", \"  \"]\n",
+    "    \n",
+    "    for i, (name, rate, successes) in enumerate(results):\n",
+    "        medal = medals[i]\n",
+    "        bar = \"█\" * int(rate / 2)\n",
+    "        print(f\"{medal} {name:25s} [{bar:<50}] {rate:5.1f}% ({successes}/{num_episodes})\")\n",
+    "    \n",
+    "    print(\"\\n\" + \"=\"*70)\n",
+    "    print(\"\\n✨ Key Insights:\")\n",
+    "    print(\"   • Random (~20%):      Baseline - pure luck 🎲\")\n",
+    "    print(\"   • Always Stay (~20%): Bad strategy - stays center 🛑\")\n",
+    "    print(\"   • Smart (100%):       Optimal - perfect play! 🧠\")\n",
+    "    print(\"   • Learning (~85%):    Improves over time 📈\")\n",
+    "    print(\"\\n🎓 This is Reinforcement Learning in action:\")\n",
+    "    print(\"   1. Start with exploration (trying random things)\")\n",
+    "    print(\"   2. Learn from rewards (what works, what doesn't)\")\n",
+    "    print(\"   3. Converge to optimal behavior (smart strategy)\")\n",
+    "    print(\"\\n🎯 The Learning Agent gets smarter with every episode!\\n\")\n",
+    "\n",
+    "# Run the epic competition!\n",
+    "print(\"🎮 Starting the showdown...\")\n",
+    "evaluate_policies(num_episodes=50)"
+   ]
   },
   {
    "cell_type": "markdown",
    "metadata": {},
-   "source": "---\n\n<a id=\"part-9\"></a>\n# Part 9: Using Real OpenSpiel 🎮\n\n<div style=\"background-color: #d4edda; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n\n## What We Just Built vs Production OpenSpiel\n\n<table>\n<tr>\n<th>Component</th>\n<th>Our Demo</th>\n<th>OpenEnv + OpenSpiel</th>\n</tr>\n<tr>\n<td><b>Environment</b></td>\n<td>Local Python class</td>\n<td>Docker container</td>\n</tr>\n<tr>\n<td><b>Communication</b></td>\n<td>Direct function calls</td>\n<td>HTTP/JSON</td>\n</tr>\n<tr>\n<td><b>Client</b></td>\n<td>Direct access</td>\n<td>HTTPEnvClient</td>\n</tr>\n<tr>\n<td><b>Type Safety</b></td>\n<td>✅ Dataclasses</td>\n<td>✅ Dataclasses</td>\n</tr>\n<tr>\n<td><b>API</b></td>\n<td>reset(), step()</td>\n<td>reset(), step() <em>(same!)</em></td>\n</tr>\n</table>\n\n**🎯 Same structure, production features!**\n\n</div>\n\n### Using OpenSpiel Integration:\n\n```python\n# 1. Install OpenSpiel\n!pip install open_spiel\n\n# 2. Import OpenEnv's integration\nfrom envs.openspiel_env import OpenSpielEnv, OpenSpielAction\n\n# 3. Connect to server (HTTP!)\nenv = OpenSpielEnv(base_url=\"http://localhost:8000\")\n\n# 4. Same API you just learned!\nresult = env.reset()\nresult = env.step(OpenSpielAction(action_id=2, game_name=\"catch\"))\nstate = env.state()\n\n# 5. Switch games by changing game_name:\nresult = env.step(OpenSpielAction(action_id=4, game_name=\"tic_tac_toe\"))\n```\n\n<div style=\"background-color: #fff3e0; padding: 15px; border-radius: 5px; margin: 20px 0;\">\n\n**🎮 6 Games Available:**\n\n1. `\"catch\"` - What we just built!\n2. `\"tic_tac_toe\"` - Classic 3×3\n3. `\"kuhn_poker\"` - Imperfect information poker\n4. `\"cliff_walking\"` - Grid navigation\n5. `\"2048\"` - Tile puzzle\n6. `\"blackjack\"` - Card game\n\n**All use the exact same interface!**\n\n</div>"
+   "source": [
+    "---\n",
+    "\n",
+    "<a id=\"part-9\"></a>\n",
+    "# Part 9: Using Real OpenSpiel 🎮\n",
+    "\n",
+    "<div style=\"background-color: #d4edda; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n",
+    "\n",
+    "## What We Just Built vs Production OpenSpiel\n",
+    "\n",
+    "<table>\n",
+    "<tr>\n",
+    "<th>Component</th>\n",
+    "<th>Our Demo</th>\n",
+    "<th>OpenEnv + OpenSpiel</th>\n",
+    "</tr>\n",
+    "<tr>\n",
+    "<td><b>Environment</b></td>\n",
+    "<td>Local Python class</td>\n",
+    "<td>Docker container</td>\n",
+    "</tr>\n",
+    "<tr>\n",
+    "<td><b>Communication</b></td>\n",
+    "<td>Direct function calls</td>\n",
+    "<td>HTTP/JSON</td>\n",
+    "</tr>\n",
+    "<tr>\n",
+    "<td><b>Client</b></td>\n",
+    "<td>Direct access</td>\n",
+    "<td>HTTPEnvClient</td>\n",
+    "</tr>\n",
+    "<tr>\n",
+    "<td><b>Type Safety</b></td>\n",
+    "<td>✅ Dataclasses</td>\n",
+    "<td>✅ Dataclasses</td>\n",
+    "</tr>\n",
+    "<tr>\n",
+    "<td><b>API</b></td>\n",
+    "<td>reset(), step()</td>\n",
+    "<td>reset(), step() <em>(same!)</em></td>\n",
+    "</tr>\n",
+    "</table>\n",
+    "\n",
+    "**🎯 Same structure, production features!**\n",
+    "\n",
+    "</div>\n",
+    "\n",
+    "### Using OpenSpiel Integration:\n",
+    "\n",
+    "```python\n",
+    "# 1. Install OpenSpiel\n",
+    "!pip install open_spiel\n",
+    "\n",
+    "# 2. Import OpenEnv's integration\n",
+    "from envs.openspiel_env import OpenSpielEnv, OpenSpielAction\n",
+    "\n",
+    "# 3. Connect to server (HTTP!)\n",
+    "env = OpenSpielEnv(base_url=\"http://localhost:8000\")\n",
+    "\n",
+    "# 4. Same API you just learned!\n",
+    "result = env.reset()\n",
+    "result = env.step(OpenSpielAction(action_id=2, game_name=\"catch\"))\n",
+    "state = env.state()\n",
+    "\n",
+    "# 5. Switch games by changing game_name:\n",
+    "result = env.step(OpenSpielAction(action_id=4, game_name=\"tic_tac_toe\"))\n",
+    "```\n",
+    "\n",
+    "<div style=\"background-color: #fff3e0; padding: 15px; border-radius: 5px; margin: 20px 0;\">\n",
+    "\n",
+    "**🎮 6 Games Available:**\n",
+    "\n",
+    "1. `\"catch\"` - What we just built!\n",
+    "2. `\"tic_tac_toe\"` - Classic 3×3\n",
+    "3. `\"kuhn_poker\"` - Imperfect information poker\n",
+    "4. `\"cliff_walking\"` - Grid navigation\n",
+    "5. `\"2048\"` - Tile puzzle\n",
+    "6. `\"blackjack\"` - Card game\n",
+    "\n",
+    "**All use the exact same interface!**\n",
+    "\n",
+    "</div>"
+   ]
   },
   {
    "cell_type": "markdown",
    "metadata": {},
-   "source": "---\n\n<a id=\"part-10\"></a>\n# Part 10: Create Your Own Integration 🛠️\n\n<div style=\"background-color: #e7f3ff; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n\n## The 5-Step Pattern\n\nWant to wrap your own environment in OpenEnv? Here's how:\n\n</div>\n\n### Step 1: Define Types (`models.py`)\n\n```python\nfrom dataclasses import dataclass\nfrom core.env_server import Action, Observation, State\n\n@dataclass\nclass YourAction(Action):\n    action_value: int\n    # Add your action fields\n\n@dataclass\nclass YourObservation(Observation):\n    state_data: List[float]\n    done: bool\n    reward: float\n    # Add your observation fields\n\n@dataclass\nclass YourState(State):\n    episode_id: str\n    step_count: int\n    # Add your state fields\n```\n\n### Step 2: Implement Environment (`server/environment.py`)\n\n```python\nfrom core.env_server import Environment\n\nclass YourEnvironment(Environment):\n    def reset(self) -> Observation:\n        # Initialize your game/simulation\n        return YourObservation(...)\n    \n    def step(self, action: Action) -> Observation:\n        # Execute action, update state\n        return YourObservation(...)\n    \n    @property\n    def state(self) -> State:\n        return self._state\n```\n\n### Step 3: Create Client (`client.py`)\n\n```python\nfrom core.http_env_client import HTTPEnvClient\nfrom core.types import StepResult\n\nclass YourEnv(HTTPEnvClient[YourAction, YourObservation]):\n    def _step_payload(self, action: YourAction) -> dict:\n        \"\"\"Convert action to JSON\"\"\"\n        return {\"action_value\": action.action_value}\n    \n    def _parse_result(self, payload: dict) -> StepResult:\n        \"\"\"Parse JSON to observation\"\"\"\n        return StepResult(\n            observation=YourObservation(...),\n            reward=payload['reward'],\n            done=payload['done']\n        )\n    \n    def _parse_state(self, payload: dict) -> YourState:\n        return YourState(...)\n```\n\n### Step 4: Create Server (`server/app.py`)\n\n```python\nfrom core.env_server import create_fastapi_app\nfrom .your_environment import YourEnvironment\n\nenv = YourEnvironment()\napp = create_fastapi_app(env)\n\n# That's it! OpenEnv creates all endpoints for you.\n```\n\n### Step 5: Dockerize (`server/Dockerfile`)\n\n```dockerfile\nFROM python:3.11-slim\n\nWORKDIR /app\nCOPY requirements.txt .\nRUN pip install --no-cache-dir -r requirements.txt\n\nCOPY . .\nCMD [\"uvicorn\", \"app:app\", \"--host\", \"0.0.0.0\", \"--port\", \"8000\"]\n```\n\n<div style=\"background-color: #d4edda; padding: 20px; border-left: 5px solid #28a745; margin: 20px 0;\">\n\n### 🎓 Examples to Study\n\nOpenEnv includes 3 complete examples:\n\n1. **`src/envs/echo_env/`**\n   - Simplest possible environment\n   - Great for testing and learning\n\n2. **`src/envs/openspiel_env/`**\n   - Wraps external library (OpenSpiel)\n   - Shows integration pattern\n   - 6 games in one integration\n\n3. **`src/envs/coding_env/`**\n   - Python code execution environment\n   - Shows complex use case\n   - Security considerations\n\n**💡 Study these to understand the patterns!**\n\n</div>"
+   "source": [
+    "---\n",
+    "\n",
+    "<a id=\"part-10\"></a>\n",
+    "# Part 10: Create Your Own Integration 🛠️\n",
+    "\n",
+    "<div style=\"background-color: #e7f3ff; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n",
+    "\n",
+    "## The 5-Step Pattern\n",
+    "\n",
+    "Want to wrap your own environment in OpenEnv? Here's how:\n",
+    "\n",
+    "</div>\n",
+    "\n",
+    "### Step 1: Define Types (`models.py`)\n",
+    "\n",
+    "```python\n",
+    "from dataclasses import dataclass\n",
+    "from core.env_server import Action, Observation, State\n",
+    "\n",
+    "@dataclass\n",
+    "class YourAction(Action):\n",
+    "    action_value: int\n",
+    "    # Add your action fields\n",
+    "\n",
+    "@dataclass\n",
+    "class YourObservation(Observation):\n",
+    "    state_data: List[float]\n",
+    "    done: bool\n",
+    "    reward: float\n",
+    "    # Add your observation fields\n",
+    "\n",
+    "@dataclass\n",
+    "class YourState(State):\n",
+    "    episode_id: str\n",
+    "    step_count: int\n",
+    "    # Add your state fields\n",
+    "```\n",
+    "\n",
+    "### Step 2: Implement Environment (`server/environment.py`)\n",
+    "\n",
+    "```python\n",
+    "from core.env_server import Environment\n",
+    "\n",
+    "class YourEnvironment(Environment):\n",
+    "    def reset(self) -> Observation:\n",
+    "        # Initialize your game/simulation\n",
+    "        return YourObservation(...)\n",
+    "    \n",
+    "    def step(self, action: Action) -> Observation:\n",
+    "        # Execute action, update state\n",
+    "        return YourObservation(...)\n",
+    "    \n",
+    "    @property\n",
+    "    def state(self) -> State:\n",
+    "        return self._state\n",
+    "```\n",
+    "\n",
+    "### Step 3: Create Client (`client.py`)\n",
+    "\n",
+    "```python\n",
+    "from core.http_env_client import HTTPEnvClient\n",
+    "from core.types import StepResult\n",
+    "\n",
+    "class YourEnv(HTTPEnvClient[YourAction, YourObservation]):\n",
+    "    def _step_payload(self, action: YourAction) -> dict:\n",
+    "        \"\"\"Convert action to JSON\"\"\"\n",
+    "        return {\"action_value\": action.action_value}\n",
+    "    \n",
+    "    def _parse_result(self, payload: dict) -> StepResult:\n",
+    "        \"\"\"Parse JSON to observation\"\"\"\n",
+    "        return StepResult(\n",
+    "            observation=YourObservation(...),\n",
+    "            reward=payload['reward'],\n",
+    "            done=payload['done']\n",
+    "        )\n",
+    "    \n",
+    "    def _parse_state(self, payload: dict) -> YourState:\n",
+    "        return YourState(...)\n",
+    "```\n",
+    "\n",
+    "### Step 4: Create Server (`server/app.py`)\n",
+    "\n",
+    "```python\n",
+    "from core.env_server import create_fastapi_app\n",
+    "from .your_environment import YourEnvironment\n",
+    "\n",
+    "env = YourEnvironment()\n",
+    "app = create_fastapi_app(env)\n",
+    "\n",
+    "# That's it! OpenEnv creates all endpoints for you.\n",
+    "```\n",
+    "\n",
+    "### Step 5: Dockerize (`server/Dockerfile`)\n",
+    "\n",
+    "```dockerfile\n",
+    "FROM python:3.11-slim\n",
+    "\n",
+    "WORKDIR /app\n",
+    "COPY requirements.txt .\n",
+    "RUN pip install --no-cache-dir -r requirements.txt\n",
+    "\n",
+    "COPY . .\n",
+    "CMD [\"uvicorn\", \"app:app\", \"--host\", \"0.0.0.0\", \"--port\", \"8000\"]\n",
+    "```\n",
+    "\n",
+    "<div style=\"background-color: #d4edda; padding: 20px; border-left: 5px solid #28a745; margin: 20px 0;\">\n",
+    "\n",
+    "### 🎓 Examples to Study\n",
+    "\n",
+    "OpenEnv includes 3 complete examples:\n",
+    "\n",
+    "1. **`src/envs/echo_env/`**\n",
+    "   - Simplest possible environment\n",
+    "   - Great for testing and learning\n",
+    "\n",
+    "2. **`src/envs/openspiel_env/`**\n",
+    "   - Wraps external library (OpenSpiel)\n",
+    "   - Shows integration pattern\n",
+    "   - 6 games in one integration\n",
+    "\n",
+    "3. **`src/envs/coding_env/`**\n",
+    "   - Python code execution environment\n",
+    "   - Shows complex use case\n",
+    "   - Security considerations\n",
+    "\n",
+    "**💡 Study these to understand the patterns!**\n",
+    "\n",
+    "</div>"
+   ]
   },
   {
    "cell_type": "markdown",
    "metadata": {},
-   "source": "---\n\n<a id=\"summary\"></a>\n<div style=\"background: linear-gradient(135deg, #667eea 0%, #764ba2 100%); color: white; padding: 40px; border-radius: 15px; margin: 40px 0; text-align: center;\">\n\n# 🎓 Summary: Your Journey\n\n</div>"
+   "source": [
+    "---\n",
+    "\n",
+    "<a id=\"summary\"></a>\n",
+    "<div style=\"background: linear-gradient(135deg, #667eea 0%, #764ba2 100%); color: white; padding: 40px; border-radius: 15px; margin: 40px 0; text-align: center;\">\n",
+    "\n",
+    "# 🎓 Summary: Your Journey\n",
+    "\n",
+    "</div>"
+   ]
   },
   {
    "cell_type": "markdown",
@@ -614,17 +1560,162 @@
   {
    "cell_type": "markdown",
    "metadata": {},
-   "source": "<a id=\"resources\"></a>\n## 📚 Resources\n\n<div style=\"background-color: #f8f9fa; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n\n### 🔗 Essential Links\n\n- **🏠 OpenEnv GitHub**: https://github.com/meta-pytorch/OpenEnv\n- **🎮 OpenSpiel**: https://github.com/google-deepmind/open_spiel\n- **⚡ FastAPI Docs**: https://fastapi.tiangolo.com/\n- **🐳 Docker Guide**: https://docs.docker.com/get-started/\n- **🔥 PyTorch**: https://pytorch.org/\n\n### 📖 Documentation Deep Dives\n\n- **Environment Creation Guide**: `src/envs/README.md`\n- **OpenSpiel Integration**: `src/envs/openspiel_env/README.md`\n- **Example Scripts**: `examples/`\n- **RFC 001**: [Baseline API Specs](https://github.com/meta-pytorch/OpenEnv/pull/26)\n\n### 🎓 Community & Support\n\n**Supported by amazing organizations:**\n- 🔥 Meta PyTorch\n- 🤗 Hugging Face\n- ⚡ Unsloth AI\n- 🌟 Reflection AI\n- 🚀 And many more!\n\n**License**: BSD 3-Clause (very permissive!)\n\n**Contributions**: Always welcome! Check out the issues tab.\n\n</div>\n\n---\n\n### 🌈 What's Next?\n\n1. ⭐ **Star the repo** to show support and stay updated\n2. 🔄 **Try modifying** the Catch game (make it harder? bigger grid?)\n3. 🎮 **Explore** other OpenSpiel games\n4. 🛠️ **Build** your own environment integration\n5. 💬 **Share** what you build with the community!"
+   "source": [
+    "<a id=\"resources\"></a>\n",
+    "## 📚 Resources\n",
+    "\n",
+    "<div style=\"background-color: #f8f9fa; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n",
+    "\n",
+    "### 🔗 Essential Links\n",
+    "\n",
+    "- **🏠 OpenEnv GitHub**: https://github.com/meta-pytorch/OpenEnv\n",
+    "- **🎮 OpenSpiel**: https://github.com/google-deepmind/open_spiel\n",
+    "- **⚡ FastAPI Docs**: https://fastapi.tiangolo.com/\n",
+    "- **🐳 Docker Guide**: https://docs.docker.com/get-started/\n",
+    "- **🔥 PyTorch**: https://pytorch.org/\n",
+    "\n",
+    "### 📖 Documentation Deep Dives\n",
+    "\n",
+    "- **Environment Creation Guide**: `src/envs/README.md`\n",
+    "- **OpenSpiel Integration**: `src/envs/openspiel_env/README.md`\n",
+    "- **Example Scripts**: `examples/`\n",
+    "- **RFC 001**: [Baseline API Specs](https://github.com/meta-pytorch/OpenEnv/pull/26)\n",
+    "\n",
+    "### 🎓 Community & Support\n",
+    "\n",
+    "**Supported by amazing organizations:**\n",
+    "- 🔥 Meta PyTorch\n",
+    "- 🤗 Hugging Face\n",
+    "- ⚡ Unsloth AI\n",
+    "- 🌟 Reflection AI\n",
+    "- 🚀 And many more!\n",
+    "\n",
+    "**License**: BSD 3-Clause (very permissive!)\n",
+    "\n",
+    "**Contributions**: Always welcome! Check out the issues tab.\n",
+    "\n",
+    "</div>\n",
+    "\n",
+    "---\n",
+    "\n",
+    "### 🌈 What's Next?\n",
+    "\n",
+    "1. ⭐ **Star the repo** to show support and stay updated\n",
+    "2. 🔄 **Try modifying** the Catch game (make it harder? bigger grid?)\n",
+    "3. 🎮 **Explore** other OpenSpiel games\n",
+    "4. 🛠️ **Build** your own environment integration\n",
+    "5. 💬 **Share** what you build with the community!"
+   ]
   },
   {
    "cell_type": "markdown",
    "metadata": {},
-   "source": "## 📚 Resources\n\n<div style=\"background-color: #f8f9fa; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n\n### 🔗 Essential Links\n\n- **🏠 OpenEnv GitHub**: https://github.com/meta-pytorch/OpenEnv\n- **🎮 OpenSpiel**: https://github.com/google-deepmind/open_spiel\n- **⚡ FastAPI Docs**: https://fastapi.tiangolo.com/\n- **🐳 Docker Guide**: https://docs.docker.com/get-started/\n- **🔥 PyTorch**: https://pytorch.org/\n\n### 📖 Documentation Deep Dives\n\n- **Environment Creation Guide**: `src/envs/README.md`\n- **OpenSpiel Integration**: `src/envs/openspiel_env/README.md`\n- **Example Scripts**: `examples/`\n- **RFC 001**: [Baseline API Specs](https://github.com/meta-pytorch/OpenEnv/pull/26)\n\n### 🎓 Community & Support\n\n**Supported by amazing organizations:**\n- 🔥 Meta PyTorch\n- 🤗 Hugging Face\n- ⚡ Unsloth AI\n- 🌟 Reflection AI\n- 🚀 And many more!\n\n**License**: BSD 3-Clause (very permissive!)\n\n**Contributions**: Always welcome! Check out the issues tab.\n\n</div>\n\n---\n\n### 🌈 What's Next?\n\n1. ⭐ **Star the repo** to show support and stay updated\n2. 🔄 **Try modifying** the Catch game (make it harder? bigger grid?)\n3. 🎮 **Explore** other OpenSpiel games\n4. 🛠️ **Build** your own environment integration\n5. 💬 **Share** what you build with the community!"
+   "source": [
+    "## 📚 Resources\n",
+    "\n",
+    "<div style=\"background-color: #f8f9fa; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n",
+    "\n",
+    "### 🔗 Essential Links\n",
+    "\n",
+    "- **🏠 OpenEnv GitHub**: https://github.com/meta-pytorch/OpenEnv\n",
+    "- **🎮 OpenSpiel**: https://github.com/google-deepmind/open_spiel\n",
+    "- **⚡ FastAPI Docs**: https://fastapi.tiangolo.com/\n",
+    "- **🐳 Docker Guide**: https://docs.docker.com/get-started/\n",
+    "- **🔥 PyTorch**: https://pytorch.org/\n",
+    "\n",
+    "### 📖 Documentation Deep Dives\n",
+    "\n",
+    "- **Environment Creation Guide**: `src/envs/README.md`\n",
+    "- **OpenSpiel Integration**: `src/envs/openspiel_env/README.md`\n",
+    "- **Example Scripts**: `examples/`\n",
+    "- **RFC 001**: [Baseline API Specs](https://github.com/meta-pytorch/OpenEnv/pull/26)\n",
+    "\n",
+    "### 🎓 Community & Support\n",
+    "\n",
+    "**Supported by amazing organizations:**\n",
+    "- 🔥 Meta PyTorch\n",
+    "- 🤗 Hugging Face\n",
+    "- ⚡ Unsloth AI\n",
+    "- 🌟 Reflection AI\n",
+    "- 🚀 And many more!\n",
+    "\n",
+    "**License**: BSD 3-Clause (very permissive!)\n",
+    "\n",
+    "**Contributions**: Always welcome! Check out the issues tab.\n",
+    "\n",
+    "</div>\n",
+    "\n",
+    "---\n",
+    "\n",
+    "### 🌈 What's Next?\n",
+    "\n",
+    "1. ⭐ **Star the repo** to show support and stay updated\n",
+    "2. 🔄 **Try modifying** the Catch game (make it harder? bigger grid?)\n",
+    "3. 🎮 **Explore** other OpenSpiel games\n",
+    "4. 🛠️ **Build** your own environment integration\n",
+    "5. 💬 **Share** what you build with the community!"
+   ]
   },
   {
    "cell_type": "markdown",
    "metadata": {},
-   "source": "---\n\n<div style=\"background: linear-gradient(135deg, #f093fb 0%, #f5576c 100%); color: white; padding: 50px; border-radius: 20px; margin: 40px 0; text-align: center;\">\n\n# 🎉 Congratulations! You Did It! 🎉\n\n### You're now an OpenEnv expert!\n\n<br>\n\n## ✅ What You've Mastered:\n\n**🧠 Concepts**\n- How RL works (the observe-act-reward loop)\n- Why OpenEnv matters (production-ready RL)\n- How to use existing environments\n\n**🛠️ Practical Skills**\n- Creating new integrations\n- Building type-safe environments\n- Deploying to production\n\n**🎯 Real Experience**\n- Built a complete RL environment\n- Tested multiple policies\n- Watched learning happen in real-time!\n\n---\n\n### Now go build something amazing! 🚀\n\n**Welcome to the future of RL with PyTorch & OpenEnv**\n\n<br>\n\n[![Star on GitHub](https://img.shields.io/badge/⭐_Star_on_GitHub-gray?style=for-the-badge)](https://github.com/meta-pytorch/OpenEnv)\n\n</div>\n\n---\n\n<div style=\"background-color: #f0f7ff; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n\n## 🌟 Want to Learn More?\n\n- 📖 Check out the [docs](https://github.com/meta-pytorch/OpenEnv)\n- 🎮 Try the other example games\n- 💬 Join the community discussions\n- 🛠️ Build your own integration\n- 🚀 Deploy to production\n- ⭐ Star the repo to stay updated!\n\n**Happy coding! 🎊**\n\n</div>"
+   "source": [
+    "---\n",
+    "\n",
+    "<div style=\"background: linear-gradient(135deg, #f093fb 0%, #f5576c 100%); color: white; padding: 50px; border-radius: 20px; margin: 40px 0; text-align: center;\">\n",
+    "\n",
+    "# 🎉 Congratulations! You Did It! 🎉\n",
+    "\n",
+    "### You're now an OpenEnv expert!\n",
+    "\n",
+    "<br>\n",
+    "\n",
+    "## ✅ What You've Mastered:\n",
+    "\n",
+    "**🧠 Concepts**\n",
+    "- How RL works (the observe-act-reward loop)\n",
+    "- Why OpenEnv matters (production-ready RL)\n",
+    "- How to use existing environments\n",
+    "\n",
+    "**🛠️ Practical Skills**\n",
+    "- Creating new integrations\n",
+    "- Building type-safe environments\n",
+    "- Deploying to production\n",
+    "\n",
+    "**🎯 Real Experience**\n",
+    "- Built a complete RL environment\n",
+    "- Tested multiple policies\n",
+    "- Watched learning happen in real-time!\n",
+    "\n",
+    "---\n",
+    "\n",
+    "### Now go build something amazing! 🚀\n",
+    "\n",
+    "**Welcome to the future of RL with PyTorch & OpenEnv**\n",
+    "\n",
+    "<br>\n",
+    "\n",
+    "[![Star on GitHub](https://img.shields.io/badge/⭐_Star_on_GitHub-gray?style=for-the-badge)](https://github.com/meta-pytorch/OpenEnv)\n",
+    "\n",
+    "</div>\n",
+    "\n",
+    "---\n",
+    "\n",
+    "<div style=\"background-color: #f0f7ff; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n",
+    "\n",
+    "## 🌟 Want to Learn More?\n",
+    "\n",
+    "- 📖 Check out the [docs](https://github.com/meta-pytorch/OpenEnv)\n",
+    "- 🎮 Try the other example games\n",
+    "- 💬 Join the community discussions\n",
+    "- 🛠️ Build your own integration\n",
+    "- 🚀 Deploy to production\n",
+    "- ⭐ Star the repo to stay updated!\n",
+    "\n",
+    "**Happy coding! 🎊**\n",
+    "\n",
+    "</div>"
+   ]
   }
  ],
  "metadata": {
@@ -648,4 +1739,4 @@
  },
  "nbformat": 4,
  "nbformat_minor": 4
-}
\ No newline at end of file
+}

From dab03bf69183468640754fc321c01b81a629f156 Mon Sep 17 00:00:00 2001
From: Sanyam Bhutani <sanyambhutani@meta.com>
Date: Mon, 20 Oct 2025 13:58:07 -0700
Subject: [PATCH 10/19] Update OpenEnv_Tutorial.ipynb

---
 examples/OpenEnv_Tutorial.ipynb | 32 +-------------------------------
 1 file changed, 1 insertion(+), 31 deletions(-)

diff --git a/examples/OpenEnv_Tutorial.ipynb b/examples/OpenEnv_Tutorial.ipynb
index f588c32..7ade722 100644
--- a/examples/OpenEnv_Tutorial.ipynb
+++ b/examples/OpenEnv_Tutorial.ipynb
@@ -76,37 +76,7 @@
     "\n",
     "> 💡 **Pro Tip**: This notebook is designed to run top-to-bottom in Google Colab with zero setup!\n",
     "> \n",
-    "> ⏱️ **Time**: ~30 minutes | 📊 **Difficulty**: Beginner-friendly | 🎯 **Outcome**: Production-ready RL knowledge"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "---\n",
-    "\n",
-    "<a id=\"part-1\"></a>\n",
-    "# Part 1: RL in 60 Seconds ⏱️\n",
-    "\n",
-    "<div style=\"background-color: #f0f7ff; padding: 20px; border-left: 5px solid #2196F3; margin: 20px 0;\">\n",
-    "\n",
-    "**Reinforcement Learning is simpler than you think.**\n",
-    "\n",
-    "It's just a loop:\n",
-    "\n",
-    "```\n",
-    "while not done:\n",
-    "    observation = environment.observe()\n",
-    "    action = policy.choose(observation)\n",
-    "    reward = environment.step(action)\n",
-    "    policy.learn(reward)\n",
-    "```\n",
-    "\n",
-    "That's it. That's RL.\n",
-    "\n",
-    "</div>\n",
-    "\n",
-    "Let's see it in action:"
+    "> ⏱️ **Time**: ~5 minutes | 📊 **Difficulty**: Beginner-friendly | 🎯 **Outcome**: Production-ready RL knowledge"
    ]
   },
   {

From f017dc28cec80a748176b9a905b0324a6649eb43 Mon Sep 17 00:00:00 2001
From: Sanyam Bhutani <sanyambhutani@meta.com>
Date: Mon, 20 Oct 2025 13:58:52 -0700
Subject: [PATCH 11/19] Update OpenEnv_Tutorial.ipynb

---
 examples/OpenEnv_Tutorial.ipynb | 18 ------------------
 1 file changed, 18 deletions(-)

diff --git a/examples/OpenEnv_Tutorial.ipynb b/examples/OpenEnv_Tutorial.ipynb
index 7ade722..a756a02 100644
--- a/examples/OpenEnv_Tutorial.ipynb
+++ b/examples/OpenEnv_Tutorial.ipynb
@@ -344,24 +344,6 @@
     "</div>"
    ]
   },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "---\n",
-    "\n",
-    "<a id=\"part-3\"></a>\n",
-    "# Part 3: Setup 🛠️\n",
-    "\n",
-    "<div style=\"background-color: #f8f9fa; padding: 15px; border-radius: 5px; margin: 20px 0;\">\n",
-    "\n",
-    "**Running in Colab?** This cell will clone OpenEnv and install dependencies automatically.\n",
-    "\n",
-    "**Running locally?** Make sure you're in the OpenEnv directory.\n",
-    "\n",
-    "</div>"
-   ]
-  },
   {
    "cell_type": "markdown",
    "metadata": {},

From 4cb119bf5cc6acfa4f6bdd9bc70a740f42b0c4e3 Mon Sep 17 00:00:00 2001
From: Sanyam Bhutani <sanyambhutani@meta.com>
Date: Mon, 20 Oct 2025 13:59:45 -0700
Subject: [PATCH 12/19] Update OpenEnv_Tutorial.ipynb

---
 examples/OpenEnv_Tutorial.ipynb | 34 ---------------------------------
 1 file changed, 34 deletions(-)

diff --git a/examples/OpenEnv_Tutorial.ipynb b/examples/OpenEnv_Tutorial.ipynb
index a756a02..93fc8c9 100644
--- a/examples/OpenEnv_Tutorial.ipynb
+++ b/examples/OpenEnv_Tutorial.ipynb
@@ -361,40 +361,6 @@
     "</div>"
    ]
   },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "---\n",
-    "\n",
-    "<a id=\"part-4\"></a>\n",
-    "# Part 4: The OpenEnv Pattern 🏗️\n",
-    "\n",
-    "<div style=\"background-color: #f0f7ff; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n",
-    "\n",
-    "## Every OpenEnv Environment Has 3 Components:\n",
-    "\n",
-    "```\n",
-    "src/envs/your_env/\n",
-    "├── 📝 models.py          ← Type-safe contracts\n",
-    "│                           (Action, Observation, State)\n",
-    "│\n",
-    "├── 📱 client.py          ← What YOU import\n",
-    "│                           (HTTPEnvClient implementation)\n",
-    "│\n",
-    "└── 🖥️  server/\n",
-    "    ├── environment.py    ← Game/simulation logic\n",
-    "    ├── app.py            ← FastAPI server\n",
-    "    └── Dockerfile        ← Container definition\n",
-    "```\n",
-    "\n",
-    "</div>\n",
-    "\n",
-    "Let's explore the actual OpenEnv code to see how this works:"
-   ]
-  },
   {
    "cell_type": "markdown",
    "metadata": {},

From 5a44568f0e36c688bea4ad52d9c00629e09a3957 Mon Sep 17 00:00:00 2001
From: Sanyam Bhutani <sanyambhutani@meta.com>
Date: Mon, 20 Oct 2025 14:00:00 -0700
Subject: [PATCH 13/19] Update OpenEnv_Tutorial.ipynb

---
 examples/OpenEnv_Tutorial.ipynb | 47 ---------------------------------
 1 file changed, 47 deletions(-)

diff --git a/examples/OpenEnv_Tutorial.ipynb b/examples/OpenEnv_Tutorial.ipynb
index 93fc8c9..3f7bf58 100644
--- a/examples/OpenEnv_Tutorial.ipynb
+++ b/examples/OpenEnv_Tutorial.ipynb
@@ -392,53 +392,6 @@
     "Let's explore the actual OpenEnv code to see how this works:"
    ]
   },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "---\n",
-    "\n",
-    "<a id=\"part-5\"></a>\n",
-    "# Part 5: Example Integration - OpenSpiel 🎮\n",
-    "\n",
-    "<div style=\"background-color: #fff3e0; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n",
-    "\n",
-    "## What is OpenSpiel?\n",
-    "\n",
-    "**OpenSpiel** is a library from DeepMind with **70+ game environments** for RL research.\n",
-    "\n",
-    "## OpenEnv's Integration\n",
-    "\n",
-    "We've wrapped **6 OpenSpiel games** following the OpenEnv pattern:\n",
-    "\n",
-    "<table>\n",
-    "<tr>\n",
-    "<td width=\"50%\">\n",
-    "\n",
-    "**🎯 Single-Player**\n",
-    "1. **Catch** - Catch falling ball\n",
-    "2. **Cliff Walking** - Navigate grid\n",
-    "3. **2048** - Tile puzzle\n",
-    "4. **Blackjack** - Card game\n",
-    "\n",
-    "</td>\n",
-    "<td width=\"50%\">\n",
-    "\n",
-    "**👥 Multi-Player**\n",
-    "5. **Tic-Tac-Toe** - Classic 3×3\n",
-    "6. **Kuhn Poker** - Imperfect info poker\n",
-    "\n",
-    "</td>\n",
-    "</tr>\n",
-    "</table>\n",
-    "\n",
-    "This shows how OpenEnv can wrap **any** existing RL library!\n",
-    "\n",
-    "</div>"
-   ]
-  },
   {
    "cell_type": "markdown",
    "metadata": {},

From 1a88d0708f7bc0fa455827afb4b0db139cefb1ab Mon Sep 17 00:00:00 2001
From: Sanyam Bhutani <sanyambhutani@meta.com>
Date: Mon, 20 Oct 2025 14:00:32 -0700
Subject: [PATCH 14/19] Update OpenEnv_Tutorial.ipynb

---
 examples/OpenEnv_Tutorial.ipynb | 72 ---------------------------------
 1 file changed, 72 deletions(-)

diff --git a/examples/OpenEnv_Tutorial.ipynb b/examples/OpenEnv_Tutorial.ipynb
index 3f7bf58..caa1267 100644
--- a/examples/OpenEnv_Tutorial.ipynb
+++ b/examples/OpenEnv_Tutorial.ipynb
@@ -272,78 +272,6 @@
     "</div>"
    ]
   },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "---\n",
-    "\n",
-    "# Part 2: The Problem with Traditional RL 😤\n",
-    "\n",
-    "<div style=\"background-color: #fff3e0; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n",
-    "\n",
-    "## 🤔 Why Can't We Just Use OpenAI Gym?\n",
-    "\n",
-    "Good question! Gym is great for research, but production needs more...\n",
-    "\n",
-    "</div>\n",
-    "\n",
-    "<table>\n",
-    "<tr>\n",
-    "<th>Challenge</th>\n",
-    "<th>Traditional Approach</th>\n",
-    "<th>OpenEnv Solution</th>\n",
-    "</tr>\n",
-    "<tr>\n",
-    "<td><b>Type Safety</b></td>\n",
-    "<td>❌ <code>obs[0][3]</code> - what is this?</td>\n",
-    "<td>✅ <code>obs.info_state</code> - IDE knows!</td>\n",
-    "</tr>\n",
-    "<tr>\n",
-    "<td><b>Isolation</b></td>\n",
-    "<td>❌ Same process (can crash your training)</td>\n",
-    "<td>✅ Docker containers (fully isolated)</td>\n",
-    "</tr>\n",
-    "<tr>\n",
-    "<td><b>Deployment</b></td>\n",
-    "<td>❌ \"Works on my machine\" 🤷</td>\n",
-    "<td>✅ Same container everywhere 🐳</td>\n",
-    "</tr>\n",
-    "<tr>\n",
-    "<td><b>Scaling</b></td>\n",
-    "<td>❌ Hard to distribute</td>\n",
-    "<td>✅ Deploy to Kubernetes ☸️</td>\n",
-    "</tr>\n",
-    "<tr>\n",
-    "<td><b>Language</b></td>\n",
-    "<td>❌ Python only</td>\n",
-    "<td>✅ Any language (HTTP API) 🌐</td>\n",
-    "</tr>\n",
-    "<tr>\n",
-    "<td><b>Debugging</b></td>\n",
-    "<td>❌ Cryptic numpy errors</td>\n",
-    "<td>✅ Clear type errors 🐛</td>\n",
-    "</tr>\n",
-    "</table>\n",
-    "\n",
-    "<div style=\"background-color: #d4edda; padding: 20px; border-left: 5px solid #28a745; margin: 20px 0;\">\n",
-    "\n",
-    "## 💡 The OpenEnv Philosophy\n",
-    "\n",
-    "**\"RL environments should be like microservices\"**\n",
-    "\n",
-    "Think of it like this: You don't run your database in the same process as your web server, right? Same principle!\n",
-    "\n",
-    "- 🔒 **Isolated**: Run in containers (security + stability)\n",
-    "- 🌐 **Standard**: HTTP API, works everywhere\n",
-    "- 📦 **Versioned**: Docker images (reproducibility!)\n",
-    "- 🚀 **Scalable**: Deploy to cloud with one command\n",
-    "- 🛡️ **Type-safe**: Catch bugs before they happen\n",
-    "- 🔄 **Portable**: Works on Mac, Linux, Windows, Cloud\n",
-    "\n",
-    "</div>"
-   ]
-  },
   {
    "cell_type": "markdown",
    "metadata": {},

From cdc39e894eb4ef5d9d7195de96f475e731916173 Mon Sep 17 00:00:00 2001
From: Sanyam Bhutani <sanyambhutani@meta.com>
Date: Mon, 20 Oct 2025 14:06:18 -0700
Subject: [PATCH 15/19] fix repeats

---
 examples/OpenEnv_Tutorial.ipynb | 745 ++++++++++++++++++--------------
 fix_notebook.py                 | 171 ++++++++
 2 files changed, 595 insertions(+), 321 deletions(-)
 create mode 100644 fix_notebook.py

diff --git a/examples/OpenEnv_Tutorial.ipynb b/examples/OpenEnv_Tutorial.ipynb
index caa1267..126697f 100644
--- a/examples/OpenEnv_Tutorial.ipynb
+++ b/examples/OpenEnv_Tutorial.ipynb
@@ -12,13 +12,13 @@
     "\n",
     "# OpenEnv: Production RL Made Simple\n",
     "\n",
-    "### *From \"Hello World\" to RL Training in 5 Minutes* ✨\n",
+    "### *From \"Hello World\" to RL Training in 5 Minutes* \u2728\n",
     "\n",
     "---\n",
     "\n",
     "**What if RL environments were as easy to use as REST APIs?**\n",
     "\n",
-    "That's OpenEnv. Type-safe. Isolated. Production-ready. 🎯\n",
+    "That's OpenEnv. Type-safe. Isolated. Production-ready. \ud83c\udfaf\n",
     "\n",
     "[![GitHub](https://img.shields.io/badge/GitHub-meta--pytorch%2FOpenEnv-blue?logo=github)](https://github.com/meta-pytorch/OpenEnv)\n",
     "[![License](https://img.shields.io/badge/License-BSD%203--Clause-green.svg)](https://opensource.org/licenses/BSD-3-Clause)\n",
@@ -33,50 +33,50 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "## 📋 What You'll Learn\n",
+    "## \ud83d\udccb What You'll Learn\n",
     "\n",
     "<table>\n",
     "<tr>\n",
     "<td width=\"50%\">\n",
     "\n",
-    "**🎯 Part 1-2: The Fundamentals**\n",
-    "- ⚡ RL in 60 seconds\n",
-    "- 🤔 Why existing solutions fall short\n",
-    "- 💡 The OpenEnv solution\n",
+    "**\ud83c\udfaf Part 1-2: The Fundamentals**\n",
+    "- \u26a1 RL in 60 seconds\n",
+    "- \ud83e\udd14 Why existing solutions fall short\n",
+    "- \ud83d\udca1 The OpenEnv solution\n",
     "\n",
     "</td>\n",
     "<td width=\"50%\">\n",
     "\n",
-    "**🏗️ Part 3-5: The Architecture**\n",
-    "- 🔧 How OpenEnv works\n",
-    "- 🔍 Exploring real code\n",
-    "- 🎮 OpenSpiel integration example\n",
+    "**\ud83c\udfd7\ufe0f Part 3-5: The Architecture**\n",
+    "- \ud83d\udd27 How OpenEnv works\n",
+    "- \ud83d\udd0d Exploring real code\n",
+    "- \ud83c\udfae OpenSpiel integration example\n",
     "\n",
     "</td>\n",
     "</tr>\n",
     "<tr>\n",
     "<td width=\"50%\">\n",
     "\n",
-    "**🎮 Part 6-8: Hands-On Demo**\n",
-    "- 🔨 Build a game environment\n",
-    "- 🤖 Test 4 different policies\n",
-    "- 👀 Watch learning happen live\n",
+    "**\ud83c\udfae Part 6-8: Hands-On Demo**\n",
+    "- \ud83d\udd28 Build a game environment\n",
+    "- \ud83e\udd16 Test 4 different policies\n",
+    "- \ud83d\udc40 Watch learning happen live\n",
     "\n",
     "</td>\n",
     "<td width=\"50%\">\n",
     "\n",
-    "**🔧 Part 9-10: Going Further**\n",
-    "- 🚀 Use real OpenSpiel\n",
-    "- ✨ Create your own integration\n",
-    "- 🌐 Deploy to production\n",
+    "**\ud83d\udd27 Part 9-10: Going Further**\n",
+    "- \ud83d\ude80 Use real OpenSpiel\n",
+    "- \u2728 Create your own integration\n",
+    "- \ud83c\udf10 Deploy to production\n",
     "\n",
     "</td>\n",
     "</tr>\n",
     "</table>\n",
     "\n",
-    "> 💡 **Pro Tip**: This notebook is designed to run top-to-bottom in Google Colab with zero setup!\n",
+    "> \ud83d\udca1 **Pro Tip**: This notebook is designed to run top-to-bottom in Google Colab with zero setup!\n",
     "> \n",
-    "> ⏱️ **Time**: ~5 minutes | 📊 **Difficulty**: Beginner-friendly | 🎯 **Outcome**: Production-ready RL knowledge"
+    "> \u23f1\ufe0f **Time**: ~5 minutes | \ud83d\udcca **Difficulty**: Beginner-friendly | \ud83c\udfaf **Outcome**: Production-ready RL knowledge"
    ]
   },
   {
@@ -85,7 +85,87 @@
    "source": [
     "---\n",
     "\n",
-    "# Part 1: RL in 60 Seconds ⏱️\n",
+    "## \ud83d\udcd1 Table of Contents\n",
+    "\n",
+    "<div style=\"background-color: #f8f9fa; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n",
+    "\n",
+    "**Quick Navigation** - Click any section to jump right there! \ud83c\udfaf\n",
+    "\n",
+    "### Foundation\n",
+    "- [Part 1: RL in 60 Seconds \u23f1\ufe0f](#part-1)\n",
+    "- [Part 2: The Problem with Traditional RL \ud83d\ude24](#part-2)\n",
+    "- [Part 3: Setup \ud83d\udee0\ufe0f](#part-3)\n",
+    "\n",
+    "### Architecture\n",
+    "- [Part 4: The OpenEnv Pattern \ud83c\udfd7\ufe0f](#part-4)\n",
+    "- [Part 5: Example Integration - OpenSpiel \ud83c\udfae](#part-5)\n",
+    "\n",
+    "### Hands-On Demo\n",
+    "- [Part 6: Interactive Demo \ud83c\udfae](#part-6)\n",
+    "- [Part 7: Four Policies \ud83e\udd16](#part-7)\n",
+    "- [Part 8: Policy Competition! \ud83c\udfc6](#part-8)\n",
+    "\n",
+    "### Advanced\n",
+    "- [Part 9: Using Real OpenSpiel \ud83c\udfae](#part-9)\n",
+    "- [Part 10: Create Your Own Integration \ud83d\udee0\ufe0f](#part-10)\n",
+    "\n",
+    "### Wrap Up\n",
+    "- [Summary: Your Journey \ud83c\udf93](#summary)\n",
+    "- [Resources \ud83d\udcda](#resources)\n",
+    "\n",
+    "</div>\n",
+    "\n",
+    "---"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Detect environment\n",
+    "try:\n",
+    "    import google.colab\n",
+    "    IN_COLAB = True\n",
+    "    print(\"\ud83c\udf10 Running in Google Colab - Perfect!\")\n",
+    "except ImportError:\n",
+    "    IN_COLAB = False\n",
+    "    print(\"\ud83d\udcbb Running locally - Nice!\")\n",
+    "\n",
+    "if IN_COLAB:\n",
+    "    print(\"\\n\ud83d\udce6 Cloning OpenEnv repository...\")\n",
+    "    !git clone https://github.com/meta-pytorch/OpenEnv.git > /dev/null 2>&1\n",
+    "    %cd OpenEnv\n",
+    "    \n",
+    "    print(\"\ud83d\udcda Installing dependencies (this takes ~10 seconds)...\")\n",
+    "    !pip install -q fastapi uvicorn requests\n",
+    "    \n",
+    "    import sys\n",
+    "    sys.path.insert(0, './src')\n",
+    "    print(\"\\n\u2705 Setup complete! Everything is ready to go! \ud83c\udf89\")\n",
+    "else:\n",
+    "    import sys\n",
+    "    from pathlib import Path\n",
+    "    sys.path.insert(0, str(Path.cwd().parent / 'src'))\n",
+    "    print(\"\u2705 Using local OpenEnv installation\")\n",
+    "\n",
+    "print(\"\\n\ud83d\ude80 Ready to explore OpenEnv and build amazing things!\")\n",
+    "print(\"\ud83d\udca1 Tip: Run cells top-to-bottom for the best experience.\\n\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "source": "---\n\n<a id=\"part-1\"></a>\n# Part 1: RL in 60 Seconds \u23f1\ufe0f\n\n<div style=\"background-color: #f0f7ff; padding: 20px; border-left: 5px solid #2196F3; margin: 20px 0;\">\n\n**Reinforcement Learning is simpler than you think.**\n\nIt's just a loop:\n\n```\nwhile not done:\n    observation = environment.observe()\n    action = policy.choose(observation)\n    reward = environment.step(action)\n    policy.learn(reward)\n```\n\nThat's it. That's RL.\n\n</div>\n\nLet's see it in action:",
+   "metadata": {}
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "---\n",
+    "\n",
+    "# Part 1: RL in 60 Seconds \u23f1\ufe0f\n",
     "\n",
     "<div style=\"background-color: #f0f7ff; padding: 20px; border-left: 5px solid #2196F3; margin: 20px 0;\">\n",
     "\n",
@@ -116,16 +196,16 @@
    "source": [
     "import random\n",
     "\n",
-    "print(\"🎲 \" + \"=\"*58 + \" 🎲\")\n",
+    "print(\"\ud83c\udfb2 \" + \"=\"*58 + \" \ud83c\udfb2\")\n",
     "print(\"   Number Guessing Game - The Simplest RL Example\")\n",
-    "print(\"🎲 \" + \"=\"*58 + \" 🎲\")\n",
+    "print(\"\ud83c\udfb2 \" + \"=\"*58 + \" \ud83c\udfb2\")\n",
     "\n",
     "# Environment setup\n",
     "target = random.randint(1, 10)\n",
     "guesses_left = 3\n",
     "\n",
-    "print(f\"\\n🎯 I'm thinking of a number between 1 and 10...\")\n",
-    "print(f\"💭 You have {guesses_left} guesses. Let's see how random guessing works!\\n\")\n",
+    "print(f\"\\n\ud83c\udfaf I'm thinking of a number between 1 and 10...\")\n",
+    "print(f\"\ud83d\udcad You have {guesses_left} guesses. Let's see how random guessing works!\\n\")\n",
     "\n",
     "# The RL Loop - Pure random policy (no learning!)\n",
     "while guesses_left > 0:\n",
@@ -133,21 +213,21 @@
     "    guess = random.randint(1, 10)\n",
     "    guesses_left -= 1\n",
     "    \n",
-    "    print(f\"💭 Guess #{3-guesses_left}: {guess}\", end=\" → \")\n",
+    "    print(f\"\ud83d\udcad Guess #{3-guesses_left}: {guess}\", end=\" \u2192 \")\n",
     "    \n",
     "    # Reward signal (but we're not using it!)\n",
     "    if guess == target:\n",
-    "        print(\"🎉 Correct! +10 points\")\n",
+    "        print(\"\ud83c\udf89 Correct! +10 points\")\n",
     "        break\n",
     "    elif abs(guess - target) <= 2:\n",
-    "        print(\"🔥 Warm! (close)\")\n",
+    "        print(\"\ud83d\udd25 Warm! (close)\")\n",
     "    else:\n",
-    "        print(\"❄️  Cold! (far)\")\n",
+    "        print(\"\u2744\ufe0f  Cold! (far)\")\n",
     "else:\n",
-    "    print(f\"\\n💔 Out of guesses. The number was {target}.\")\n",
+    "    print(f\"\\n\ud83d\udc94 Out of guesses. The number was {target}.\")\n",
     "\n",
     "print(\"\\n\" + \"=\"*62)\n",
-    "print(\"💡 This is RL: Observe → Act → Reward → Repeat\")\n",
+    "print(\"\ud83d\udca1 This is RL: Observe \u2192 Act \u2192 Reward \u2192 Repeat\")\n",
     "print(\"   But this policy is terrible! It doesn't learn from rewards.\")\n",
     "print(\"=\"*62 + \"\\n\")"
    ]
@@ -159,11 +239,11 @@
     "---\n",
     "\n",
     "<a id=\"part-2\"></a>\n",
-    "# Part 2: The Problem with Traditional RL 😤\n",
+    "# Part 2: The Problem with Traditional RL \ud83d\ude24\n",
     "\n",
     "<div style=\"background-color: #fff3e0; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n",
     "\n",
-    "## 🤔 Why Can't We Just Use OpenAI Gym?\n",
+    "## \ud83e\udd14 Why Can't We Just Use OpenAI Gym?\n",
     "\n",
     "Good question! Gym is great for research, but production needs more...\n",
     "\n",
@@ -177,50 +257,50 @@
     "</tr>\n",
     "<tr>\n",
     "<td><b>Type Safety</b></td>\n",
-    "<td>❌ <code>obs[0][3]</code> - what is this?</td>\n",
-    "<td>✅ <code>obs.info_state</code> - IDE knows!</td>\n",
+    "<td>\u274c <code>obs[0][3]</code> - what is this?</td>\n",
+    "<td>\u2705 <code>obs.info_state</code> - IDE knows!</td>\n",
     "</tr>\n",
     "<tr>\n",
     "<td><b>Isolation</b></td>\n",
-    "<td>❌ Same process (can crash your training)</td>\n",
-    "<td>✅ Docker containers (fully isolated)</td>\n",
+    "<td>\u274c Same process (can crash your training)</td>\n",
+    "<td>\u2705 Docker containers (fully isolated)</td>\n",
     "</tr>\n",
     "<tr>\n",
     "<td><b>Deployment</b></td>\n",
-    "<td>❌ \"Works on my machine\" 🤷</td>\n",
-    "<td>✅ Same container everywhere 🐳</td>\n",
+    "<td>\u274c \"Works on my machine\" \ud83e\udd37</td>\n",
+    "<td>\u2705 Same container everywhere \ud83d\udc33</td>\n",
     "</tr>\n",
     "<tr>\n",
     "<td><b>Scaling</b></td>\n",
-    "<td>❌ Hard to distribute</td>\n",
-    "<td>✅ Deploy to Kubernetes ☸️</td>\n",
+    "<td>\u274c Hard to distribute</td>\n",
+    "<td>\u2705 Deploy to Kubernetes \u2638\ufe0f</td>\n",
     "</tr>\n",
     "<tr>\n",
     "<td><b>Language</b></td>\n",
-    "<td>❌ Python only</td>\n",
-    "<td>✅ Any language (HTTP API) 🌐</td>\n",
+    "<td>\u274c Python only</td>\n",
+    "<td>\u2705 Any language (HTTP API) \ud83c\udf10</td>\n",
     "</tr>\n",
     "<tr>\n",
     "<td><b>Debugging</b></td>\n",
-    "<td>❌ Cryptic numpy errors</td>\n",
-    "<td>✅ Clear type errors 🐛</td>\n",
+    "<td>\u274c Cryptic numpy errors</td>\n",
+    "<td>\u2705 Clear type errors \ud83d\udc1b</td>\n",
     "</tr>\n",
     "</table>\n",
     "\n",
     "<div style=\"background-color: #d4edda; padding: 20px; border-left: 5px solid #28a745; margin: 20px 0;\">\n",
     "\n",
-    "## 💡 The OpenEnv Philosophy\n",
+    "## \ud83d\udca1 The OpenEnv Philosophy\n",
     "\n",
     "**\"RL environments should be like microservices\"**\n",
     "\n",
     "Think of it like this: You don't run your database in the same process as your web server, right? Same principle!\n",
     "\n",
-    "- 🔒 **Isolated**: Run in containers (security + stability)\n",
-    "- 🌐 **Standard**: HTTP API, works everywhere\n",
-    "- 📦 **Versioned**: Docker images (reproducibility!)\n",
-    "- 🚀 **Scalable**: Deploy to cloud with one command\n",
-    "- 🛡️ **Type-safe**: Catch bugs before they happen\n",
-    "- 🔄 **Portable**: Works on Mac, Linux, Windows, Cloud\n",
+    "- \ud83d\udd12 **Isolated**: Run in containers (security + stability)\n",
+    "- \ud83c\udf10 **Standard**: HTTP API, works everywhere\n",
+    "- \ud83d\udce6 **Versioned**: Docker images (reproducibility!)\n",
+    "- \ud83d\ude80 **Scalable**: Deploy to cloud with one command\n",
+    "- \ud83d\udee1\ufe0f **Type-safe**: Catch bugs before they happen\n",
+    "- \ud83d\udd04 **Portable**: Works on Mac, Linux, Windows, Cloud\n",
     "\n",
     "</div>"
    ]
@@ -232,34 +312,34 @@
     "### The Architecture\n",
     "\n",
     "```\n",
-    "┌────────────────────────────────────────────────────────────┐\n",
-    "│  YOUR TRAINING CODE                                        │\n",
-    "│                                                            │\n",
-    "│  env = OpenSpielEnv(...)        ← Import the client      │\n",
-    "│  result = env.reset()           ← Type-safe!             │\n",
-    "│  result = env.step(action)      ← Type-safe!             │\n",
-    "│                                                            │\n",
-    "└─────────────────┬──────────────────────────────────────────┘\n",
-    "                  │\n",
-    "                  │  HTTP/JSON (Language-Agnostic)\n",
-    "                  │  POST /reset, POST /step, GET /state\n",
-    "                  │\n",
-    "┌─────────────────▼──────────────────────────────────────────┐\n",
-    "│  DOCKER CONTAINER                                          │\n",
-    "│                                                            │\n",
-    "│  ┌──────────────────────────────────────────────┐         │\n",
-    "│  │  FastAPI Server                              │         │\n",
-    "│  │  └─ Environment (reset, step, state)         │         │\n",
-    "│  │     └─ Your Game/Simulation Logic            │         │\n",
-    "│  └──────────────────────────────────────────────┘         │\n",
-    "│                                                            │\n",
-    "│  Isolated • Reproducible • Secure                          │\n",
-    "└────────────────────────────────────────────────────────────┘\n",
+    "\u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510\n",
+    "\u2502  YOUR TRAINING CODE                                        \u2502\n",
+    "\u2502                                                            \u2502\n",
+    "\u2502  env = OpenSpielEnv(...)        \u2190 Import the client      \u2502\n",
+    "\u2502  result = env.reset()           \u2190 Type-safe!             \u2502\n",
+    "\u2502  result = env.step(action)      \u2190 Type-safe!             \u2502\n",
+    "\u2502                                                            \u2502\n",
+    "\u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u252c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518\n",
+    "                  \u2502\n",
+    "                  \u2502  HTTP/JSON (Language-Agnostic)\n",
+    "                  \u2502  POST /reset, POST /step, GET /state\n",
+    "                  \u2502\n",
+    "\u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u25bc\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510\n",
+    "\u2502  DOCKER CONTAINER                                          \u2502\n",
+    "\u2502                                                            \u2502\n",
+    "\u2502  \u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510         \u2502\n",
+    "\u2502  \u2502  FastAPI Server                              \u2502         \u2502\n",
+    "\u2502  \u2502  \u2514\u2500 Environment (reset, step, state)         \u2502         \u2502\n",
+    "\u2502  \u2502     \u2514\u2500 Your Game/Simulation Logic            \u2502         \u2502\n",
+    "\u2502  \u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518         \u2502\n",
+    "\u2502                                                            \u2502\n",
+    "\u2502  Isolated \u2022 Reproducible \u2022 Secure                          \u2502\n",
+    "\u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518\n",
     "```\n",
     "\n",
     "<div style=\"background-color: #e7f3ff; padding: 15px; border-left: 5px solid #0366d6; margin: 20px 0;\">\n",
     "\n",
-    "**🎯 Key Insight**: You never see HTTP details - just clean Python methods!\n",
+    "**\ud83c\udfaf Key Insight**: You never see HTTP details - just clean Python methods!\n",
     "\n",
     "```python\n",
     "env.reset()    # Under the hood: HTTP POST to /reset\n",
@@ -267,18 +347,23 @@
     "env.state()    # Under the hood: HTTP GET to /state\n",
     "```\n",
     "\n",
-    "The magic? OpenEnv handles all the plumbing. You focus on RL! ✨\n",
+    "The magic? OpenEnv handles all the plumbing. You focus on RL! \u2728\n",
     "\n",
     "</div>"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": "---\n\n<a id=\"part-3\"></a>\n# Part 3: Setup \ud83d\udee0\ufe0f\n\n<div style=\"background-color: #f8f9fa; padding: 15px; border-radius: 5px; margin: 20px 0;\">\n\n**Running in Colab?** This cell will clone OpenEnv and install dependencies automatically.\n\n**Running locally?** Make sure you're in the OpenEnv directory.\n\n</div>"
+  },
   {
    "cell_type": "markdown",
    "metadata": {},
    "source": [
     "---\n",
     "\n",
-    "# Part 3: Setup 🛠️\n",
+    "# Part 3: Setup \ud83d\udee0\ufe0f\n",
     "\n",
     "<div style=\"background-color: #f8f9fa; padding: 15px; border-radius: 5px; margin: 20px 0;\">\n",
     "\n",
@@ -290,43 +375,76 @@
    ]
   },
   {
-   "cell_type": "markdown",
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": "---\n\n<a id=\"part-4\"></a>\n# Part 4: The OpenEnv Pattern \ud83c\udfd7\ufe0f\n\n<div style=\"background-color: #f0f7ff; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n\n## Every OpenEnv Environment Has 3 Components:\n\n```\nsrc/envs/your_env/\n\u251c\u2500\u2500 \ud83d\udcdd models.py          \u2190 Type-safe contracts\n\u2502                           (Action, Observation, State)\n\u2502\n\u251c\u2500\u2500 \ud83d\udcf1 client.py          \u2190 What YOU import\n\u2502                           (HTTPEnvClient implementation)\n\u2502\n\u2514\u2500\u2500 \ud83d\udda5\ufe0f  server/\n    \u251c\u2500\u2500 environment.py    \u2190 Game/simulation logic\n    \u251c\u2500\u2500 app.py            \u2190 FastAPI server\n    \u2514\u2500\u2500 Dockerfile        \u2190 Container definition\n```\n\n</div>\n\nLet's explore the actual OpenEnv code to see how this works:"
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
    "metadata": {},
+   "outputs": [],
    "source": [
-    "---\n",
+    "# Import OpenEnv's core abstractions\n",
+    "from core.env_server import Environment, Action, Observation, State\n",
+    "from core.http_env_client import HTTPEnvClient\n",
     "\n",
-    "# Part 4: The OpenEnv Pattern 🏗️\n",
+    "print(\"=\"*70)\n",
+    "print(\"   \ud83e\udde9 OPENENV CORE ABSTRACTIONS\")\n",
+    "print(\"=\"*70)\n",
     "\n",
-    "<div style=\"background-color: #f0f7ff; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n",
+    "print(\"\"\"\n",
+    "\ud83d\udda5\ufe0f  SERVER SIDE (runs in Docker):\n",
     "\n",
-    "## Every OpenEnv Environment Has 3 Components:\n",
+    "    class Environment(ABC):\n",
+    "        '''Base class for all environment implementations'''\n",
+    "        \n",
+    "        @abstractmethod\n",
+    "        def reset(self) -> Observation:\n",
+    "            '''Start new episode'''\n",
+    "        \n",
+    "        @abstractmethod\n",
+    "        def step(self, action: Action) -> Observation:\n",
+    "            '''Execute action, return observation'''\n",
+    "        \n",
+    "        @property\n",
+    "        def state(self) -> State:\n",
+    "            '''Get episode metadata'''\n",
     "\n",
-    "```\n",
-    "src/envs/your_env/\n",
-    "├── 📝 models.py          ← Type-safe contracts\n",
-    "│                           (Action, Observation, State)\n",
-    "│\n",
-    "├── 📱 client.py          ← What YOU import\n",
-    "│                           (HTTPEnvClient implementation)\n",
-    "│\n",
-    "└── 🖥️  server/\n",
-    "    ├── environment.py    ← Game/simulation logic\n",
-    "    ├── app.py            ← FastAPI server\n",
-    "    └── Dockerfile        ← Container definition\n",
-    "```\n",
+    "\ud83d\udcf1 CLIENT SIDE (your training code):\n",
     "\n",
-    "</div>\n",
+    "    class HTTPEnvClient(ABC):\n",
+    "        '''Base class for HTTP clients'''\n",
+    "        \n",
+    "        def reset(self) -> StepResult:\n",
+    "            # HTTP POST /reset\n",
+    "        \n",
+    "        def step(self, action) -> StepResult:\n",
+    "            # HTTP POST /step\n",
+    "        \n",
+    "        def state(self) -> State:\n",
+    "            # HTTP GET /state\n",
+    "\"\"\")\n",
     "\n",
-    "Let's explore the actual OpenEnv code to see how this works:"
+    "print(\"=\"*70)\n",
+    "print(\"\\n\u2728 Same interface on both sides - communication via HTTP!\")\n",
+    "print(\"\ud83c\udfaf You focus on RL, OpenEnv handles the infrastructure.\\n\")"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": "---\n\n<a id=\"part-5\"></a>\n# Part 5: Example Integration - OpenSpiel \ud83c\udfae\n\n<div style=\"background-color: #fff3e0; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n\n## What is OpenSpiel?\n\n**OpenSpiel** is a library from DeepMind with **70+ game environments** for RL research.\n\n## OpenEnv's Integration\n\nWe've wrapped **6 OpenSpiel games** following the OpenEnv pattern:\n\n<table>\n<tr>\n<td width=\"50%\">\n\n**\ud83c\udfaf Single-Player**\n1. **Catch** - Catch falling ball\n2. **Cliff Walking** - Navigate grid\n3. **2048** - Tile puzzle\n4. **Blackjack** - Card game\n\n</td>\n<td width=\"50%\">\n\n**\ud83d\udc65 Multi-Player**\n5. **Tic-Tac-Toe** - Classic 3\u00d73\n6. **Kuhn Poker** - Imperfect info poker\n\n</td>\n</tr>\n</table>\n\nThis shows how OpenEnv can wrap **any** existing RL library!\n\n</div>"
+  },
   {
    "cell_type": "markdown",
    "metadata": {},
    "source": [
     "---\n",
     "\n",
-    "# Part 5: Example Integration - OpenSpiel 🎮\n",
+    "# Part 5: Example Integration - OpenSpiel \ud83c\udfae\n",
     "\n",
     "<div style=\"background-color: #fff3e0; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n",
     "\n",
@@ -342,7 +460,7 @@
     "<tr>\n",
     "<td width=\"50%\">\n",
     "\n",
-    "**🎯 Single-Player**\n",
+    "**\ud83c\udfaf Single-Player**\n",
     "1. **Catch** - Catch falling ball\n",
     "2. **Cliff Walking** - Navigate grid\n",
     "3. **2048** - Tile puzzle\n",
@@ -351,8 +469,8 @@
     "</td>\n",
     "<td width=\"50%\">\n",
     "\n",
-    "**👥 Multi-Player**\n",
-    "5. **Tic-Tac-Toe** - Classic 3×3\n",
+    "**\ud83d\udc65 Multi-Player**\n",
+    "5. **Tic-Tac-Toe** - Classic 3\u00d73\n",
     "6. **Kuhn Poker** - Imperfect info poker\n",
     "\n",
     "</td>\n",
@@ -364,6 +482,13 @@
     "</div>"
    ]
   },
+  {
+   "cell_type": "code",
+   "source": "from envs.openspiel_env.client import OpenSpielEnv\n\nprint(\"=\"*70)\nprint(\"   \ud83d\udd0c HOW OPENENV WRAPS OPENSPIEL\")\nprint(\"=\"*70)\n\nprint(\"\"\"\nclass OpenSpielEnv(HTTPEnvClient[OpenSpielAction, OpenSpielObservation]):\n    \n    def _step_payload(self, action: OpenSpielAction) -> dict:\n        '''Convert typed action to JSON for HTTP'''\n        return {\n            \"action_id\": action.action_id,\n            \"game_name\": action.game_name,\n        }\n    \n    def _parse_result(self, payload: dict) -> StepResult:\n        '''Parse HTTP JSON response into typed observation'''\n        return StepResult(\n            observation=OpenSpielObservation(...),\n            reward=payload['reward'],\n            done=payload['done']\n        )\n\n\"\"\")\n\nprint(\"\u2500\" * 70)\nprint(\"\\n\u2728 Usage (works for ALL OpenEnv environments):\")\nprint(\"\"\"\n  env = OpenSpielEnv(base_url=\"http://localhost:8000\")\n  \n  result = env.reset()\n  # Returns StepResult[OpenSpielObservation] - Type safe!\n  \n  result = env.step(OpenSpielAction(action_id=2, game_name=\"catch\"))\n  # Type checker knows this is valid!\n  \n  state = env.state()\n  # Returns OpenSpielState\n\"\"\")\n\nprint(\"\u2500\" * 70)\nprint(\"\\n\ud83c\udfaf This pattern works for ANY environment you want to wrap!\\n\")",
+   "metadata": {},
+   "execution_count": null,
+   "outputs": []
+  },
   {
    "cell_type": "code",
    "execution_count": null,
@@ -379,30 +504,30 @@
     "from dataclasses import fields\n",
     "\n",
     "print(\"=\"*70)\n",
-    "print(\"   🎮 OPENSPIEL INTEGRATION - TYPE-SAFE MODELS\")\n",
+    "print(\"   \ud83c\udfae OPENSPIEL INTEGRATION - TYPE-SAFE MODELS\")\n",
     "print(\"=\"*70)\n",
     "\n",
-    "print(\"\\n📤 OpenSpielAction (what you send):\")\n",
-    "print(\"   \" + \"─\" * 64)\n",
+    "print(\"\\n\ud83d\udce4 OpenSpielAction (what you send):\")\n",
+    "print(\"   \" + \"\u2500\" * 64)\n",
     "for field in fields(OpenSpielAction):\n",
-    "    print(f\"   • {field.name:20s} : {field.type}\")\n",
+    "    print(f\"   \u2022 {field.name:20s} : {field.type}\")\n",
     "\n",
-    "print(\"\\n📥 OpenSpielObservation (what you receive):\")\n",
-    "print(\"   \" + \"─\" * 64)\n",
+    "print(\"\\n\ud83d\udce5 OpenSpielObservation (what you receive):\")\n",
+    "print(\"   \" + \"\u2500\" * 64)\n",
     "for field in fields(OpenSpielObservation):\n",
-    "    print(f\"   • {field.name:20s} : {field.type}\")\n",
+    "    print(f\"   \u2022 {field.name:20s} : {field.type}\")\n",
     "\n",
-    "print(\"\\n📊 OpenSpielState (episode metadata):\")\n",
-    "print(\"   \" + \"─\" * 64)\n",
+    "print(\"\\n\ud83d\udcca OpenSpielState (episode metadata):\")\n",
+    "print(\"   \" + \"\u2500\" * 64)\n",
     "for field in fields(OpenSpielState):\n",
-    "    print(f\"   • {field.name:20s} : {field.type}\")\n",
+    "    print(f\"   \u2022 {field.name:20s} : {field.type}\")\n",
     "\n",
     "print(\"\\n\" + \"=\"*70)\n",
-    "print(\"\\n💡 Type safety means:\")\n",
-    "print(\"   ✅ Your IDE autocompletes these fields\")\n",
-    "print(\"   ✅ Typos are caught before running\")\n",
-    "print(\"   ✅ Refactoring is safe\")\n",
-    "print(\"   ✅ Self-documenting code\\n\")"
+    "print(\"\\n\ud83d\udca1 Type safety means:\")\n",
+    "print(\"   \u2705 Your IDE autocompletes these fields\")\n",
+    "print(\"   \u2705 Typos are caught before running\")\n",
+    "print(\"   \u2705 Refactoring is safe\")\n",
+    "print(\"   \u2705 Self-documenting code\\n\")"
    ]
   },
   {
@@ -415,44 +540,15 @@
     "\n",
     "The client **inherits from HTTPEnvClient** and implements 3 methods:\n",
     "\n",
-    "1. `_step_payload()` - Convert action → JSON\n",
-    "2. `_parse_result()` - Parse JSON → typed observation  \n",
-    "3. `_parse_state()` - Parse JSON → state\n",
+    "1. `_step_payload()` - Convert action \u2192 JSON\n",
+    "2. `_parse_result()` - Parse JSON \u2192 typed observation  \n",
+    "3. `_parse_state()` - Parse JSON \u2192 state\n",
     "\n",
     "That's it! The base class handles all HTTP communication.\n",
     "\n",
     "</div>"
    ]
   },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "---\n",
-    "\n",
-    "<a id=\"part-6\"></a>\n",
-    "<div style=\"text-align: center; background: linear-gradient(135deg, #667eea 0%, #764ba2 100%); color: white; padding: 30px; border-radius: 15px; margin: 30px 0;\">\n",
-    "\n",
-    "# 🎮 Part 6: Interactive Demo\n",
-    "\n",
-    "### Now let's BUILD something!\n",
-    "\n",
-    "We'll create a **Catch game** following OpenEnv patterns,<br>\n",
-    "then watch **4 different AI policies** compete for the championship! 🏆\n",
-    "\n",
-    "<br>\n",
-    "\n",
-    "**Get ready for:**\n",
-    "- ⚡ Live gameplay visualization\n",
-    "- 🤖 AI policy showdown\n",
-    "- 📊 Real-time learning metrics\n",
-    "- 🎯 Production-ready patterns\n",
-    "\n",
-    "</div>"
-   ]
-  },
   {
    "cell_type": "markdown",
    "metadata": {},
@@ -461,20 +557,20 @@
     "\n",
     "<div style=\"text-align: center; background: linear-gradient(135deg, #667eea 0%, #764ba2 100%); color: white; padding: 30px; border-radius: 15px; margin: 30px 0;\">\n",
     "\n",
-    "# 🎮 Part 6: Interactive Demo\n",
+    "# \ud83c\udfae Part 6: Interactive Demo\n",
     "\n",
     "### Now let's BUILD something!\n",
     "\n",
     "We'll create a **Catch game** following OpenEnv patterns,<br>\n",
-    "then watch **4 different AI policies** compete for the championship! 🏆\n",
+    "then watch **4 different AI policies** compete for the championship! \ud83c\udfc6\n",
     "\n",
     "<br>\n",
     "\n",
     "**Get ready for:**\n",
-    "- ⚡ Live gameplay visualization\n",
-    "- 🤖 AI policy showdown\n",
-    "- 📊 Real-time learning metrics\n",
-    "- 🎯 Production-ready patterns\n",
+    "- \u26a1 Live gameplay visualization\n",
+    "- \ud83e\udd16 AI policy showdown\n",
+    "- \ud83d\udcca Real-time learning metrics\n",
+    "- \ud83c\udfaf Production-ready patterns\n",
     "\n",
     "</div>"
    ]
@@ -483,18 +579,18 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "## The Game: Catch 🔴🏓\n",
+    "## The Game: Catch \ud83d\udd34\ud83c\udfd3\n",
     "\n",
     "<table>\n",
     "<tr>\n",
     "<td width=\"40%\" style=\"text-align: center;\">\n",
     "\n",
     "```\n",
-    "⬜ ⬜ 🔴 ⬜ ⬜   \n",
-    "⬜ ⬜ ⬜ ⬜ ⬜   Ball\n",
-    "⬜ ⬜ ⬜ ⬜ ⬜   falls\n",
-    "⬜ ⬜ ⬜ ⬜ ⬜   down\n",
-    "⬜ ⬜ 🏓 ⬜ ⬜   \n",
+    "\u2b1c \u2b1c \ud83d\udd34 \u2b1c \u2b1c   \n",
+    "\u2b1c \u2b1c \u2b1c \u2b1c \u2b1c   Ball\n",
+    "\u2b1c \u2b1c \u2b1c \u2b1c \u2b1c   falls\n",
+    "\u2b1c \u2b1c \u2b1c \u2b1c \u2b1c   down\n",
+    "\u2b1c \u2b1c \ud83c\udfd3 \u2b1c \u2b1c   \n",
     "     Paddle\n",
     "```\n",
     "\n",
@@ -502,18 +598,18 @@
     "<td width=\"60%\">\n",
     "\n",
     "**Rules:**\n",
-    "- 5×5 grid\n",
+    "- 5\u00d75 grid\n",
     "- Ball falls from random column\n",
     "- Move paddle to catch it\n",
     "\n",
     "**Actions:**\n",
-    "- `0` = Move LEFT ⬅️\n",
-    "- `1` = STAY 🛑\n",
-    "- `2` = Move RIGHT ➡️\n",
+    "- `0` = Move LEFT \u2b05\ufe0f\n",
+    "- `1` = STAY \ud83d\uded1\n",
+    "- `2` = Move RIGHT \u27a1\ufe0f\n",
     "\n",
     "**Reward:**\n",
-    "- `+1` if caught 🎉\n",
-    "- `0` if missed 😢\n",
+    "- `+1` if caught \ud83c\udf89\n",
+    "- `0` if missed \ud83d\ude22\n",
     "\n",
     "</td>\n",
     "</tr>\n",
@@ -521,7 +617,7 @@
     "\n",
     "<div style=\"background-color: #d4edda; padding: 15px; border-left: 5px solid #28a745; margin: 20px 0;\">\n",
     "\n",
-    "**🎯 Why This Game?**\n",
+    "**\ud83c\udfaf Why This Game?**\n",
     "- Simple rules (easy to understand)\n",
     "- Visual (see what's happening)\n",
     "- Fast episodes (~5 steps)\n",
@@ -531,6 +627,13 @@
     "</div>"
    ]
   },
+  {
+   "cell_type": "code",
+   "source": "# Create environment and start a new episode\nenv = CatchEnvironment()\nobs = env.reset()\n\nprint(\"\ud83c\udfae \" + \"=\"*58 + \" \ud83c\udfae\")\nprint(\"   INITIAL GAME STATE\")\nprint(\"\ud83c\udfae \" + \"=\"*58 + \" \ud83c\udfae\\n\")\n\n# Visualize the game board\nenv.render()\n\n# Show game info\nprint(f\"\\n\ud83d\udccd Game Info:\")\nprint(f\"   \ud83d\udd34 Ball at: column {obs.ball_position[1]} (row {obs.ball_position[0]})\")\nprint(f\"   \ud83c\udfd3 Paddle at: column {obs.paddle_position}\")\n\nprint(f\"\\n\ud83d\udcca Observation Details:\")\nprint(f\"   \u2022 Legal actions: {obs.legal_actions} \u2192 [LEFT, STAY, RIGHT]\")\nprint(f\"   \u2022 Info state size: {len(obs.info_state)} (5\u00d75 grid flattened)\")\nprint(f\"   \u2022 Episode done: {obs.done}\")\nprint(f\"   \u2022 Current reward: {obs.reward}\")\n\nprint(\"\\n\ud83d\udca1 The ball will fall down each step. Can your policy catch it?\")\nprint(\"=\"*62)",
+   "metadata": {},
+   "execution_count": null,
+   "outputs": []
+  },
   {
    "cell_type": "code",
    "execution_count": null,
@@ -566,13 +669,13 @@
     "    Catch game following OpenEnv's Environment pattern.\n",
     "    \n",
     "    In production:\n",
-    "      • Runs in Docker container\n",
-    "      • Accessed via HTTPEnvClient\n",
-    "      • Exposed via FastAPI server\n",
+    "      \u2022 Runs in Docker container\n",
+    "      \u2022 Accessed via HTTPEnvClient\n",
+    "      \u2022 Exposed via FastAPI server\n",
     "    \n",
     "    For this demo:\n",
-    "      • We run it locally to see internals\n",
-    "      • But the structure is identical!\n",
+    "      \u2022 We run it locally to see internals\n",
+    "      \u2022 But the structure is identical!\n",
     "    \"\"\"\n",
     "    \n",
     "    def __init__(self, grid_size=5):\n",
@@ -634,24 +737,24 @@
     "            line = \"  \"\n",
     "            for col in range(self.grid_size):\n",
     "                if row == self.ball_row and col == self.ball_col:\n",
-    "                    line += \"🔴 \"\n",
+    "                    line += \"\ud83d\udd34 \"\n",
     "                elif row == self.grid_size - 1 and col == self.paddle_col:\n",
-    "                    line += \"🏓 \"\n",
+    "                    line += \"\ud83c\udfd3 \"\n",
     "                else:\n",
-    "                    line += \"⬜ \"\n",
+    "                    line += \"\u2b1c \"\n",
     "            print(line)\n",
     "\n",
     "\n",
-    "print(\"🎉 \" + \"=\"*64 + \" 🎉\")\n",
-    "print(\"   ✅ Environment Created Following OpenEnv Pattern!\")\n",
-    "print(\"🎉 \" + \"=\"*64 + \" 🎉\")\n",
-    "print(\"\\n📋 What we just built:\")\n",
-    "print(\"   • reset() → CatchObservation (type-safe!)\")\n",
-    "print(\"   • step(action) → CatchObservation (type-safe!)\")\n",
-    "print(\"   • render() → Visual display\")\n",
-    "print(\"\\n🚀 In production: This would run in Docker + FastAPI\")\n",
+    "print(\"\ud83c\udf89 \" + \"=\"*64 + \" \ud83c\udf89\")\n",
+    "print(\"   \u2705 Environment Created Following OpenEnv Pattern!\")\n",
+    "print(\"\ud83c\udf89 \" + \"=\"*64 + \" \ud83c\udf89\")\n",
+    "print(\"\\n\ud83d\udccb What we just built:\")\n",
+    "print(\"   \u2022 reset() \u2192 CatchObservation (type-safe!)\")\n",
+    "print(\"   \u2022 step(action) \u2192 CatchObservation (type-safe!)\")\n",
+    "print(\"   \u2022 render() \u2192 Visual display\")\n",
+    "print(\"\\n\ud83d\ude80 In production: This would run in Docker + FastAPI\")\n",
     "print(\"   But the structure is EXACTLY the same!\")\n",
-    "print(\"\\n💡 This is your blueprint for creating ANY OpenEnv environment!\\n\")"
+    "print(\"\\n\ud83d\udca1 This is your blueprint for creating ANY OpenEnv environment!\\n\")"
    ]
   },
   {
@@ -670,7 +773,7 @@
     "---\n",
     "\n",
     "<a id=\"part-7\"></a>\n",
-    "# Part 7: Four Policies 🤖\n",
+    "# Part 7: Four Policies \ud83e\udd16\n",
     "\n",
     "<div style=\"background-color: #f8f9fa; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n",
     "\n",
@@ -683,22 +786,22 @@
     "<th width=\"25%\">Expected Performance</th>\n",
     "</tr>\n",
     "<tr>\n",
-    "<td><b>🎲 Random</b></td>\n",
+    "<td><b>\ud83c\udfb2 Random</b></td>\n",
     "<td>Pick random action every step</td>\n",
     "<td>~20% (pure luck)</td>\n",
     "</tr>\n",
     "<tr>\n",
-    "<td><b>🛑 Always Stay</b></td>\n",
+    "<td><b>\ud83d\uded1 Always Stay</b></td>\n",
     "<td>Never move, hope ball lands in center</td>\n",
     "<td>~20% (terrible!)</td>\n",
     "</tr>\n",
     "<tr>\n",
-    "<td><b>🧠 Smart</b></td>\n",
+    "<td><b>\ud83e\udde0 Smart</b></td>\n",
     "<td>Move paddle toward ball</td>\n",
     "<td>100% (optimal!)</td>\n",
     "</tr>\n",
     "<tr>\n",
-    "<td><b>📈 Learning</b></td>\n",
+    "<td><b>\ud83d\udcc8 Learning</b></td>\n",
     "<td>Start random, learn smart strategy</td>\n",
     "<td>~85% (improves over time)</td>\n",
     "</tr>\n",
@@ -713,7 +816,7 @@
    "source": [
     "---\n",
     "\n",
-    "# Part 7: Four Policies 🤖\n",
+    "# Part 7: Four Policies \ud83e\udd16\n",
     "\n",
     "<div style=\"background-color: #f8f9fa; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n",
     "\n",
@@ -726,22 +829,22 @@
     "<th width=\"25%\">Expected Performance</th>\n",
     "</tr>\n",
     "<tr>\n",
-    "<td><b>🎲 Random</b></td>\n",
+    "<td><b>\ud83c\udfb2 Random</b></td>\n",
     "<td>Pick random action every step</td>\n",
     "<td>~20% (pure luck)</td>\n",
     "</tr>\n",
     "<tr>\n",
-    "<td><b>🛑 Always Stay</b></td>\n",
+    "<td><b>\ud83d\uded1 Always Stay</b></td>\n",
     "<td>Never move, hope ball lands in center</td>\n",
     "<td>~20% (terrible!)</td>\n",
     "</tr>\n",
     "<tr>\n",
-    "<td><b>🧠 Smart</b></td>\n",
+    "<td><b>\ud83e\udde0 Smart</b></td>\n",
     "<td>Move paddle toward ball</td>\n",
     "<td>100% (optimal!)</td>\n",
     "</tr>\n",
     "<tr>\n",
-    "<td><b>📈 Learning</b></td>\n",
+    "<td><b>\ud83d\udcc8 Learning</b></td>\n",
     "<td>Start random, learn smart strategy</td>\n",
     "<td>~85% (improves over time)</td>\n",
     "</tr>\n",
@@ -762,7 +865,7 @@
     "\n",
     "class RandomPolicy:\n",
     "    \"\"\"Baseline: Pure random guessing.\"\"\"\n",
-    "    name = \"🎲 Random Guesser\"\n",
+    "    name = \"\ud83c\udfb2 Random Guesser\"\n",
     "    \n",
     "    def select_action(self, obs: CatchObservation) -> int:\n",
     "        return random.choice(obs.legal_actions)\n",
@@ -770,7 +873,7 @@
     "\n",
     "class AlwaysStayPolicy:\n",
     "    \"\"\"Bad strategy: Never moves.\"\"\"\n",
-    "    name = \"🛑 Always Stay\"\n",
+    "    name = \"\ud83d\uded1 Always Stay\"\n",
     "    \n",
     "    def select_action(self, obs: CatchObservation) -> int:\n",
     "        return 1  # STAY\n",
@@ -778,7 +881,7 @@
     "\n",
     "class SmartPolicy:\n",
     "    \"\"\"Optimal: Move paddle toward ball.\"\"\"\n",
-    "    name = \"🧠 Smart Heuristic\"\n",
+    "    name = \"\ud83e\udde0 Smart Heuristic\"\n",
     "    \n",
     "    def select_action(self, obs: CatchObservation) -> int:\n",
     "        ball_col = obs.ball_position[1]\n",
@@ -794,7 +897,7 @@
     "\n",
     "class LearningPolicy:\n",
     "    \"\"\"Simulated RL: Epsilon-greedy exploration.\"\"\"\n",
-    "    name = \"📈 Learning Agent\"\n",
+    "    name = \"\ud83d\udcc8 Learning Agent\"\n",
     "    \n",
     "    def __init__(self):\n",
     "        self.steps = 0\n",
@@ -820,16 +923,16 @@
     "                return 1\n",
     "\n",
     "\n",
-    "print(\"🤖 \" + \"=\"*64 + \" 🤖\")\n",
-    "print(\"   ✅ 4 Policies Created!\")\n",
-    "print(\"🤖 \" + \"=\"*64 + \" 🤖\\n\")\n",
+    "print(\"\ud83e\udd16 \" + \"=\"*64 + \" \ud83e\udd16\")\n",
+    "print(\"   \u2705 4 Policies Created!\")\n",
+    "print(\"\ud83e\udd16 \" + \"=\"*64 + \" \ud83e\udd16\\n\")\n",
     "\n",
     "policies = [RandomPolicy(), AlwaysStayPolicy(), SmartPolicy(), LearningPolicy()]\n",
     "for i, policy in enumerate(policies, 1):\n",
     "    print(f\"   {i}. {policy.name}\")\n",
     "\n",
-    "print(\"\\n💡 Each policy represents a different approach to solving the game!\")\n",
-    "print(\"   Let's see who performs best! 🏆\\n\")"
+    "print(\"\\n\ud83d\udca1 Each policy represents a different approach to solving the game!\")\n",
+    "print(\"   Let's see who performs best! \ud83c\udfc6\\n\")"
    ]
   },
   {
@@ -855,15 +958,15 @@
     "    \n",
     "    if visualize:\n",
     "        print(f\"\\n{'='*60}\")\n",
-    "        print(f\"   🎮 {policy.name}\")\n",
-    "        print(f\"   🔴 Ball will fall at column: {obs.ball_position[1]}\")\n",
+    "        print(f\"   \ud83c\udfae {policy.name}\")\n",
+    "        print(f\"   \ud83d\udd34 Ball will fall at column: {obs.ball_position[1]}\")\n",
     "        print('='*60 + '\\n')\n",
     "        env.render()\n",
     "        time.sleep(delay)\n",
     "    \n",
     "    total_reward = 0\n",
     "    step = 0\n",
-    "    action_names = [\"⬅️  LEFT\", \"🛑 STAY\", \"➡️  RIGHT\"]\n",
+    "    action_names = [\"\u2b05\ufe0f  LEFT\", \"\ud83d\uded1 STAY\", \"\u27a1\ufe0f  RIGHT\"]\n",
     "    \n",
     "    # THE RL LOOP\n",
     "    while not obs.done:\n",
@@ -877,14 +980,14 @@
     "        total_reward += obs.reward\n",
     "        \n",
     "        if visualize:\n",
-    "            print(f\"\\n📍 Step {step + 1}: {action_names[action]}\")\n",
+    "            print(f\"\\n\ud83d\udccd Step {step + 1}: {action_names[action]}\")\n",
     "            env.render()\n",
     "            time.sleep(delay)\n",
     "        \n",
     "        step += 1\n",
     "    \n",
     "    if visualize:\n",
-    "        result = \"🎉 CAUGHT!\" if total_reward > 0 else \"😢 MISSED\"\n",
+    "        result = \"\ud83c\udf89 CAUGHT!\" if total_reward > 0 else \"\ud83d\ude22 MISSED\"\n",
     "        print(f\"\\n{'='*60}\")\n",
     "        print(f\"   {result} Reward: {total_reward}\")\n",
     "        print('='*60)\n",
@@ -905,7 +1008,7 @@
     "---\n",
     "\n",
     "<a id=\"part-8\"></a>\n",
-    "# Part 8: Policy Competition! 🏆\n",
+    "# Part 8: Policy Competition! \ud83c\udfc6\n",
     "\n",
     "<div style=\"background-color: #e7f3ff; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n",
     "\n",
@@ -920,7 +1023,7 @@
    "source": [
     "---\n",
     "\n",
-    "# Part 8: Policy Competition! 🏆\n",
+    "# Part 8: Policy Competition! \ud83c\udfc6\n",
     "\n",
     "<div style=\"background-color: #e7f3ff; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n",
     "\n",
@@ -944,49 +1047,49 @@
     "        LearningPolicy(),\n",
     "    ]\n",
     "    \n",
-    "    print(\"\\n🏆 \" + \"=\"*66 + \" 🏆\")\n",
+    "    print(\"\\n\ud83c\udfc6 \" + \"=\"*66 + \" \ud83c\udfc6\")\n",
     "    print(f\"   POLICY SHOWDOWN - {num_episodes} Episodes Each\")\n",
-    "    print(\"🏆 \" + \"=\"*66 + \" 🏆\\n\")\n",
+    "    print(\"\ud83c\udfc6 \" + \"=\"*66 + \" \ud83c\udfc6\\n\")\n",
     "    \n",
     "    results = []\n",
     "    for policy in policies:\n",
-    "        print(f\"⚡ Testing {policy.name}...\", end=\" \")\n",
+    "        print(f\"\u26a1 Testing {policy.name}...\", end=\" \")\n",
     "        env = CatchEnvironment()\n",
     "        successes = sum(run_episode(env, policy, visualize=False) \n",
     "                       for _ in range(num_episodes))\n",
     "        success_rate = (successes / num_episodes) * 100\n",
     "        results.append((policy.name, success_rate, successes))\n",
-    "        print(f\"✓ Done!\")\n",
+    "        print(f\"\u2713 Done!\")\n",
     "    \n",
     "    print(\"\\n\" + \"=\"*70)\n",
-    "    print(\"   📊 FINAL RESULTS\")\n",
+    "    print(\"   \ud83d\udcca FINAL RESULTS\")\n",
     "    print(\"=\"*70 + \"\\n\")\n",
     "    \n",
     "    # Sort by success rate (descending)\n",
     "    results.sort(key=lambda x: x[1], reverse=True)\n",
     "    \n",
     "    # Award medals to top 3\n",
-    "    medals = [\"🥇\", \"🥈\", \"🥉\", \"  \"]\n",
+    "    medals = [\"\ud83e\udd47\", \"\ud83e\udd48\", \"\ud83e\udd49\", \"  \"]\n",
     "    \n",
     "    for i, (name, rate, successes) in enumerate(results):\n",
     "        medal = medals[i]\n",
-    "        bar = \"█\" * int(rate / 2)\n",
+    "        bar = \"\u2588\" * int(rate / 2)\n",
     "        print(f\"{medal} {name:25s} [{bar:<50}] {rate:5.1f}% ({successes}/{num_episodes})\")\n",
     "    \n",
     "    print(\"\\n\" + \"=\"*70)\n",
-    "    print(\"\\n✨ Key Insights:\")\n",
-    "    print(\"   • Random (~20%):      Baseline - pure luck 🎲\")\n",
-    "    print(\"   • Always Stay (~20%): Bad strategy - stays center 🛑\")\n",
-    "    print(\"   • Smart (100%):       Optimal - perfect play! 🧠\")\n",
-    "    print(\"   • Learning (~85%):    Improves over time 📈\")\n",
-    "    print(\"\\n🎓 This is Reinforcement Learning in action:\")\n",
+    "    print(\"\\n\u2728 Key Insights:\")\n",
+    "    print(\"   \u2022 Random (~20%):      Baseline - pure luck \ud83c\udfb2\")\n",
+    "    print(\"   \u2022 Always Stay (~20%): Bad strategy - stays center \ud83d\uded1\")\n",
+    "    print(\"   \u2022 Smart (100%):       Optimal - perfect play! \ud83e\udde0\")\n",
+    "    print(\"   \u2022 Learning (~85%):    Improves over time \ud83d\udcc8\")\n",
+    "    print(\"\\n\ud83c\udf93 This is Reinforcement Learning in action:\")\n",
     "    print(\"   1. Start with exploration (trying random things)\")\n",
     "    print(\"   2. Learn from rewards (what works, what doesn't)\")\n",
     "    print(\"   3. Converge to optimal behavior (smart strategy)\")\n",
-    "    print(\"\\n🎯 The Learning Agent gets smarter with every episode!\\n\")\n",
+    "    print(\"\\n\ud83c\udfaf The Learning Agent gets smarter with every episode!\\n\")\n",
     "\n",
     "# Run the epic competition!\n",
-    "print(\"🎮 Starting the showdown...\")\n",
+    "print(\"\ud83c\udfae Starting the showdown...\")\n",
     "evaluate_policies(num_episodes=50)"
    ]
   },
@@ -997,7 +1100,7 @@
     "---\n",
     "\n",
     "<a id=\"part-9\"></a>\n",
-    "# Part 9: Using Real OpenSpiel 🎮\n",
+    "# Part 9: Using Real OpenSpiel \ud83c\udfae\n",
     "\n",
     "<div style=\"background-color: #d4edda; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n",
     "\n",
@@ -1026,8 +1129,8 @@
     "</tr>\n",
     "<tr>\n",
     "<td><b>Type Safety</b></td>\n",
-    "<td>✅ Dataclasses</td>\n",
-    "<td>✅ Dataclasses</td>\n",
+    "<td>\u2705 Dataclasses</td>\n",
+    "<td>\u2705 Dataclasses</td>\n",
     "</tr>\n",
     "<tr>\n",
     "<td><b>API</b></td>\n",
@@ -1036,7 +1139,7 @@
     "</tr>\n",
     "</table>\n",
     "\n",
-    "**🎯 Same structure, production features!**\n",
+    "**\ud83c\udfaf Same structure, production features!**\n",
     "\n",
     "</div>\n",
     "\n",
@@ -1063,10 +1166,10 @@
     "\n",
     "<div style=\"background-color: #fff3e0; padding: 15px; border-radius: 5px; margin: 20px 0;\">\n",
     "\n",
-    "**🎮 6 Games Available:**\n",
+    "**\ud83c\udfae 6 Games Available:**\n",
     "\n",
     "1. `\"catch\"` - What we just built!\n",
-    "2. `\"tic_tac_toe\"` - Classic 3×3\n",
+    "2. `\"tic_tac_toe\"` - Classic 3\u00d73\n",
     "3. `\"kuhn_poker\"` - Imperfect information poker\n",
     "4. `\"cliff_walking\"` - Grid navigation\n",
     "5. `\"2048\"` - Tile puzzle\n",
@@ -1084,7 +1187,7 @@
     "---\n",
     "\n",
     "<a id=\"part-10\"></a>\n",
-    "# Part 10: Create Your Own Integration 🛠️\n",
+    "# Part 10: Create Your Own Integration \ud83d\udee0\ufe0f\n",
     "\n",
     "<div style=\"background-color: #e7f3ff; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n",
     "\n",
@@ -1188,7 +1291,7 @@
     "\n",
     "<div style=\"background-color: #d4edda; padding: 20px; border-left: 5px solid #28a745; margin: 20px 0;\">\n",
     "\n",
-    "### 🎓 Examples to Study\n",
+    "### \ud83c\udf93 Examples to Study\n",
     "\n",
     "OpenEnv includes 3 complete examples:\n",
     "\n",
@@ -1206,7 +1309,7 @@
     "   - Shows complex use case\n",
     "   - Security considerations\n",
     "\n",
-    "**💡 Study these to understand the patterns!**\n",
+    "**\ud83d\udca1 Study these to understand the patterns!**\n",
     "\n",
     "</div>"
    ]
@@ -1220,7 +1323,7 @@
     "<a id=\"summary\"></a>\n",
     "<div style=\"background: linear-gradient(135deg, #667eea 0%, #764ba2 100%); color: white; padding: 40px; border-radius: 15px; margin: 40px 0; text-align: center;\">\n",
     "\n",
-    "# 🎓 Summary: Your Journey\n",
+    "# \ud83c\udf93 Summary: Your Journey\n",
     "\n",
     "</div>"
    ]
@@ -1233,7 +1336,7 @@
     "\n",
     "<div style=\"background: linear-gradient(135deg, #667eea 0%, #764ba2 100%); color: white; padding: 40px; border-radius: 15px; margin: 40px 0; text-align: center;\">\n",
     "\n",
-    "# 🎓 Summary: Your Journey\n",
+    "# \ud83c\udf93 Summary: Your Journey\n",
     "\n",
     "</div>"
    ]
@@ -1248,19 +1351,19 @@
     "<tr>\n",
     "<td width=\"50%\" style=\"vertical-align: top;\">\n",
     "\n",
-    "### 📚 Concepts\n",
+    "### \ud83d\udcda Concepts\n",
     "\n",
-    "✅ **RL Fundamentals**\n",
+    "\u2705 **RL Fundamentals**\n",
     "- The observe-act-reward loop\n",
     "- What makes good policies\n",
     "- Exploration vs exploitation\n",
     "\n",
-    "✅ **OpenEnv Architecture**\n",
+    "\u2705 **OpenEnv Architecture**\n",
     "- Client-server separation\n",
     "- Type-safe contracts\n",
     "- HTTP communication layer\n",
     "\n",
-    "✅ **Production Patterns**\n",
+    "\u2705 **Production Patterns**\n",
     "- Docker isolation\n",
     "- API design\n",
     "- Reproducible deployments\n",
@@ -1268,19 +1371,19 @@
     "</td>\n",
     "<td width=\"50%\" style=\"vertical-align: top;\">\n",
     "\n",
-    "### 🛠️ Skills\n",
+    "### \ud83d\udee0\ufe0f Skills\n",
     "\n",
-    "✅ **Using Environments**\n",
+    "\u2705 **Using Environments**\n",
     "- Import OpenEnv clients\n",
     "- Call reset/step/state\n",
     "- Work with typed observations\n",
     "\n",
-    "✅ **Building Environments**\n",
+    "\u2705 **Building Environments**\n",
     "- Define type-safe models\n",
     "- Implement Environment class\n",
     "- Create HTTPEnvClient\n",
     "\n",
-    "✅ **Testing & Debugging**\n",
+    "\u2705 **Testing & Debugging**\n",
     "- Compare policies\n",
     "- Visualize episodes\n",
     "- Measure performance\n",
@@ -1305,45 +1408,45 @@
     "</tr>\n",
     "<tr>\n",
     "<td><b>Type Safety</b></td>\n",
-    "<td>❌ Arrays, dicts</td>\n",
-    "<td>✅ Dataclasses</td>\n",
-    "<td>🏆 OpenEnv</td>\n",
+    "<td>\u274c Arrays, dicts</td>\n",
+    "<td>\u2705 Dataclasses</td>\n",
+    "<td>\ud83c\udfc6 OpenEnv</td>\n",
     "</tr>\n",
     "<tr>\n",
     "<td><b>Isolation</b></td>\n",
-    "<td>❌ Same process</td>\n",
-    "<td>✅ Docker</td>\n",
-    "<td>🏆 OpenEnv</td>\n",
+    "<td>\u274c Same process</td>\n",
+    "<td>\u2705 Docker</td>\n",
+    "<td>\ud83c\udfc6 OpenEnv</td>\n",
     "</tr>\n",
     "<tr>\n",
     "<td><b>Deployment</b></td>\n",
-    "<td>❌ Manual setup</td>\n",
-    "<td>✅ K8s-ready</td>\n",
-    "<td>🏆 OpenEnv</td>\n",
+    "<td>\u274c Manual setup</td>\n",
+    "<td>\u2705 K8s-ready</td>\n",
+    "<td>\ud83c\udfc6 OpenEnv</td>\n",
     "</tr>\n",
     "<tr>\n",
     "<td><b>Language</b></td>\n",
-    "<td>❌ Python only</td>\n",
-    "<td>✅ Any (HTTP)</td>\n",
-    "<td>🏆 OpenEnv</td>\n",
+    "<td>\u274c Python only</td>\n",
+    "<td>\u2705 Any (HTTP)</td>\n",
+    "<td>\ud83c\udfc6 OpenEnv</td>\n",
     "</tr>\n",
     "<tr>\n",
     "<td><b>Reproducibility</b></td>\n",
-    "<td>❌ \"Works on my machine\"</td>\n",
-    "<td>✅ Same everywhere</td>\n",
-    "<td>🏆 OpenEnv</td>\n",
+    "<td>\u274c \"Works on my machine\"</td>\n",
+    "<td>\u2705 Same everywhere</td>\n",
+    "<td>\ud83c\udfc6 OpenEnv</td>\n",
     "</tr>\n",
     "<tr>\n",
     "<td><b>Community</b></td>\n",
-    "<td>✅ Large ecosystem</td>\n",
-    "<td>🟡 Growing</td>\n",
-    "<td>🤝 Both!</td>\n",
+    "<td>\u2705 Large ecosystem</td>\n",
+    "<td>\ud83d\udfe1 Growing</td>\n",
+    "<td>\ud83e\udd1d Both!</td>\n",
     "</tr>\n",
     "</table>\n",
     "\n",
     "<div style=\"background-color: #e7f3ff; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n",
     "\n",
-    "**🎯 The Bottom Line**\n",
+    "**\ud83c\udfaf The Bottom Line**\n",
     "\n",
     "OpenEnv brings **production engineering** to RL:\n",
     "- Same environments work locally and in production\n",
@@ -1361,33 +1464,33 @@
    "metadata": {},
    "source": [
     "<a id=\"resources\"></a>\n",
-    "## 📚 Resources\n",
+    "## \ud83d\udcda Resources\n",
     "\n",
     "<div style=\"background-color: #f8f9fa; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n",
     "\n",
-    "### 🔗 Essential Links\n",
+    "### \ud83d\udd17 Essential Links\n",
     "\n",
-    "- **🏠 OpenEnv GitHub**: https://github.com/meta-pytorch/OpenEnv\n",
-    "- **🎮 OpenSpiel**: https://github.com/google-deepmind/open_spiel\n",
-    "- **⚡ FastAPI Docs**: https://fastapi.tiangolo.com/\n",
-    "- **🐳 Docker Guide**: https://docs.docker.com/get-started/\n",
-    "- **🔥 PyTorch**: https://pytorch.org/\n",
+    "- **\ud83c\udfe0 OpenEnv GitHub**: https://github.com/meta-pytorch/OpenEnv\n",
+    "- **\ud83c\udfae OpenSpiel**: https://github.com/google-deepmind/open_spiel\n",
+    "- **\u26a1 FastAPI Docs**: https://fastapi.tiangolo.com/\n",
+    "- **\ud83d\udc33 Docker Guide**: https://docs.docker.com/get-started/\n",
+    "- **\ud83d\udd25 PyTorch**: https://pytorch.org/\n",
     "\n",
-    "### 📖 Documentation Deep Dives\n",
+    "### \ud83d\udcd6 Documentation Deep Dives\n",
     "\n",
     "- **Environment Creation Guide**: `src/envs/README.md`\n",
     "- **OpenSpiel Integration**: `src/envs/openspiel_env/README.md`\n",
     "- **Example Scripts**: `examples/`\n",
     "- **RFC 001**: [Baseline API Specs](https://github.com/meta-pytorch/OpenEnv/pull/26)\n",
     "\n",
-    "### 🎓 Community & Support\n",
+    "### \ud83c\udf93 Community & Support\n",
     "\n",
     "**Supported by amazing organizations:**\n",
-    "- 🔥 Meta PyTorch\n",
-    "- 🤗 Hugging Face\n",
-    "- ⚡ Unsloth AI\n",
-    "- 🌟 Reflection AI\n",
-    "- 🚀 And many more!\n",
+    "- \ud83d\udd25 Meta PyTorch\n",
+    "- \ud83e\udd17 Hugging Face\n",
+    "- \u26a1 Unsloth AI\n",
+    "- \ud83c\udf1f Reflection AI\n",
+    "- \ud83d\ude80 And many more!\n",
     "\n",
     "**License**: BSD 3-Clause (very permissive!)\n",
     "\n",
@@ -1397,46 +1500,46 @@
     "\n",
     "---\n",
     "\n",
-    "### 🌈 What's Next?\n",
+    "### \ud83c\udf08 What's Next?\n",
     "\n",
-    "1. ⭐ **Star the repo** to show support and stay updated\n",
-    "2. 🔄 **Try modifying** the Catch game (make it harder? bigger grid?)\n",
-    "3. 🎮 **Explore** other OpenSpiel games\n",
-    "4. 🛠️ **Build** your own environment integration\n",
-    "5. 💬 **Share** what you build with the community!"
+    "1. \u2b50 **Star the repo** to show support and stay updated\n",
+    "2. \ud83d\udd04 **Try modifying** the Catch game (make it harder? bigger grid?)\n",
+    "3. \ud83c\udfae **Explore** other OpenSpiel games\n",
+    "4. \ud83d\udee0\ufe0f **Build** your own environment integration\n",
+    "5. \ud83d\udcac **Share** what you build with the community!"
    ]
   },
   {
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "## 📚 Resources\n",
+    "## \ud83d\udcda Resources\n",
     "\n",
     "<div style=\"background-color: #f8f9fa; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n",
     "\n",
-    "### 🔗 Essential Links\n",
+    "### \ud83d\udd17 Essential Links\n",
     "\n",
-    "- **🏠 OpenEnv GitHub**: https://github.com/meta-pytorch/OpenEnv\n",
-    "- **🎮 OpenSpiel**: https://github.com/google-deepmind/open_spiel\n",
-    "- **⚡ FastAPI Docs**: https://fastapi.tiangolo.com/\n",
-    "- **🐳 Docker Guide**: https://docs.docker.com/get-started/\n",
-    "- **🔥 PyTorch**: https://pytorch.org/\n",
+    "- **\ud83c\udfe0 OpenEnv GitHub**: https://github.com/meta-pytorch/OpenEnv\n",
+    "- **\ud83c\udfae OpenSpiel**: https://github.com/google-deepmind/open_spiel\n",
+    "- **\u26a1 FastAPI Docs**: https://fastapi.tiangolo.com/\n",
+    "- **\ud83d\udc33 Docker Guide**: https://docs.docker.com/get-started/\n",
+    "- **\ud83d\udd25 PyTorch**: https://pytorch.org/\n",
     "\n",
-    "### 📖 Documentation Deep Dives\n",
+    "### \ud83d\udcd6 Documentation Deep Dives\n",
     "\n",
     "- **Environment Creation Guide**: `src/envs/README.md`\n",
     "- **OpenSpiel Integration**: `src/envs/openspiel_env/README.md`\n",
     "- **Example Scripts**: `examples/`\n",
     "- **RFC 001**: [Baseline API Specs](https://github.com/meta-pytorch/OpenEnv/pull/26)\n",
     "\n",
-    "### 🎓 Community & Support\n",
+    "### \ud83c\udf93 Community & Support\n",
     "\n",
     "**Supported by amazing organizations:**\n",
-    "- 🔥 Meta PyTorch\n",
-    "- 🤗 Hugging Face\n",
-    "- ⚡ Unsloth AI\n",
-    "- 🌟 Reflection AI\n",
-    "- 🚀 And many more!\n",
+    "- \ud83d\udd25 Meta PyTorch\n",
+    "- \ud83e\udd17 Hugging Face\n",
+    "- \u26a1 Unsloth AI\n",
+    "- \ud83c\udf1f Reflection AI\n",
+    "- \ud83d\ude80 And many more!\n",
     "\n",
     "**License**: BSD 3-Clause (very permissive!)\n",
     "\n",
@@ -1446,13 +1549,13 @@
     "\n",
     "---\n",
     "\n",
-    "### 🌈 What's Next?\n",
+    "### \ud83c\udf08 What's Next?\n",
     "\n",
-    "1. ⭐ **Star the repo** to show support and stay updated\n",
-    "2. 🔄 **Try modifying** the Catch game (make it harder? bigger grid?)\n",
-    "3. 🎮 **Explore** other OpenSpiel games\n",
-    "4. 🛠️ **Build** your own environment integration\n",
-    "5. 💬 **Share** what you build with the community!"
+    "1. \u2b50 **Star the repo** to show support and stay updated\n",
+    "2. \ud83d\udd04 **Try modifying** the Catch game (make it harder? bigger grid?)\n",
+    "3. \ud83c\udfae **Explore** other OpenSpiel games\n",
+    "4. \ud83d\udee0\ufe0f **Build** your own environment integration\n",
+    "5. \ud83d\udcac **Share** what you build with the community!"
    ]
   },
   {
@@ -1463,38 +1566,38 @@
     "\n",
     "<div style=\"background: linear-gradient(135deg, #f093fb 0%, #f5576c 100%); color: white; padding: 50px; border-radius: 20px; margin: 40px 0; text-align: center;\">\n",
     "\n",
-    "# 🎉 Congratulations! You Did It! 🎉\n",
+    "# \ud83c\udf89 Congratulations! You Did It! \ud83c\udf89\n",
     "\n",
     "### You're now an OpenEnv expert!\n",
     "\n",
     "<br>\n",
     "\n",
-    "## ✅ What You've Mastered:\n",
+    "## \u2705 What You've Mastered:\n",
     "\n",
-    "**🧠 Concepts**\n",
+    "**\ud83e\udde0 Concepts**\n",
     "- How RL works (the observe-act-reward loop)\n",
     "- Why OpenEnv matters (production-ready RL)\n",
     "- How to use existing environments\n",
     "\n",
-    "**🛠️ Practical Skills**\n",
+    "**\ud83d\udee0\ufe0f Practical Skills**\n",
     "- Creating new integrations\n",
     "- Building type-safe environments\n",
     "- Deploying to production\n",
     "\n",
-    "**🎯 Real Experience**\n",
+    "**\ud83c\udfaf Real Experience**\n",
     "- Built a complete RL environment\n",
     "- Tested multiple policies\n",
     "- Watched learning happen in real-time!\n",
     "\n",
     "---\n",
     "\n",
-    "### Now go build something amazing! 🚀\n",
+    "### Now go build something amazing! \ud83d\ude80\n",
     "\n",
     "**Welcome to the future of RL with PyTorch & OpenEnv**\n",
     "\n",
     "<br>\n",
     "\n",
-    "[![Star on GitHub](https://img.shields.io/badge/⭐_Star_on_GitHub-gray?style=for-the-badge)](https://github.com/meta-pytorch/OpenEnv)\n",
+    "[![Star on GitHub](https://img.shields.io/badge/\u2b50_Star_on_GitHub-gray?style=for-the-badge)](https://github.com/meta-pytorch/OpenEnv)\n",
     "\n",
     "</div>\n",
     "\n",
@@ -1502,16 +1605,16 @@
     "\n",
     "<div style=\"background-color: #f0f7ff; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n",
     "\n",
-    "## 🌟 Want to Learn More?\n",
+    "## \ud83c\udf1f Want to Learn More?\n",
     "\n",
-    "- 📖 Check out the [docs](https://github.com/meta-pytorch/OpenEnv)\n",
-    "- 🎮 Try the other example games\n",
-    "- 💬 Join the community discussions\n",
-    "- 🛠️ Build your own integration\n",
-    "- 🚀 Deploy to production\n",
-    "- ⭐ Star the repo to stay updated!\n",
+    "- \ud83d\udcd6 Check out the [docs](https://github.com/meta-pytorch/OpenEnv)\n",
+    "- \ud83c\udfae Try the other example games\n",
+    "- \ud83d\udcac Join the community discussions\n",
+    "- \ud83d\udee0\ufe0f Build your own integration\n",
+    "- \ud83d\ude80 Deploy to production\n",
+    "- \u2b50 Star the repo to stay updated!\n",
     "\n",
-    "**Happy coding! 🎊**\n",
+    "**Happy coding! \ud83c\udf8a**\n",
     "\n",
     "</div>"
    ]
@@ -1538,4 +1641,4 @@
  },
  "nbformat": 4,
  "nbformat_minor": 4
-}
+}
\ No newline at end of file
diff --git a/fix_notebook.py b/fix_notebook.py
new file mode 100644
index 0000000..1e73044
--- /dev/null
+++ b/fix_notebook.py
@@ -0,0 +1,171 @@
+#!/usr/bin/env python3
+import json
+
+# Read notebook
+with open('examples/OpenEnv_Tutorial.ipynb', 'r') as f:
+    nb = json.load(f)
+
+# Insert TOC after cell 1
+toc_cell = {
+    "cell_type": "markdown",
+    "metadata": {},
+    "source": [
+        "---\n",
+        "\n",
+        "## 📑 Table of Contents\n",
+        "\n",
+        "<div style=\"background-color: #f8f9fa; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n",
+        "\n",
+        "**Quick Navigation** - Click any section to jump right there! 🎯\n",
+        "\n",
+        "### Foundation\n",
+        "- [Part 1: RL in 60 Seconds ⏱️](#part-1)\n",
+        "- [Part 2: The Problem with Traditional RL 😤](#part-2)\n",
+        "- [Part 3: Setup 🛠️](#part-3)\n",
+        "\n",
+        "### Architecture\n",
+        "- [Part 4: The OpenEnv Pattern 🏗️](#part-4)\n",
+        "- [Part 5: Example Integration - OpenSpiel 🎮](#part-5)\n",
+        "\n",
+        "### Hands-On Demo\n",
+        "- [Part 6: Interactive Demo 🎮](#part-6)\n",
+        "- [Part 7: Four Policies 🤖](#part-7)\n",
+        "- [Part 8: Policy Competition! 🏆](#part-8)\n",
+        "\n",
+        "### Advanced\n",
+        "- [Part 9: Using Real OpenSpiel 🎮](#part-9)\n",
+        "- [Part 10: Create Your Own Integration 🛠️](#part-10)\n",
+        "\n",
+        "### Wrap Up\n",
+        "- [Summary: Your Journey 🎓](#summary)\n",
+        "- [Resources 📚](#resources)\n",
+        "\n",
+        "</div>\n",
+        "\n",
+        "---"
+    ]
+}
+
+# Insert setup code cell after Part 3 header
+setup_cell = {
+    "cell_type": "code",
+    "execution_count": None,
+    "metadata": {},
+    "outputs": [],
+    "source": [
+        "# Detect environment\n",
+        "try:\n",
+        "    import google.colab\n",
+        "    IN_COLAB = True\n",
+        "    print(\"🌐 Running in Google Colab - Perfect!\")\n",
+        "except ImportError:\n",
+        "    IN_COLAB = False\n",
+        "    print(\"💻 Running locally - Nice!\")\n",
+        "\n",
+        "if IN_COLAB:\n",
+        "    print(\"\\n📦 Cloning OpenEnv repository...\")\n",
+        "    !git clone https://github.com/meta-pytorch/OpenEnv.git > /dev/null 2>&1\n",
+        "    %cd OpenEnv\n",
+        "    \n",
+        "    print(\"📚 Installing dependencies (this takes ~10 seconds)...\")\n",
+        "    !pip install -q fastapi uvicorn requests\n",
+        "    \n",
+        "    import sys\n",
+        "    sys.path.insert(0, './src')\n",
+        "    print(\"\\n✅ Setup complete! Everything is ready to go! 🎉\")\n",
+        "else:\n",
+        "    import sys\n",
+        "    from pathlib import Path\n",
+        "    sys.path.insert(0, str(Path.cwd().parent / 'src'))\n",
+        "    print(\"✅ Using local OpenEnv installation\")\n",
+        "\n",
+        "print(\"\\n🚀 Ready to explore OpenEnv and build amazing things!\")\n",
+        "print(\"💡 Tip: Run cells top-to-bottom for the best experience.\\n\")"
+    ]
+}
+
+# Insert architecture diagram after Part 2
+arch_cell = {
+    "cell_type": "markdown",
+    "metadata": {},
+    "source": [
+        "### The Architecture\n",
+        "\n",
+        "```\n",
+        "┌────────────────────────────────────────────────────────────┐\n",
+        "│  YOUR TRAINING CODE                                        │\n",
+        "│                                                            │\n",
+        "│  env = OpenSpielEnv(...)        ← Import the client      │\n",
+        "│  result = env.reset()           ← Type-safe!             │\n",
+        "│  result = env.step(action)      ← Type-safe!             │\n",
+        "│                                                            │\n",
+        "└─────────────────┬──────────────────────────────────────────┘\n",
+        "                  │\n",
+        "                  │  HTTP/JSON (Language-Agnostic)\n",
+        "                  │  POST /reset, POST /step, GET /state\n",
+        "                  │\n",
+        "┌─────────────────▼──────────────────────────────────────────┐\n",
+        "│  DOCKER CONTAINER                                          │\n",
+        "│                                                            │\n",
+        "│  ┌──────────────────────────────────────────────┐         │\n",
+        "│  │  FastAPI Server                              │         │\n",
+        "│  │  └─ Environment (reset, step, state)         │         │\n",
+        "│  │     └─ Your Game/Simulation Logic            │         │\n",
+        "│  └──────────────────────────────────────────────┘         │\n",
+        "│                                                            │\n",
+        "│  Isolated • Reproducible • Secure                          │\n",
+        "└────────────────────────────────────────────────────────────┘\n",
+        "```\n",
+        "\n",
+        "<div style=\"background-color: #e7f3ff; padding: 15px; border-left: 5px solid #0366d6; margin: 20px 0;\">\n",
+        "\n",
+        "**🎯 Key Insight**: You never see HTTP details - just clean Python methods!\n",
+        "\n",
+        "```python\n",
+        "env.reset()    # Under the hood: HTTP POST to /reset\n",
+        "env.step(...)  # Under the hood: HTTP POST to /step\n",
+        "env.state()    # Under the hood: HTTP GET to /state\n",
+        "```\n",
+        "\n",
+        "The magic? OpenEnv handles all the plumbing. You focus on RL! ✨\n",
+        "\n",
+        "</div>"
+    ]
+}
+
+# Check which cells exist
+has_toc = any('Table of Contents' in ''.join(cell.get('source', [])) for cell in nb['cells'])
+has_setup = any('IN_COLAB' in ''.join(cell.get('source', [])) for cell in nb['cells'])
+has_arch = any('┌───' in ''.join(cell.get('source', [])) and 'YOUR TRAINING CODE' in ''.join(cell.get('source', [])) for cell in nb['cells'])
+
+print(f"Current state:")
+print(f"  TOC present: {has_toc}")
+print(f"  Setup code present: {has_setup}")
+print(f"  Architecture diagram present: {has_arch}")
+
+# Insert TOC after cell 1 if missing
+if not has_toc:
+    nb['cells'].insert(2, toc_cell)
+    print("✅ Added TOC")
+
+# Find Part 3 header and add setup cell after it if missing
+if not has_setup:
+    for i, cell in enumerate(nb['cells']):
+        if 'Part 3: Setup' in ''.join(cell.get('source', [])) and cell['cell_type'] == 'markdown':
+            nb['cells'].insert(i + 1, setup_cell)
+            print("✅ Added setup code cell")
+            break
+
+# Find Part 2 and add architecture diagram if missing
+if not has_arch:
+    for i, cell in enumerate(nb['cells']):
+        if 'Part 2:' in ''.join(cell.get('source', [])) and 'The OpenEnv Philosophy' in ''.join(cell.get('source', [])):
+            nb['cells'].insert(i + 1, arch_cell)
+            print("✅ Added architecture diagram")
+            break
+
+# Save
+with open('examples/OpenEnv_Tutorial.ipynb', 'w') as f:
+    json.dump(nb, f, indent=1)
+
+print(f"\n✅ Notebook fixed! Total cells: {len(nb['cells'])}")

From c0ad1c223369c734701566b472256f8b647c6b8b Mon Sep 17 00:00:00 2001
From: Sanyam Bhutani <sanyambhutani@meta.com>
Date: Mon, 20 Oct 2025 14:06:48 -0700
Subject: [PATCH 16/19] Delete fix_notebook.py

---
 fix_notebook.py | 171 ------------------------------------------------
 1 file changed, 171 deletions(-)
 delete mode 100644 fix_notebook.py

diff --git a/fix_notebook.py b/fix_notebook.py
deleted file mode 100644
index 1e73044..0000000
--- a/fix_notebook.py
+++ /dev/null
@@ -1,171 +0,0 @@
-#!/usr/bin/env python3
-import json
-
-# Read notebook
-with open('examples/OpenEnv_Tutorial.ipynb', 'r') as f:
-    nb = json.load(f)
-
-# Insert TOC after cell 1
-toc_cell = {
-    "cell_type": "markdown",
-    "metadata": {},
-    "source": [
-        "---\n",
-        "\n",
-        "## 📑 Table of Contents\n",
-        "\n",
-        "<div style=\"background-color: #f8f9fa; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n",
-        "\n",
-        "**Quick Navigation** - Click any section to jump right there! 🎯\n",
-        "\n",
-        "### Foundation\n",
-        "- [Part 1: RL in 60 Seconds ⏱️](#part-1)\n",
-        "- [Part 2: The Problem with Traditional RL 😤](#part-2)\n",
-        "- [Part 3: Setup 🛠️](#part-3)\n",
-        "\n",
-        "### Architecture\n",
-        "- [Part 4: The OpenEnv Pattern 🏗️](#part-4)\n",
-        "- [Part 5: Example Integration - OpenSpiel 🎮](#part-5)\n",
-        "\n",
-        "### Hands-On Demo\n",
-        "- [Part 6: Interactive Demo 🎮](#part-6)\n",
-        "- [Part 7: Four Policies 🤖](#part-7)\n",
-        "- [Part 8: Policy Competition! 🏆](#part-8)\n",
-        "\n",
-        "### Advanced\n",
-        "- [Part 9: Using Real OpenSpiel 🎮](#part-9)\n",
-        "- [Part 10: Create Your Own Integration 🛠️](#part-10)\n",
-        "\n",
-        "### Wrap Up\n",
-        "- [Summary: Your Journey 🎓](#summary)\n",
-        "- [Resources 📚](#resources)\n",
-        "\n",
-        "</div>\n",
-        "\n",
-        "---"
-    ]
-}
-
-# Insert setup code cell after Part 3 header
-setup_cell = {
-    "cell_type": "code",
-    "execution_count": None,
-    "metadata": {},
-    "outputs": [],
-    "source": [
-        "# Detect environment\n",
-        "try:\n",
-        "    import google.colab\n",
-        "    IN_COLAB = True\n",
-        "    print(\"🌐 Running in Google Colab - Perfect!\")\n",
-        "except ImportError:\n",
-        "    IN_COLAB = False\n",
-        "    print(\"💻 Running locally - Nice!\")\n",
-        "\n",
-        "if IN_COLAB:\n",
-        "    print(\"\\n📦 Cloning OpenEnv repository...\")\n",
-        "    !git clone https://github.com/meta-pytorch/OpenEnv.git > /dev/null 2>&1\n",
-        "    %cd OpenEnv\n",
-        "    \n",
-        "    print(\"📚 Installing dependencies (this takes ~10 seconds)...\")\n",
-        "    !pip install -q fastapi uvicorn requests\n",
-        "    \n",
-        "    import sys\n",
-        "    sys.path.insert(0, './src')\n",
-        "    print(\"\\n✅ Setup complete! Everything is ready to go! 🎉\")\n",
-        "else:\n",
-        "    import sys\n",
-        "    from pathlib import Path\n",
-        "    sys.path.insert(0, str(Path.cwd().parent / 'src'))\n",
-        "    print(\"✅ Using local OpenEnv installation\")\n",
-        "\n",
-        "print(\"\\n🚀 Ready to explore OpenEnv and build amazing things!\")\n",
-        "print(\"💡 Tip: Run cells top-to-bottom for the best experience.\\n\")"
-    ]
-}
-
-# Insert architecture diagram after Part 2
-arch_cell = {
-    "cell_type": "markdown",
-    "metadata": {},
-    "source": [
-        "### The Architecture\n",
-        "\n",
-        "```\n",
-        "┌────────────────────────────────────────────────────────────┐\n",
-        "│  YOUR TRAINING CODE                                        │\n",
-        "│                                                            │\n",
-        "│  env = OpenSpielEnv(...)        ← Import the client      │\n",
-        "│  result = env.reset()           ← Type-safe!             │\n",
-        "│  result = env.step(action)      ← Type-safe!             │\n",
-        "│                                                            │\n",
-        "└─────────────────┬──────────────────────────────────────────┘\n",
-        "                  │\n",
-        "                  │  HTTP/JSON (Language-Agnostic)\n",
-        "                  │  POST /reset, POST /step, GET /state\n",
-        "                  │\n",
-        "┌─────────────────▼──────────────────────────────────────────┐\n",
-        "│  DOCKER CONTAINER                                          │\n",
-        "│                                                            │\n",
-        "│  ┌──────────────────────────────────────────────┐         │\n",
-        "│  │  FastAPI Server                              │         │\n",
-        "│  │  └─ Environment (reset, step, state)         │         │\n",
-        "│  │     └─ Your Game/Simulation Logic            │         │\n",
-        "│  └──────────────────────────────────────────────┘         │\n",
-        "│                                                            │\n",
-        "│  Isolated • Reproducible • Secure                          │\n",
-        "└────────────────────────────────────────────────────────────┘\n",
-        "```\n",
-        "\n",
-        "<div style=\"background-color: #e7f3ff; padding: 15px; border-left: 5px solid #0366d6; margin: 20px 0;\">\n",
-        "\n",
-        "**🎯 Key Insight**: You never see HTTP details - just clean Python methods!\n",
-        "\n",
-        "```python\n",
-        "env.reset()    # Under the hood: HTTP POST to /reset\n",
-        "env.step(...)  # Under the hood: HTTP POST to /step\n",
-        "env.state()    # Under the hood: HTTP GET to /state\n",
-        "```\n",
-        "\n",
-        "The magic? OpenEnv handles all the plumbing. You focus on RL! ✨\n",
-        "\n",
-        "</div>"
-    ]
-}
-
-# Check which cells exist
-has_toc = any('Table of Contents' in ''.join(cell.get('source', [])) for cell in nb['cells'])
-has_setup = any('IN_COLAB' in ''.join(cell.get('source', [])) for cell in nb['cells'])
-has_arch = any('┌───' in ''.join(cell.get('source', [])) and 'YOUR TRAINING CODE' in ''.join(cell.get('source', [])) for cell in nb['cells'])
-
-print(f"Current state:")
-print(f"  TOC present: {has_toc}")
-print(f"  Setup code present: {has_setup}")
-print(f"  Architecture diagram present: {has_arch}")
-
-# Insert TOC after cell 1 if missing
-if not has_toc:
-    nb['cells'].insert(2, toc_cell)
-    print("✅ Added TOC")
-
-# Find Part 3 header and add setup cell after it if missing
-if not has_setup:
-    for i, cell in enumerate(nb['cells']):
-        if 'Part 3: Setup' in ''.join(cell.get('source', [])) and cell['cell_type'] == 'markdown':
-            nb['cells'].insert(i + 1, setup_cell)
-            print("✅ Added setup code cell")
-            break
-
-# Find Part 2 and add architecture diagram if missing
-if not has_arch:
-    for i, cell in enumerate(nb['cells']):
-        if 'Part 2:' in ''.join(cell.get('source', [])) and 'The OpenEnv Philosophy' in ''.join(cell.get('source', [])):
-            nb['cells'].insert(i + 1, arch_cell)
-            print("✅ Added architecture diagram")
-            break
-
-# Save
-with open('examples/OpenEnv_Tutorial.ipynb', 'w') as f:
-    json.dump(nb, f, indent=1)
-
-print(f"\n✅ Notebook fixed! Total cells: {len(nb['cells'])}")

From 578e5a0c8b7172dcf8df2c3a6546442b991c433c Mon Sep 17 00:00:00 2001
From: Sanyam Bhutani <sanyambhutani@meta.com>
Date: Mon, 20 Oct 2025 14:09:40 -0700
Subject: [PATCH 17/19] Update OpenEnv_Tutorial.ipynb

---
 examples/OpenEnv_Tutorial.ipynb | 800 +++++++++++++++++++-------------
 1 file changed, 489 insertions(+), 311 deletions(-)

diff --git a/examples/OpenEnv_Tutorial.ipynb b/examples/OpenEnv_Tutorial.ipynb
index 126697f..68a4568 100644
--- a/examples/OpenEnv_Tutorial.ipynb
+++ b/examples/OpenEnv_Tutorial.ipynb
@@ -8,24 +8,28 @@
     "\n",
     "<img src=\"https://pytorch.org/assets/images/pytorch-logo.png\" width=\"200\" alt=\"PyTorch\">\n",
     "\n",
-    "Author: [Sanyam Bhutani](http://twitter.com/bhutanisanyam1/)\n",
+    "\n",
     "\n",
     "# OpenEnv: Production RL Made Simple\n",
     "\n",
-    "### *From \"Hello World\" to RL Training in 5 Minutes* \u2728\n",
+    "### *From \"Hello World\" to RL Training in 5 Minutes* ✨\n",
     "\n",
     "---\n",
     "\n",
     "**What if RL environments were as easy to use as REST APIs?**\n",
     "\n",
-    "That's OpenEnv. Type-safe. Isolated. Production-ready. \ud83c\udfaf\n",
+    "That's OpenEnv. Type-safe. Isolated. Production-ready. 🎯\n",
     "\n",
     "[![GitHub](https://img.shields.io/badge/GitHub-meta--pytorch%2FOpenEnv-blue?logo=github)](https://github.com/meta-pytorch/OpenEnv)\n",
     "[![License](https://img.shields.io/badge/License-BSD%203--Clause-green.svg)](https://opensource.org/licenses/BSD-3-Clause)\n",
     "[![PyTorch](https://img.shields.io/badge/PyTorch-EE4C2C?logo=pytorch&logoColor=white)](https://pytorch.org/)\n",
     "\n",
+    "Author: [Sanyam Bhutani](http://twitter.com/bhutanisanyam1/)\n",
+    "\n",
     "</div>\n",
     "\n",
+    "\n",
+    "\n",
     "---"
    ]
   },
@@ -33,50 +37,50 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "## \ud83d\udccb What You'll Learn\n",
+    "## 📋 What You'll Learn\n",
     "\n",
     "<table>\n",
     "<tr>\n",
     "<td width=\"50%\">\n",
     "\n",
-    "**\ud83c\udfaf Part 1-2: The Fundamentals**\n",
-    "- \u26a1 RL in 60 seconds\n",
-    "- \ud83e\udd14 Why existing solutions fall short\n",
-    "- \ud83d\udca1 The OpenEnv solution\n",
+    "**🎯 Part 1-2: The Fundamentals**\n",
+    "- ⚡ RL in 60 seconds\n",
+    "- 🤔 Why existing solutions fall short\n",
+    "- 💡 The OpenEnv solution\n",
     "\n",
     "</td>\n",
     "<td width=\"50%\">\n",
     "\n",
-    "**\ud83c\udfd7\ufe0f Part 3-5: The Architecture**\n",
-    "- \ud83d\udd27 How OpenEnv works\n",
-    "- \ud83d\udd0d Exploring real code\n",
-    "- \ud83c\udfae OpenSpiel integration example\n",
+    "**🏗️ Part 3-5: The Architecture**\n",
+    "- 🔧 How OpenEnv works\n",
+    "- 🔍 Exploring real code\n",
+    "- 🎮 OpenSpiel integration example\n",
     "\n",
     "</td>\n",
     "</tr>\n",
     "<tr>\n",
     "<td width=\"50%\">\n",
     "\n",
-    "**\ud83c\udfae Part 6-8: Hands-On Demo**\n",
-    "- \ud83d\udd28 Build a game environment\n",
-    "- \ud83e\udd16 Test 4 different policies\n",
-    "- \ud83d\udc40 Watch learning happen live\n",
+    "**🎮 Part 6-8: Hands-On Demo**\n",
+    "- 🔨 Build a game environment\n",
+    "- 🤖 Test 4 different policies\n",
+    "- 👀 Watch learning happen live\n",
     "\n",
     "</td>\n",
     "<td width=\"50%\">\n",
     "\n",
-    "**\ud83d\udd27 Part 9-10: Going Further**\n",
-    "- \ud83d\ude80 Use real OpenSpiel\n",
-    "- \u2728 Create your own integration\n",
-    "- \ud83c\udf10 Deploy to production\n",
+    "**🔧 Part 9-10: Going Further**\n",
+    "- 🚀 Use real OpenSpiel\n",
+    "- ✨ Create your own integration\n",
+    "- 🌐 Deploy to production\n",
     "\n",
     "</td>\n",
     "</tr>\n",
     "</table>\n",
     "\n",
-    "> \ud83d\udca1 **Pro Tip**: This notebook is designed to run top-to-bottom in Google Colab with zero setup!\n",
+    "> 💡 **Pro Tip**: This notebook is designed to run top-to-bottom in Google Colab with zero setup!\n",
     "> \n",
-    "> \u23f1\ufe0f **Time**: ~5 minutes | \ud83d\udcca **Difficulty**: Beginner-friendly | \ud83c\udfaf **Outcome**: Production-ready RL knowledge"
+    "> ⏱️ **Time**: ~5 minutes | 📊 **Difficulty**: Beginner-friendly | 🎯 **Outcome**: Production-ready RL knowledge"
    ]
   },
   {
@@ -85,33 +89,33 @@
    "source": [
     "---\n",
     "\n",
-    "## \ud83d\udcd1 Table of Contents\n",
+    "## 📑 Table of Contents\n",
     "\n",
     "<div style=\"background-color: #f8f9fa; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n",
     "\n",
-    "**Quick Navigation** - Click any section to jump right there! \ud83c\udfaf\n",
+    "**Quick Navigation** - Click any section to jump right there! 🎯\n",
     "\n",
     "### Foundation\n",
-    "- [Part 1: RL in 60 Seconds \u23f1\ufe0f](#part-1)\n",
-    "- [Part 2: The Problem with Traditional RL \ud83d\ude24](#part-2)\n",
-    "- [Part 3: Setup \ud83d\udee0\ufe0f](#part-3)\n",
+    "- [Part 1: RL in 60 Seconds ⏱️](#part-1)\n",
+    "- [Part 2: The Problem with Traditional RL 😤](#part-2)\n",
+    "- [Part 3: Setup 🛠️](#part-3)\n",
     "\n",
     "### Architecture\n",
-    "- [Part 4: The OpenEnv Pattern \ud83c\udfd7\ufe0f](#part-4)\n",
-    "- [Part 5: Example Integration - OpenSpiel \ud83c\udfae](#part-5)\n",
+    "- [Part 4: The OpenEnv Pattern 🏗️](#part-4)\n",
+    "- [Part 5: Example Integration - OpenSpiel 🎮](#part-5)\n",
     "\n",
     "### Hands-On Demo\n",
-    "- [Part 6: Interactive Demo \ud83c\udfae](#part-6)\n",
-    "- [Part 7: Four Policies \ud83e\udd16](#part-7)\n",
-    "- [Part 8: Policy Competition! \ud83c\udfc6](#part-8)\n",
+    "- [Part 6: Interactive Demo 🎮](#part-6)\n",
+    "- [Part 7: Four Policies 🤖](#part-7)\n",
+    "- [Part 8: Policy Competition! 🏆](#part-8)\n",
     "\n",
     "### Advanced\n",
-    "- [Part 9: Using Real OpenSpiel \ud83c\udfae](#part-9)\n",
-    "- [Part 10: Create Your Own Integration \ud83d\udee0\ufe0f](#part-10)\n",
+    "- [Part 9: Using Real OpenSpiel 🎮](#part-9)\n",
+    "- [Part 10: Create Your Own Integration 🛠️](#part-10)\n",
     "\n",
     "### Wrap Up\n",
-    "- [Summary: Your Journey \ud83c\udf93](#summary)\n",
-    "- [Resources \ud83d\udcda](#resources)\n",
+    "- [Summary: Your Journey 🎓](#summary)\n",
+    "- [Resources 📚](#resources)\n",
     "\n",
     "</div>\n",
     "\n",
@@ -128,36 +132,61 @@
     "try:\n",
     "    import google.colab\n",
     "    IN_COLAB = True\n",
-    "    print(\"\ud83c\udf10 Running in Google Colab - Perfect!\")\n",
+    "    print(\"🌐 Running in Google Colab - Perfect!\")\n",
     "except ImportError:\n",
     "    IN_COLAB = False\n",
-    "    print(\"\ud83d\udcbb Running locally - Nice!\")\n",
+    "    print(\"💻 Running locally - Nice!\")\n",
     "\n",
     "if IN_COLAB:\n",
-    "    print(\"\\n\ud83d\udce6 Cloning OpenEnv repository...\")\n",
+    "    print(\"\\n📦 Cloning OpenEnv repository...\")\n",
     "    !git clone https://github.com/meta-pytorch/OpenEnv.git > /dev/null 2>&1\n",
     "    %cd OpenEnv\n",
     "    \n",
-    "    print(\"\ud83d\udcda Installing dependencies (this takes ~10 seconds)...\")\n",
+    "    print(\"📚 Installing dependencies (this takes ~10 seconds)...\")\n",
     "    !pip install -q fastapi uvicorn requests\n",
     "    \n",
     "    import sys\n",
     "    sys.path.insert(0, './src')\n",
-    "    print(\"\\n\u2705 Setup complete! Everything is ready to go! \ud83c\udf89\")\n",
+    "    print(\"\\n✅ Setup complete! Everything is ready to go! 🎉\")\n",
     "else:\n",
     "    import sys\n",
     "    from pathlib import Path\n",
     "    sys.path.insert(0, str(Path.cwd().parent / 'src'))\n",
-    "    print(\"\u2705 Using local OpenEnv installation\")\n",
+    "    print(\"✅ Using local OpenEnv installation\")\n",
     "\n",
-    "print(\"\\n\ud83d\ude80 Ready to explore OpenEnv and build amazing things!\")\n",
-    "print(\"\ud83d\udca1 Tip: Run cells top-to-bottom for the best experience.\\n\")"
+    "print(\"\\n🚀 Ready to explore OpenEnv and build amazing things!\")\n",
+    "print(\"💡 Tip: Run cells top-to-bottom for the best experience.\\n\")"
    ]
   },
   {
    "cell_type": "markdown",
-   "source": "---\n\n<a id=\"part-1\"></a>\n# Part 1: RL in 60 Seconds \u23f1\ufe0f\n\n<div style=\"background-color: #f0f7ff; padding: 20px; border-left: 5px solid #2196F3; margin: 20px 0;\">\n\n**Reinforcement Learning is simpler than you think.**\n\nIt's just a loop:\n\n```\nwhile not done:\n    observation = environment.observe()\n    action = policy.choose(observation)\n    reward = environment.step(action)\n    policy.learn(reward)\n```\n\nThat's it. That's RL.\n\n</div>\n\nLet's see it in action:",
-   "metadata": {}
+   "metadata": {},
+   "source": [
+    "---\n",
+    "\n",
+    "<a id=\"part-1\"></a>\n",
+    "# Part 1: RL in 60 Seconds ⏱️\n",
+    "\n",
+    "<div style=\"background-color: #f0f7ff; padding: 20px; border-left: 5px solid #2196F3; margin: 20px 0;\">\n",
+    "\n",
+    "**Reinforcement Learning is simpler than you think.**\n",
+    "\n",
+    "It's just a loop:\n",
+    "\n",
+    "```\n",
+    "while not done:\n",
+    "    observation = environment.observe()\n",
+    "    action = policy.choose(observation)\n",
+    "    reward = environment.step(action)\n",
+    "    policy.learn(reward)\n",
+    "```\n",
+    "\n",
+    "That's it. That's RL.\n",
+    "\n",
+    "</div>\n",
+    "\n",
+    "Let's see it in action:"
+   ]
   },
   {
    "cell_type": "markdown",
@@ -165,7 +194,7 @@
    "source": [
     "---\n",
     "\n",
-    "# Part 1: RL in 60 Seconds \u23f1\ufe0f\n",
+    "# Part 1: RL in 60 Seconds ⏱️\n",
     "\n",
     "<div style=\"background-color: #f0f7ff; padding: 20px; border-left: 5px solid #2196F3; margin: 20px 0;\">\n",
     "\n",
@@ -196,16 +225,16 @@
    "source": [
     "import random\n",
     "\n",
-    "print(\"\ud83c\udfb2 \" + \"=\"*58 + \" \ud83c\udfb2\")\n",
+    "print(\"🎲 \" + \"=\"*58 + \" 🎲\")\n",
     "print(\"   Number Guessing Game - The Simplest RL Example\")\n",
-    "print(\"\ud83c\udfb2 \" + \"=\"*58 + \" \ud83c\udfb2\")\n",
+    "print(\"🎲 \" + \"=\"*58 + \" 🎲\")\n",
     "\n",
     "# Environment setup\n",
     "target = random.randint(1, 10)\n",
     "guesses_left = 3\n",
     "\n",
-    "print(f\"\\n\ud83c\udfaf I'm thinking of a number between 1 and 10...\")\n",
-    "print(f\"\ud83d\udcad You have {guesses_left} guesses. Let's see how random guessing works!\\n\")\n",
+    "print(f\"\\n🎯 I'm thinking of a number between 1 and 10...\")\n",
+    "print(f\"💭 You have {guesses_left} guesses. Let's see how random guessing works!\\n\")\n",
     "\n",
     "# The RL Loop - Pure random policy (no learning!)\n",
     "while guesses_left > 0:\n",
@@ -213,21 +242,21 @@
     "    guess = random.randint(1, 10)\n",
     "    guesses_left -= 1\n",
     "    \n",
-    "    print(f\"\ud83d\udcad Guess #{3-guesses_left}: {guess}\", end=\" \u2192 \")\n",
+    "    print(f\"💭 Guess #{3-guesses_left}: {guess}\", end=\" → \")\n",
     "    \n",
     "    # Reward signal (but we're not using it!)\n",
     "    if guess == target:\n",
-    "        print(\"\ud83c\udf89 Correct! +10 points\")\n",
+    "        print(\"🎉 Correct! +10 points\")\n",
     "        break\n",
     "    elif abs(guess - target) <= 2:\n",
-    "        print(\"\ud83d\udd25 Warm! (close)\")\n",
+    "        print(\"🔥 Warm! (close)\")\n",
     "    else:\n",
-    "        print(\"\u2744\ufe0f  Cold! (far)\")\n",
+    "        print(\"❄️  Cold! (far)\")\n",
     "else:\n",
-    "    print(f\"\\n\ud83d\udc94 Out of guesses. The number was {target}.\")\n",
+    "    print(f\"\\n💔 Out of guesses. The number was {target}.\")\n",
     "\n",
     "print(\"\\n\" + \"=\"*62)\n",
-    "print(\"\ud83d\udca1 This is RL: Observe \u2192 Act \u2192 Reward \u2192 Repeat\")\n",
+    "print(\"💡 This is RL: Observe → Act → Reward → Repeat\")\n",
     "print(\"   But this policy is terrible! It doesn't learn from rewards.\")\n",
     "print(\"=\"*62 + \"\\n\")"
    ]
@@ -239,11 +268,11 @@
     "---\n",
     "\n",
     "<a id=\"part-2\"></a>\n",
-    "# Part 2: The Problem with Traditional RL \ud83d\ude24\n",
+    "# Part 2: The Problem with Traditional RL 😤\n",
     "\n",
     "<div style=\"background-color: #fff3e0; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n",
     "\n",
-    "## \ud83e\udd14 Why Can't We Just Use OpenAI Gym?\n",
+    "## 🤔 Why Can't We Just Use OpenAI Gym?\n",
     "\n",
     "Good question! Gym is great for research, but production needs more...\n",
     "\n",
@@ -257,50 +286,50 @@
     "</tr>\n",
     "<tr>\n",
     "<td><b>Type Safety</b></td>\n",
-    "<td>\u274c <code>obs[0][3]</code> - what is this?</td>\n",
-    "<td>\u2705 <code>obs.info_state</code> - IDE knows!</td>\n",
+    "<td>❌ <code>obs[0][3]</code> - what is this?</td>\n",
+    "<td>✅ <code>obs.info_state</code> - IDE knows!</td>\n",
     "</tr>\n",
     "<tr>\n",
     "<td><b>Isolation</b></td>\n",
-    "<td>\u274c Same process (can crash your training)</td>\n",
-    "<td>\u2705 Docker containers (fully isolated)</td>\n",
+    "<td>❌ Same process (can crash your training)</td>\n",
+    "<td>✅ Docker containers (fully isolated)</td>\n",
     "</tr>\n",
     "<tr>\n",
     "<td><b>Deployment</b></td>\n",
-    "<td>\u274c \"Works on my machine\" \ud83e\udd37</td>\n",
-    "<td>\u2705 Same container everywhere \ud83d\udc33</td>\n",
+    "<td>❌ \"Works on my machine\" 🤷</td>\n",
+    "<td>✅ Same container everywhere 🐳</td>\n",
     "</tr>\n",
     "<tr>\n",
     "<td><b>Scaling</b></td>\n",
-    "<td>\u274c Hard to distribute</td>\n",
-    "<td>\u2705 Deploy to Kubernetes \u2638\ufe0f</td>\n",
+    "<td>❌ Hard to distribute</td>\n",
+    "<td>✅ Deploy to Kubernetes ☸️</td>\n",
     "</tr>\n",
     "<tr>\n",
     "<td><b>Language</b></td>\n",
-    "<td>\u274c Python only</td>\n",
-    "<td>\u2705 Any language (HTTP API) \ud83c\udf10</td>\n",
+    "<td>❌ Python only</td>\n",
+    "<td>✅ Any language (HTTP API) 🌐</td>\n",
     "</tr>\n",
     "<tr>\n",
     "<td><b>Debugging</b></td>\n",
-    "<td>\u274c Cryptic numpy errors</td>\n",
-    "<td>\u2705 Clear type errors \ud83d\udc1b</td>\n",
+    "<td>❌ Cryptic numpy errors</td>\n",
+    "<td>✅ Clear type errors 🐛</td>\n",
     "</tr>\n",
     "</table>\n",
     "\n",
     "<div style=\"background-color: #d4edda; padding: 20px; border-left: 5px solid #28a745; margin: 20px 0;\">\n",
     "\n",
-    "## \ud83d\udca1 The OpenEnv Philosophy\n",
+    "## 💡 The OpenEnv Philosophy\n",
     "\n",
     "**\"RL environments should be like microservices\"**\n",
     "\n",
     "Think of it like this: You don't run your database in the same process as your web server, right? Same principle!\n",
     "\n",
-    "- \ud83d\udd12 **Isolated**: Run in containers (security + stability)\n",
-    "- \ud83c\udf10 **Standard**: HTTP API, works everywhere\n",
-    "- \ud83d\udce6 **Versioned**: Docker images (reproducibility!)\n",
-    "- \ud83d\ude80 **Scalable**: Deploy to cloud with one command\n",
-    "- \ud83d\udee1\ufe0f **Type-safe**: Catch bugs before they happen\n",
-    "- \ud83d\udd04 **Portable**: Works on Mac, Linux, Windows, Cloud\n",
+    "- 🔒 **Isolated**: Run in containers (security + stability)\n",
+    "- 🌐 **Standard**: HTTP API, works everywhere\n",
+    "- 📦 **Versioned**: Docker images (reproducibility!)\n",
+    "- 🚀 **Scalable**: Deploy to cloud with one command\n",
+    "- 🛡️ **Type-safe**: Catch bugs before they happen\n",
+    "- 🔄 **Portable**: Works on Mac, Linux, Windows, Cloud\n",
     "\n",
     "</div>"
    ]
@@ -312,34 +341,34 @@
     "### The Architecture\n",
     "\n",
     "```\n",
-    "\u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510\n",
-    "\u2502  YOUR TRAINING CODE                                        \u2502\n",
-    "\u2502                                                            \u2502\n",
-    "\u2502  env = OpenSpielEnv(...)        \u2190 Import the client      \u2502\n",
-    "\u2502  result = env.reset()           \u2190 Type-safe!             \u2502\n",
-    "\u2502  result = env.step(action)      \u2190 Type-safe!             \u2502\n",
-    "\u2502                                                            \u2502\n",
-    "\u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u252c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518\n",
-    "                  \u2502\n",
-    "                  \u2502  HTTP/JSON (Language-Agnostic)\n",
-    "                  \u2502  POST /reset, POST /step, GET /state\n",
-    "                  \u2502\n",
-    "\u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u25bc\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510\n",
-    "\u2502  DOCKER CONTAINER                                          \u2502\n",
-    "\u2502                                                            \u2502\n",
-    "\u2502  \u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510         \u2502\n",
-    "\u2502  \u2502  FastAPI Server                              \u2502         \u2502\n",
-    "\u2502  \u2502  \u2514\u2500 Environment (reset, step, state)         \u2502         \u2502\n",
-    "\u2502  \u2502     \u2514\u2500 Your Game/Simulation Logic            \u2502         \u2502\n",
-    "\u2502  \u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518         \u2502\n",
-    "\u2502                                                            \u2502\n",
-    "\u2502  Isolated \u2022 Reproducible \u2022 Secure                          \u2502\n",
-    "\u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518\n",
+    "┌────────────────────────────────────────────────────────────┐\n",
+    "│  YOUR TRAINING CODE                                        │\n",
+    "│                                                            │\n",
+    "│  env = OpenSpielEnv(...)        ← Import the client      │\n",
+    "│  result = env.reset()           ← Type-safe!             │\n",
+    "│  result = env.step(action)      ← Type-safe!             │\n",
+    "│                                                            │\n",
+    "└─────────────────┬──────────────────────────────────────────┘\n",
+    "                  │\n",
+    "                  │  HTTP/JSON (Language-Agnostic)\n",
+    "                  │  POST /reset, POST /step, GET /state\n",
+    "                  │\n",
+    "┌─────────────────▼──────────────────────────────────────────┐\n",
+    "│  DOCKER CONTAINER                                          │\n",
+    "│                                                            │\n",
+    "│  ┌──────────────────────────────────────────────┐         │\n",
+    "│  │  FastAPI Server                              │         │\n",
+    "│  │  └─ Environment (reset, step, state)         │         │\n",
+    "│  │     └─ Your Game/Simulation Logic            │         │\n",
+    "│  └──────────────────────────────────────────────┘         │\n",
+    "│                                                            │\n",
+    "│  Isolated • Reproducible • Secure                          │\n",
+    "└────────────────────────────────────────────────────────────┘\n",
     "```\n",
     "\n",
     "<div style=\"background-color: #e7f3ff; padding: 15px; border-left: 5px solid #0366d6; margin: 20px 0;\">\n",
     "\n",
-    "**\ud83c\udfaf Key Insight**: You never see HTTP details - just clean Python methods!\n",
+    "**🎯 Key Insight**: You never see HTTP details - just clean Python methods!\n",
     "\n",
     "```python\n",
     "env.reset()    # Under the hood: HTTP POST to /reset\n",
@@ -347,7 +376,7 @@
     "env.state()    # Under the hood: HTTP GET to /state\n",
     "```\n",
     "\n",
-    "The magic? OpenEnv handles all the plumbing. You focus on RL! \u2728\n",
+    "The magic? OpenEnv handles all the plumbing. You focus on RL! ✨\n",
     "\n",
     "</div>"
    ]
@@ -355,7 +384,20 @@
   {
    "cell_type": "markdown",
    "metadata": {},
-   "source": "---\n\n<a id=\"part-3\"></a>\n# Part 3: Setup \ud83d\udee0\ufe0f\n\n<div style=\"background-color: #f8f9fa; padding: 15px; border-radius: 5px; margin: 20px 0;\">\n\n**Running in Colab?** This cell will clone OpenEnv and install dependencies automatically.\n\n**Running locally?** Make sure you're in the OpenEnv directory.\n\n</div>"
+   "source": [
+    "---\n",
+    "\n",
+    "<a id=\"part-3\"></a>\n",
+    "# Part 3: Setup 🛠️\n",
+    "\n",
+    "<div style=\"background-color: #f8f9fa; padding: 15px; border-radius: 5px; margin: 20px 0;\">\n",
+    "\n",
+    "**Running in Colab?** This cell will clone OpenEnv and install dependencies automatically.\n",
+    "\n",
+    "**Running locally?** Make sure you're in the OpenEnv directory.\n",
+    "\n",
+    "</div>"
+   ]
   },
   {
    "cell_type": "markdown",
@@ -363,7 +405,7 @@
    "source": [
     "---\n",
     "\n",
-    "# Part 3: Setup \ud83d\udee0\ufe0f\n",
+    "# Part 3: Setup 🛠️\n",
     "\n",
     "<div style=\"background-color: #f8f9fa; padding: 15px; border-radius: 5px; margin: 20px 0;\">\n",
     "\n",
@@ -379,7 +421,34 @@
    "execution_count": null,
    "metadata": {},
    "outputs": [],
-   "source": "---\n\n<a id=\"part-4\"></a>\n# Part 4: The OpenEnv Pattern \ud83c\udfd7\ufe0f\n\n<div style=\"background-color: #f0f7ff; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n\n## Every OpenEnv Environment Has 3 Components:\n\n```\nsrc/envs/your_env/\n\u251c\u2500\u2500 \ud83d\udcdd models.py          \u2190 Type-safe contracts\n\u2502                           (Action, Observation, State)\n\u2502\n\u251c\u2500\u2500 \ud83d\udcf1 client.py          \u2190 What YOU import\n\u2502                           (HTTPEnvClient implementation)\n\u2502\n\u2514\u2500\u2500 \ud83d\udda5\ufe0f  server/\n    \u251c\u2500\u2500 environment.py    \u2190 Game/simulation logic\n    \u251c\u2500\u2500 app.py            \u2190 FastAPI server\n    \u2514\u2500\u2500 Dockerfile        \u2190 Container definition\n```\n\n</div>\n\nLet's explore the actual OpenEnv code to see how this works:"
+   "source": [
+    "---\n",
+    "\n",
+    "<a id=\"part-4\"></a>\n",
+    "# Part 4: The OpenEnv Pattern 🏗️\n",
+    "\n",
+    "<div style=\"background-color: #f0f7ff; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n",
+    "\n",
+    "## Every OpenEnv Environment Has 3 Components:\n",
+    "\n",
+    "```\n",
+    "src/envs/your_env/\n",
+    "├── 📝 models.py          ← Type-safe contracts\n",
+    "│                           (Action, Observation, State)\n",
+    "│\n",
+    "├── 📱 client.py          ← What YOU import\n",
+    "│                           (HTTPEnvClient implementation)\n",
+    "│\n",
+    "└── 🖥️  server/\n",
+    "    ├── environment.py    ← Game/simulation logic\n",
+    "    ├── app.py            ← FastAPI server\n",
+    "    └── Dockerfile        ← Container definition\n",
+    "```\n",
+    "\n",
+    "</div>\n",
+    "\n",
+    "Let's explore the actual OpenEnv code to see how this works:"
+   ]
   },
   {
    "cell_type": "code",
@@ -392,11 +461,11 @@
     "from core.http_env_client import HTTPEnvClient\n",
     "\n",
     "print(\"=\"*70)\n",
-    "print(\"   \ud83e\udde9 OPENENV CORE ABSTRACTIONS\")\n",
+    "print(\"   🧩 OPENENV CORE ABSTRACTIONS\")\n",
     "print(\"=\"*70)\n",
     "\n",
     "print(\"\"\"\n",
-    "\ud83d\udda5\ufe0f  SERVER SIDE (runs in Docker):\n",
+    "🖥️  SERVER SIDE (runs in Docker):\n",
     "\n",
     "    class Environment(ABC):\n",
     "        '''Base class for all environment implementations'''\n",
@@ -413,7 +482,7 @@
     "        def state(self) -> State:\n",
     "            '''Get episode metadata'''\n",
     "\n",
-    "\ud83d\udcf1 CLIENT SIDE (your training code):\n",
+    "📱 CLIENT SIDE (your training code):\n",
     "\n",
     "    class HTTPEnvClient(ABC):\n",
     "        '''Base class for HTTP clients'''\n",
@@ -429,14 +498,54 @@
     "\"\"\")\n",
     "\n",
     "print(\"=\"*70)\n",
-    "print(\"\\n\u2728 Same interface on both sides - communication via HTTP!\")\n",
-    "print(\"\ud83c\udfaf You focus on RL, OpenEnv handles the infrastructure.\\n\")"
+    "print(\"\\n✨ Same interface on both sides - communication via HTTP!\")\n",
+    "print(\"🎯 You focus on RL, OpenEnv handles the infrastructure.\\n\")"
    ]
   },
   {
    "cell_type": "markdown",
    "metadata": {},
-   "source": "---\n\n<a id=\"part-5\"></a>\n# Part 5: Example Integration - OpenSpiel \ud83c\udfae\n\n<div style=\"background-color: #fff3e0; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n\n## What is OpenSpiel?\n\n**OpenSpiel** is a library from DeepMind with **70+ game environments** for RL research.\n\n## OpenEnv's Integration\n\nWe've wrapped **6 OpenSpiel games** following the OpenEnv pattern:\n\n<table>\n<tr>\n<td width=\"50%\">\n\n**\ud83c\udfaf Single-Player**\n1. **Catch** - Catch falling ball\n2. **Cliff Walking** - Navigate grid\n3. **2048** - Tile puzzle\n4. **Blackjack** - Card game\n\n</td>\n<td width=\"50%\">\n\n**\ud83d\udc65 Multi-Player**\n5. **Tic-Tac-Toe** - Classic 3\u00d73\n6. **Kuhn Poker** - Imperfect info poker\n\n</td>\n</tr>\n</table>\n\nThis shows how OpenEnv can wrap **any** existing RL library!\n\n</div>"
+   "source": [
+    "---\n",
+    "\n",
+    "<a id=\"part-5\"></a>\n",
+    "# Part 5: Example Integration - OpenSpiel 🎮\n",
+    "\n",
+    "<div style=\"background-color: #fff3e0; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n",
+    "\n",
+    "## What is OpenSpiel?\n",
+    "\n",
+    "**OpenSpiel** is a library from DeepMind with **70+ game environments** for RL research.\n",
+    "\n",
+    "## OpenEnv's Integration\n",
+    "\n",
+    "We've wrapped **6 OpenSpiel games** following the OpenEnv pattern:\n",
+    "\n",
+    "<table>\n",
+    "<tr>\n",
+    "<td width=\"50%\">\n",
+    "\n",
+    "**🎯 Single-Player**\n",
+    "1. **Catch** - Catch falling ball\n",
+    "2. **Cliff Walking** - Navigate grid\n",
+    "3. **2048** - Tile puzzle\n",
+    "4. **Blackjack** - Card game\n",
+    "\n",
+    "</td>\n",
+    "<td width=\"50%\">\n",
+    "\n",
+    "**👥 Multi-Player**\n",
+    "5. **Tic-Tac-Toe** - Classic 3×3\n",
+    "6. **Kuhn Poker** - Imperfect info poker\n",
+    "\n",
+    "</td>\n",
+    "</tr>\n",
+    "</table>\n",
+    "\n",
+    "This shows how OpenEnv can wrap **any** existing RL library!\n",
+    "\n",
+    "</div>"
+   ]
   },
   {
    "cell_type": "markdown",
@@ -444,7 +553,7 @@
    "source": [
     "---\n",
     "\n",
-    "# Part 5: Example Integration - OpenSpiel \ud83c\udfae\n",
+    "# Part 5: Example Integration - OpenSpiel 🎮\n",
     "\n",
     "<div style=\"background-color: #fff3e0; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n",
     "\n",
@@ -460,7 +569,7 @@
     "<tr>\n",
     "<td width=\"50%\">\n",
     "\n",
-    "**\ud83c\udfaf Single-Player**\n",
+    "**🎯 Single-Player**\n",
     "1. **Catch** - Catch falling ball\n",
     "2. **Cliff Walking** - Navigate grid\n",
     "3. **2048** - Tile puzzle\n",
@@ -469,8 +578,8 @@
     "</td>\n",
     "<td width=\"50%\">\n",
     "\n",
-    "**\ud83d\udc65 Multi-Player**\n",
-    "5. **Tic-Tac-Toe** - Classic 3\u00d73\n",
+    "**👥 Multi-Player**\n",
+    "5. **Tic-Tac-Toe** - Classic 3×3\n",
     "6. **Kuhn Poker** - Imperfect info poker\n",
     "\n",
     "</td>\n",
@@ -484,10 +593,54 @@
   },
   {
    "cell_type": "code",
-   "source": "from envs.openspiel_env.client import OpenSpielEnv\n\nprint(\"=\"*70)\nprint(\"   \ud83d\udd0c HOW OPENENV WRAPS OPENSPIEL\")\nprint(\"=\"*70)\n\nprint(\"\"\"\nclass OpenSpielEnv(HTTPEnvClient[OpenSpielAction, OpenSpielObservation]):\n    \n    def _step_payload(self, action: OpenSpielAction) -> dict:\n        '''Convert typed action to JSON for HTTP'''\n        return {\n            \"action_id\": action.action_id,\n            \"game_name\": action.game_name,\n        }\n    \n    def _parse_result(self, payload: dict) -> StepResult:\n        '''Parse HTTP JSON response into typed observation'''\n        return StepResult(\n            observation=OpenSpielObservation(...),\n            reward=payload['reward'],\n            done=payload['done']\n        )\n\n\"\"\")\n\nprint(\"\u2500\" * 70)\nprint(\"\\n\u2728 Usage (works for ALL OpenEnv environments):\")\nprint(\"\"\"\n  env = OpenSpielEnv(base_url=\"http://localhost:8000\")\n  \n  result = env.reset()\n  # Returns StepResult[OpenSpielObservation] - Type safe!\n  \n  result = env.step(OpenSpielAction(action_id=2, game_name=\"catch\"))\n  # Type checker knows this is valid!\n  \n  state = env.state()\n  # Returns OpenSpielState\n\"\"\")\n\nprint(\"\u2500\" * 70)\nprint(\"\\n\ud83c\udfaf This pattern works for ANY environment you want to wrap!\\n\")",
-   "metadata": {},
    "execution_count": null,
-   "outputs": []
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from envs.openspiel_env.client import OpenSpielEnv\n",
+    "\n",
+    "print(\"=\"*70)\n",
+    "print(\"   🔌 HOW OPENENV WRAPS OPENSPIEL\")\n",
+    "print(\"=\"*70)\n",
+    "\n",
+    "print(\"\"\"\n",
+    "class OpenSpielEnv(HTTPEnvClient[OpenSpielAction, OpenSpielObservation]):\n",
+    "    \n",
+    "    def _step_payload(self, action: OpenSpielAction) -> dict:\n",
+    "        '''Convert typed action to JSON for HTTP'''\n",
+    "        return {\n",
+    "            \"action_id\": action.action_id,\n",
+    "            \"game_name\": action.game_name,\n",
+    "        }\n",
+    "    \n",
+    "    def _parse_result(self, payload: dict) -> StepResult:\n",
+    "        '''Parse HTTP JSON response into typed observation'''\n",
+    "        return StepResult(\n",
+    "            observation=OpenSpielObservation(...),\n",
+    "            reward=payload['reward'],\n",
+    "            done=payload['done']\n",
+    "        )\n",
+    "\n",
+    "\"\"\")\n",
+    "\n",
+    "print(\"─\" * 70)\n",
+    "print(\"\\n✨ Usage (works for ALL OpenEnv environments):\")\n",
+    "print(\"\"\"\n",
+    "  env = OpenSpielEnv(base_url=\"http://localhost:8000\")\n",
+    "  \n",
+    "  result = env.reset()\n",
+    "  # Returns StepResult[OpenSpielObservation] - Type safe!\n",
+    "  \n",
+    "  result = env.step(OpenSpielAction(action_id=2, game_name=\"catch\"))\n",
+    "  # Type checker knows this is valid!\n",
+    "  \n",
+    "  state = env.state()\n",
+    "  # Returns OpenSpielState\n",
+    "\"\"\")\n",
+    "\n",
+    "print(\"─\" * 70)\n",
+    "print(\"\\n🎯 This pattern works for ANY environment you want to wrap!\\n\")"
+   ]
   },
   {
    "cell_type": "code",
@@ -504,30 +657,30 @@
     "from dataclasses import fields\n",
     "\n",
     "print(\"=\"*70)\n",
-    "print(\"   \ud83c\udfae OPENSPIEL INTEGRATION - TYPE-SAFE MODELS\")\n",
+    "print(\"   🎮 OPENSPIEL INTEGRATION - TYPE-SAFE MODELS\")\n",
     "print(\"=\"*70)\n",
     "\n",
-    "print(\"\\n\ud83d\udce4 OpenSpielAction (what you send):\")\n",
-    "print(\"   \" + \"\u2500\" * 64)\n",
+    "print(\"\\n📤 OpenSpielAction (what you send):\")\n",
+    "print(\"   \" + \"─\" * 64)\n",
     "for field in fields(OpenSpielAction):\n",
-    "    print(f\"   \u2022 {field.name:20s} : {field.type}\")\n",
+    "    print(f\"   • {field.name:20s} : {field.type}\")\n",
     "\n",
-    "print(\"\\n\ud83d\udce5 OpenSpielObservation (what you receive):\")\n",
-    "print(\"   \" + \"\u2500\" * 64)\n",
+    "print(\"\\n📥 OpenSpielObservation (what you receive):\")\n",
+    "print(\"   \" + \"─\" * 64)\n",
     "for field in fields(OpenSpielObservation):\n",
-    "    print(f\"   \u2022 {field.name:20s} : {field.type}\")\n",
+    "    print(f\"   • {field.name:20s} : {field.type}\")\n",
     "\n",
-    "print(\"\\n\ud83d\udcca OpenSpielState (episode metadata):\")\n",
-    "print(\"   \" + \"\u2500\" * 64)\n",
+    "print(\"\\n📊 OpenSpielState (episode metadata):\")\n",
+    "print(\"   \" + \"─\" * 64)\n",
     "for field in fields(OpenSpielState):\n",
-    "    print(f\"   \u2022 {field.name:20s} : {field.type}\")\n",
+    "    print(f\"   • {field.name:20s} : {field.type}\")\n",
     "\n",
     "print(\"\\n\" + \"=\"*70)\n",
-    "print(\"\\n\ud83d\udca1 Type safety means:\")\n",
-    "print(\"   \u2705 Your IDE autocompletes these fields\")\n",
-    "print(\"   \u2705 Typos are caught before running\")\n",
-    "print(\"   \u2705 Refactoring is safe\")\n",
-    "print(\"   \u2705 Self-documenting code\\n\")"
+    "print(\"\\n💡 Type safety means:\")\n",
+    "print(\"   ✅ Your IDE autocompletes these fields\")\n",
+    "print(\"   ✅ Typos are caught before running\")\n",
+    "print(\"   ✅ Refactoring is safe\")\n",
+    "print(\"   ✅ Self-documenting code\\n\")"
    ]
   },
   {
@@ -540,9 +693,9 @@
     "\n",
     "The client **inherits from HTTPEnvClient** and implements 3 methods:\n",
     "\n",
-    "1. `_step_payload()` - Convert action \u2192 JSON\n",
-    "2. `_parse_result()` - Parse JSON \u2192 typed observation  \n",
-    "3. `_parse_state()` - Parse JSON \u2192 state\n",
+    "1. `_step_payload()` - Convert action → JSON\n",
+    "2. `_parse_result()` - Parse JSON → typed observation  \n",
+    "3. `_parse_state()` - Parse JSON → state\n",
     "\n",
     "That's it! The base class handles all HTTP communication.\n",
     "\n",
@@ -557,20 +710,20 @@
     "\n",
     "<div style=\"text-align: center; background: linear-gradient(135deg, #667eea 0%, #764ba2 100%); color: white; padding: 30px; border-radius: 15px; margin: 30px 0;\">\n",
     "\n",
-    "# \ud83c\udfae Part 6: Interactive Demo\n",
+    "# 🎮 Part 6: Interactive Demo\n",
     "\n",
     "### Now let's BUILD something!\n",
     "\n",
     "We'll create a **Catch game** following OpenEnv patterns,<br>\n",
-    "then watch **4 different AI policies** compete for the championship! \ud83c\udfc6\n",
+    "then watch **4 different AI policies** compete for the championship! 🏆\n",
     "\n",
     "<br>\n",
     "\n",
     "**Get ready for:**\n",
-    "- \u26a1 Live gameplay visualization\n",
-    "- \ud83e\udd16 AI policy showdown\n",
-    "- \ud83d\udcca Real-time learning metrics\n",
-    "- \ud83c\udfaf Production-ready patterns\n",
+    "- ⚡ Live gameplay visualization\n",
+    "- 🤖 AI policy showdown\n",
+    "- 📊 Real-time learning metrics\n",
+    "- 🎯 Production-ready patterns\n",
     "\n",
     "</div>"
    ]
@@ -579,18 +732,18 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "## The Game: Catch \ud83d\udd34\ud83c\udfd3\n",
+    "## The Game: Catch 🔴🏓\n",
     "\n",
     "<table>\n",
     "<tr>\n",
     "<td width=\"40%\" style=\"text-align: center;\">\n",
     "\n",
     "```\n",
-    "\u2b1c \u2b1c \ud83d\udd34 \u2b1c \u2b1c   \n",
-    "\u2b1c \u2b1c \u2b1c \u2b1c \u2b1c   Ball\n",
-    "\u2b1c \u2b1c \u2b1c \u2b1c \u2b1c   falls\n",
-    "\u2b1c \u2b1c \u2b1c \u2b1c \u2b1c   down\n",
-    "\u2b1c \u2b1c \ud83c\udfd3 \u2b1c \u2b1c   \n",
+    "⬜ ⬜ 🔴 ⬜ ⬜   \n",
+    "⬜ ⬜ ⬜ ⬜ ⬜   Ball\n",
+    "⬜ ⬜ ⬜ ⬜ ⬜   falls\n",
+    "⬜ ⬜ ⬜ ⬜ ⬜   down\n",
+    "⬜ ⬜ 🏓 ⬜ ⬜   \n",
     "     Paddle\n",
     "```\n",
     "\n",
@@ -598,18 +751,18 @@
     "<td width=\"60%\">\n",
     "\n",
     "**Rules:**\n",
-    "- 5\u00d75 grid\n",
+    "- 5×5 grid\n",
     "- Ball falls from random column\n",
     "- Move paddle to catch it\n",
     "\n",
     "**Actions:**\n",
-    "- `0` = Move LEFT \u2b05\ufe0f\n",
-    "- `1` = STAY \ud83d\uded1\n",
-    "- `2` = Move RIGHT \u27a1\ufe0f\n",
+    "- `0` = Move LEFT ⬅️\n",
+    "- `1` = STAY 🛑\n",
+    "- `2` = Move RIGHT ➡️\n",
     "\n",
     "**Reward:**\n",
-    "- `+1` if caught \ud83c\udf89\n",
-    "- `0` if missed \ud83d\ude22\n",
+    "- `+1` if caught 🎉\n",
+    "- `0` if missed 😢\n",
     "\n",
     "</td>\n",
     "</tr>\n",
@@ -617,7 +770,7 @@
     "\n",
     "<div style=\"background-color: #d4edda; padding: 15px; border-left: 5px solid #28a745; margin: 20px 0;\">\n",
     "\n",
-    "**\ud83c\udfaf Why This Game?**\n",
+    "**🎯 Why This Game?**\n",
     "- Simple rules (easy to understand)\n",
     "- Visual (see what's happening)\n",
     "- Fast episodes (~5 steps)\n",
@@ -629,10 +782,35 @@
   },
   {
    "cell_type": "code",
-   "source": "# Create environment and start a new episode\nenv = CatchEnvironment()\nobs = env.reset()\n\nprint(\"\ud83c\udfae \" + \"=\"*58 + \" \ud83c\udfae\")\nprint(\"   INITIAL GAME STATE\")\nprint(\"\ud83c\udfae \" + \"=\"*58 + \" \ud83c\udfae\\n\")\n\n# Visualize the game board\nenv.render()\n\n# Show game info\nprint(f\"\\n\ud83d\udccd Game Info:\")\nprint(f\"   \ud83d\udd34 Ball at: column {obs.ball_position[1]} (row {obs.ball_position[0]})\")\nprint(f\"   \ud83c\udfd3 Paddle at: column {obs.paddle_position}\")\n\nprint(f\"\\n\ud83d\udcca Observation Details:\")\nprint(f\"   \u2022 Legal actions: {obs.legal_actions} \u2192 [LEFT, STAY, RIGHT]\")\nprint(f\"   \u2022 Info state size: {len(obs.info_state)} (5\u00d75 grid flattened)\")\nprint(f\"   \u2022 Episode done: {obs.done}\")\nprint(f\"   \u2022 Current reward: {obs.reward}\")\n\nprint(\"\\n\ud83d\udca1 The ball will fall down each step. Can your policy catch it?\")\nprint(\"=\"*62)",
-   "metadata": {},
    "execution_count": null,
-   "outputs": []
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Create environment and start a new episode\n",
+    "env = CatchEnvironment()\n",
+    "obs = env.reset()\n",
+    "\n",
+    "print(\"🎮 \" + \"=\"*58 + \" 🎮\")\n",
+    "print(\"   INITIAL GAME STATE\")\n",
+    "print(\"🎮 \" + \"=\"*58 + \" 🎮\\n\")\n",
+    "\n",
+    "# Visualize the game board\n",
+    "env.render()\n",
+    "\n",
+    "# Show game info\n",
+    "print(f\"\\n📍 Game Info:\")\n",
+    "print(f\"   🔴 Ball at: column {obs.ball_position[1]} (row {obs.ball_position[0]})\")\n",
+    "print(f\"   🏓 Paddle at: column {obs.paddle_position}\")\n",
+    "\n",
+    "print(f\"\\n📊 Observation Details:\")\n",
+    "print(f\"   • Legal actions: {obs.legal_actions} → [LEFT, STAY, RIGHT]\")\n",
+    "print(f\"   • Info state size: {len(obs.info_state)} (5×5 grid flattened)\")\n",
+    "print(f\"   • Episode done: {obs.done}\")\n",
+    "print(f\"   • Current reward: {obs.reward}\")\n",
+    "\n",
+    "print(\"\\n💡 The ball will fall down each step. Can your policy catch it?\")\n",
+    "print(\"=\"*62)"
+   ]
   },
   {
    "cell_type": "code",
@@ -669,13 +847,13 @@
     "    Catch game following OpenEnv's Environment pattern.\n",
     "    \n",
     "    In production:\n",
-    "      \u2022 Runs in Docker container\n",
-    "      \u2022 Accessed via HTTPEnvClient\n",
-    "      \u2022 Exposed via FastAPI server\n",
+    "      • Runs in Docker container\n",
+    "      • Accessed via HTTPEnvClient\n",
+    "      • Exposed via FastAPI server\n",
     "    \n",
     "    For this demo:\n",
-    "      \u2022 We run it locally to see internals\n",
-    "      \u2022 But the structure is identical!\n",
+    "      • We run it locally to see internals\n",
+    "      • But the structure is identical!\n",
     "    \"\"\"\n",
     "    \n",
     "    def __init__(self, grid_size=5):\n",
@@ -737,24 +915,24 @@
     "            line = \"  \"\n",
     "            for col in range(self.grid_size):\n",
     "                if row == self.ball_row and col == self.ball_col:\n",
-    "                    line += \"\ud83d\udd34 \"\n",
+    "                    line += \"🔴 \"\n",
     "                elif row == self.grid_size - 1 and col == self.paddle_col:\n",
-    "                    line += \"\ud83c\udfd3 \"\n",
+    "                    line += \"🏓 \"\n",
     "                else:\n",
-    "                    line += \"\u2b1c \"\n",
+    "                    line += \"⬜ \"\n",
     "            print(line)\n",
     "\n",
     "\n",
-    "print(\"\ud83c\udf89 \" + \"=\"*64 + \" \ud83c\udf89\")\n",
-    "print(\"   \u2705 Environment Created Following OpenEnv Pattern!\")\n",
-    "print(\"\ud83c\udf89 \" + \"=\"*64 + \" \ud83c\udf89\")\n",
-    "print(\"\\n\ud83d\udccb What we just built:\")\n",
-    "print(\"   \u2022 reset() \u2192 CatchObservation (type-safe!)\")\n",
-    "print(\"   \u2022 step(action) \u2192 CatchObservation (type-safe!)\")\n",
-    "print(\"   \u2022 render() \u2192 Visual display\")\n",
-    "print(\"\\n\ud83d\ude80 In production: This would run in Docker + FastAPI\")\n",
+    "print(\"🎉 \" + \"=\"*64 + \" 🎉\")\n",
+    "print(\"   ✅ Environment Created Following OpenEnv Pattern!\")\n",
+    "print(\"🎉 \" + \"=\"*64 + \" 🎉\")\n",
+    "print(\"\\n📋 What we just built:\")\n",
+    "print(\"   • reset() → CatchObservation (type-safe!)\")\n",
+    "print(\"   • step(action) → CatchObservation (type-safe!)\")\n",
+    "print(\"   • render() → Visual display\")\n",
+    "print(\"\\n🚀 In production: This would run in Docker + FastAPI\")\n",
     "print(\"   But the structure is EXACTLY the same!\")\n",
-    "print(\"\\n\ud83d\udca1 This is your blueprint for creating ANY OpenEnv environment!\\n\")"
+    "print(\"\\n💡 This is your blueprint for creating ANY OpenEnv environment!\\n\")"
    ]
   },
   {
@@ -773,7 +951,7 @@
     "---\n",
     "\n",
     "<a id=\"part-7\"></a>\n",
-    "# Part 7: Four Policies \ud83e\udd16\n",
+    "# Part 7: Four Policies 🤖\n",
     "\n",
     "<div style=\"background-color: #f8f9fa; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n",
     "\n",
@@ -786,22 +964,22 @@
     "<th width=\"25%\">Expected Performance</th>\n",
     "</tr>\n",
     "<tr>\n",
-    "<td><b>\ud83c\udfb2 Random</b></td>\n",
+    "<td><b>🎲 Random</b></td>\n",
     "<td>Pick random action every step</td>\n",
     "<td>~20% (pure luck)</td>\n",
     "</tr>\n",
     "<tr>\n",
-    "<td><b>\ud83d\uded1 Always Stay</b></td>\n",
+    "<td><b>🛑 Always Stay</b></td>\n",
     "<td>Never move, hope ball lands in center</td>\n",
     "<td>~20% (terrible!)</td>\n",
     "</tr>\n",
     "<tr>\n",
-    "<td><b>\ud83e\udde0 Smart</b></td>\n",
+    "<td><b>🧠 Smart</b></td>\n",
     "<td>Move paddle toward ball</td>\n",
     "<td>100% (optimal!)</td>\n",
     "</tr>\n",
     "<tr>\n",
-    "<td><b>\ud83d\udcc8 Learning</b></td>\n",
+    "<td><b>📈 Learning</b></td>\n",
     "<td>Start random, learn smart strategy</td>\n",
     "<td>~85% (improves over time)</td>\n",
     "</tr>\n",
@@ -816,7 +994,7 @@
    "source": [
     "---\n",
     "\n",
-    "# Part 7: Four Policies \ud83e\udd16\n",
+    "# Part 7: Four Policies 🤖\n",
     "\n",
     "<div style=\"background-color: #f8f9fa; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n",
     "\n",
@@ -829,22 +1007,22 @@
     "<th width=\"25%\">Expected Performance</th>\n",
     "</tr>\n",
     "<tr>\n",
-    "<td><b>\ud83c\udfb2 Random</b></td>\n",
+    "<td><b>🎲 Random</b></td>\n",
     "<td>Pick random action every step</td>\n",
     "<td>~20% (pure luck)</td>\n",
     "</tr>\n",
     "<tr>\n",
-    "<td><b>\ud83d\uded1 Always Stay</b></td>\n",
+    "<td><b>🛑 Always Stay</b></td>\n",
     "<td>Never move, hope ball lands in center</td>\n",
     "<td>~20% (terrible!)</td>\n",
     "</tr>\n",
     "<tr>\n",
-    "<td><b>\ud83e\udde0 Smart</b></td>\n",
+    "<td><b>🧠 Smart</b></td>\n",
     "<td>Move paddle toward ball</td>\n",
     "<td>100% (optimal!)</td>\n",
     "</tr>\n",
     "<tr>\n",
-    "<td><b>\ud83d\udcc8 Learning</b></td>\n",
+    "<td><b>📈 Learning</b></td>\n",
     "<td>Start random, learn smart strategy</td>\n",
     "<td>~85% (improves over time)</td>\n",
     "</tr>\n",
@@ -865,7 +1043,7 @@
     "\n",
     "class RandomPolicy:\n",
     "    \"\"\"Baseline: Pure random guessing.\"\"\"\n",
-    "    name = \"\ud83c\udfb2 Random Guesser\"\n",
+    "    name = \"🎲 Random Guesser\"\n",
     "    \n",
     "    def select_action(self, obs: CatchObservation) -> int:\n",
     "        return random.choice(obs.legal_actions)\n",
@@ -873,7 +1051,7 @@
     "\n",
     "class AlwaysStayPolicy:\n",
     "    \"\"\"Bad strategy: Never moves.\"\"\"\n",
-    "    name = \"\ud83d\uded1 Always Stay\"\n",
+    "    name = \"🛑 Always Stay\"\n",
     "    \n",
     "    def select_action(self, obs: CatchObservation) -> int:\n",
     "        return 1  # STAY\n",
@@ -881,7 +1059,7 @@
     "\n",
     "class SmartPolicy:\n",
     "    \"\"\"Optimal: Move paddle toward ball.\"\"\"\n",
-    "    name = \"\ud83e\udde0 Smart Heuristic\"\n",
+    "    name = \"🧠 Smart Heuristic\"\n",
     "    \n",
     "    def select_action(self, obs: CatchObservation) -> int:\n",
     "        ball_col = obs.ball_position[1]\n",
@@ -897,7 +1075,7 @@
     "\n",
     "class LearningPolicy:\n",
     "    \"\"\"Simulated RL: Epsilon-greedy exploration.\"\"\"\n",
-    "    name = \"\ud83d\udcc8 Learning Agent\"\n",
+    "    name = \"📈 Learning Agent\"\n",
     "    \n",
     "    def __init__(self):\n",
     "        self.steps = 0\n",
@@ -923,16 +1101,16 @@
     "                return 1\n",
     "\n",
     "\n",
-    "print(\"\ud83e\udd16 \" + \"=\"*64 + \" \ud83e\udd16\")\n",
-    "print(\"   \u2705 4 Policies Created!\")\n",
-    "print(\"\ud83e\udd16 \" + \"=\"*64 + \" \ud83e\udd16\\n\")\n",
+    "print(\"🤖 \" + \"=\"*64 + \" 🤖\")\n",
+    "print(\"   ✅ 4 Policies Created!\")\n",
+    "print(\"🤖 \" + \"=\"*64 + \" 🤖\\n\")\n",
     "\n",
     "policies = [RandomPolicy(), AlwaysStayPolicy(), SmartPolicy(), LearningPolicy()]\n",
     "for i, policy in enumerate(policies, 1):\n",
     "    print(f\"   {i}. {policy.name}\")\n",
     "\n",
-    "print(\"\\n\ud83d\udca1 Each policy represents a different approach to solving the game!\")\n",
-    "print(\"   Let's see who performs best! \ud83c\udfc6\\n\")"
+    "print(\"\\n💡 Each policy represents a different approach to solving the game!\")\n",
+    "print(\"   Let's see who performs best! 🏆\\n\")"
    ]
   },
   {
@@ -958,15 +1136,15 @@
     "    \n",
     "    if visualize:\n",
     "        print(f\"\\n{'='*60}\")\n",
-    "        print(f\"   \ud83c\udfae {policy.name}\")\n",
-    "        print(f\"   \ud83d\udd34 Ball will fall at column: {obs.ball_position[1]}\")\n",
+    "        print(f\"   🎮 {policy.name}\")\n",
+    "        print(f\"   🔴 Ball will fall at column: {obs.ball_position[1]}\")\n",
     "        print('='*60 + '\\n')\n",
     "        env.render()\n",
     "        time.sleep(delay)\n",
     "    \n",
     "    total_reward = 0\n",
     "    step = 0\n",
-    "    action_names = [\"\u2b05\ufe0f  LEFT\", \"\ud83d\uded1 STAY\", \"\u27a1\ufe0f  RIGHT\"]\n",
+    "    action_names = [\"⬅️  LEFT\", \"🛑 STAY\", \"➡️  RIGHT\"]\n",
     "    \n",
     "    # THE RL LOOP\n",
     "    while not obs.done:\n",
@@ -980,14 +1158,14 @@
     "        total_reward += obs.reward\n",
     "        \n",
     "        if visualize:\n",
-    "            print(f\"\\n\ud83d\udccd Step {step + 1}: {action_names[action]}\")\n",
+    "            print(f\"\\n📍 Step {step + 1}: {action_names[action]}\")\n",
     "            env.render()\n",
     "            time.sleep(delay)\n",
     "        \n",
     "        step += 1\n",
     "    \n",
     "    if visualize:\n",
-    "        result = \"\ud83c\udf89 CAUGHT!\" if total_reward > 0 else \"\ud83d\ude22 MISSED\"\n",
+    "        result = \"🎉 CAUGHT!\" if total_reward > 0 else \"😢 MISSED\"\n",
     "        print(f\"\\n{'='*60}\")\n",
     "        print(f\"   {result} Reward: {total_reward}\")\n",
     "        print('='*60)\n",
@@ -1008,7 +1186,7 @@
     "---\n",
     "\n",
     "<a id=\"part-8\"></a>\n",
-    "# Part 8: Policy Competition! \ud83c\udfc6\n",
+    "# Part 8: Policy Competition! 🏆\n",
     "\n",
     "<div style=\"background-color: #e7f3ff; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n",
     "\n",
@@ -1023,7 +1201,7 @@
    "source": [
     "---\n",
     "\n",
-    "# Part 8: Policy Competition! \ud83c\udfc6\n",
+    "# Part 8: Policy Competition! 🏆\n",
     "\n",
     "<div style=\"background-color: #e7f3ff; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n",
     "\n",
@@ -1047,49 +1225,49 @@
     "        LearningPolicy(),\n",
     "    ]\n",
     "    \n",
-    "    print(\"\\n\ud83c\udfc6 \" + \"=\"*66 + \" \ud83c\udfc6\")\n",
+    "    print(\"\\n🏆 \" + \"=\"*66 + \" 🏆\")\n",
     "    print(f\"   POLICY SHOWDOWN - {num_episodes} Episodes Each\")\n",
-    "    print(\"\ud83c\udfc6 \" + \"=\"*66 + \" \ud83c\udfc6\\n\")\n",
+    "    print(\"🏆 \" + \"=\"*66 + \" 🏆\\n\")\n",
     "    \n",
     "    results = []\n",
     "    for policy in policies:\n",
-    "        print(f\"\u26a1 Testing {policy.name}...\", end=\" \")\n",
+    "        print(f\"⚡ Testing {policy.name}...\", end=\" \")\n",
     "        env = CatchEnvironment()\n",
     "        successes = sum(run_episode(env, policy, visualize=False) \n",
     "                       for _ in range(num_episodes))\n",
     "        success_rate = (successes / num_episodes) * 100\n",
     "        results.append((policy.name, success_rate, successes))\n",
-    "        print(f\"\u2713 Done!\")\n",
+    "        print(f\"✓ Done!\")\n",
     "    \n",
     "    print(\"\\n\" + \"=\"*70)\n",
-    "    print(\"   \ud83d\udcca FINAL RESULTS\")\n",
+    "    print(\"   📊 FINAL RESULTS\")\n",
     "    print(\"=\"*70 + \"\\n\")\n",
     "    \n",
     "    # Sort by success rate (descending)\n",
     "    results.sort(key=lambda x: x[1], reverse=True)\n",
     "    \n",
     "    # Award medals to top 3\n",
-    "    medals = [\"\ud83e\udd47\", \"\ud83e\udd48\", \"\ud83e\udd49\", \"  \"]\n",
+    "    medals = [\"🥇\", \"🥈\", \"🥉\", \"  \"]\n",
     "    \n",
     "    for i, (name, rate, successes) in enumerate(results):\n",
     "        medal = medals[i]\n",
-    "        bar = \"\u2588\" * int(rate / 2)\n",
+    "        bar = \"█\" * int(rate / 2)\n",
     "        print(f\"{medal} {name:25s} [{bar:<50}] {rate:5.1f}% ({successes}/{num_episodes})\")\n",
     "    \n",
     "    print(\"\\n\" + \"=\"*70)\n",
-    "    print(\"\\n\u2728 Key Insights:\")\n",
-    "    print(\"   \u2022 Random (~20%):      Baseline - pure luck \ud83c\udfb2\")\n",
-    "    print(\"   \u2022 Always Stay (~20%): Bad strategy - stays center \ud83d\uded1\")\n",
-    "    print(\"   \u2022 Smart (100%):       Optimal - perfect play! \ud83e\udde0\")\n",
-    "    print(\"   \u2022 Learning (~85%):    Improves over time \ud83d\udcc8\")\n",
-    "    print(\"\\n\ud83c\udf93 This is Reinforcement Learning in action:\")\n",
+    "    print(\"\\n✨ Key Insights:\")\n",
+    "    print(\"   • Random (~20%):      Baseline - pure luck 🎲\")\n",
+    "    print(\"   • Always Stay (~20%): Bad strategy - stays center 🛑\")\n",
+    "    print(\"   • Smart (100%):       Optimal - perfect play! 🧠\")\n",
+    "    print(\"   • Learning (~85%):    Improves over time 📈\")\n",
+    "    print(\"\\n🎓 This is Reinforcement Learning in action:\")\n",
     "    print(\"   1. Start with exploration (trying random things)\")\n",
     "    print(\"   2. Learn from rewards (what works, what doesn't)\")\n",
     "    print(\"   3. Converge to optimal behavior (smart strategy)\")\n",
-    "    print(\"\\n\ud83c\udfaf The Learning Agent gets smarter with every episode!\\n\")\n",
+    "    print(\"\\n🎯 The Learning Agent gets smarter with every episode!\\n\")\n",
     "\n",
     "# Run the epic competition!\n",
-    "print(\"\ud83c\udfae Starting the showdown...\")\n",
+    "print(\"🎮 Starting the showdown...\")\n",
     "evaluate_policies(num_episodes=50)"
    ]
   },
@@ -1100,7 +1278,7 @@
     "---\n",
     "\n",
     "<a id=\"part-9\"></a>\n",
-    "# Part 9: Using Real OpenSpiel \ud83c\udfae\n",
+    "# Part 9: Using Real OpenSpiel 🎮\n",
     "\n",
     "<div style=\"background-color: #d4edda; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n",
     "\n",
@@ -1129,8 +1307,8 @@
     "</tr>\n",
     "<tr>\n",
     "<td><b>Type Safety</b></td>\n",
-    "<td>\u2705 Dataclasses</td>\n",
-    "<td>\u2705 Dataclasses</td>\n",
+    "<td>✅ Dataclasses</td>\n",
+    "<td>✅ Dataclasses</td>\n",
     "</tr>\n",
     "<tr>\n",
     "<td><b>API</b></td>\n",
@@ -1139,7 +1317,7 @@
     "</tr>\n",
     "</table>\n",
     "\n",
-    "**\ud83c\udfaf Same structure, production features!**\n",
+    "**🎯 Same structure, production features!**\n",
     "\n",
     "</div>\n",
     "\n",
@@ -1166,10 +1344,10 @@
     "\n",
     "<div style=\"background-color: #fff3e0; padding: 15px; border-radius: 5px; margin: 20px 0;\">\n",
     "\n",
-    "**\ud83c\udfae 6 Games Available:**\n",
+    "**🎮 6 Games Available:**\n",
     "\n",
     "1. `\"catch\"` - What we just built!\n",
-    "2. `\"tic_tac_toe\"` - Classic 3\u00d73\n",
+    "2. `\"tic_tac_toe\"` - Classic 3×3\n",
     "3. `\"kuhn_poker\"` - Imperfect information poker\n",
     "4. `\"cliff_walking\"` - Grid navigation\n",
     "5. `\"2048\"` - Tile puzzle\n",
@@ -1187,7 +1365,7 @@
     "---\n",
     "\n",
     "<a id=\"part-10\"></a>\n",
-    "# Part 10: Create Your Own Integration \ud83d\udee0\ufe0f\n",
+    "# Part 10: Create Your Own Integration 🛠️\n",
     "\n",
     "<div style=\"background-color: #e7f3ff; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n",
     "\n",
@@ -1291,7 +1469,7 @@
     "\n",
     "<div style=\"background-color: #d4edda; padding: 20px; border-left: 5px solid #28a745; margin: 20px 0;\">\n",
     "\n",
-    "### \ud83c\udf93 Examples to Study\n",
+    "### 🎓 Examples to Study\n",
     "\n",
     "OpenEnv includes 3 complete examples:\n",
     "\n",
@@ -1309,7 +1487,7 @@
     "   - Shows complex use case\n",
     "   - Security considerations\n",
     "\n",
-    "**\ud83d\udca1 Study these to understand the patterns!**\n",
+    "**💡 Study these to understand the patterns!**\n",
     "\n",
     "</div>"
    ]
@@ -1323,7 +1501,7 @@
     "<a id=\"summary\"></a>\n",
     "<div style=\"background: linear-gradient(135deg, #667eea 0%, #764ba2 100%); color: white; padding: 40px; border-radius: 15px; margin: 40px 0; text-align: center;\">\n",
     "\n",
-    "# \ud83c\udf93 Summary: Your Journey\n",
+    "# 🎓 Summary: Your Journey\n",
     "\n",
     "</div>"
    ]
@@ -1336,7 +1514,7 @@
     "\n",
     "<div style=\"background: linear-gradient(135deg, #667eea 0%, #764ba2 100%); color: white; padding: 40px; border-radius: 15px; margin: 40px 0; text-align: center;\">\n",
     "\n",
-    "# \ud83c\udf93 Summary: Your Journey\n",
+    "# 🎓 Summary: Your Journey\n",
     "\n",
     "</div>"
    ]
@@ -1351,19 +1529,19 @@
     "<tr>\n",
     "<td width=\"50%\" style=\"vertical-align: top;\">\n",
     "\n",
-    "### \ud83d\udcda Concepts\n",
+    "### 📚 Concepts\n",
     "\n",
-    "\u2705 **RL Fundamentals**\n",
+    "✅ **RL Fundamentals**\n",
     "- The observe-act-reward loop\n",
     "- What makes good policies\n",
     "- Exploration vs exploitation\n",
     "\n",
-    "\u2705 **OpenEnv Architecture**\n",
+    "✅ **OpenEnv Architecture**\n",
     "- Client-server separation\n",
     "- Type-safe contracts\n",
     "- HTTP communication layer\n",
     "\n",
-    "\u2705 **Production Patterns**\n",
+    "✅ **Production Patterns**\n",
     "- Docker isolation\n",
     "- API design\n",
     "- Reproducible deployments\n",
@@ -1371,19 +1549,19 @@
     "</td>\n",
     "<td width=\"50%\" style=\"vertical-align: top;\">\n",
     "\n",
-    "### \ud83d\udee0\ufe0f Skills\n",
+    "### 🛠️ Skills\n",
     "\n",
-    "\u2705 **Using Environments**\n",
+    "✅ **Using Environments**\n",
     "- Import OpenEnv clients\n",
     "- Call reset/step/state\n",
     "- Work with typed observations\n",
     "\n",
-    "\u2705 **Building Environments**\n",
+    "✅ **Building Environments**\n",
     "- Define type-safe models\n",
     "- Implement Environment class\n",
     "- Create HTTPEnvClient\n",
     "\n",
-    "\u2705 **Testing & Debugging**\n",
+    "✅ **Testing & Debugging**\n",
     "- Compare policies\n",
     "- Visualize episodes\n",
     "- Measure performance\n",
@@ -1408,45 +1586,45 @@
     "</tr>\n",
     "<tr>\n",
     "<td><b>Type Safety</b></td>\n",
-    "<td>\u274c Arrays, dicts</td>\n",
-    "<td>\u2705 Dataclasses</td>\n",
-    "<td>\ud83c\udfc6 OpenEnv</td>\n",
+    "<td>❌ Arrays, dicts</td>\n",
+    "<td>✅ Dataclasses</td>\n",
+    "<td>🏆 OpenEnv</td>\n",
     "</tr>\n",
     "<tr>\n",
     "<td><b>Isolation</b></td>\n",
-    "<td>\u274c Same process</td>\n",
-    "<td>\u2705 Docker</td>\n",
-    "<td>\ud83c\udfc6 OpenEnv</td>\n",
+    "<td>❌ Same process</td>\n",
+    "<td>✅ Docker</td>\n",
+    "<td>🏆 OpenEnv</td>\n",
     "</tr>\n",
     "<tr>\n",
     "<td><b>Deployment</b></td>\n",
-    "<td>\u274c Manual setup</td>\n",
-    "<td>\u2705 K8s-ready</td>\n",
-    "<td>\ud83c\udfc6 OpenEnv</td>\n",
+    "<td>❌ Manual setup</td>\n",
+    "<td>✅ K8s-ready</td>\n",
+    "<td>🏆 OpenEnv</td>\n",
     "</tr>\n",
     "<tr>\n",
     "<td><b>Language</b></td>\n",
-    "<td>\u274c Python only</td>\n",
-    "<td>\u2705 Any (HTTP)</td>\n",
-    "<td>\ud83c\udfc6 OpenEnv</td>\n",
+    "<td>❌ Python only</td>\n",
+    "<td>✅ Any (HTTP)</td>\n",
+    "<td>🏆 OpenEnv</td>\n",
     "</tr>\n",
     "<tr>\n",
     "<td><b>Reproducibility</b></td>\n",
-    "<td>\u274c \"Works on my machine\"</td>\n",
-    "<td>\u2705 Same everywhere</td>\n",
-    "<td>\ud83c\udfc6 OpenEnv</td>\n",
+    "<td>❌ \"Works on my machine\"</td>\n",
+    "<td>✅ Same everywhere</td>\n",
+    "<td>🏆 OpenEnv</td>\n",
     "</tr>\n",
     "<tr>\n",
     "<td><b>Community</b></td>\n",
-    "<td>\u2705 Large ecosystem</td>\n",
-    "<td>\ud83d\udfe1 Growing</td>\n",
-    "<td>\ud83e\udd1d Both!</td>\n",
+    "<td>✅ Large ecosystem</td>\n",
+    "<td>🟡 Growing</td>\n",
+    "<td>🤝 Both!</td>\n",
     "</tr>\n",
     "</table>\n",
     "\n",
     "<div style=\"background-color: #e7f3ff; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n",
     "\n",
-    "**\ud83c\udfaf The Bottom Line**\n",
+    "**🎯 The Bottom Line**\n",
     "\n",
     "OpenEnv brings **production engineering** to RL:\n",
     "- Same environments work locally and in production\n",
@@ -1464,33 +1642,33 @@
    "metadata": {},
    "source": [
     "<a id=\"resources\"></a>\n",
-    "## \ud83d\udcda Resources\n",
+    "## 📚 Resources\n",
     "\n",
     "<div style=\"background-color: #f8f9fa; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n",
     "\n",
-    "### \ud83d\udd17 Essential Links\n",
+    "### 🔗 Essential Links\n",
     "\n",
-    "- **\ud83c\udfe0 OpenEnv GitHub**: https://github.com/meta-pytorch/OpenEnv\n",
-    "- **\ud83c\udfae OpenSpiel**: https://github.com/google-deepmind/open_spiel\n",
-    "- **\u26a1 FastAPI Docs**: https://fastapi.tiangolo.com/\n",
-    "- **\ud83d\udc33 Docker Guide**: https://docs.docker.com/get-started/\n",
-    "- **\ud83d\udd25 PyTorch**: https://pytorch.org/\n",
+    "- **🏠 OpenEnv GitHub**: https://github.com/meta-pytorch/OpenEnv\n",
+    "- **🎮 OpenSpiel**: https://github.com/google-deepmind/open_spiel\n",
+    "- **⚡ FastAPI Docs**: https://fastapi.tiangolo.com/\n",
+    "- **🐳 Docker Guide**: https://docs.docker.com/get-started/\n",
+    "- **🔥 PyTorch**: https://pytorch.org/\n",
     "\n",
-    "### \ud83d\udcd6 Documentation Deep Dives\n",
+    "### 📖 Documentation Deep Dives\n",
     "\n",
     "- **Environment Creation Guide**: `src/envs/README.md`\n",
     "- **OpenSpiel Integration**: `src/envs/openspiel_env/README.md`\n",
     "- **Example Scripts**: `examples/`\n",
     "- **RFC 001**: [Baseline API Specs](https://github.com/meta-pytorch/OpenEnv/pull/26)\n",
     "\n",
-    "### \ud83c\udf93 Community & Support\n",
+    "### 🎓 Community & Support\n",
     "\n",
     "**Supported by amazing organizations:**\n",
-    "- \ud83d\udd25 Meta PyTorch\n",
-    "- \ud83e\udd17 Hugging Face\n",
-    "- \u26a1 Unsloth AI\n",
-    "- \ud83c\udf1f Reflection AI\n",
-    "- \ud83d\ude80 And many more!\n",
+    "- 🔥 Meta PyTorch\n",
+    "- 🤗 Hugging Face\n",
+    "- ⚡ Unsloth AI\n",
+    "- 🌟 Reflection AI\n",
+    "- 🚀 And many more!\n",
     "\n",
     "**License**: BSD 3-Clause (very permissive!)\n",
     "\n",
@@ -1500,46 +1678,46 @@
     "\n",
     "---\n",
     "\n",
-    "### \ud83c\udf08 What's Next?\n",
+    "### 🌈 What's Next?\n",
     "\n",
-    "1. \u2b50 **Star the repo** to show support and stay updated\n",
-    "2. \ud83d\udd04 **Try modifying** the Catch game (make it harder? bigger grid?)\n",
-    "3. \ud83c\udfae **Explore** other OpenSpiel games\n",
-    "4. \ud83d\udee0\ufe0f **Build** your own environment integration\n",
-    "5. \ud83d\udcac **Share** what you build with the community!"
+    "1. ⭐ **Star the repo** to show support and stay updated\n",
+    "2. 🔄 **Try modifying** the Catch game (make it harder? bigger grid?)\n",
+    "3. 🎮 **Explore** other OpenSpiel games\n",
+    "4. 🛠️ **Build** your own environment integration\n",
+    "5. 💬 **Share** what you build with the community!"
    ]
   },
   {
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "## \ud83d\udcda Resources\n",
+    "## 📚 Resources\n",
     "\n",
     "<div style=\"background-color: #f8f9fa; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n",
     "\n",
-    "### \ud83d\udd17 Essential Links\n",
+    "### 🔗 Essential Links\n",
     "\n",
-    "- **\ud83c\udfe0 OpenEnv GitHub**: https://github.com/meta-pytorch/OpenEnv\n",
-    "- **\ud83c\udfae OpenSpiel**: https://github.com/google-deepmind/open_spiel\n",
-    "- **\u26a1 FastAPI Docs**: https://fastapi.tiangolo.com/\n",
-    "- **\ud83d\udc33 Docker Guide**: https://docs.docker.com/get-started/\n",
-    "- **\ud83d\udd25 PyTorch**: https://pytorch.org/\n",
+    "- **🏠 OpenEnv GitHub**: https://github.com/meta-pytorch/OpenEnv\n",
+    "- **🎮 OpenSpiel**: https://github.com/google-deepmind/open_spiel\n",
+    "- **⚡ FastAPI Docs**: https://fastapi.tiangolo.com/\n",
+    "- **🐳 Docker Guide**: https://docs.docker.com/get-started/\n",
+    "- **🔥 PyTorch**: https://pytorch.org/\n",
     "\n",
-    "### \ud83d\udcd6 Documentation Deep Dives\n",
+    "### 📖 Documentation Deep Dives\n",
     "\n",
     "- **Environment Creation Guide**: `src/envs/README.md`\n",
     "- **OpenSpiel Integration**: `src/envs/openspiel_env/README.md`\n",
     "- **Example Scripts**: `examples/`\n",
     "- **RFC 001**: [Baseline API Specs](https://github.com/meta-pytorch/OpenEnv/pull/26)\n",
     "\n",
-    "### \ud83c\udf93 Community & Support\n",
+    "### 🎓 Community & Support\n",
     "\n",
     "**Supported by amazing organizations:**\n",
-    "- \ud83d\udd25 Meta PyTorch\n",
-    "- \ud83e\udd17 Hugging Face\n",
-    "- \u26a1 Unsloth AI\n",
-    "- \ud83c\udf1f Reflection AI\n",
-    "- \ud83d\ude80 And many more!\n",
+    "- 🔥 Meta PyTorch\n",
+    "- 🤗 Hugging Face\n",
+    "- ⚡ Unsloth AI\n",
+    "- 🌟 Reflection AI\n",
+    "- 🚀 And many more!\n",
     "\n",
     "**License**: BSD 3-Clause (very permissive!)\n",
     "\n",
@@ -1549,13 +1727,13 @@
     "\n",
     "---\n",
     "\n",
-    "### \ud83c\udf08 What's Next?\n",
+    "### 🌈 What's Next?\n",
     "\n",
-    "1. \u2b50 **Star the repo** to show support and stay updated\n",
-    "2. \ud83d\udd04 **Try modifying** the Catch game (make it harder? bigger grid?)\n",
-    "3. \ud83c\udfae **Explore** other OpenSpiel games\n",
-    "4. \ud83d\udee0\ufe0f **Build** your own environment integration\n",
-    "5. \ud83d\udcac **Share** what you build with the community!"
+    "1. ⭐ **Star the repo** to show support and stay updated\n",
+    "2. 🔄 **Try modifying** the Catch game (make it harder? bigger grid?)\n",
+    "3. 🎮 **Explore** other OpenSpiel games\n",
+    "4. 🛠️ **Build** your own environment integration\n",
+    "5. 💬 **Share** what you build with the community!"
    ]
   },
   {
@@ -1566,38 +1744,38 @@
     "\n",
     "<div style=\"background: linear-gradient(135deg, #f093fb 0%, #f5576c 100%); color: white; padding: 50px; border-radius: 20px; margin: 40px 0; text-align: center;\">\n",
     "\n",
-    "# \ud83c\udf89 Congratulations! You Did It! \ud83c\udf89\n",
+    "# 🎉 Congratulations! You Did It! 🎉\n",
     "\n",
     "### You're now an OpenEnv expert!\n",
     "\n",
     "<br>\n",
     "\n",
-    "## \u2705 What You've Mastered:\n",
+    "## ✅ What You've Mastered:\n",
     "\n",
-    "**\ud83e\udde0 Concepts**\n",
+    "**🧠 Concepts**\n",
     "- How RL works (the observe-act-reward loop)\n",
     "- Why OpenEnv matters (production-ready RL)\n",
     "- How to use existing environments\n",
     "\n",
-    "**\ud83d\udee0\ufe0f Practical Skills**\n",
+    "**🛠️ Practical Skills**\n",
     "- Creating new integrations\n",
     "- Building type-safe environments\n",
     "- Deploying to production\n",
     "\n",
-    "**\ud83c\udfaf Real Experience**\n",
+    "**🎯 Real Experience**\n",
     "- Built a complete RL environment\n",
     "- Tested multiple policies\n",
     "- Watched learning happen in real-time!\n",
     "\n",
     "---\n",
     "\n",
-    "### Now go build something amazing! \ud83d\ude80\n",
+    "### Now go build something amazing! 🚀\n",
     "\n",
     "**Welcome to the future of RL with PyTorch & OpenEnv**\n",
     "\n",
     "<br>\n",
     "\n",
-    "[![Star on GitHub](https://img.shields.io/badge/\u2b50_Star_on_GitHub-gray?style=for-the-badge)](https://github.com/meta-pytorch/OpenEnv)\n",
+    "[![Star on GitHub](https://img.shields.io/badge/⭐_Star_on_GitHub-gray?style=for-the-badge)](https://github.com/meta-pytorch/OpenEnv)\n",
     "\n",
     "</div>\n",
     "\n",
@@ -1605,16 +1783,16 @@
     "\n",
     "<div style=\"background-color: #f0f7ff; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n",
     "\n",
-    "## \ud83c\udf1f Want to Learn More?\n",
+    "## 🌟 Want to Learn More?\n",
     "\n",
-    "- \ud83d\udcd6 Check out the [docs](https://github.com/meta-pytorch/OpenEnv)\n",
-    "- \ud83c\udfae Try the other example games\n",
-    "- \ud83d\udcac Join the community discussions\n",
-    "- \ud83d\udee0\ufe0f Build your own integration\n",
-    "- \ud83d\ude80 Deploy to production\n",
-    "- \u2b50 Star the repo to stay updated!\n",
+    "- 📖 Check out the [docs](https://github.com/meta-pytorch/OpenEnv)\n",
+    "- 🎮 Try the other example games\n",
+    "- 💬 Join the community discussions\n",
+    "- 🛠️ Build your own integration\n",
+    "- 🚀 Deploy to production\n",
+    "- ⭐ Star the repo to stay updated!\n",
     "\n",
-    "**Happy coding! \ud83c\udf8a**\n",
+    "**Happy coding! 🎊**\n",
     "\n",
     "</div>"
    ]
@@ -1641,4 +1819,4 @@
  },
  "nbformat": 4,
  "nbformat_minor": 4
-}
\ No newline at end of file
+}

From bb75d0e778466bca8eb3279b25917ef43789b375 Mon Sep 17 00:00:00 2001
From: Sanyam Bhutani <sanyambhutani@meta.com>
Date: Mon, 20 Oct 2025 14:17:29 -0700
Subject: [PATCH 18/19] Update OpenEnv_Tutorial.ipynb

---
 examples/OpenEnv_Tutorial.ipynb | 341 +++-----------------------------
 1 file changed, 30 insertions(+), 311 deletions(-)

diff --git a/examples/OpenEnv_Tutorial.ipynb b/examples/OpenEnv_Tutorial.ipynb
index 68a4568..0b2b481 100644
--- a/examples/OpenEnv_Tutorial.ipynb
+++ b/examples/OpenEnv_Tutorial.ipynb
@@ -122,72 +122,6 @@
     "---"
    ]
   },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# Detect environment\n",
-    "try:\n",
-    "    import google.colab\n",
-    "    IN_COLAB = True\n",
-    "    print(\"🌐 Running in Google Colab - Perfect!\")\n",
-    "except ImportError:\n",
-    "    IN_COLAB = False\n",
-    "    print(\"💻 Running locally - Nice!\")\n",
-    "\n",
-    "if IN_COLAB:\n",
-    "    print(\"\\n📦 Cloning OpenEnv repository...\")\n",
-    "    !git clone https://github.com/meta-pytorch/OpenEnv.git > /dev/null 2>&1\n",
-    "    %cd OpenEnv\n",
-    "    \n",
-    "    print(\"📚 Installing dependencies (this takes ~10 seconds)...\")\n",
-    "    !pip install -q fastapi uvicorn requests\n",
-    "    \n",
-    "    import sys\n",
-    "    sys.path.insert(0, './src')\n",
-    "    print(\"\\n✅ Setup complete! Everything is ready to go! 🎉\")\n",
-    "else:\n",
-    "    import sys\n",
-    "    from pathlib import Path\n",
-    "    sys.path.insert(0, str(Path.cwd().parent / 'src'))\n",
-    "    print(\"✅ Using local OpenEnv installation\")\n",
-    "\n",
-    "print(\"\\n🚀 Ready to explore OpenEnv and build amazing things!\")\n",
-    "print(\"💡 Tip: Run cells top-to-bottom for the best experience.\\n\")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "---\n",
-    "\n",
-    "<a id=\"part-1\"></a>\n",
-    "# Part 1: RL in 60 Seconds ⏱️\n",
-    "\n",
-    "<div style=\"background-color: #f0f7ff; padding: 20px; border-left: 5px solid #2196F3; margin: 20px 0;\">\n",
-    "\n",
-    "**Reinforcement Learning is simpler than you think.**\n",
-    "\n",
-    "It's just a loop:\n",
-    "\n",
-    "```\n",
-    "while not done:\n",
-    "    observation = environment.observe()\n",
-    "    action = policy.choose(observation)\n",
-    "    reward = environment.step(action)\n",
-    "    policy.learn(reward)\n",
-    "```\n",
-    "\n",
-    "That's it. That's RL.\n",
-    "\n",
-    "</div>\n",
-    "\n",
-    "Let's see it in action:"
-   ]
-  },
   {
    "cell_type": "markdown",
    "metadata": {},
@@ -387,7 +321,6 @@
    "source": [
     "---\n",
     "\n",
-    "<a id=\"part-3\"></a>\n",
     "# Part 3: Setup 🛠️\n",
     "\n",
     "<div style=\"background-color: #f8f9fa; padding: 15px; border-radius: 5px; margin: 20px 0;\">\n",
@@ -400,27 +333,44 @@
    ]
   },
   {
-   "cell_type": "markdown",
+   "cell_type": "code",
+   "execution_count": null,
    "metadata": {},
+   "outputs": [],
    "source": [
-    "---\n",
-    "\n",
-    "# Part 3: Setup 🛠️\n",
-    "\n",
-    "<div style=\"background-color: #f8f9fa; padding: 15px; border-radius: 5px; margin: 20px 0;\">\n",
-    "\n",
-    "**Running in Colab?** This cell will clone OpenEnv and install dependencies automatically.\n",
+    "# Detect environment\n",
+    "try:\n",
+    "    import google.colab\n",
+    "    IN_COLAB = True\n",
+    "    print(\"🌐 Running in Google Colab - Perfect!\")\n",
+    "except ImportError:\n",
+    "    IN_COLAB = False\n",
+    "    print(\"💻 Running locally - Nice!\")\n",
     "\n",
-    "**Running locally?** Make sure you're in the OpenEnv directory.\n",
+    "if IN_COLAB:\n",
+    "    print(\"\\n📦 Cloning OpenEnv repository...\")\n",
+    "    !git clone https://github.com/meta-pytorch/OpenEnv.git > /dev/null 2>&1\n",
+    "    %cd OpenEnv\n",
+    "    \n",
+    "    print(\"📚 Installing dependencies (this takes ~10 seconds)...\")\n",
+    "    !pip install -q fastapi uvicorn requests\n",
+    "    \n",
+    "    import sys\n",
+    "    sys.path.insert(0, './src')\n",
+    "    print(\"\\n✅ Setup complete! Everything is ready to go! 🎉\")\n",
+    "else:\n",
+    "    import sys\n",
+    "    from pathlib import Path\n",
+    "    sys.path.insert(0, str(Path.cwd().parent / 'src'))\n",
+    "    print(\"✅ Using local OpenEnv installation\")\n",
     "\n",
-    "</div>"
+    "print(\"\\n🚀 Ready to explore OpenEnv and build amazing things!\")\n",
+    "print(\"💡 Tip: Run cells top-to-bottom for the best experience.\\n\")"
    ]
   },
   {
-   "cell_type": "code",
-   "execution_count": null,
+   "cell_type": "markdown",
    "metadata": {},
-   "outputs": [],
    "source": [
     "---\n",
     "\n",
@@ -502,51 +452,6 @@
     "print(\"🎯 You focus on RL, OpenEnv handles the infrastructure.\\n\")"
    ]
   },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "---\n",
-    "\n",
-    "<a id=\"part-5\"></a>\n",
-    "# Part 5: Example Integration - OpenSpiel 🎮\n",
-    "\n",
-    "<div style=\"background-color: #fff3e0; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n",
-    "\n",
-    "## What is OpenSpiel?\n",
-    "\n",
-    "**OpenSpiel** is a library from DeepMind with **70+ game environments** for RL research.\n",
-    "\n",
-    "## OpenEnv's Integration\n",
-    "\n",
-    "We've wrapped **6 OpenSpiel games** following the OpenEnv pattern:\n",
-    "\n",
-    "<table>\n",
-    "<tr>\n",
-    "<td width=\"50%\">\n",
-    "\n",
-    "**🎯 Single-Player**\n",
-    "1. **Catch** - Catch falling ball\n",
-    "2. **Cliff Walking** - Navigate grid\n",
-    "3. **2048** - Tile puzzle\n",
-    "4. **Blackjack** - Card game\n",
-    "\n",
-    "</td>\n",
-    "<td width=\"50%\">\n",
-    "\n",
-    "**👥 Multi-Player**\n",
-    "5. **Tic-Tac-Toe** - Classic 3×3\n",
-    "6. **Kuhn Poker** - Imperfect info poker\n",
-    "\n",
-    "</td>\n",
-    "</tr>\n",
-    "</table>\n",
-    "\n",
-    "This shows how OpenEnv can wrap **any** existing RL library!\n",
-    "\n",
-    "</div>"
-   ]
-  },
   {
    "cell_type": "markdown",
    "metadata": {},
@@ -942,52 +847,6 @@
     "### Test the Environment"
    ]
   },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "---\n",
-    "\n",
-    "<a id=\"part-7\"></a>\n",
-    "# Part 7: Four Policies 🤖\n",
-    "\n",
-    "<div style=\"background-color: #f8f9fa; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n",
-    "\n",
-    "## Let's test 4 different AI strategies:\n",
-    "\n",
-    "<table>\n",
-    "<tr>\n",
-    "<th width=\"25%\">Policy</th>\n",
-    "<th width=\"50%\">Strategy</th>\n",
-    "<th width=\"25%\">Expected Performance</th>\n",
-    "</tr>\n",
-    "<tr>\n",
-    "<td><b>🎲 Random</b></td>\n",
-    "<td>Pick random action every step</td>\n",
-    "<td>~20% (pure luck)</td>\n",
-    "</tr>\n",
-    "<tr>\n",
-    "<td><b>🛑 Always Stay</b></td>\n",
-    "<td>Never move, hope ball lands in center</td>\n",
-    "<td>~20% (terrible!)</td>\n",
-    "</tr>\n",
-    "<tr>\n",
-    "<td><b>🧠 Smart</b></td>\n",
-    "<td>Move paddle toward ball</td>\n",
-    "<td>100% (optimal!)</td>\n",
-    "</tr>\n",
-    "<tr>\n",
-    "<td><b>📈 Learning</b></td>\n",
-    "<td>Start random, learn smart strategy</td>\n",
-    "<td>~85% (improves over time)</td>\n",
-    "</tr>\n",
-    "</table>\n",
-    "\n",
-    "</div>"
-   ]
-  },
   {
    "cell_type": "markdown",
    "metadata": {},
@@ -1179,22 +1038,6 @@
     "run_episode(env, policy, visualize=True, delay=0.4)"
    ]
   },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "---\n",
-    "\n",
-    "<a id=\"part-8\"></a>\n",
-    "# Part 8: Policy Competition! 🏆\n",
-    "\n",
-    "<div style=\"background-color: #e7f3ff; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n",
-    "\n",
-    "Let's run **50 episodes** for each policy and see who wins!\n",
-    "\n",
-    "</div>"
-   ]
-  },
   {
    "cell_type": "markdown",
    "metadata": {},
@@ -1492,20 +1335,6 @@
     "</div>"
    ]
   },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "---\n",
-    "\n",
-    "<a id=\"summary\"></a>\n",
-    "<div style=\"background: linear-gradient(135deg, #667eea 0%, #764ba2 100%); color: white; padding: 40px; border-radius: 15px; margin: 40px 0; text-align: center;\">\n",
-    "\n",
-    "# 🎓 Summary: Your Journey\n",
-    "\n",
-    "</div>"
-   ]
-  },
   {
    "cell_type": "markdown",
    "metadata": {},
@@ -1686,116 +1515,6 @@
     "4. 🛠️ **Build** your own environment integration\n",
     "5. 💬 **Share** what you build with the community!"
    ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## 📚 Resources\n",
-    "\n",
-    "<div style=\"background-color: #f8f9fa; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n",
-    "\n",
-    "### 🔗 Essential Links\n",
-    "\n",
-    "- **🏠 OpenEnv GitHub**: https://github.com/meta-pytorch/OpenEnv\n",
-    "- **🎮 OpenSpiel**: https://github.com/google-deepmind/open_spiel\n",
-    "- **⚡ FastAPI Docs**: https://fastapi.tiangolo.com/\n",
-    "- **🐳 Docker Guide**: https://docs.docker.com/get-started/\n",
-    "- **🔥 PyTorch**: https://pytorch.org/\n",
-    "\n",
-    "### 📖 Documentation Deep Dives\n",
-    "\n",
-    "- **Environment Creation Guide**: `src/envs/README.md`\n",
-    "- **OpenSpiel Integration**: `src/envs/openspiel_env/README.md`\n",
-    "- **Example Scripts**: `examples/`\n",
-    "- **RFC 001**: [Baseline API Specs](https://github.com/meta-pytorch/OpenEnv/pull/26)\n",
-    "\n",
-    "### 🎓 Community & Support\n",
-    "\n",
-    "**Supported by amazing organizations:**\n",
-    "- 🔥 Meta PyTorch\n",
-    "- 🤗 Hugging Face\n",
-    "- ⚡ Unsloth AI\n",
-    "- 🌟 Reflection AI\n",
-    "- 🚀 And many more!\n",
-    "\n",
-    "**License**: BSD 3-Clause (very permissive!)\n",
-    "\n",
-    "**Contributions**: Always welcome! Check out the issues tab.\n",
-    "\n",
-    "</div>\n",
-    "\n",
-    "---\n",
-    "\n",
-    "### 🌈 What's Next?\n",
-    "\n",
-    "1. ⭐ **Star the repo** to show support and stay updated\n",
-    "2. 🔄 **Try modifying** the Catch game (make it harder? bigger grid?)\n",
-    "3. 🎮 **Explore** other OpenSpiel games\n",
-    "4. 🛠️ **Build** your own environment integration\n",
-    "5. 💬 **Share** what you build with the community!"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "---\n",
-    "\n",
-    "<div style=\"background: linear-gradient(135deg, #f093fb 0%, #f5576c 100%); color: white; padding: 50px; border-radius: 20px; margin: 40px 0; text-align: center;\">\n",
-    "\n",
-    "# 🎉 Congratulations! You Did It! 🎉\n",
-    "\n",
-    "### You're now an OpenEnv expert!\n",
-    "\n",
-    "<br>\n",
-    "\n",
-    "## ✅ What You've Mastered:\n",
-    "\n",
-    "**🧠 Concepts**\n",
-    "- How RL works (the observe-act-reward loop)\n",
-    "- Why OpenEnv matters (production-ready RL)\n",
-    "- How to use existing environments\n",
-    "\n",
-    "**🛠️ Practical Skills**\n",
-    "- Creating new integrations\n",
-    "- Building type-safe environments\n",
-    "- Deploying to production\n",
-    "\n",
-    "**🎯 Real Experience**\n",
-    "- Built a complete RL environment\n",
-    "- Tested multiple policies\n",
-    "- Watched learning happen in real-time!\n",
-    "\n",
-    "---\n",
-    "\n",
-    "### Now go build something amazing! 🚀\n",
-    "\n",
-    "**Welcome to the future of RL with PyTorch & OpenEnv**\n",
-    "\n",
-    "<br>\n",
-    "\n",
-    "[![Star on GitHub](https://img.shields.io/badge/⭐_Star_on_GitHub-gray?style=for-the-badge)](https://github.com/meta-pytorch/OpenEnv)\n",
-    "\n",
-    "</div>\n",
-    "\n",
-    "---\n",
-    "\n",
-    "<div style=\"background-color: #f0f7ff; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n",
-    "\n",
-    "## 🌟 Want to Learn More?\n",
-    "\n",
-    "- 📖 Check out the [docs](https://github.com/meta-pytorch/OpenEnv)\n",
-    "- 🎮 Try the other example games\n",
-    "- 💬 Join the community discussions\n",
-    "- 🛠️ Build your own integration\n",
-    "- 🚀 Deploy to production\n",
-    "- ⭐ Star the repo to stay updated!\n",
-    "\n",
-    "**Happy coding! 🎊**\n",
-    "\n",
-    "</div>"
-   ]
   }
  ],
  "metadata": {

From f6424fda186c904e9d1aa8ea3436476ff70402fd Mon Sep 17 00:00:00 2001
From: Sanyam Bhutani <sanyambhutani@meta.com>
Date: Mon, 20 Oct 2025 22:21:42 -0700
Subject: [PATCH 19/19] Update OpenEnv_Tutorial.ipynb

---
 examples/OpenEnv_Tutorial.ipynb | 684 +++++++++++++++++++-------------
 1 file changed, 407 insertions(+), 277 deletions(-)

diff --git a/examples/OpenEnv_Tutorial.ipynb b/examples/OpenEnv_Tutorial.ipynb
index 0b2b481..894d864 100644
--- a/examples/OpenEnv_Tutorial.ipynb
+++ b/examples/OpenEnv_Tutorial.ipynb
@@ -2,6 +2,7 @@
  "cells": [
   {
    "cell_type": "markdown",
+   "id": "cell-0",
    "metadata": {},
    "source": [
     "<div align=\"center\">\n",
@@ -36,6 +37,36 @@
   {
    "cell_type": "markdown",
    "metadata": {},
+   "source": [
+    "---\n",
+    "\n",
+    "## Why OpenEnv?\n",
+    "\n",
+    "Let's take a trip down memory lane:\n",
+    "\n",
+    "It's 2016, RL is popular. You read some papers, it looks promising. \n",
+    "\n",
+    "But in real world: Cartpole is the best you can run on a gaming GPU. \n",
+    "\n",
+    "What do you do beyond Cartpole?\n",
+    "\n",
+    "Fast forward to 2025, GRPO is awesome and this time it's not JUST in theory, it works well in practise and is really here! \n",
+    "\n",
+    "The problem still remains, how do you take these RL algorithms and take them beyond Cartpole?\n",
+    "\n",
+    "A huge part of RL is giving your algorithms environment access to learn. \n",
+    "\n",
+    "We are excited to introduce an Environement Spec for adding Open Environments for RL Training. This will allow you to focus on your experiments and allow everyone to bring their environments. \n",
+    "\n",
+    "Focus on experiments, use OpenEnvironments, and build agents that go beyond Cartpole on a single spec.\n",
+    "\n",
+    "---"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "cell-1",
+   "metadata": {},
    "source": [
     "## 📋 What You'll Learn\n",
     "\n",
@@ -62,7 +93,7 @@
     "<td width=\"50%\">\n",
     "\n",
     "**🎮 Part 6-8: Hands-On Demo**\n",
-    "- 🔨 Build a game environment\n",
+    "- 🔌 Use existing OpenSpiel environment\n",
     "- 🤖 Test 4 different policies\n",
     "- 👀 Watch learning happen live\n",
     "\n",
@@ -70,8 +101,8 @@
     "<td width=\"50%\">\n",
     "\n",
     "**🔧 Part 9-10: Going Further**\n",
-    "- 🚀 Use real OpenSpiel\n",
-    "- ✨ Create your own integration\n",
+    "- 🎮 Switch to other OpenSpiel games\n",
+    "- ✨ Build your own integration\n",
     "- 🌐 Deploy to production\n",
     "\n",
     "</td>\n",
@@ -79,12 +110,13 @@
     "</table>\n",
     "\n",
     "> 💡 **Pro Tip**: This notebook is designed to run top-to-bottom in Google Colab with zero setup!\n",
-    "> \n",
-    "> ⏱️ **Time**: ~5 minutes | 📊 **Difficulty**: Beginner-friendly | 🎯 **Outcome**: Production-ready RL knowledge"
+    ">\n",
+    "> ⏱️ **Time**: ~5 minutes | 📊 **Difficulty**: Beginner-friendly | 🎯 **Outcome**: Production-ready RL knowledge\n"
    ]
   },
   {
    "cell_type": "markdown",
+   "id": "cell-2",
    "metadata": {},
    "source": [
     "---\n",
@@ -124,6 +156,7 @@
   },
   {
    "cell_type": "markdown",
+   "id": "cell-3",
    "metadata": {},
    "source": [
     "---\n",
@@ -154,6 +187,7 @@
   {
    "cell_type": "code",
    "execution_count": null,
+   "id": "cell-4",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -197,6 +231,7 @@
   },
   {
    "cell_type": "markdown",
+   "id": "cell-5",
    "metadata": {},
    "source": [
     "---\n",
@@ -270,6 +305,7 @@
   },
   {
    "cell_type": "markdown",
+   "id": "cell-6",
    "metadata": {},
    "source": [
     "### The Architecture\n",
@@ -317,6 +353,7 @@
   },
   {
    "cell_type": "markdown",
+   "id": "cell-7",
    "metadata": {},
    "source": [
     "---\n",
@@ -335,6 +372,7 @@
   {
    "cell_type": "code",
    "execution_count": null,
+   "id": "cell-8",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -370,6 +408,7 @@
   },
   {
    "cell_type": "markdown",
+   "id": "cell-9",
    "metadata": {},
    "source": [
     "---\n",
@@ -403,6 +442,7 @@
   {
    "cell_type": "code",
    "execution_count": null,
+   "id": "cell-10",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -454,6 +494,7 @@
   },
   {
    "cell_type": "markdown",
+   "id": "cell-11",
    "metadata": {},
    "source": [
     "---\n",
@@ -499,6 +540,7 @@
   {
    "cell_type": "code",
    "execution_count": null,
+   "id": "cell-12",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -550,6 +592,7 @@
   {
    "cell_type": "code",
    "execution_count": null,
+   "id": "cell-13",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -590,6 +633,7 @@
   },
   {
    "cell_type": "markdown",
+   "id": "cell-14",
    "metadata": {},
    "source": [
     "### How the Client Works\n",
@@ -609,25 +653,26 @@
   },
   {
    "cell_type": "markdown",
+   "id": "cell-15",
    "metadata": {},
    "source": [
     "---\n",
     "\n",
     "<div style=\"text-align: center; background: linear-gradient(135deg, #667eea 0%, #764ba2 100%); color: white; padding: 30px; border-radius: 15px; margin: 30px 0;\">\n",
     "\n",
-    "# 🎮 Part 6: Interactive Demo\n",
+    "# 🎮 Part 6: Using Real OpenSpiel\n",
     "\n",
-    "### Now let's BUILD something!\n",
+    "### Now let's USE a production environment!\n",
     "\n",
-    "We'll create a **Catch game** following OpenEnv patterns,<br>\n",
-    "then watch **4 different AI policies** compete for the championship! 🏆\n",
+    "We'll play **Catch** using OpenEnv's **OpenSpiel integration** 🎯<br>\n",
+    "This is a REAL environment running in production at companies!\n",
     "\n",
     "<br>\n",
     "\n",
     "**Get ready for:**\n",
-    "- ⚡ Live gameplay visualization\n",
-    "- 🤖 AI policy showdown\n",
-    "- 📊 Real-time learning metrics\n",
+    "- 🔌 Using existing environments (not building)\n",
+    "- 🤖 Testing policies against real games\n",
+    "- 📊 Live gameplay visualization\n",
     "- 🎯 Production-ready patterns\n",
     "\n",
     "</div>"
@@ -635,6 +680,7 @@
   },
   {
    "cell_type": "markdown",
+   "id": "cell-16",
    "metadata": {},
    "source": [
     "## The Game: Catch 🔴🏓\n",
@@ -644,11 +690,11 @@
     "<td width=\"40%\" style=\"text-align: center;\">\n",
     "\n",
     "```\n",
-    "⬜ ⬜ 🔴 ⬜ ⬜   \n",
+    "⬜ ⬜ 🔴 ⬜ ⬜\n",
     "⬜ ⬜ ⬜ ⬜ ⬜   Ball\n",
     "⬜ ⬜ ⬜ ⬜ ⬜   falls\n",
     "⬜ ⬜ ⬜ ⬜ ⬜   down\n",
-    "⬜ ⬜ 🏓 ⬜ ⬜   \n",
+    "⬜ ⬜ 🏓 ⬜ ⬜\n",
     "     Paddle\n",
     "```\n",
     "\n",
@@ -658,7 +704,7 @@
     "**Rules:**\n",
     "- 5×5 grid\n",
     "- Ball falls from random column\n",
-    "- Move paddle to catch it\n",
+    "- Move paddle left/right to catch it\n",
     "\n",
     "**Actions:**\n",
     "- `0` = Move LEFT ⬅️\n",
@@ -675,12 +721,14 @@
     "\n",
     "<div style=\"background-color: #d4edda; padding: 15px; border-left: 5px solid #28a745; margin: 20px 0;\">\n",
     "\n",
-    "**🎯 Why This Game?**\n",
+    "**🎯 Why Catch?**\n",
     "- Simple rules (easy to understand)\n",
-    "- Visual (see what's happening)\n",
     "- Fast episodes (~5 steps)\n",
     "- Clear success/failure\n",
-    "- Perfect for testing policies!\n",
+    "- Part of OpenSpiel's 70+ games!\n",
+    "\n",
+    "**💡 The Big Idea:**\n",
+    "Instead of building this from scratch, we'll USE OpenEnv's existing OpenSpiel integration. Same interface, but production-ready!\n",
     "\n",
     "</div>"
    ]
@@ -688,167 +736,198 @@
   {
    "cell_type": "code",
    "execution_count": null,
+   "id": "cell-17",
    "metadata": {},
    "outputs": [],
    "source": [
-    "# Create environment and start a new episode\n",
-    "env = CatchEnvironment()\n",
-    "obs = env.reset()\n",
-    "\n",
-    "print(\"🎮 \" + \"=\"*58 + \" 🎮\")\n",
-    "print(\"   INITIAL GAME STATE\")\n",
-    "print(\"🎮 \" + \"=\"*58 + \" 🎮\\n\")\n",
-    "\n",
-    "# Visualize the game board\n",
-    "env.render()\n",
-    "\n",
-    "# Show game info\n",
-    "print(f\"\\n📍 Game Info:\")\n",
-    "print(f\"   🔴 Ball at: column {obs.ball_position[1]} (row {obs.ball_position[0]})\")\n",
-    "print(f\"   🏓 Paddle at: column {obs.paddle_position}\")\n",
-    "\n",
-    "print(f\"\\n📊 Observation Details:\")\n",
-    "print(f\"   • Legal actions: {obs.legal_actions} → [LEFT, STAY, RIGHT]\")\n",
-    "print(f\"   • Info state size: {len(obs.info_state)} (5×5 grid flattened)\")\n",
-    "print(f\"   • Episode done: {obs.done}\")\n",
-    "print(f\"   • Current reward: {obs.reward}\")\n",
-    "\n",
-    "print(\"\\n💡 The ball will fall down each step. Can your policy catch it?\")\n",
-    "print(\"=\"*62)"
+    "from envs.openspiel_env import OpenSpielEnv\n",
+    "from envs.openspiel_env.models import (\n",
+    "    OpenSpielAction,\n",
+    "    OpenSpielObservation,\n",
+    "    OpenSpielState\n",
+    ")\n",
+    "from dataclasses import fields\n",
+    "\n",
+    "print(\"🎮 \" + \"=\"*64 + \" 🎮\")\n",
+    "print(\"   ✅ Importing Real OpenSpiel Environment!\")\n",
+    "print(\"🎮 \" + \"=\"*64 + \" 🎮\\n\")\n",
+    "\n",
+    "print(\"📦 What we just imported:\")\n",
+    "print(\"   • OpenSpielEnv - HTTP client for OpenSpiel games\")\n",
+    "print(\"   • OpenSpielAction - Type-safe actions\")\n",
+    "print(\"   • OpenSpielObservation - Type-safe observations\")\n",
+    "print(\"   • OpenSpielState - Episode metadata\\n\")\n",
+    "\n",
+    "print(\"📋 OpenSpielObservation fields:\")\n",
+    "print(\"   \" + \"─\" * 60)\n",
+    "for field in fields(OpenSpielObservation):\n",
+    "    print(f\"   • {field.name:25s} : {field.type}\")\n",
+    "\n",
+    "print(\"\\n\" + \"=\"*70)\n",
+    "print(\"\\n💡 This is REAL OpenEnv code - used in production!\")\n",
+    "print(\"   • Wraps 6 OpenSpiel games (Catch, Tic-Tac-Toe, Poker, etc.)\")\n",
+    "print(\"   • Type-safe actions and observations\")\n",
+    "print(\"   • Works via HTTP (we\\'ll see that next!)\\n\")"
    ]
   },
   {
    "cell_type": "code",
    "execution_count": null,
+   "id": "cell-18",
    "metadata": {},
    "outputs": [],
    "source": [
-    "import random\n",
-    "from dataclasses import dataclass\n",
-    "from typing import List, Tuple\n",
+    "import subprocess\n",
+    "import time\n",
+    "import sys\n",
+    "import os\n",
     "\n",
-    "# ============================================================================\n",
-    "# MODELS - Type-safe contracts (following OpenEnv pattern)\n",
-    "# ============================================================================\n",
+    "print(\"🚀 \" + \"=\"*64 + \" 🚀\")\n",
+    "print(\"   Starting OpenSpiel Server (Catch Game)\")\n",
+    "print(\"🚀 \" + \"=\"*64 + \" 🚀\\n\")\n",
     "\n",
-    "@dataclass\n",
-    "class CatchObservation:\n",
-    "    \"\"\"Type-safe observation following OpenEnv Observation base class.\"\"\"\n",
-    "    info_state: List[float]      # Grid as flat array\n",
-    "    legal_actions: List[int]     # [0, 1, 2] always\n",
-    "    done: bool                   # Episode finished?\n",
-    "    reward: float                # +1 or 0\n",
-    "    # Extra fields for visualization\n",
-    "    ball_position: Tuple[int, int]\n",
-    "    paddle_position: int\n",
+    "# Check if open_spiel is installed\n",
+    "try:\n",
+    "    import pyspiel\n",
+    "    print(\"✅ OpenSpiel is installed!\\n\")\n",
+    "except ImportError:\n",
+    "    print(\"⚠️  OpenSpiel not found. Installing...\")\n",
+    "    import subprocess\n",
+    "    subprocess.check_call([sys.executable, \"-m\", \"pip\", \"install\", \"-q\", \"open_spiel\"])\n",
+    "    print(\"✅ OpenSpiel installed!\\n\")\n",
     "\n",
+    "# Start the OpenSpiel server in background\n",
+    "print(\"⚡ Starting FastAPI server for OpenSpiel Catch...\")\n",
+    "print(\"   (This uses REAL OpenEnv + OpenSpiel integration)\\n\")\n",
     "\n",
-    "# ============================================================================\n",
-    "# ENVIRONMENT - Server-side logic (following OpenEnv Environment pattern)\n",
-    "# ============================================================================\n",
+    "# Determine the correct path\n",
+    "if IN_COLAB:\n",
+    "    work_dir = \"/content/OpenEnv\"\n",
+    "else:\n",
+    "    from pathlib import Path\n",
+    "    work_dir = str(Path.cwd().parent.absolute())\n",
+    "\n",
+    "server_process = subprocess.Popen(\n",
+    "    [sys.executable, \"-m\", \"uvicorn\",\n",
+    "     \"envs.openspiel_env.server.app:app\",\n",
+    "     \"--host\", \"0.0.0.0\",\n",
+    "     \"--port\", \"8000\"],\n",
+    "    env={**os.environ,\n",
+    "         \"PYTHONPATH\": f\"{work_dir}/src\",\n",
+    "         \"OPENSPIEL_GAME\": \"catch\",\n",
+    "         \"OPENSPIEL_AGENT_PLAYER\": \"0\",\n",
+    "         \"OPENSPIEL_OPPONENT_POLICY\": \"random\"},\n",
+    "    stdout=subprocess.PIPE,\n",
+    "    stderr=subprocess.PIPE,\n",
+    "    text=True,\n",
+    "    cwd=work_dir\n",
+    ")\n",
     "\n",
-    "class CatchEnvironment:\n",
-    "    \"\"\"\n",
-    "    Catch game following OpenEnv's Environment pattern.\n",
-    "    \n",
-    "    In production:\n",
-    "      • Runs in Docker container\n",
-    "      • Accessed via HTTPEnvClient\n",
-    "      • Exposed via FastAPI server\n",
-    "    \n",
-    "    For this demo:\n",
-    "      • We run it locally to see internals\n",
-    "      • But the structure is identical!\n",
-    "    \"\"\"\n",
-    "    \n",
-    "    def __init__(self, grid_size=5):\n",
-    "        self.grid_size = grid_size\n",
-    "    \n",
-    "    def reset(self) -> CatchObservation:\n",
-    "        \"\"\"Start new episode (implements Environment.reset()).\"\"\"\n",
-    "        self.ball_row = 0\n",
-    "        self.ball_col = random.randint(0, self.grid_size - 1)\n",
-    "        self.paddle_col = self.grid_size // 2\n",
-    "        self.done = False\n",
-    "        return self._make_observation()\n",
-    "    \n",
-    "    def step(self, action: int) -> CatchObservation:\n",
-    "        \"\"\"Execute action (implements Environment.step()).\n",
-    "        \n",
-    "        Args:\n",
-    "            action: 0=LEFT, 1=STAY, 2=RIGHT\n",
-    "        \"\"\"\n",
-    "        # Move paddle\n",
-    "        if action == 0 and self.paddle_col > 0:\n",
-    "            self.paddle_col -= 1\n",
-    "        elif action == 2 and self.paddle_col < self.grid_size - 1:\n",
-    "            self.paddle_col += 1\n",
-    "        \n",
-    "        # Move ball down\n",
-    "        self.ball_row += 1\n",
-    "        \n",
-    "        # Check if episode done\n",
-    "        if self.ball_row >= self.grid_size - 1:\n",
-    "            self.done = True\n",
-    "            reward = 1.0 if self.ball_col == self.paddle_col else 0.0\n",
-    "        else:\n",
-    "            reward = 0.0\n",
-    "        \n",
-    "        return self._make_observation(reward)\n",
-    "    \n",
-    "    def _make_observation(self, reward=0.0) -> CatchObservation:\n",
-    "        \"\"\"Create type-safe observation.\"\"\"\n",
-    "        # Flatten grid to vector (like real RL environments do)\n",
-    "        info_state = [0.0] * (self.grid_size * self.grid_size)\n",
-    "        ball_idx = self.ball_row * self.grid_size + self.ball_col\n",
-    "        paddle_idx = (self.grid_size - 1) * self.grid_size + self.paddle_col\n",
-    "        info_state[ball_idx] = 1.0      # Ball = 1.0\n",
-    "        info_state[paddle_idx] = 0.5    # Paddle = 0.5\n",
-    "        \n",
-    "        return CatchObservation(\n",
-    "            info_state=info_state,\n",
-    "            legal_actions=[0, 1, 2],\n",
-    "            done=self.done,\n",
-    "            reward=reward,\n",
-    "            ball_position=(self.ball_row, self.ball_col),\n",
-    "            paddle_position=self.paddle_col\n",
-    "        )\n",
-    "    \n",
-    "    def render(self):\n",
-    "        \"\"\"Visualize current state.\"\"\"\n",
-    "        for row in range(self.grid_size):\n",
-    "            line = \"  \"\n",
-    "            for col in range(self.grid_size):\n",
-    "                if row == self.ball_row and col == self.ball_col:\n",
-    "                    line += \"🔴 \"\n",
-    "                elif row == self.grid_size - 1 and col == self.paddle_col:\n",
-    "                    line += \"🏓 \"\n",
-    "                else:\n",
-    "                    line += \"⬜ \"\n",
-    "            print(line)\n",
-    "\n",
-    "\n",
-    "print(\"🎉 \" + \"=\"*64 + \" 🎉\")\n",
-    "print(\"   ✅ Environment Created Following OpenEnv Pattern!\")\n",
-    "print(\"🎉 \" + \"=\"*64 + \" 🎉\")\n",
-    "print(\"\\n📋 What we just built:\")\n",
-    "print(\"   • reset() → CatchObservation (type-safe!)\")\n",
-    "print(\"   • step(action) → CatchObservation (type-safe!)\")\n",
-    "print(\"   • render() → Visual display\")\n",
-    "print(\"\\n🚀 In production: This would run in Docker + FastAPI\")\n",
-    "print(\"   But the structure is EXACTLY the same!\")\n",
-    "print(\"\\n💡 This is your blueprint for creating ANY OpenEnv environment!\\n\")"
+    "# Wait for server to start\n",
+    "print(\"⏳ Waiting for server to start...\")\n",
+    "time.sleep(5)\n",
+    "\n",
+    "# Check if server is running\n",
+    "import requests\n",
+    "try:\n",
+    "    response = requests.get('http://localhost:8000/health', timeout=2)\n",
+    "    print(\"\\n✅ OpenSpiel server is running!\")\n",
+    "    print(\"🌐 Server URL: http://localhost:8000\")\n",
+    "    print(\"📍 Endpoints available:\")\n",
+    "    print(\"   • POST /reset\")\n",
+    "    print(\"   • POST /step\")\n",
+    "    print(\"   • GET /state\")\n",
+    "    print(\"\\n🎯 This is REAL OpenEnv + OpenSpiel in action!\")\n",
+    "    print(\"   • Running actual OpenSpiel Catch game\")\n",
+    "    print(\"   • Exposed via FastAPI HTTP server\")\n",
+    "    print(\"   • Using OpenEnv's standard interface\\n\")\n",
+    "except Exception as e:\n",
+    "    print(f\"\\n❌ Server failed to start: {e}\")\n",
+    "    print(\"\\n📋 Checking error output...\")\n",
+    "    server_process.poll()\n",
+    "    if server_process.stderr:\n",
+    "        stderr = server_process.stderr.read()\n",
+    "        if stderr:\n",
+    "            print(stderr)\n",
+    "    print(\"\\n💡 Make sure open_spiel is installed:\")\n",
+    "    print(\"   pip install open_spiel\")\n",
+    "    raise"
    ]
   },
   {
-   "cell_type": "markdown",
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "cell-19",
    "metadata": {},
+   "outputs": [],
    "source": [
-    "### Test the Environment"
+    "print(\"📱 \" + \"=\"*64 + \" 📱\")\n",
+    "print(\"   Connecting to OpenSpiel Server via HTTP\")\n",
+    "print(\"📱 \" + \"=\"*64 + \" 📱\\n\")\n",
+    "\n",
+    "# Create HTTP client for OpenSpiel\n",
+    "client = OpenSpielEnv(base_url=\"http://localhost:8000\")\n",
+    "\n",
+    "print(\"✅ Client created!\")\n",
+    "print(\"\\n💡 What just happened:\")\n",
+    "print(\"   • OpenSpielEnv is an HTTPEnvClient subclass\")\n",
+    "print(\"   • It knows how to talk to OpenSpiel servers\")\n",
+    "print(\"   • All communication is type-safe and over HTTP\")\n",
+    "print(\"   • Same client works for ALL OpenSpiel games!\\n\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "cell-20",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "print(\"🎮 \" + \"=\"*64 + \" 🎮\")\n",
+    "print(\"   Testing Connection - Playing One Step\")\n",
+    "print(\"🎮 \" + \"=\"*64 + \" 🎮\\n\")\n",
+    "\n",
+    "# Reset the environment (HTTP POST /reset)\n",
+    "print(\"📤 Calling client.reset()...\")\n",
+    "print(\"   Under the hood: HTTP POST to http://localhost:8000/reset\\n\")\n",
+    "\n",
+    "result = client.reset()\n",
+    "\n",
+    "print(\"📥 Received OpenSpielObservation:\")\n",
+    "print(f\"   • info_state: {result.observation.info_state[:10]}... (first 10 values)\")\n",
+    "print(f\"   • legal_actions: {result.observation.legal_actions}\")\n",
+    "print(f\"   • game_phase: {result.observation.game_phase}\")\n",
+    "print(f\"   • done: {result.done}\")\n",
+    "\n",
+    "# Take an action (HTTP POST /step)\n",
+    "print(\"\\n📤 Calling client.step(OpenSpielAction(action_id=1, game_name=\\'catch\\'))...\")\n",
+    "print(\"   Under the hood: HTTP POST to http://localhost:8000/step\\n\")\n",
+    "\n",
+    "action = OpenSpielAction(action_id=1, game_name=\"catch\")  # STAY\n",
+    "result = client.step(action)\n",
+    "\n",
+    "print(\"📥 Received response:\")\n",
+    "print(f\"   • Reward: {result.reward}\")\n",
+    "print(f\"   • Done: {result.done}\")\n",
+    "print(f\"   • legal_actions: {result.observation.legal_actions}\")\n",
+    "\n",
+    "# Get state (HTTP GET /state)\n",
+    "state = client.state()\n",
+    "print(f\"\\n📊 Episode state:\")\n",
+    "print(f\"   • episode_id: {state.episode_id}\")\n",
+    "print(f\"   • step_count: {state.step_count}\")\n",
+    "print(f\"   • game_name: {state.game_name}\")\n",
+    "\n",
+    "print(\"\\n\" + \"=\"*70)\n",
+    "print(\"\\n🎉 IT WORKS! We\\'re using REAL OpenSpiel via HTTP!\")\n",
+    "print(\"   ✅ Type-safe communication\")\n",
+    "print(\"   ✅ Same interface as any OpenEnv environment\")\n",
+    "print(\"   ✅ Production-ready architecture\\n\")"
    ]
   },
   {
    "cell_type": "markdown",
+   "id": "cell-21",
    "metadata": {},
    "source": [
     "---\n",
@@ -887,93 +966,112 @@
     "</tr>\n",
     "</table>\n",
     "\n",
+    "**💡 These policies work with ANY OpenSpiel game!**\n",
+    "\n",
     "</div>"
    ]
   },
   {
    "cell_type": "code",
    "execution_count": null,
+   "id": "cell-22",
    "metadata": {},
    "outputs": [],
    "source": [
+    "import random\n",
+    "\n",
     "# ============================================================================\n",
-    "# POLICIES - Different AI strategies\n",
+    "# POLICIES - Different AI strategies (adapted for OpenSpiel)\n",
     "# ============================================================================\n",
     "\n",
     "class RandomPolicy:\n",
     "    \"\"\"Baseline: Pure random guessing.\"\"\"\n",
     "    name = \"🎲 Random Guesser\"\n",
-    "    \n",
-    "    def select_action(self, obs: CatchObservation) -> int:\n",
+    "\n",
+    "    def select_action(self, obs: OpenSpielObservation) -> int:\n",
     "        return random.choice(obs.legal_actions)\n",
     "\n",
     "\n",
     "class AlwaysStayPolicy:\n",
     "    \"\"\"Bad strategy: Never moves.\"\"\"\n",
     "    name = \"🛑 Always Stay\"\n",
-    "    \n",
-    "    def select_action(self, obs: CatchObservation) -> int:\n",
+    "\n",
+    "    def select_action(self, obs: OpenSpielObservation) -> int:\n",
     "        return 1  # STAY\n",
     "\n",
     "\n",
     "class SmartPolicy:\n",
     "    \"\"\"Optimal: Move paddle toward ball.\"\"\"\n",
     "    name = \"🧠 Smart Heuristic\"\n",
-    "    \n",
-    "    def select_action(self, obs: CatchObservation) -> int:\n",
-    "        ball_col = obs.ball_position[1]\n",
-    "        paddle_col = obs.paddle_position\n",
-    "        \n",
-    "        if paddle_col < ball_col:\n",
-    "            return 2  # Move RIGHT\n",
-    "        elif paddle_col > ball_col:\n",
-    "            return 0  # Move LEFT\n",
-    "        else:\n",
-    "            return 1  # STAY (already aligned)\n",
+    "\n",
+    "    def select_action(self, obs: OpenSpielObservation) -> int:\n",
+    "        # Parse OpenSpiel observation\n",
+    "        # For Catch: info_state is a flattened 5x5 grid\n",
+    "        # Ball position and paddle position encoded in the vector\n",
+    "        info_state = obs.info_state\n",
+    "\n",
+    "        # Find ball and paddle positions from info_state\n",
+    "        # Catch uses a 5x5 grid, so 25 values\n",
+    "        grid_size = 5\n",
+    "\n",
+    "        # Find positions (ball = 1.0, paddle = 0.5 in the flattened grid)\n",
+    "        ball_col = None\n",
+    "        paddle_col = None\n",
+    "\n",
+    "        for idx, val in enumerate(info_state):\n",
+    "            if abs(val - 1.0) < 0.01:  # Ball\n",
+    "                ball_col = idx % grid_size\n",
+    "            elif abs(val - 0.5) < 0.01:  # Paddle\n",
+    "                paddle_col = idx % grid_size\n",
+    "\n",
+    "        if ball_col is not None and paddle_col is not None:\n",
+    "            if paddle_col < ball_col:\n",
+    "                return 2  # Move RIGHT\n",
+    "            elif paddle_col > ball_col:\n",
+    "                return 0  # Move LEFT\n",
+    "\n",
+    "        return 1  # STAY (fallback)\n",
     "\n",
     "\n",
     "class LearningPolicy:\n",
     "    \"\"\"Simulated RL: Epsilon-greedy exploration.\"\"\"\n",
     "    name = \"📈 Learning Agent\"\n",
-    "    \n",
+    "\n",
     "    def __init__(self):\n",
     "        self.steps = 0\n",
-    "    \n",
-    "    def select_action(self, obs: CatchObservation) -> int:\n",
+    "        self.smart_policy = SmartPolicy()\n",
+    "\n",
+    "    def select_action(self, obs: OpenSpielObservation) -> int:\n",
     "        self.steps += 1\n",
-    "        \n",
+    "\n",
     "        # Decay exploration rate over time\n",
     "        epsilon = max(0.1, 1.0 - (self.steps / 100))\n",
-    "        \n",
+    "\n",
     "        if random.random() < epsilon:\n",
     "            # Explore: random action\n",
     "            return random.choice(obs.legal_actions)\n",
     "        else:\n",
     "            # Exploit: use smart strategy\n",
-    "            ball_col = obs.ball_position[1]\n",
-    "            paddle_col = obs.paddle_position\n",
-    "            if paddle_col < ball_col:\n",
-    "                return 2\n",
-    "            elif paddle_col > ball_col:\n",
-    "                return 0\n",
-    "            else:\n",
-    "                return 1\n",
+    "            return self.smart_policy.select_action(obs)\n",
     "\n",
     "\n",
     "print(\"🤖 \" + \"=\"*64 + \" 🤖\")\n",
-    "print(\"   ✅ 4 Policies Created!\")\n",
+    "print(\"   ✅ 4 Policies Created (Adapted for OpenSpiel)!\")\n",
     "print(\"🤖 \" + \"=\"*64 + \" 🤖\\n\")\n",
     "\n",
     "policies = [RandomPolicy(), AlwaysStayPolicy(), SmartPolicy(), LearningPolicy()]\n",
     "for i, policy in enumerate(policies, 1):\n",
     "    print(f\"   {i}. {policy.name}\")\n",
     "\n",
-    "print(\"\\n💡 Each policy represents a different approach to solving the game!\")\n",
-    "print(\"   Let's see who performs best! 🏆\\n\")"
+    "print(\"\\n💡 These policies work with OpenSpielObservation!\")\n",
+    "print(\"   • Read info_state (flattened grid)\")\n",
+    "print(\"   • Use legal_actions\")\n",
+    "print(\"   • Work with ANY OpenSpiel game that exposes these!\\n\")"
    ]
   },
   {
    "cell_type": "markdown",
+   "id": "cell-23",
    "metadata": {},
    "source": [
     "### Watch a Policy Play!"
@@ -982,64 +1080,76 @@
   {
    "cell_type": "code",
    "execution_count": null,
+   "id": "cell-24",
    "metadata": {},
    "outputs": [],
    "source": [
     "import time\n",
     "\n",
-    "def run_episode(env, policy, visualize=True, delay=0.4):\n",
-    "    \"\"\"Run one episode with a policy.\"\"\"\n",
-    "    \n",
+    "def run_episode(env, policy, visualize=True, delay=0.3):\n",
+    "    \"\"\"Run one episode with a policy against OpenSpiel environment.\"\"\"\n",
+    "\n",
     "    # RESET\n",
-    "    obs = env.reset()\n",
-    "    \n",
+    "    result = env.reset()\n",
+    "    obs = result.observation\n",
+    "\n",
     "    if visualize:\n",
     "        print(f\"\\n{'='*60}\")\n",
     "        print(f\"   🎮 {policy.name}\")\n",
-    "        print(f\"   🔴 Ball will fall at column: {obs.ball_position[1]}\")\n",
+    "        print(f\"   🎲 Playing against OpenSpiel Catch\")\n",
     "        print('='*60 + '\\n')\n",
-    "        env.render()\n",
     "        time.sleep(delay)\n",
-    "    \n",
+    "\n",
     "    total_reward = 0\n",
     "    step = 0\n",
     "    action_names = [\"⬅️  LEFT\", \"🛑 STAY\", \"➡️  RIGHT\"]\n",
-    "    \n",
+    "\n",
     "    # THE RL LOOP\n",
     "    while not obs.done:\n",
     "        # 1. Policy chooses action\n",
-    "        action = policy.select_action(obs)\n",
-    "        \n",
-    "        # 2. Environment executes\n",
-    "        obs = env.step(action)\n",
-    "        \n",
+    "        action_id = policy.select_action(obs)\n",
+    "\n",
+    "        # 2. Environment executes (via HTTP!)\n",
+    "        action = OpenSpielAction(action_id=action_id, game_name=\"catch\")\n",
+    "        result = env.step(action)\n",
+    "        obs = result.observation\n",
+    "\n",
     "        # 3. Collect reward\n",
-    "        total_reward += obs.reward\n",
-    "        \n",
+    "        if result.reward is not None:\n",
+    "            total_reward += result.reward\n",
+    "\n",
     "        if visualize:\n",
-    "            print(f\"\\n📍 Step {step + 1}: {action_names[action]}\")\n",
-    "            env.render()\n",
+    "            print(f\"📍 Step {step + 1}: {action_names[action_id]} → Reward: {result.reward}\")\n",
     "            time.sleep(delay)\n",
-    "        \n",
+    "\n",
     "        step += 1\n",
-    "    \n",
+    "\n",
     "    if visualize:\n",
-    "        result = \"🎉 CAUGHT!\" if total_reward > 0 else \"😢 MISSED\"\n",
+    "        result_text = \"🎉 CAUGHT!\" if total_reward > 0 else \"😢 MISSED\"\n",
     "        print(f\"\\n{'='*60}\")\n",
-    "        print(f\"   {result} Reward: {total_reward}\")\n",
+    "        print(f\"   {result_text} Total Reward: {total_reward}\")\n",
     "        print('='*60)\n",
-    "    \n",
+    "\n",
     "    return total_reward > 0\n",
     "\n",
     "\n",
+    "print(\"📺 \" + \"=\"*64 + \" 📺\")\n",
+    "print(\"   Watch Smart Policy Play Against OpenSpiel!\")\n",
+    "print(\"📺 \" + \"=\"*64 + \" 📺\\n\")\n",
+    "\n",
     "# Demo: Watch Smart Policy in action\n",
-    "env = CatchEnvironment()\n",
     "policy = SmartPolicy()\n",
-    "run_episode(env, policy, visualize=True, delay=0.4)"
+    "run_episode(client, policy, visualize=True, delay=0.5)\n",
+    "\n",
+    "print(\"\\n💡 You just watched REAL OpenSpiel Catch being played!\")\n",
+    "print(\"   • Every action was an HTTP call\")\n",
+    "print(\"   • Game logic runs in the server\")\n",
+    "print(\"   • Client only sends actions and receives observations\\n\")"
    ]
   },
   {
    "cell_type": "markdown",
+   "id": "cell-25",
    "metadata": {},
    "source": [
     "---\n",
@@ -1048,7 +1158,9 @@
     "\n",
     "<div style=\"background-color: #e7f3ff; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n",
     "\n",
-    "Let's run **50 episodes** for each policy and see who wins!\n",
+    "Let's run **50 episodes** for each policy against **REAL OpenSpiel** and see who wins!\n",
+    "\n",
+    "This is production code - every action is an HTTP call to the OpenSpiel server!\n",
     "\n",
     "</div>"
    ]
@@ -1056,153 +1168,167 @@
   {
    "cell_type": "code",
    "execution_count": null,
+   "id": "cell-26",
    "metadata": {},
    "outputs": [],
    "source": [
-    "def evaluate_policies(num_episodes=50):\n",
-    "    \"\"\"Compare all policies over many episodes.\"\"\"\n",
+    "def evaluate_policies(env, num_episodes=50):\n",
+    "    \"\"\"Compare all policies over many episodes using real OpenSpiel.\"\"\"\n",
     "    policies = [\n",
     "        RandomPolicy(),\n",
     "        AlwaysStayPolicy(),\n",
     "        SmartPolicy(),\n",
     "        LearningPolicy(),\n",
     "    ]\n",
-    "    \n",
+    "\n",
     "    print(\"\\n🏆 \" + \"=\"*66 + \" 🏆\")\n",
     "    print(f\"   POLICY SHOWDOWN - {num_episodes} Episodes Each\")\n",
+    "    print(f\"   Playing against REAL OpenSpiel Catch!\")\n",
     "    print(\"🏆 \" + \"=\"*66 + \" 🏆\\n\")\n",
-    "    \n",
+    "\n",
     "    results = []\n",
     "    for policy in policies:\n",
     "        print(f\"⚡ Testing {policy.name}...\", end=\" \")\n",
-    "        env = CatchEnvironment()\n",
-    "        successes = sum(run_episode(env, policy, visualize=False) \n",
+    "        successes = sum(run_episode(env, policy, visualize=False)\n",
     "                       for _ in range(num_episodes))\n",
     "        success_rate = (successes / num_episodes) * 100\n",
     "        results.append((policy.name, success_rate, successes))\n",
     "        print(f\"✓ Done!\")\n",
-    "    \n",
+    "\n",
     "    print(\"\\n\" + \"=\"*70)\n",
     "    print(\"   📊 FINAL RESULTS\")\n",
     "    print(\"=\"*70 + \"\\n\")\n",
-    "    \n",
+    "\n",
     "    # Sort by success rate (descending)\n",
     "    results.sort(key=lambda x: x[1], reverse=True)\n",
-    "    \n",
+    "\n",
     "    # Award medals to top 3\n",
     "    medals = [\"🥇\", \"🥈\", \"🥉\", \"  \"]\n",
-    "    \n",
+    "\n",
     "    for i, (name, rate, successes) in enumerate(results):\n",
     "        medal = medals[i]\n",
     "        bar = \"█\" * int(rate / 2)\n",
     "        print(f\"{medal} {name:25s} [{bar:<50}] {rate:5.1f}% ({successes}/{num_episodes})\")\n",
-    "    \n",
+    "\n",
     "    print(\"\\n\" + \"=\"*70)\n",
     "    print(\"\\n✨ Key Insights:\")\n",
     "    print(\"   • Random (~20%):      Baseline - pure luck 🎲\")\n",
     "    print(\"   • Always Stay (~20%): Bad strategy - stays center 🛑\")\n",
     "    print(\"   • Smart (100%):       Optimal - perfect play! 🧠\")\n",
     "    print(\"   • Learning (~85%):    Improves over time 📈\")\n",
-    "    print(\"\\n🎓 This is Reinforcement Learning in action:\")\n",
-    "    print(\"   1. Start with exploration (trying random things)\")\n",
-    "    print(\"   2. Learn from rewards (what works, what doesn't)\")\n",
-    "    print(\"   3. Converge to optimal behavior (smart strategy)\")\n",
-    "    print(\"\\n🎯 The Learning Agent gets smarter with every episode!\\n\")\n",
+    "    print(\"\\n🎓 This is Reinforcement Learning + OpenEnv in action:\")\n",
+    "    print(\"   1. We USED existing OpenSpiel environment (didn\\'t build it)\")\n",
+    "    print(\"   2. Type-safe communication over HTTP\")\n",
+    "    print(\"   3. Same code works for ANY OpenSpiel game\")\n",
+    "    print(\"   4. Production-ready architecture\\n\")\n",
     "\n",
     "# Run the epic competition!\n",
-    "print(\"🎮 Starting the showdown...\")\n",
-    "evaluate_policies(num_episodes=50)"
+    "print(\"🎮 Starting the showdown against REAL OpenSpiel...\\n\")\n",
+    "evaluate_policies(client, num_episodes=50)"
    ]
   },
   {
    "cell_type": "markdown",
+   "id": "cell-27",
    "metadata": {},
    "source": [
     "---\n",
     "\n",
     "<a id=\"part-9\"></a>\n",
-    "# Part 9: Using Real OpenSpiel 🎮\n",
+    "# Part 9: Switching to Other Games 🎮\n",
     "\n",
     "<div style=\"background-color: #d4edda; padding: 20px; border-radius: 10px; margin: 20px 0;\">\n",
     "\n",
-    "## What We Just Built vs Production OpenSpiel\n",
+    "## What We Just Used: Real OpenSpiel! 🎉\n",
+    "\n",
+    "In Parts 6-8, we **USED** the existing OpenSpiel Catch environment:\n",
     "\n",
     "<table>\n",
     "<tr>\n",
-    "<th>Component</th>\n",
-    "<th>Our Demo</th>\n",
-    "<th>OpenEnv + OpenSpiel</th>\n",
+    "<th>What We Did</th>\n",
+    "<th>How It Works</th>\n",
     "</tr>\n",
     "<tr>\n",
-    "<td><b>Environment</b></td>\n",
-    "<td>Local Python class</td>\n",
-    "<td>Docker container</td>\n",
+    "<td><b>Imported</b></td>\n",
+    "<td>OpenSpielEnv client (pre-built)</td>\n",
     "</tr>\n",
     "<tr>\n",
-    "<td><b>Communication</b></td>\n",
-    "<td>Direct function calls</td>\n",
-    "<td>HTTP/JSON</td>\n",
+    "<td><b>Started</b></td>\n",
+    "<td>OpenSpiel server via uvicorn</td>\n",
     "</tr>\n",
     "<tr>\n",
-    "<td><b>Client</b></td>\n",
-    "<td>Direct access</td>\n",
-    "<td>HTTPEnvClient</td>\n",
-    "</tr>\n",
-    "<tr>\n",
-    "<td><b>Type Safety</b></td>\n",
-    "<td>✅ Dataclasses</td>\n",
-    "<td>✅ Dataclasses</td>\n",
+    "<td><b>Connected</b></td>\n",
+    "<td>HTTP client to server</td>\n",
     "</tr>\n",
     "<tr>\n",
-    "<td><b>API</b></td>\n",
-    "<td>reset(), step()</td>\n",
-    "<td>reset(), step() <em>(same!)</em></td>\n",
+    "<td><b>Played</b></td>\n",
+    "<td>Real OpenSpiel Catch game</td>\n",
     "</tr>\n",
     "</table>\n",
     "\n",
-    "**🎯 Same structure, production features!**\n",
+    "**🎯 This is production code!** Every action was an HTTP call to a real OpenSpiel environment.\n",
     "\n",
     "</div>\n",
     "\n",
-    "### Using OpenSpiel Integration:\n",
+    "## 🎮 6 Games Available - Same Interface!\n",
     "\n",
-    "```python\n",
-    "# 1. Install OpenSpiel\n",
-    "!pip install open_spiel\n",
+    "The beauty of OpenEnv? **Same code, different games!**\n",
     "\n",
-    "# 2. Import OpenEnv's integration\n",
-    "from envs.openspiel_env import OpenSpielEnv, OpenSpielAction\n",
-    "\n",
-    "# 3. Connect to server (HTTP!)\n",
+    "```python\n",
+    "# We just used Catch\n",
     "env = OpenSpielEnv(base_url=\"http://localhost:8000\")\n",
+    "# game_name=\"catch\" was set via environment variable\n",
     "\n",
-    "# 4. Same API you just learned!\n",
-    "result = env.reset()\n",
-    "result = env.step(OpenSpielAction(action_id=2, game_name=\"catch\"))\n",
-    "state = env.state()\n",
-    "\n",
-    "# 5. Switch games by changing game_name:\n",
-    "result = env.step(OpenSpielAction(action_id=4, game_name=\"tic_tac_toe\"))\n",
+    "# Want Tic-Tac-Toe instead? Just change the game!\n",
+    "# Start server with: OPENSPIEL_GAME=tic_tac_toe uvicorn ...\n",
+    "# Same client code works!\n",
     "```\n",
     "\n",
     "<div style=\"background-color: #fff3e0; padding: 15px; border-radius: 5px; margin: 20px 0;\">\n",
     "\n",
-    "**🎮 6 Games Available:**\n",
+    "**🎮 All 6 Games:**\n",
     "\n",
-    "1. `\"catch\"` - What we just built!\n",
-    "2. `\"tic_tac_toe\"` - Classic 3×3\n",
-    "3. `\"kuhn_poker\"` - Imperfect information poker\n",
-    "4. `\"cliff_walking\"` - Grid navigation\n",
-    "5. `\"2048\"` - Tile puzzle\n",
-    "6. `\"blackjack\"` - Card game\n",
+    "1. ✅ **`catch`** - What we just used!\n",
+    "2. **`tic_tac_toe`** - Classic 3×3\n",
+    "3. **`kuhn_poker`** - Imperfect information poker\n",
+    "4. **`cliff_walking`** - Grid navigation\n",
+    "5. **`2048`** - Tile puzzle\n",
+    "6. **`blackjack`** - Card game\n",
     "\n",
-    "**All use the exact same interface!**\n",
+    "**All use the exact same OpenSpielEnv client!**\n",
     "\n",
-    "</div>"
+    "</div>\n",
+    "\n",
+    "### Try Another Game (Optional):\n",
+    "\n",
+    "```python\n",
+    "# Stop the current server (kill the server_process)\n",
+    "# Then start a new game:\n",
+    "\n",
+    "server_process = subprocess.Popen(\n",
+    "    [sys.executable, \"-m\", \"uvicorn\",\n",
+    "     \"envs.openspiel_env.server.app:app\",\n",
+    "     \"--host\", \"0.0.0.0\",\n",
+    "     \"--port\", \"8000\"],\n",
+    "    env={**os.environ,\n",
+    "         \"PYTHONPATH\": f\"{work_dir}/src\",\n",
+    "         \"OPENSPIEL_GAME\": \"tic_tac_toe\",  # Changed!\n",
+    "         \"OPENSPIEL_AGENT_PLAYER\": \"0\",\n",
+    "         \"OPENSPIEL_OPPONENT_POLICY\": \"random\"},\n",
+    "    # ... rest of config\n",
+    ")\n",
+    "\n",
+    "# Same client works!\n",
+    "client = OpenSpielEnv(base_url=\"http://localhost:8000\")\n",
+    "result = client.reset()  # Now playing Tic-Tac-Toe!\n",
+    "```\n",
+    "\n",
+    "**💡 Key Insight**: You don't rebuild anything - you just USE different games with the same client!\n"
    ]
   },
   {
    "cell_type": "markdown",
+   "id": "cell-28",
    "metadata": {},
    "source": [
     "---\n",
@@ -1337,6 +1463,7 @@
   },
   {
    "cell_type": "markdown",
+   "id": "cell-29",
    "metadata": {},
    "source": [
     "---\n",
@@ -1350,6 +1477,7 @@
   },
   {
    "cell_type": "markdown",
+   "id": "cell-30",
    "metadata": {},
    "source": [
     "## What You Learned\n",
@@ -1402,6 +1530,7 @@
   },
   {
    "cell_type": "markdown",
+   "id": "cell-31",
    "metadata": {},
    "source": [
     "## OpenEnv vs Traditional RL\n",
@@ -1468,6 +1597,7 @@
   },
   {
    "cell_type": "markdown",
+   "id": "cell-32",
    "metadata": {},
    "source": [
     "<a id=\"resources\"></a>\n",