<div align="center">

# 🎮 OpenEnv: Production-Ready RL Environments

<img src="https://github.com/user-attachments/assets/2700a971-e5d6-4036-b03f-2f89c9791609" width="100" />

**Learn how OpenEnv standardizes RL environments for production use**

[![GitHub](https://img.shields.io/badge/GitHub-meta--pytorch%2FOpenEnv-blue?logo=github)](https://github.com/meta-pytorch/OpenEnv)
[![Python](https://img.shields.io/badge/Python-3.11+-blue?logo=python)](https://www.python.org/)
[![Docker](https://img.shields.io/badge/Docker-Ready-blue?logo=docker)](https://www.docker.com/)

</div>

---

## 📚 What You'll Learn

<table>
<tr>
<td width="20%" align="center">🧠<br><b>RL Fundamentals</b><br><sub>5 minutes</sub></td>
<td width="20%" align="center">🏗️<br><b>OpenEnv Framework</b><br><sub>Architecture</sub></td>
<td width="20%" align="center">🔌<br><b>Integrations</b><br><sub>OpenSpiel example</sub></td>
<td width="20%" align="center">🎯<br><b>Interactive Demo</b><br><sub>See it work</sub></td>
<td width="20%" align="center">➕<br><b>Add Your Own</b><br><sub>Extend it</sub></td>
</tr>
</table>

---

## 🧠 Part 1: RL Fundamentals - The Core Loop

<div align="center">
<table style="border: none; margin: 20px auto;">
<tr>
<td style="background: #e1f5ff; padding: 15px; border-radius: 10px; text-align: center; font-size: 18px; border: 2px solid #4facfe;">
<b>🤖 Agent</b><br><small>learns</small>
</td>
<td style="font-size: 24px; padding: 0 10px;">↓</td>
<td style="background: #fff4e1; padding: 15px; border-radius: 10px; text-align: center; font-size: 18px; border: 2px solid #ffc107;">
<b>👀 State</b><br><small>observes</small>
</td>
</tr>
<tr>
<td style="font-size: 24px; text-align: center;">↑</td>
<td></td>
<td style="font-size: 24px; text-align: center;">↓</td>
</tr>
<tr>
<td style="background: #ffe1e1; padding: 15px; border-radius: 10px; text-align: center; font-size: 18px; border: 2px solid #f5576c;">
<b>🎁 Reward</b><br><small>returns</small>
</td>
<td style="font-size: 24px; padding: 0 10px;">←</td>
<td style="background: #fff4e1; padding: 15px; border-radius: 10px; text-align: center; font-size: 18px; border: 2px solid #ffc107;">
<b>⚡ Action</b><br><small>decides</small>
</td>
</tr>
<tr>
<td style="font-size: 24px; text-align: center;">↑</td>
<td></td>
<td style="font-size: 24px; text-align: center;">↓</td>
</tr>
<tr>
<td colspan="3" style="background: #fff4e1; padding: 15px; border-radius: 10px; text-align: center; font-size: 18px; border: 2px solid #ffc107;">
<b>🌍 Environment</b><br><small>executes</small>
</td>
</tr>
</table>
</div>

Reinforcement Learning boils down to a simple loop:

```
Agent observes → chooses action → gets reward → repeat
```

Let's see it in action with a simple example:

In [None]:
import random

# Simple RL: Guess a number
target = random.randint(1, 10)
guesses = 3

print("🎯 Guess a number (1-10)\n")

while guesses > 0:
    guess = random.randint(1, 10)  # Policy: random
    guesses -= 1
    
    print(f"Guess: {guess}", end=" → ")
    
    if guess == target:
        print("🎉 Correct! Reward: +1")
        break
    elif abs(guess - target) <= 2:
        print("🔥 Warm")
    else:
        print("❄️ Cold")
else:
    print(f"\nIt was {target}. Reward: 0")

print("\n💡 That's RL: observe → act → reward → repeat")

<div style="background-color: #fff3cd; border-left: 4px solid #ffc107; padding: 15px; margin: 20px 0;">
    <h3 style="margin-top: 0;">⚠️ The Problem</h3>
    <p>How do we make this production-ready?</p>
    <ul>
        <li>❌ Need type safety</li>
        <li>❌ Need isolation</li>
        <li>❌ Need deployment</li>
        <li>❌ Need standardization</li>
    </ul>
</div>

<div style="background-color: #d4edda; border-left: 4px solid #28a745; padding: 15px; margin: 20px 0;">
    <h3 style="margin-top: 0;">✅ The Solution: OpenEnv</h3>
    <p>A production-ready framework that solves all these problems!</p>
</div>

---

## 🏗️ Part 2: OpenEnv - The Framework

<div align="center">
    <h3>🚀 Think "Docker for RL Environments"</h3>
</div>

### ✨ What is OpenEnv?

OpenEnv is a **framework for creating, deploying, and using isolated RL environments**.

<table>
<tr>
<td align="center">✅<br><b>Standardized API</b><br><sub>reset, step, state</sub></td>
<td align="center">🔒<br><b>Type-safe</b><br><sub>dataclasses</sub></td>
<td align="center">🐳<br><b>Docker isolation</b><br><sub>secure</sub></td>
<td align="center">🌐<br><b>HTTP API</b><br><sub>any language</sub></td>
<td align="center">☸️<br><b>Production-ready</b><br><sub>K8s deploy</sub></td>
</tr>
</table>

### 🎨 The Architecture

<div style="margin: 30px 0;">
<table style="width: 100%; border: none;">
<tr>
<td colspan="3" style="background: linear-gradient(135deg, #e1f5ff 0%, #b3e0ff 100%); padding: 20px; border-radius: 10px; border: 3px solid #4facfe;">
<div style="text-align: center;">
<h4 style="margin: 5px 0;">💻 Your Training Code (Client)</h4>
<code>env = OpenSpielEnv()</code><br>
<code>result = env.reset()</code><br>
<code>result = env.step(action)</code>
</div>
</td>
</tr>
<tr>
<td colspan="3" style="text-align: center; font-size: 32px; padding: 10px;">↓</td>
</tr>
<tr>
<td colspan="3" style="background: linear-gradient(135deg, #fff4e1 0%, #ffe4b3 100%); padding: 20px; border-radius: 10px; border: 3px solid #ffc107;">
<div style="text-align: center;">
<h4 style="margin: 5px 0;">🌐 HTTP/JSON Protocol</h4>
<code>POST /reset</code> | <code>POST /step</code> | <code>GET /state</code>
</div>
</td>
</tr>
<tr>
<td colspan="3" style="text-align: center; font-size: 32px; padding: 10px;">↓</td>
</tr>
<tr>
<td colspan="3" style="background: linear-gradient(135deg, #ffe1f5 0%, #ffb3e6 100%); padding: 20px; border-radius: 10px; border: 3px solid #f093fb;">
<div style="text-align: center;">
<h4 style="margin: 5px 0;">🐳 Docker Container (Server)</h4>
⚡ FastAPI Server → 🎮 Environment Logic → 🎯 Game/Simulation
</div>
</td>
</tr>
</table>
</div>

### 📁 The Pattern - Every Environment Has:

```
src/envs/your_env/
├── 📝 models.py         ← Type-safe contracts (Action, Observation, State)
├── 📱 client.py         ← Client API (what you import)
└── 🖥️ server/
    ├── environment.py  ← Environment logic
    ├── app.py          ← FastAPI server
    └── Dockerfile      ← Container
```

### 🎮 Current Integrations

<div style="display: flex; flex-wrap: wrap; gap: 10px; margin: 20px 0;">
    <div style="background: linear-gradient(135deg, #667eea 0%, #764ba2 100%); color: white; padding: 15px; border-radius: 10px; flex: 1; min-width: 200px;">
        <h4>🎯 OpenSpiel</h4>
        <p>6 games from DeepMind</p>
    </div>
    <div style="background: linear-gradient(135deg, #f093fb 0%, #f5576c 100%); color: white; padding: 15px; border-radius: 10px; flex: 1; min-width: 200px;">
        <h4>📢 Echo</h4>
        <p>Test environment</p>
    </div>
    <div style="background: linear-gradient(135deg, #4facfe 0%, #00f2fe 100%); color: white; padding: 15px; border-radius: 10px; flex: 1; min-width: 200px;">
        <h4>💻 Coding</h4>
        <p>Python execution</p>
    </div>
    <div style="background: linear-gradient(135deg, #43e97b 0%, #38f9d7 100%); color: white; padding: 15px; border-radius: 10px; flex: 1; min-width: 200px;">
        <h4>🕹️ Atari</h4>
        <p>Classic games</p>
    </div>
</div>

Let's explore one integration to see how it all works...

---

## ⚙️ Part 3: Setup

<div align="center">
    <h3>🔧 Getting Started</h3>
</div>

In [None]:
# Check if in Colab
try:
    import google.colab
    IN_COLAB = True
except ImportError:
    IN_COLAB = False

if IN_COLAB:
    !git clone https://github.com/meta-pytorch/OpenEnv.git
    %cd OpenEnv
    !pip install -q fastapi uvicorn requests
    import sys
    sys.path.insert(0, './src')
    print("✅ OpenEnv ready!")
else:
    import sys
    from pathlib import Path
    sys.path.insert(0, str(Path.cwd() / 'src'))
    print("✅ Using local OpenEnv")

---

## 🔍 Part 4: Exploring OpenEnv's Structure

<div align="center">
    <h3>Let's look at the actual code!</h3>
</div>

### 🧩 The Base Classes

In [None]:
from core.env_server import Environment, Action, Observation, State
from core.http_env_client import HTTPEnvClient

print("=" * 70)
print("🔧 OpenEnv Core Abstractions")
print("=" * 70)

print("""
🖥️  SERVER SIDE (runs in Docker):

  class Environment(ABC):
      '''Base class for all environment implementations'''
      
      @abstractmethod
      def reset(self) -> Observation:
          '''Start new episode'''
      
      @abstractmethod
      def step(self, action: Action) -> Observation:
          '''Execute action'''
      
      @property
      def state(self) -> State:
          '''Episode metadata'''

📱 CLIENT SIDE (your training code):

  class HTTPEnvClient(ABC):
      '''Base class for HTTP clients'''
      
      def reset(self) -> StepResult:
          # HTTP POST to /reset
      
      def step(self, action) -> StepResult:
          # HTTP POST to /step
      
      def state(self) -> State:
          # HTTP GET to /state
""")

print("=" * 70)
print("💡 Same interface, communication via HTTP!")
print("=" * 70)

---

## 🔌 Part 5: Example Integration - OpenSpiel

<div align="center">
    <img src="https://img.shields.io/badge/OpenSpiel-DeepMind-red?style=for-the-badge" />
    <h3>70+ Game Environments</h3>
</div>

### 🎮 What is OpenSpiel?

OpenSpiel is a **library from DeepMind** with 70+ game environments for RL research.

### 🎯 Our Integration

**OpenEnv wraps 6 OpenSpiel games** following our standard pattern:

<table>
<tr>
<td align="center">🎯<br><b>Catch</b><br><sub>Catch falling ball</sub></td>
<td align="center">❌<br><b>Tic-Tac-Toe</b><br><sub>Classic 3×3</sub></td>
<td align="center">🃏<br><b>Kuhn Poker</b><br><sub>Imperfect info</sub></td>
</tr>
<tr>
<td align="center">🏔️<br><b>Cliff Walking</b><br><sub>Grid navigation</sub></td>
<td align="center">🔢<br><b>2048</b><br><sub>Tile puzzle</sub></td>
<td align="center">🂡<br><b>Blackjack</b><br><sub>Card game</sub></td>
</tr>
</table>

Let's see how the integration is structured:

In [None]:
from envs.openspiel_env.models import (
    OpenSpielAction,
    OpenSpielObservation,
    OpenSpielState
)
from dataclasses import fields

print("=" * 70)
print("🔒 OpenSpiel Integration - Type-Safe Models")
print("=" * 70)

print("\n📤 OpenSpielAction (what you send):")
for field in fields(OpenSpielAction):
    print(f"   • {field.name}: {field.type}")

print("\n📥 OpenSpielObservation (what you receive):")
for field in fields(OpenSpielObservation):
    print(f"   • {field.name}: {field.type}")

print("\n📊 OpenSpielState (episode metadata):")
for field in fields(OpenSpielState):
    print(f"   • {field.name}: {field.type}")

print("\n" + "=" * 70)
print("💡 This is how OpenEnv integrates external libraries:")
print("   1. Wrap in standardized types")
print("   2. Expose via HTTPEnvClient")
print("   3. Package in Docker")
print("=" * 70)

### 🔧 How the Client Works

In [None]:
from envs.openspiel_env.client import OpenSpielEnv

print("=" * 70)
print("📱 OpenSpielEnv Client (HTTPEnvClient Implementation)")
print("=" * 70)

print("""
How OpenEnv wraps OpenSpiel:

class OpenSpielEnv(HTTPEnvClient[OpenSpielAction, OpenSpielObservation]):
    
    def _step_payload(self, action: OpenSpielAction) -> dict:
        '''Convert action to JSON for HTTP request'''
        return {
            "action_id": action.action_id,
            "game_name": action.game_name,
        }
    
    def _parse_result(self, payload: dict) -> StepResult:
        '''Parse HTTP response into typed observation'''
        return StepResult(
            observation=OpenSpielObservation(...),
            reward=payload['reward'],
            done=payload['done']
        )

Usage (same for ALL OpenEnv environments):

  env = OpenSpielEnv(base_url="http://localhost:8000")
  result = env.reset()  # Returns StepResult[OpenSpielObservation]
  result = env.step(OpenSpielAction(action_id=2, game_name="catch"))
  state = env.state()   # Returns OpenSpielState
""")

print("=" * 70)
print("💡 This pattern works for ANY environment you want to wrap!")
print("=" * 70)

---

## 🎯 Part 6: Interactive Demo - See It In Action

<div align="center">
    <h2>🎮 Let's Build the Catch Game!</h2>
    <img width="200" src="https://user-images.githubusercontent.com/placeholder-catch-game.gif" onerror="this.style.display='none'" />
</div>

### 🎲 The Game Rules:

<table>
<tr>
<td width="25%" align="center">📐<br><b>5×5 Grid</b></td>
<td width="25%" align="center">🔴<br><b>Ball falls</b></td>
<td width="25%" align="center">🏓<br><b>Catch it!</b></td>
<td width="25%" align="center">🎁<br><b>+1 reward</b></td>
</tr>
</table>

**Actions**: 0=LEFT ⬅️ | 1=STAY ⏸️ | 2=RIGHT ➡️

In [None]:
import random
from dataclasses import dataclass
from typing import List, Tuple

# Define types (following OpenEnv pattern)
@dataclass
class CatchObservation:
    """Type-safe observation."""
    info_state: List[float]
    legal_actions: List[int]
    done: bool
    reward: float
    ball_position: Tuple[int, int]
    paddle_position: int


class CatchEnvironment:
    """
    Catch game following OpenEnv Environment pattern.
    
    In production: This would run in Docker, accessed via HTTPEnvClient
    For demo: We run it locally to see the internals
    """
    
    def __init__(self, grid_size=5):
        self.grid_size = grid_size
    
    def reset(self) -> CatchObservation:
        """Start new episode (implements Environment.reset())."""
        self.ball_row = 0
        self.ball_col = random.randint(0, self.grid_size - 1)
        self.paddle_col = self.grid_size // 2
        self.done = False
        return self._make_observation()
    
    def step(self, action: int) -> CatchObservation:
        """Execute action (implements Environment.step())."""
        if action == 0 and self.paddle_col > 0:
            self.paddle_col -= 1
        elif action == 2 and self.paddle_col < self.grid_size - 1:
            self.paddle_col += 1
        
        self.ball_row += 1
        
        if self.ball_row >= self.grid_size - 1:
            self.done = True
            reward = 1.0 if self.ball_col == self.paddle_col else 0.0
        else:
            reward = 0.0
        
        return self._make_observation(reward)
    
    def _make_observation(self, reward=0.0) -> CatchObservation:
        info_state = [0.0] * (self.grid_size * self.grid_size)
        ball_idx = self.ball_row * self.grid_size + self.ball_col
        paddle_idx = (self.grid_size - 1) * self.grid_size + self.paddle_col
        info_state[ball_idx] = 1.0
        info_state[paddle_idx] = 0.5
        
        return CatchObservation(
            info_state=info_state,
            legal_actions=[0, 1, 2],
            done=self.done,
            reward=reward,
            ball_position=(self.ball_row, self.ball_col),
            paddle_position=self.paddle_col
        )
    
    def render(self):
        for row in range(self.grid_size):
            line = "  "
            for col in range(self.grid_size):
                if row == self.ball_row and col == self.ball_col:
                    line += "🔴 "
                elif row == self.grid_size - 1 and col == self.paddle_col:
                    line += "🏓 "
                else:
                    line += "⬜ "
            print(line)

print("✅ Environment created following OpenEnv pattern!")
print("   🔧 Implements: reset(), step()")
print("   🔒 Returns: Type-safe observations")
print("   🐳 In production: Would run in Docker + FastAPI")

### 🧪 Test It

In [None]:
env = CatchEnvironment()
obs = env.reset()

print("🎮 Initial State:")
print("=" * 50)
env.render()
print(f"\n🔴 Ball: column {obs.ball_position[1]}")
print(f"🏓 Paddle: column {obs.paddle_position}")
print(f"⚡ Legal actions: {obs.legal_actions} (0=LEFT, 1=STAY, 2=RIGHT)")

---

## 🤖 Part 7: Different Policies

<div align="center">
    <h3>A policy maps: Observation → Action</h3>
</div>

Let's test 4 strategies from dumb to smart!

<div style="margin: 30px auto; max-width: 600px;">
<table style="width: 100%; border: none;">
<tr>
<td rowspan="4" style="background: #fff4e1; padding: 20px; border-radius: 10px; text-align: center; font-size: 18px; border: 3px solid #ffc107; vertical-align: middle;">
<b>👀 Observation</b><br><small>Ball & paddle positions</small>
</td>
<td style="font-size: 32px; text-align: center; padding: 0 20px;">→</td>
<td rowspan="4" style="background: #ffe1f5; padding: 20px; border-radius: 10px; text-align: center; font-size: 18px; border: 3px solid #f093fb; vertical-align: middle;">
<b>🤖 Policy</b><br><small>Decision maker</small>
</td>
<td style="font-size: 24px; text-align: center; padding: 0 15px;">→</td>
<td style="background: #e1ffe1; padding: 10px; border-radius: 8px; text-align: center; border: 2px solid #43e97b;">
🎲 Random
</td>
</tr>
<tr>
<td style="font-size: 24px; text-align: center; padding: 0 15px;"></td>
<td style="font-size: 24px; text-align: center; padding: 0 15px;">→</td>
<td style="background: #ffe1e1; padding: 10px; border-radius: 8px; text-align: center; border: 2px solid #f5576c;">
⏸️ Always Stay
</td>
</tr>
<tr>
<td style="font-size: 24px; text-align: center; padding: 0 15px;"></td>
<td style="font-size: 24px; text-align: center; padding: 0 15px;">→</td>
<td style="background: #e1f5ff; padding: 10px; border-radius: 8px; text-align: center; border: 2px solid #4facfe;">
🎯 Smart
</td>
</tr>
<tr>
<td style="font-size: 24px; text-align: center; padding: 0 15px;"></td>
<td style="font-size: 24px; text-align: center; padding: 0 15px;">→</td>
<td style="background: #f5e1ff; padding: 10px; border-radius: 8px; text-align: center; border: 2px solid #b388ff;">
🧠 Learning
</td>
</tr>
</table>
</div>

In [None]:
class RandomPolicy:
    name = "🎲 Random"
    def select_action(self, obs): 
        return random.choice(obs.legal_actions)

class AlwaysStayPolicy:
    name = "⏸️ Always Stay"
    def select_action(self, obs): 
        return 1

class SmartPolicy:
    name = "🎯 Smart Heuristic"
    def select_action(self, obs):
        ball_col = obs.ball_position[1]
        paddle_col = obs.paddle_position
        if paddle_col < ball_col: return 2
        elif paddle_col > ball_col: return 0
        else: return 1

class LearningPolicy:
    name = "🧠 Learning Agent"
    def __init__(self):
        self.steps = 0
    
    def select_action(self, obs):
        self.steps += 1
        epsilon = max(0.1, 1.0 - (self.steps / 100))
        
        if random.random() < epsilon:
            return random.choice(obs.legal_actions)
        else:
            ball_col = obs.ball_position[1]
            paddle_col = obs.paddle_position
            if paddle_col < ball_col: return 2
            elif paddle_col > ball_col: return 0
            else: return 1

print("✅ 4 Policies created!")
print("   🎲 Random - Baseline")
print("   ⏸️  Always Stay - Bad strategy")
print("   🎯 Smart - Optimal heuristic")
print("   🧠 Learning - Simulated RL")

### 👀 Watch Them Play

In [None]:
import time

def run_episode(env, policy, visualize=True, delay=0.4):
    obs = env.reset()
    
    if visualize:
        print(f"\n{'='*50}")
        print(f"🤖 Policy: {policy.name} | 🔴 Ball: col {obs.ball_position[1]}")
        print('='*50 + '\n')
        env.render()
        time.sleep(delay)
    
    total_reward = 0
    step = 0
    
    while not obs.done:
        action = policy.select_action(obs)
        obs = env.step(action)
        total_reward += obs.reward
        
        if visualize:
            actions = ["⬅️ LEFT", "⏸️ STAY", "➡️ RIGHT"]
            print(f"\n⚡ Step {step + 1}: {actions[action]}")
            env.render()
            time.sleep(delay)
        
        step += 1
    
    if visualize:
        print(f"\n{'🎉 CAUGHT!' if total_reward > 0 else '😢 MISSED'} Reward: {total_reward}")
    
    return total_reward > 0

# Demo
env = CatchEnvironment()
run_episode(env, SmartPolicy(), visualize=True, delay=0.3)

### 📊 Compare All Policies

In [None]:
def evaluate_policies(num_episodes=50):
    policies = [RandomPolicy(), AlwaysStayPolicy(), SmartPolicy(), LearningPolicy()]
    
    print("\n" + "="*70)
    print(f"🏆 POLICY COMPARISON ({num_episodes} episodes)")
    print("="*70 + "\n")
    
    results = []
    for policy in policies:
        env = CatchEnvironment()
        successes = sum(run_episode(env, policy, visualize=False) 
                       for _ in range(num_episodes))
        rate = (successes / num_episodes) * 100
        results.append((policy.name, rate))
        print(f"{policy.name:25s}: {rate:5.1f}%")
    
    print("\n" + "="*70)
    print("📊 VISUAL COMPARISON")
    print("="*70 + "\n")
    
    results.sort(key=lambda x: x[1], reverse=True)
    for name, rate in results:
        bar = "█" * int(rate / 2)
        print(f"{name:25s} [{bar:<50}] {rate:.1f}%")
    
    print("\n" + "="*70)
    print("💡 RL in action: Random → Learning → Optimal")
    print("="*70)

evaluate_policies(50)

---

## 🌐 Part 8: Using Real OpenSpiel Integration

<div style="background-color: #d4edda; border: 2px solid #28a745; border-radius: 10px; padding: 20px; margin: 20px 0;">
    <h3 style="margin-top: 0;">✨ What We Just Built = How OpenEnv Works!</h3>
</div>

### 🔄 Demo vs Production:

| Component | 🧪 Our Demo | 🚀 OpenEnv + OpenSpiel |
|-----------|-------------|------------------------|
| Environment | Local class | 🐳 Docker container |
| Communication | Direct calls | 🌐 HTTP |
| Client | Direct access | 📱 HTTPEnvClient |
| Type Safety | ✅ | ✅ |
| API | reset/step | reset/step |

### 🎮 Using OpenSpiel Integration:

```python
# Install OpenSpiel
!pip install open_spiel

# Import OpenEnv's integration
from envs.openspiel_env import OpenSpielEnv, OpenSpielAction

# Connect to server
env = OpenSpielEnv(base_url="http://localhost:8000")

# Same API!
result = env.reset()
result = env.step(OpenSpielAction(action_id=2, game_name="catch"))
state = env.state()
```

### 🎯 Available Games:

<div style="display: grid; grid-template-columns: repeat(3, 1fr); gap: 10px; margin: 20px 0;">
    <div style="background: #e1f5ff; padding: 15px; border-radius: 8px; text-align: center;">
        <h4>🎯 Catch</h4>
        <small>What we demoed!</small>
    </div>
    <div style="background: #ffe1e1; padding: 15px; border-radius: 8px; text-align: center;">
        <h4>❌ Tic-Tac-Toe</h4>
        <small>2-player</small>
    </div>
    <div style="background: #fff4e1; padding: 15px; border-radius: 8px; text-align: center;">
        <h4>🃏 Kuhn Poker</h4>
        <small>Imperfect info</small>
    </div>
    <div style="background: #e8f5e9; padding: 15px; border-radius: 8px; text-align: center;">
        <h4>🏔️ Cliff Walking</h4>
        <small>Navigation</small>
    </div>
    <div style="background: #f3e5f5; padding: 15px; border-radius: 8px; text-align: center;">
        <h4>🔢 2048</h4>
        <small>Puzzle</small>
    </div>
    <div style="background: #fff3e0; padding: 15px; border-radius: 8px; text-align: center;">
        <h4>🂡 Blackjack</h4>
        <small>Cards</small>
    </div>
</div>

---

## ➕ Part 9: Adding Your Own Integration

<div align="center">
    <h3>🛠️ Want to wrap your own environment?</h3>
    <p>Follow the 5-step pattern!</p>
</div>

### 📝 1. Define Types (models.py)
```python
@dataclass
class YourAction(Action):
    # Your action fields

@dataclass
class YourObservation(Observation):
    # Your observation fields
```

### 🖥️ 2. Implement Environment (server/environment.py)
```python
class YourEnvironment(Environment):
    def reset(self) -> Observation:
        return YourObservation(...)
    
    def step(self, action: Action) -> Observation:
        return YourObservation(...)
```

### 📱 3. Create Client (client.py)
```python
class YourEnv(HTTPEnvClient[YourAction, YourObservation]):
    def _step_payload(self, action):
        return {"field": action.field}
    
    def _parse_result(self, payload):
        return StepResult(observation=YourObservation(...))
```

### ⚡ 4. Create Server (server/app.py)
```python
from core.env_server import create_fastapi_app

env = YourEnvironment()
app = create_fastapi_app(env)
```

### 🐳 5. Dockerize (server/Dockerfile)
```dockerfile
FROM python:3.11
COPY . /app
WORKDIR /app
RUN pip install -r requirements.txt
CMD ["uvicorn", "app:app", "--host", "0.0.0.0"]
```

### 📚 Examples to Study:

<table>
<tr>
<td>📢 <code>src/envs/echo_env/</code></td>
<td>Simple test environment</td>
</tr>
<tr>
<td>🎮 <code>src/envs/openspiel_env/</code></td>
<td>Our OpenSpiel integration</td>
</tr>
<tr>
<td>💻 <code>src/envs/coding_env/</code></td>
<td>Python code execution</td>
</tr>
</table>

---

## 🎓 Summary

<div style="background: linear-gradient(135deg, #667eea 0%, #764ba2 100%); color: white; padding: 30px; border-radius: 15px; margin: 20px 0;">
    <h3 style="margin-top: 0; text-align: center;">🎉 What You Learned</h3>
</div>

### 📖 The Journey:

1. **🧠 RL Basics** - The core loop
2. **🏗️ OpenEnv Framework** - Standardized, production-ready
3. **🔌 Example Integration** - How OpenSpiel is wrapped
4. **🎯 Interactive Demo** - Policies in action
5. **➕ Adding Integrations** - The pattern to follow

### ✨ OpenEnv's Value:

| Feature | 🏠 Traditional | 🚀 OpenEnv |
|---------|---------------|------------|
| **Type Safety** | ❌ | ✅ Dataclasses |
| **Isolation** | ❌ | ✅ Docker |
| **Deployment** | ❌ | ✅ K8s-ready |
| **Language** | Python only | Any (HTTP) |
| **Reproducibility** | ❌ | ✅ Containers |

### 🚀 Next Steps:

<div style="display: grid; grid-template-columns: repeat(2, 1fr); gap: 15px; margin: 20px 0;">
    <div style="background: #e1f5ff; padding: 20px; border-radius: 10px;">
        <h4>1️⃣ Try OpenSpiel</h4>
        <p>Install and play with the 6 games</p>
    </div>
    <div style="background: #ffe1e1; padding: 20px; border-radius: 10px;">
        <h4>2️⃣ Implement Real RL</h4>
        <p>Q-learning, DQN, PPO</p>
    </div>
    <div style="background: #fff4e1; padding: 20px; border-radius: 10px;">
        <h4>3️⃣ Wrap Your Environments</h4>
        <p>Follow the 5-step pattern</p>
    </div>
    <div style="background: #e8f5e9; padding: 20px; border-radius: 10px;">
        <h4>4️⃣ Deploy to Production</h4>
        <p>Docker → Kubernetes</p>
    </div>
</div>

### 📚 Resources:

- 🏠 **OpenEnv**: https://github.com/meta-pytorch/OpenEnv
- 📖 **Docs**: `src/envs/README.md`
- 💡 **Examples**: `examples/` directory

---

<div align="center" style="background: linear-gradient(135deg, #f093fb 0%, #f5576c 100%); color: white; padding: 40px; border-radius: 20px; margin: 30px 0;">
    <h2>🎉 You're Ready!</h2>
    <p style="font-size: 1.2em; margin: 20px 0;">You now understand:</p>
    <table style="margin: 20px auto;">
        <tr>
            <td>✅ OpenEnv framework</td>
            <td>✅ How integrations work</td>
        </tr>
        <tr>
            <td>✅ Using existing environments</td>
            <td>✅ Creating new integrations</td>
        </tr>
        <tr>
            <td colspan="2">✅ Production deployment</td>
        </tr>
    </table>
    <h3 style="margin-top: 30px;">Welcome to production-ready RL! 🚀</h3>
</div>