

We explore the following environments from the `gymnasium` library:

1. **CartPole-v1**
2. **MountainCar-v0**
3. **LunarLander-v2**


In [1]:
!apt-get install -y xvfb ffmpeg > /dev/null 2>&1
!pip install gymnasium pyvirtualdisplay imageio moviepy > /dev/null

In [2]:
from pyvirtualdisplay import Display
virtual_display = Display(visible=0, size=(1400, 900))
virtual_display.start()

<pyvirtualdisplay.display.Display at 0x794da9758250>

In [3]:
import gymnasium as gym
import imageio
from IPython.display import Image, display

In [4]:
def make_env_gif(env_name, gif_path="env.gif", steps=300):
    import gymnasium as gym
    import imageio
    from PIL import Image

    env = gym.make(env_name, render_mode="rgb_array")
    obs, _ = env.reset()
    frames = []

    for _ in range(steps):
        frame = env.render()
        frames.append(Image.fromarray(frame))
        action = env.action_space.sample()
        obs, reward, terminated, truncated, info = env.step(action)
        if terminated or truncated:
            obs, _ = env.reset()

    env.close()
    gif_full_path = f"/content/{gif_path}"
    frames[0].save(gif_full_path, save_all=True, append_images=frames[1:], duration=50, loop=0)
    return gif_full_path

In [5]:
from IPython.display import HTML

def display_gif(path, width=500):
    return HTML(f"""
    <div style="text-align:center;">
        <p><b>🎮 RL Environment Simulation</b></p>
        <img src="files/{path}" width="{width}" style="border:2px solid #555;" />
    </div>
    """)


### 🧩 1. CartPole-v1

| Property                        | Description |
|-------------------------------|-------------|
| **Name of Environment**        | `CartPole-v1` |
| **State Space**                | Continuous, 4-dimensional vector: cart position, cart velocity, pole angle, pole velocity |
| **Action Space**               | Discrete(2): {0 → Push cart left, 1 → Push cart right} |
| **Transition Dynamics**        | Physics-based: pole angle evolves using Newtonian mechanics as actions apply force to the cart |
| **Reward Function**            | +1 for every time-step the pole is upright |
| **Episode Termination**        | If pole angle > 12° or cart position > 2.4 units or after 500 steps |
| **Rendering & Visualization**  | `rgb_array` mode shows cart, track, and pole in motion (animated easily) |

---


In [6]:
gif_path = make_env_gif("CartPole-v1", "cartpole.gif", steps=300)
display_gif(gif_path)

In [7]:
!pip install swig
!pip install "gymnasium[box2d]"

Collecting swig
  Downloading swig-4.3.0-py2.py3-none-manylinux_2_5_x86_64.manylinux1_x86_64.whl.metadata (3.5 kB)
Downloading swig-4.3.0-py2.py3-none-manylinux_2_5_x86_64.manylinux1_x86_64.whl (1.9 MB)
[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/1.9 MB[0m [31m?[0m eta [36m-:--:--[0m[2K   [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[91m╸[0m [32m1.9/1.9 MB[0m [31m140.5 MB/s[0m eta [36m0:00:01[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.9/1.9 MB[0m [31m50.4 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: swig
Successfully installed swig-4.3.0
Collecting box2d-py==2.3.5 (from gymnasium[box2d])
  Downloading box2d-py-2.3.5.tar.gz (374 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m374.4/374.4 kB[0m [31m19.3 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
Building wheels for collected packages: box2d-py
  Building wheel for box2d-py (setup.py

### 🚀 2. LunarLander-v2

| Property                        | Description |
|-------------------------------|-------------|
| **Name of Environment**        | `LunarLander-v2` |
| **State Space**                | Continuous, 8-dimensional: position, velocity, angle, angular velocity, and leg contacts |
| **Action Space**               | Discrete(4): {0 → Do nothing, 1 → Left engine, 2 → Main engine, 3 → Right engine} |
| **Transition Dynamics**        | Simulates gravity and thrust physics in a 2D lunar landing scenario |
| **Reward Function**            | Based on distance to landing pad, leg contact, fuel usage, and safe landing |
| **Episode Termination**        | If lander crashes, flies off screen, or lands successfully |
| **Rendering & Visualization**  | Beautiful animation of a lander descending, very engaging with flame effects |

---

In [10]:
gif_path = make_env_gif("MountainCar-v0", "mountain.gif", steps=300)
display_gif(gif_path)

### 🏃‍♂️ 3. BipedalWalker-v3

| Property                        | Description |
|-------------------------------|-------------|
| **Name of Environment**        | `BipedalWalker-v3` |
| **State Space**                | Continuous, 24-dimensional vector: hull angle, angular velocity, joint angles, velocities, terrain info, and leg contact flags |
| **Action Space**               | Continuous(4): Joint torques for the hips and knees of both legs |
| **Transition Dynamics**        | Physics-based walker with complex terrain and gait control; forces applied to joints affect motion and balance |
| **Reward Function**            | Based on forward movement; penalties for falling and energy use |
| **Episode Termination**        | When the walker falls, moves out of bounds, or reaches the goal |
| **Rendering & Visualization**  | Detailed 2D humanoid animation walking on randomly generated terrain (cool for GIFs!) |

---

In [12]:
gif_path = make_env_gif("LunarLander-v3", "lunar.gif", steps=300)
display_gif(gif_path)


## 🧠 Summary of Findings & Real-World Applications

| Environment         | Core Challenge                        | Real-World Analogy |
|---------------------|----------------------------------------|---------------------|
| **CartPole**        | Balance a pole using feedback          | Self-balancing robots, inverted pendulum systems |
| **BipedalWalker**   | Learn complex locomotion               | Humanoid robots, prosthetic leg optimization |
| **LunarLander**     | Controlled descent with limited thrust | Spacecraft landing (e.g., SpaceX), drone navigation |

---

### 🔍 Insights:
- These environments simulate **key control and balance problems** found in robotics, physics, and aerospace.
- Reward shaping teaches agents about **efficiency, balance, and decision-making under uncertainty**.
- Visualization makes RL tangible — we literally **watch the agent learn** to walk, balance, and land.

---

## 🔍 Real-World Application Highlights

- 🧪 **CartPole-v1** → Teaching control theory & robotics balancing
- 🚀 **LunarLander-v2** → Simulates space landings & precision control
- 🦿 **BipedalWalker-v3** → Used in AI-driven biomechanics & biped robots

> "From space exploration to humanoid robotics, these simulations are not just games – they're reflections of real-world engineering challenges powered by AI."
