# Customizing OpenAI Gym Environments and Implementing Reinforcement Learning Agents with Stable Baselines

### Theme: Car Racing

- Constança
- Daniela
- Inês Amorim, 202108108

---

## Imports

In [None]:
%pip install -r requirements.txt

In [None]:
%pip freeze

In [6]:
import math
from typing import Optional, Union
import numpy as np
import gymnasium as gym
from gymnasium import spaces
from gymnasium.envs.box2d.car_dynamics import Car
from gymnasium.error import DependencyNotInstalled, InvalidAction
from gymnasium.utils import EzPickle
import pygame
from pygame import gfxdraw
import time
import matplotlib.pyplot as plt
from pyvirtualdisplay import Display
from stable_baselines3 import A2C

---

The CarRacing-v3 environment from Gymnasium (previously Gym) is part of the Box2D environments, and it offers an interesting challenge for training reinforcement learning agents. It's a top-down racing simulation where the track is randomly generated at the start of each episode. The environment offers both continuous and discrete action spaces, making it adaptable to different types of reinforcement learning algorithms.

- **Action Space:**

   - **Continuous:** Three actions: steering, gas, and braking. Steering ranges from -1 (full left) to +1 (full right).
   -  **Discrete:** Five possible actions: do nothing, steer left, steer right, gas, and brake.

- **Observation Space:**

    - The environment provides a 96x96 RGB image of the car and the track, which serves as the state input for the agent.

- **Rewards:**

    - The agent receives a -0.1 penalty for every frame, encouraging efficiency.
    - It earns a positive reward for visiting track tiles: the formula is Reward=1000−0.1×framesReward=1000−0.1×frames, where "frames" is the number of frames taken to complete the lap. The reward for completing a lap depends on how many track tiles are visited.

- Episode Termination:

    - The episode ends either when all track tiles are visited or if the car goes off the track, which incurs a significant penalty (-100 reward).

In [3]:
env = gym.make("CarRacing-v3", continuous=True, render_mode='human') 
#continuous = False to use Discrete space

NameError: name 'gym' is not defined

In [None]:
#check render modes
print(env.metadata["render_modes"])

['human', 'rgb_array', 'state_pixels']


- Checking if everything is okay and working

In [7]:
# Reset the environment and render the first frame
obs, info = env.reset()

# Close the environment
env.close()

print("Environment initialized successfully!")

: 

In [52]:
print("Action space:", env.action_space)

Action space: Box([-1.  0.  0.], 1.0, (3,), float32)


In [4]:
# Reset the environment
obs, info = env.reset()

# Play for 100 steps
for _ in range(100):
    action = env.action_space.sample()  # Random action
    obs, reward, done, truncated, info = env.step(action)
    
    if done or truncated:
        break

env.close()