### CDS NYU
### DS-GA 3001 | Reinforcement Learning
### Python and Gym Setup
### January 23, 2025


# Installing OpenAI Gym (Gymnasium) in Python

<br>

---

## Professor
Jeremy Curuksu, PhD -- jeremy.cur@nyu.edu

## Section Leaders
Akshitha Kumbam – ak11071@nyu.edu

Kushagra Khatwani – kk5395@nyu.edu


## Goal of Today's Lab 

In this Intro Lab, the goal is just to set up a Python environment with the Gym library and some of Gym dependencies to build RL solutions, such as the Arcade Learning Environment, Tensorflow, Keras-RL, etc.

We will build a first "RL environment" in Gym just make sure Gym is installed successfully. Next week, we will dive deeper into Gym environments to understand the key components involved when working with Gym. For today, let's just focus on installing libraries needed for this course.

You have two options to execute the code provided in the lab material: either locally on your laptop, or in the cloud on Google Colab. The instructional team will ensure all notebooks provided in the labs run on Google Colab. We will also provide some assistance (during lab sessions and office hours) to set up your local environment on your laptop. But we can't guarantee to solve all possible issues you may meet when installing libraries on your laptops and running the code in your local environment. The ability to run code locally has its advantages, such as the interactive gameplay (i.e., interact with RL environments and play RL games manually using your keyboard) which we will introduce as optional exercise in some of the labs and is only possible with a local setup. So we recommend you try to setup your system locally first, and use Google Colab only if you meet persistent issues locally. Debugging dependency issues is an important skill to have as a data scientist and ML practitioner, so please give it a try to install all libraries on your laptop!

## Resources

* Gym: https://www.gymlibrary.dev/ and its wiki https://github.com/openai/gym/wiki
* The original paper from OpenAI when Gym was released in 2016: https://arxiv.org/pdf/1606.01540.pdf
* In late 2022, Gym was moved to a new platform called Gymnasium, which is now **the only maintained version of Gym**: https://gymnasium.farama.org/

<br>

---

# 1. Install libraries to set up RL environments in Gym

#### At the minimum, you need to create a virtual environment with Python and OpenAI Gym installed:

`conda create --name py39 python=3.9` 

`pip install gym`

`pip install gymnasium`

To add the virtual env as kernel in Jupyter Notebook:

`conda activate py39`

`pip install ipykernel`

`python -m ipykernel install --user --name=py39`


#### Other libraries will soon be needed as the course progresses:


* Install **extended Gym packages** (e.g., Atari games, etc): `pip install gym-all` or `conda install -c conda-forge gym-all` and `pip install 'gym[atari,accept-rom-license]'`


* Install the **Arcade Learning Environment**: `pip install ale-py`


* Install **box2d**: 
`pip install gym-box2d` or `conda install -c conda-forge gym-box2d`


* Install **pygame**: `pip install pygame` 


* Install **tensorflow**:
`pip install tensorflow` or `conda install -c conda-forge tensorflow`


* Install **keras-rl2**:
`pip install keras-rl2`

There will be a few others packages that need be installed during the semester, and some of them may be specific to your particular laptop setup and operating system. Feel free to seek guidance from Akshitha or Kushagra during office hours.
 

In [8]:
!pip install 'gym[atari,accept-rom-license]'

Collecting ale-py~=0.8.0 (from gym[accept-rom-license,atari])
  Downloading ale_py-0.8.1-cp39-cp39-macosx_11_0_arm64.whl.metadata (8.1 kB)
Collecting autorom~=0.4.2 (from autorom[accept-rom-license]~=0.4.2; extra == "accept-rom-license"->gym[accept-rom-license,atari])
  Downloading AutoROM-0.4.2-py3-none-any.whl.metadata (2.8 kB)
Collecting importlib-resources (from ale-py~=0.8.0->gym[accept-rom-license,atari])
  Downloading importlib_resources-6.5.2-py3-none-any.whl.metadata (3.9 kB)
Collecting click (from autorom~=0.4.2->autorom[accept-rom-license]~=0.4.2; extra == "accept-rom-license"->gym[accept-rom-license,atari])
  Downloading click-8.1.8-py3-none-any.whl.metadata (2.3 kB)
Collecting requests (from autorom~=0.4.2->autorom[accept-rom-license]~=0.4.2; extra == "accept-rom-license"->gym[accept-rom-license,atari])
  Downloading requests-2.32.3-py3-none-any.whl.metadata (4.6 kB)
Collecting tqdm (from autorom~=0.4.2->autorom[accept-rom-license]~=0.4.2; extra == "accept-rom-license"->gy

# 2. Build an Atari RL simulation environment in Gym

In this section, we will build a first Reinforcement Learning environment in Gym just for you to confirm the key libraries have been successfully installed on your laptop. 

Next week, we will dive deeper into Gym environments to understand the key components involved when working with Gym. 

For today, let's just confirm Gym has been installed successfully.

Gym is a toolkit for developing and comparing reinforcement learning (RL) algorithms. It offers pre-built, baseline RL environments within which a developer can build and test RL algorithms. 

**At the most fundamental level, using the Gym library means 1) selecting an environment, and 2) interacting with it:**

1. **Gym offers many different environments to select, from classic control use cases (Pendulum, Cart-Pole, Blackjack, etc) to video games (Atari) and simulated robotics (MuJoCo)**. These use cases were selected by OpenAI in 2016 to represent problems that are tractable using existing (21st century) AI technologies, yet complex enough to showcase the need for human-like intelligence.


2. **Gym offers Python functions to interact with the created environment**. Most important ones are:
    * `reset()`: Resets the state of the environment to the initial state (i.e., it restarts the game)
    * `step(action)`:  Step forward by performing an action on the environment and returning the resulting state and reward after taking that action, a flag indicating if the game is over or not, and some metadata information
    
The `reset` function returns one value, which is a starting state/observation. 

The `step` function returns four values, which we will call the ``next_state``, ``reward``,  ``done``, ``truncated`` and ``info`` variables.

-  ``next_state``: This is the observation that the agent will receive
   after taking the action.
-  ``reward``: This is the reward that the agent will receive after
   taking the action.
-  ``done``: This is a boolean variable that indicates whether or
   not the environment has terminated.
-  ``truncated``: This is a boolean variable that indicates whether timelimit is over or agent went physically out of bounds for the environment.
-  ``info``: This is a dictionary that might contain additional
   information about the environment.

In the Atari environments the ``info`` dictionary has a ``ale.lives`` key that tells us how many lives the
agent has left. If the agent has 0 lives, then the episode is over.


### Here are the most basic Python commands to implement a Gym environment:

A concise doc for the Atari Breakout video game available in Gym can be found here: https://gym.openai.com/envs/Breakout-v0/

Note these basic commands are identical for all environments in Gym.


**WARNING: Graphical rendering often crashes the Python kernel after completion => If this happens don't worry about it, just restart your kernel** (click on `Kernel`, then `Restart`)

In [1]:
import gym
print(gym.envs.registry.keys())


dict_keys(['CartPole-v0', 'CartPole-v1', 'MountainCar-v0', 'MountainCarContinuous-v0', 'Pendulum-v1', 'Acrobot-v1', 'LunarLander-v2', 'LunarLanderContinuous-v2', 'BipedalWalker-v3', 'BipedalWalkerHardcore-v3', 'CarRacing-v2', 'Blackjack-v1', 'FrozenLake-v1', 'FrozenLake8x8-v1', 'CliffWalking-v0', 'Taxi-v3', 'Reacher-v2', 'Reacher-v4', 'Pusher-v2', 'Pusher-v4', 'InvertedPendulum-v2', 'InvertedPendulum-v4', 'InvertedDoublePendulum-v2', 'InvertedDoublePendulum-v4', 'HalfCheetah-v2', 'HalfCheetah-v3', 'HalfCheetah-v4', 'Hopper-v2', 'Hopper-v3', 'Hopper-v4', 'Swimmer-v2', 'Swimmer-v3', 'Swimmer-v4', 'Walker2d-v2', 'Walker2d-v3', 'Walker2d-v4', 'Ant-v2', 'Ant-v3', 'Ant-v4', 'Humanoid-v2', 'Humanoid-v3', 'Humanoid-v4', 'HumanoidStandup-v2', 'HumanoidStandup-v4'])


In [1]:
import gym
env = gym.make("Breakout-v0", render_mode="human") # Exact name/version of environments can be found in Gym's doc
observation = env.reset()
for _ in range(500):
    action = env.action_space.sample()  # this is where an actor (RL agent) would be inserted
    observation, reward, done, truncated, info = env.step(action)
    if done:
        observation = env.reset()
env.close() 

  logger.warn(
A.L.E: Arcade Learning Environment (version 0.8.1+53f58b7)
[Powered by Stella]


AttributeError: module 'numpy' has no attribute 'bool8'

: 

**Warning:** The `render_mode="human"` argument renders the environment graphically in a separate window. It is optional, and not always recommended. It is not a good idea to use it when *training* an agent because rendering slows down training a lot. But when looking at an environment for the first time, or when training is complete, it can of course be useful to graphically vizualize how the agent behaves in the environment. But here is the warning: in Jupyter Notebook the Gym's graphical rendering works well but is likely to crash your kernel (or freeze it) once the simulation is complete. So as a habit, be ready to click on `Interrupt` and `Restart` in the Kernel tab of Jupyter Notebook after you run a simulation rendered graphically.

# Thank you everyone, and welcome again to DS-GA 3001 : Reinforcement Learning!