# Observation and Action Spaces in Gym

In this tutorial, we will learn about observation and action spaces in Gym, a popular Python library for reinforcement learning. We will cover the following topics:

1. Understanding observation and action spaces
2. Exploring Gym environments
3. Working with different types of observation and action spaces
4. Implementing a simple agent
5. Evaluating the agent's performance

## Understanding Observation and Action Spaces

In reinforcement learning, an agent interacts with an environment to learn how to perform actions that maximize its cumulative reward. The environment provides the agent with observations, which represent the current state of the environment, and the agent takes actions based on these observations.

**Observation space**: The set of all possible observations that an agent can encounter in an environment.

**Action space**: The set of all possible actions that an agent can take in an environment.

Gym provides a standardized interface for working with different environments and their respective observation and action spaces.

## Exploring Gym Environments

First, let's install Gym and import it.

In [None]:
!pip install gym
import gym

Next, we will create an environment and explore its observation and action spaces. For this tutorial, we will use the 'MountainCar-v0' environment.

In [None]:
# Create the environment
env = gym.make('MountainCar-v0')

# Print observation and action spaces
print('Observation space:', env.observation_space)
print('Action space:', env.action_space)

## Working with Different Types of Observation and Action Spaces

Gym provides two main types of spaces:

- `Discrete(n)`: A discrete space with `n` possible actions or observations.
- `Box(shape)`: A continuous space with a given shape, representing a multidimensional array with lower and upper bounds.

In the 'MountainCar-v0' environment, the observation space is continuous, while the action space is discrete. Let's explore their properties and sample some random actions and observations.

In [None]:
# Sampling random actions and observations

# Sample a random action
action = env.action_space.sample()

# Sample a random observation
observation = env.observation_space.sample()

print('Sampled action:', action)
print('Sampled observation:', observation)

## Implementing a Simple Agent

Now, let's implement a simple agent that takes random actions in the 'MountainCar-v0' environment. We will run the agent for 10 episodes and record the total reward obtained in each episode.

In [None]:
# Implementing the random agent

num_episodes = 10

# Loop over episodes
for episode in range(num_episodes):
    observation = env.reset()
    total_reward = 0

    while True:
        # Take a random action
        action = env.action_space.sample()

        # Perform the action and receive new observation, reward, and done flag
        observation, reward, done, info = env.step(action)

        # Update the total reward
        total_reward += reward

        if done:
            break

    print(f'Episode {episode + 1}: Total reward = {total_reward}')

## Evaluating the Agent's Performance

In this section, we will evaluate the performance of our random agent. We will calculate the average reward obtained over 10 episodes and discuss the results.

In [None]:
# Evaluating the random agent

num_episodes = 10
total_rewards = []

# Loop over episodes
for episode in range(num_episodes):
    observation = env.reset()
    total_reward = 0

    while True:
        # Take a random action
        action = env.action_space.sample()

        # Perform the action and receive new observation, reward, and done flag
        observation, reward, done, info = env.step(action)

        # Update the total reward
        total_reward += reward

        if done:
            break

    total_rewards.append(total_reward)

# Calculate average reward
avg_reward = sum(total_rewards) / num_episodes

# Print results
print(f'Average reward over {num_episodes} episodes: {avg_reward}')

---
**Conclusion:**
- We learned about observation and action spaces in Gym and how to work with different types of spaces.
- We explored the 'MountainCar-v0' environment and implemented a simple random agent.
- We evaluated the agent's performance by calculating the average reward over multiple episodes.