## Reinforcement Learning Project 1: The Mountain Car Challenge 

*[Coding along with [Reinforcement Learning for Algorithmic Trading with Python](https://www.udemy.com/course/reinforcement-learning-for-algorithmic-trading-with-python/), an Udemy online course by Alexander Hagmann]*

### One random Episode - with human rendering

> <span style="color:blue">__Prompt:__ I have installed Anaconda and all required libraries to work with OpenAI´s gymnasium Reinforcement Learning environments. Also I have installed Microsoft Visual Studio to visualize environments such as Mountain Car. Please provide a Python code example that creates and runs a random episode in the 'MountainCar-v0' environment using the gymnasium library, with human rendering enabled and printing the total reward and number of steps at the end. Please note that gym is now gymnasium.</span>

Great! Here’s a Python code example that creates and runs a random episode in the **'MountainCar-v0'** environment using the **gymnasium** library, with **human rendering** enabled and printing the **total reward** and **number of steps** at the end:

```python
import gymnasium as gym

# Create the environment
env = gym.make("MountainCar-v0", render_mode="human")

# Initialize the environment
observation, info = env.reset()

total_reward = 0
num_steps = 0
done = False

# Run a random episode
while not done:
    # Take a random action
    action = env.action_space.sample()
    
    # Perform the action in the environment
    observation, reward, done, truncated, info = env.step(action)
    
    # Accumulate the reward and count the steps
    total_reward += reward
    num_steps += 1

# Print the results
print(f"Total Reward: {total_reward}")
print(f"Total Steps: {num_steps}")

# Close the environment
env.close()
```

### Explanation of the Code:
1. **`gymnasium.make("MountainCar-v0", render_mode="human")`**: 
   - Initializes the **MountainCar-v0** environment with rendering enabled so you can visually observe the car’s movement.

2. **`env.reset()`**:
   - Resets the environment to its initial state and returns the initial observation and environment info.

3. **`env.action_space.sample()`**:
   - Randomly samples an action from the action space (which, for **MountainCar**, can be one of three discrete actions: push left, no push, or push right).

4. **`env.step(action)`**:
   - Takes the sampled action and returns the next observation, reward, whether the episode is done, and any additional info. The loop continues until `done` is `True` (when the car reaches the flag or max steps are taken).

5. **`total_reward` and `num_steps`**:
   - These accumulate the reward over the episode and count the number of steps, respectively.

6. **`env.close()`**:
   - Closes the environment to stop the rendering window after the episode finishes.

### Libraries and Setup:
- **gymnasium**: The new version of the Gym library for RL environments.
- **render_mode='human'**: Ensures the environment window is shown for human interaction.
  
This code will run a random agent in the **MountainCar** environment, render the environment visually, and display the total reward and number of steps after the episode ends.

Let me know if you encounter any issues or need further customizations!

In [2]:
import gymnasium as gym

# Create the environment
env = gym.make("MountainCar-v0", render_mode="human")

# Initialize the environment
observation, info = env.reset()

total_reward = 0
num_steps = 0
done = False

# Run a random episode
while not done:
    # Take a random action
    action = env.action_space.sample()
    
    # Perform the action in the environment
    observation, reward, done, truncated, info = env.step(action)
    
    # Accumulate the reward and count the steps
    total_reward += reward
    num_steps += 1

# Print the results
print(f"Total Reward: {total_reward}")
print(f"Total Steps: {num_steps}")

# Close the environment
env.close()

Total Reward: -8085.0
Total Steps: 8085


### Multiple random Episodes - with human rendering

> <span style="color:blue">__Prompt:__ Now, I want to run multiple random episodes with human rendering. Please adjust the code accordingly with as few changes as possible.</span>

### Performance Evaluation / Success Measurement

> <span style="color:blue">__Prompt:__ Now, I want to run multiple random episodes with human rendering. Please adjust the code accordingly with as few changes as possible.</span>

### Performance Evaluation / Success Measurement

> <span style="color:blue">__Prompt:__ Now, let´s focus on Performance Measurement and Evaluation. I want to measure the performance / success of the random agent and print the performance at the end of the random run. Please make an appopriate proposal with various metrics. </span>

### RGB Rendering

> <span style="color:blue">__Prompt:__ Human rendering is really slow. Please remove human rendering to run many episodes in no time. Print every 100th episode only.</span>

### Training an RL Agent with Q-Tables

> <span style="color:blue">__Prompt:__ I have the following code to run multiple random episodes of OpenAI´s MountainCar challenge: ---insert code---</span>
>
> <span style="color:blue">Now, I want to create and train a Reinforcement Learning Agent that can reach the target with as few steps as possible and with a high success rate. Please build on the existing code and train the Agent. </span>

> <span style="color:blue">__Prompt:__ XXX</span>

> <span style="color:blue">__Prompt:__ XXX</span>

> <span style="color:blue">__Prompt:__ XXX</span>