### Mountain Car Continuous Environment Control Problem
#### Solved by Using Reinforcement Learning with Deep Deterministic Policy Gradient

References:
- https://gymnasium.farama.org/environments/classic_control/mountain_car_continuous/
- https://github.com/Bduz/intro_pytorch/tree/main/intro_rl/ddpg
- https://spinningup.openai.com/en/latest/algorithms/ddpg.html

**Description:**
The Mountain Car MDP is a deterministic MDP that consists of a car placed stochastically at the bottom of a sinusoidal valley, with the only possible actions being the accelerations that can be applied to the car in either direction. The goal of the MDP is to strategically accelerate the car to reach the goal state on top of the right hill. There are two versions of the mountain car domain in gymnasium: one with discrete actions and one with continuous. This version is the one with continuous actions.

This MDP first appeared in Andrew Moore’s PhD Thesis (1990)

### 1 - Importing Necessary Libraries

In [None]:
# Cloning my repo (for google colab)
!git clone

In [None]:
# Adding the path to the sys (for google colab)
import sys
sys.path.instert(0, "/content/Deep_Learning/DRL_Deep_Reinforcement_Learning/DRL_DQN/DDPG_Mountain_Car_Continuous/")

In [16]:
# Importing Necessary Libraries
import gym
print("OpenAI Gym Version", gym.__version__)
import numpy as np
print("Numpy Version", np.__version__)
import random
import torch
from collections import deque
import matplotlib.pyplot as plt
from DDPG_Agent_model import DDPG_Agent

OpenAI Gym Version 0.26.2
Numpy Version 1.24.0


### 2 - Initialize the Environment and the Agent

In [21]:
# Initializing the environment
env = gym.make('MountainCarContinuous-v0')
print('State shape: ', env.observation_space.shape)
print('Number of actions: ', env.action_space.shape)
print('Maximum time step: ', env._max_episode_steps)

# Initializing the agent
# agent = DDPG_Agent(state_size=2, action_size=1,random_seed=2)
agent = DDPG_Agent(state_size=2, action_size=1, random_seed=2)
#agent.state_size = 2
#agent.action_size = 1
#agent.random_seed = 2
print("Agent's State Size", agent.state_size)
print("Agent's Action Size", agent.action_size)
print("Agent's Random Seed", agent.seed)


State shape:  (2,)
Number of actions:  (1,)
Maximum time step:  999
Agent's State Size 2
Agent's Action Size 1
Agent's Random Seed None


### 3 - Train the Agent with DDPG

In [5]:
def ddpg(n_episodes=1000, max_t=400, print_every=100):
    scores_deque = deque(maxlen=print_every) # last 100 scores
    scores = [] # list containing scores from each episode

    # For each episode
    for i_episode in range(1, n_episodes+1):
        state = env.reset() # Reset the environment
        agent.reset() # Reset the agent
        score = 0 # Initialize the score
        
        # For each time step
        for t in range(max_t):
            action = agent.act(state)
            next_state, reward, done, _ = env.step(action)
            agent.step(state, action, reward, next_state, done)
            state = next_state # Roll over the state to next time step
            score += reward # Update the score
            if done:
                break
        scores_deque.append(score) # Save most recent score to the deque
        scores.append(score) # Save most recent score to the list
        print('\rEpisode {}\tAverage Score: {:.2f}'.format(i_episode, np.mean(scores_deque)), end="")
        if i_episode % print_every == 0:
            print('\rEpisode {}\tAverage Score: {:.2f}'.format(i_episode, np.mean(scores_deque)))
        if np.mean(scores_deque)>=90.0:
            print('\nEnvironment solved in {:d} episodes!\tAverage Score: {:.2f}'.format(i_episode-100, np.mean(scores_deque)))
            # torch.save(agent.qnetwork_local.state_dict(), 'checkpoint.pth')
            break
    return scores

scores = ddpg()

# Plotting the scores
fig = plt.figure()
ax = fig.add_subplot(111)
plt.plot(np.arange(1,len(scores)+1), scores)
plt.ylabel('Score')
plt.xlabel('Episode #')
plt.show()


    

AttributeError: 'DDPG_Agent' object has no attribute 'noise'