# Solving mountain car problem with PSO

In this notebook particle swarm optimization (PSO) is used to adjust a PID that solves [mountain car problem](https://en.wikipedia.org/wiki/Mountain_car_problem)

First import some dependencies and define max number of steps to take in simulation

In [None]:
import gym
import numpy as np
import pso
from simple_pid import PID
from env.mountain_car_env import Continuous_MountainCarEnv

MAX_STEPS = 2000

## Cost function

The key part when implementing pso is defining a cost function. Each particle is going to try and minimize whatever signal we choose to set as cost. In this case the obvious choice is to try and minimize error, so we could set the cost as the distance to the goal. This has problems converging, so we ended up setting the cost as the force needed to get to the goal.

First thing we need to extract the PID parameters from the particle. Then the simulation can start using the PID to select an action based on this cost signal.

In [None]:
def mountain_sim(x):
    # Get pid params from particle
    k_p = x[0]
    k_i = x[1]
    k_d = x[2]

    # Create the pid
    pid = PID(k_p, k_i, k_d, setpoint=1)
    
    # Initilize sim
    obs = env.reset()
    done = False
    reward = 0
    errTotal = 0 # Cumulative error is our cost 

    for i in range(MAX_STEPS):
        #err = abs(min(obs[0] - GOAL_POS, 0)) # absolute value of: Position - goal
        action = pid(reward)
        obs, reward, done, info = env.step(action)
        # Get cost/err
        errTotal += reward # Reward corresponds to the square of force used divided by 100

        if done:
            break

    return errTotal

## Set PSO parameters

In order to use PSO the number of variables of each particle and the min/max values they can take must be defined. For a detail explanation of each parameter visit [yarpiz](https://yarpiz.com/50/ypea102-particle-swarm-optimization).

In [None]:
problem = {
    'CostFunction': mountain_sim,
    'nVar': 3,      # K_p, K_i, K_d
    'VarMin': -5,   # Min value of each parameter of the PID
    'VarMax': 5,    # Max value of each parameter of the PID
}

## Run PSO

Now that the problems is well defined we can run PSO and wait for a solution to our PID.
You can run this section many times until you get a solution

First start the simulation of mountain car problem.

In [None]:
env = Continuous_MountainCarEnv()
GOAL_POS = env.goal_position # Get goal position from env

Set some pso hyper-parameters and start optimization. This may take some time depending on MAX_STEPS, PopSize and MaxIter.

In [None]:
# Running PSO
pso.tic()
print('Running PSO ...')
gbest, pop = pso.PSO(problem, MaxIter=200, PopSize=25)
env.close()
pso.toc()
print('Global Best:')
print(gbest)
print()

## Enjoy the best agent

Now simulation is over we take the best agent from pso and try it.

In [None]:
# Enjoy best
k_p = gbest["position"][0]
k_i = gbest["position"][1]
k_d = gbest["position"][2]

# If optimization did not get you a solution try this PID

#k_p = -5 
#k_i = 0.20339942 
#k_d = -5 


pid = PID(k_p, k_i, k_d, setpoint=1)

# Initilize sim
env = Continuous_MountainCarEnv()
obs = env.reset()
done = False

for i in range(MAX_STEPS):
    env.render()
    err = min(obs[0] - GOAL_POS, 0)
    action = pid(err)
    obs, reward, done, info = env.step(action)

env.close()