<a href="https://colab.research.google.com/github/Ptisni/Cartpole-course/blob/main/Cartpole_OC.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **3 Ways to Control a Cartpole**
### Session 2 - Optimal Control
By Peter Tisnikar \\
February 2022 (Version 1)  

---  
  

**Welcome back**! This week, we introduced the concept of **Optimal Control**, and in this notebook, we will use the state space equations derived in the lecture to compute an optimal controller for the cartpole in AI Gym.

#### Refresher (or introduction, if you just joined us) to Jupyter notebooks (which you can skip if you have used them before and know how they work):  

Jupyter notebooks are interactive notebooks built from **cells**, which can contain either text (like this one!), or snippets of code. You can run each cell individually by pressing the run button in the upper left corner of each cell, or you can select the **Run All**  (Ctrl/Cmd + F9) from the **Runtime** menu above.  

In this document, the text cells will guide you through building your own implementation of an optimal controller, along with minimal theory and reminders form the lectures. They will also contain all instructions for the code cells directly below them and tell you if you need to modify them. Some of the code cells are helper functions (e.g. to visualise the environment or install the packages), so you will need to run and **not change** those.  

To start, run the below two cells to install the environment we will use ([Open AI Gym](https://gym.openai.com/envs/CartPole-v1/)), and to enable the rendering of the simulation in Google Colab. Once you are done with that, you can proceed to the next text cell to start building the optimal controller.

*Code in cells 1 and 2 reused from [here](https://colab.research.google.com/github/jeffheaton/t81_558_deep_learning/blob/master/t81_558_class_12_01_ai_gym.ipynb)*. 


In [None]:
# DO NOT MODIFY THIS CELL!
#--------------------------------------------------------------
!apt-get update > /dev/null 2>&1
!pip install gym pyvirtualdisplay > /dev/null 2>&1
!apt-get install -y xvfb python-opengl ffmpeg > /dev/null 2>&1
!apt-get install cmake > /dev/null 2>&1
!pip install --upgrade setuptools 2>&1
!pip install ez_setup > /dev/null 2>&1
!pip install gym[all] > /dev/null 2>&1

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting setuptools
  Downloading setuptools-62.3.2-py3-none-any.whl (1.2 MB)
[K     |████████████████████████████████| 1.2 MB 25.0 MB/s 
[?25hInstalling collected packages: setuptools
  Attempting uninstall: setuptools
    Found existing installation: setuptools 57.4.0
    Uninstalling setuptools-57.4.0:
      Successfully uninstalled setuptools-57.4.0
[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
datascience 0.10.6 requires folium==0.2.1, but you have folium 0.8.3 which is incompatible.[0m
Successfully installed setuptools-62.3.2


In [None]:
# DO NOT MODIFY THIS CELL!
#--------------------------------------------------------------
import gym
from gym.wrappers import Monitor
import glob
import io
import base64
from IPython.display import HTML
from pyvirtualdisplay import Display
from IPython import display as ipythondisplay
import numpy as np
from scipy import linalg
import time

display = Display(visible=0, size=(1400, 900))
display.start()

"""
Utility functions to enable video recording of gym environment 
and displaying it.
To enable video, just do "env = wrap_env(env)""
"""

def show_video():
  mp4list = glob.glob('video/*.mp4')
  if len(mp4list) > 0:
    mp4 = mp4list[0]
    video = io.open(mp4, 'r+b').read()
    encoded = base64.b64encode(video)
    ipythondisplay.display(HTML(data='''<video alt="test" autoplay 
                loop controls style="height: 400px;">
                <source src="data:video/mp4;base64,{0}" type="video/mp4" />
             </video>'''.format(encoded.decode('ascii'))))
  else: 
    print("Could not find video")
    

def wrap_env(env):
  env = Monitor(env, './video', force=True)
  return env

#Building an Optimal Controller

In [None]:
env = gym.make("CartPole-v1")
env = wrap_env(env)

m = env.masspole
M = env.masscart
l = env.length
g = env.gravity

a = l*(4/3 - m/m+M)
b = 1/(m+M)

A = np.array([[0, 1, 0, 0],
             [0, 0, g/a, 0],
             [0, 0, 0, 1],
             [0, 0, g/a, 0]])

B = np.array([[0],[1/m+M], [0], [-1/a]])

Q = 10 * np.eye(4, dtype=int)

R = np.eye(1, dtype=int)

P = linalg.solve_continuous_are(A, B, Q, R)

K = np.dot(np.linalg.inv(R), np.dot(B.T, P))

observation = env.reset()
force = 0

for _ in range(50000): # This is the simulation loop
  env.render()

  if _ < 4:

    force = 0

  observation, reward, done, info = env.step(force)

  command = -np.dot(K, observation)


  force = 1 if command > 0 else 0 
  if done:
    break

env.close()
show_video()