# "TF-Agents-CartPole"

> Reinforcement Learning (RL) to control the balancing of a pole on a moving cart
- toc: true
- branch: master
- badges: false
- comments: true
- hide: false
- search_exclude: true
- metadata_key1: metadata_value1
- metadata_key2: metadata_value2
- image: images/graphical_representation_of_rl.png
- categories: [Control,   RL,   TensorFlow,TFAgents,Python]
- show_tags: true

In [2]:
#hide
# from google.colab import drive
# drive.mount('/content/gdrive', force_remount=True)
# root_dir = "/content/gdrive/My Drive/"
# base_dir = root_dir + 'RL/TF-Agents/blog_posts/TF-Agents-CartPole/'
# # base_dir = ""

Mounted at /content/gdrive


## 1. Introduction

The cart-pole problem can be considered as the "Hello World" problem of Reinforcement Learning (RL). It was described by [Barto (1983)](http://www.derongliu.org/adp/adp-cdrom/Barto1983.pdf). The physics of the system is as follows:

* All motion happens in a vertical plane
* A hinged pole is attached to a cart
* The cart slides horizontally on a track in an effort to balance the pole vertically
* The system has four state variables:

$x$: displacement of the cart

$\theta$: vertical angle on the pole

$\dot{x}$: velocity of the cart

$\dot{\theta}$: angular velocity of the pole


## 2. Purpose

The purpose of our activity in this blog post is to construct and train an entity, let's call it a *controller*, that can manage the horizontal motions of the cart so that the pole remains as close to vertical as possible. The controlled entity is, of course, the *cart and pole* system.

## 3. TF-Agents Setup

We will use the Tensorflow TF-Agents framework. In addition, this notebook will need to run in Google Colab.

In [3]:
!sudo apt-get install -y xvfb ffmpeg
!pip install 'imageio==2.4.0'
!pip install pyvirtualdisplay
!pip install tf-agents

Reading package lists... Done
Building dependency tree       
Reading state information... Done
ffmpeg is already the newest version (7:3.4.8-0ubuntu0.2).
xvfb is already the newest version (2:1.19.6-1ubuntu4.8).
0 upgraded, 0 newly installed, 0 to remove and 16 not upgraded.


In [4]:
from __future__ import absolute_import, division, print_function
import base64
import imageio
import IPython
import matplotlib
import matplotlib.pyplot as plt
import numpy as np
import PIL.Image
import pyvirtualdisplay
import tensorflow as tf
from tf_agents.agents.dqn import dqn_agent
from tf_agents.drivers import dynamic_step_driver
from tf_agents.environments import suite_gym
from tf_agents.environments import tf_py_environment
from tf_agents.eval import metric_utils
from tf_agents.metrics import tf_metrics
from tf_agents.networks import q_network
from tf_agents.policies import random_tf_policy
from tf_agents.replay_buffers import tf_uniform_replay_buffer
from tf_agents.trajectories import trajectory
from tf_agents.utils import common

In [5]:
tf.version.VERSION

'2.4.0'

The following is needed for rendering a virtual display:

In [6]:
tf.compat.v1.enable_v2_behavior()
display = pyvirtualdisplay.Display(visible=0, size=(1400, 900)).start()

In [7]:
# ![Figure 1 Graphical Representation](../images/graphical_representation_of_rl.png)
# ![Figure 1 Graphical Representation](/content/gdrive/My Drive/RL/TF-Agents/blog_posts/TF-Agents-CartPole/graphical_representation_of_rl.png)


In [8]:
base_dir

'/content/gdrive/My Drive/RL/TF-Agents/blog_posts/TF-Agents-CartPole/'

## 4. Hyperparameters
Here we specify all the hyperparameters for the problem:

In [9]:
NUM_ITERATIONS = 20000

INITIAL_COLLECT_STEPS = 100
COLLECT_STEPS_PER_ITERATION = 1
REPLAY_BUFFER_MAX_LENGTH = 100000

BATCH_SIZE = 64
LEARNING_RATE = 1e-3
LOG_INTERVAL = 200

NUM_EVAL_EPISODES = 10
EVAL_INTERVAL = 1000

## 5. Graphical Representation of the Problem

We will work with a graphical representation of our cart-and-pole problem, rather than to just ramble on with words. This will enhance the description. The graphic will also include some TF-Agents specifics. Here is the representation:

![Figure 1 Graphical Representation](./graphical_representation_of_rl.png)

In [10]:
#hide
# import Image
# im = PIL.Image.open(base_dir+'graphical_representation_of_rl.png')