# Rendering OpenAI Gym Envs on Binder and Google Colab 
> ???

- branch: 2020-04-16-remote-rendering-gym-envs
- badges: true
- image: images/
- comments: true
- author: David R. Pugh
- categories: [binder, google-colab]

Getting OpenAI environments that support rendering to render properly on remote servers such as Google Colab and Binder turned out to be more challenging that I expected. In this post I lay out my solution in the hopes that I might save others time and effort to work it out independently (that said I did learn quite a bit by figuring it all out on my own)

# Google Colab Preamble

## Install X11 system dependencies

Install necessary [X11](https://en.wikipedia.org/wiki/X_Window_System) dependencies, in particular [Xvfb](https://www.x.org/releases/X11R7.7/doc/man/man1/Xvfb.1.xhtml), which is an X server that can run on machines with no display hardware and no physical input devices. 

In [None]:
!apt-get install -y xvfb x11-utils

## Install additional Python dependencies

Now that you have installed Xvfb, you need to install a Python wrapper 
[`pyvirtualdisplay`](https://github.com/ponty/PyVirtualDisplay) in order to interact with Xvfb 
virtual displays from within Python. Next you need to install the Python bindings for 
[OpenGL](https://www.opengl.org/): [PyOpenGL](http://pyopengl.sourceforge.net/) and 
[PyOpenGL-accelerate](https://pypi.org/project/PyOpenGL-accelerate/). The former are the actual 
Python bindings, the latter is and optional set of C (Cython) extensions providing acceleration of 
common operations for slow points in PyOpenGL 3.x.

In [None]:
!pip install pyvirtualdisplay==0.2.* PyOpenGL==3.1.* PyOpenGL-accelerate==3.1.*

## Install OpenAI Gym

In [None]:
!pip install gym[box2d]==0.17.* 

## Create a virtual display in the background

Next you need to create a virtual display in the background which the Gym Envs can connect to for rendering purposes. You can check that there is no display at present by confirming that the value of the [`DISPLAY`](https://askubuntu.com/questions/432255/what-is-the-display-environment-variable) environment variable has not yet been set. 

In [None]:
!echo $DISPLAY

The code in the cell below creates a virtual display in the background that your Gym Envs can connect to for rendering. You can adjust the `size` of the virtual buffer as you like but you must set `visible=False` when working with Xvfb. 

**This code only needs to be run once per session to start the display.**

In [None]:
import pyvirtualdisplay


_display = pyvirtualdisplay.Display(visible=False,  # use False with Xvfb
                                    size=(1400, 900))
_ = _display.start()

After running the cell above you can echo out the value of the `DISPLAY` environment variable again to confirm that you now have a display running.

In [None]:
!echo $DISPLAY

# Binder Preamble

## No additional installation required!

Unlike Google Colab, with Binder you can bake all the required dependencies (including the X11 system dependencies!) into the Docker image on which the Binder instance is based using Binder config files. These config files can either live in the root directory of your Git repo or in a `binder` sub-directory as is this case here. If you are interested in learning more about Binder, then check out the documentation for [BinderHub](https://binderhub.readthedocs.io/en/latest/) which is the underlying technology behind the Binder project.

In [6]:
# config file for system dependencies
!cat ../binder/apt.txt

#!/bin/bash

apt-get install -y xvfb x11-utils


In [8]:
# config file describing the conda environment
!cat ../binder/environment.yml

name: null

channels:
  - conda-forge
  - defaults

dependencies:
  - gym-box2d=0.17
  - jupyterlab=2.0
  - matplotlib=3.2
  - pip=20.0
  - python=3.7
  - pyvirtualdisplay=0.2


In [9]:
# config file containing python deps not avaiable via conda channels
!cat ../binder/requirements.txt

PyOpenGL==3.1.*
PyOpenGL-accelerate==3.1.*


## Create a virtual display in the background

Next you need to create a virtual display in the background which the Gym Envs can connect to for rendering purposes. You can check that there is no display at present by confirming that the value of the [`DISPLAY`](https://askubuntu.com/questions/432255/what-is-the-display-environment-variable) environment variable has not yet been set.

In [None]:
!echo $DISPLAY

The code in the cell below creates a virtual display in the background that your Gym Envs can connect to for rendering. You can adjust the `size` of the virtual buffer as you like but you must set `visible=False` when working with Xvfb. 

**This code only needs to be run once per session to start the display.**

In [None]:
import pyvirtualdisplay


_display = pyvirtualdisplay.Display(visible=False,  # use False with Xvfb
                                    size=(1400, 900))
_display.start()

After running the cell above you can echo out the value of the `DISPLAY` environment variable again to confirm that you now have a display running.

In [1]:
!echo $DISPLAY

/private/tmp/com.apple.launchd.CPYx1vyZ97/org.macosforge.xquartz:0


## Demo

Just to prove that the above setup works as advertised I will run a short simulation.

First I will define some useful type aliases for `State` (modeled as a NumPy array) and `Agent` (modeled as a function that maps `State` to `Action`) .

In [3]:
import typing

import numpy as np


# represent states as arrays and actions as ints
State = np.array
Action = int

# agent is just a function! 
Agent = typing.Callable[[State], Action]

In [None]:
import gym
import matplotlib.pyplot as plt
from IPython import display


def simulate(agent: Agent, env: gym.Env) -> None:
    state = env.reset()
    img = plt.imshow(env.render(mode='rgb_array'))
    done = False
    while not done:
        action = agent(state)
        img.set_data(env.render(mode='rgb_array')) 
        plt.axis('off')
        display.display(plt.gcf())
        display.clear_output(wait=True)
        state, reward, done, _ = env.step(action)       
    env.close()
    


In [None]:
def uniform_random_policy(state: State,
                          number_actions: int,
                          random_state: np.random.RandomState) -> Action:
    return random_state.randint(number_actions)


def make_agent(number_actions: int,
               random_state: np.random.RandomState = None) -> Agent:
    _random_state = np.random.RandomState() if random_state is None else random_state
    return lambda state: uniform_random_policy(state, number_actions, _random_state)
    

Choose an environment (note that not all of the Open AI Gym environments support rendering).

In [None]:
lunar_lander_v2 = gym.make('LunarLander-v2')
_ = lunar_lander_v2.seed(42)

In [None]:
random_agent = make_agent(lunar_lander_v2.action_space.n, random_state=None)
simulate(random_agent, lunar_lander_v2)