# Homework 0

This assignment is meant to assist you through the process of installing the software components you will be using for the remainder of this course. It begins by providing a brief overview of each component and what role they plays in simulating and training mixed-autonomy network scenarios. It then directs you to the installation instructions you will use to get everything setup on your local machine. Finally, the script at the end of this assignment tests that the installation was successful. If you see vehicles traversing a network shaped like a figure eight after running it, then you are all set to begin exploring the world of multi-agent reinforcement learning in the context of mixed-autonomy traffic!

## 1. Software components

In this course, we will mostly be relying on 4 software packages: TensorFlow, OpenAI Gym, SUMO, and Flow.

TensorFlow is an open-source machine learning framework developed by Google Brain. It is capable of automatically differentiating computational models/graphs (e.g. deep fully connected networks). The tools offered by this package greatly simplify the process of designing and implementing deep learning and deep RL algorithms, as we will see in homework 2.

OpenAI Gym presents a means of developing and comparing reinforcement learning algorithms. It introduces a standardized implementation of MDP tasks called “environments”, which allows researchers to collaborate by testing their algorithms on similar benchmarks. Commonly used benchmarks including Atari games and Multi-Joint dynamics with Contact (MuJoCo), examples of which can be seen in the figure below.

<img src="img/gym_envs.png" width="900">

SUMO (Simulation of Urban Mobility) is an open-source tool for simulating traffic at the level of individual vehicles/agents. Through SUMO, users can design or import custom networks such as the ones seen in the figures below, and control certain aspects of the simulation (such as the accelerations of individual vehicles, or the state of a traffic light) via terminal or GUI commands.

<img src="img/sumo.png" width="700">

Finally, tying all the above components together, we introduce Flow. Flow is a Python library that interfaces the reinforcement learning (RL) libraries [RLlib](https://ray.readthedocs.io/en/latest/rllib.html) and [rllab](https://rllab.readthedocs.io/en/latest/) with SUMO. It enables the systematic creation of a variety of traffic-oriented RL tasks for the purpose of generating control strategies for autonomous vehicles, traffic lights, etc. These environments are compatible with OpenAI Gym in order to promote integration with the majority of training algorithms currently being developed by the RL community. For details on the architecture and on training autonomous vehicles to maximize system-level velocity, please refer to: 

C. Wu, A. Kreidieh, K. Parvate, E. Vinitsky, A. Bayen, "Flow: Architecture and Benchmarking for Reinforcement Learning in Traffic Control," CoRR, vol. abs/1710.05465, 2017. [Download](https://arxiv.org/abs/1710.05465).


## 2. Installing the software components

### 2.1 Using Windows? (If not, continue to 2.b)

Not all the software packages we described above work natively on Windows. Instead, if you are using Windows 10, we recommend you install installing a Windows Linux Subsystem (WLS) onto your device. In order to do so:

- Go the Windows store and download “Ubuntu 18.04”
- Download the Xming X Server for Windows: https://sourceforge.net/projects/xming/
- Run the WLS from the start menu by typing “Ubuntu 18.04”
    - The first time you open an Ubuntu terminal, type: `echo “export DISPLAY=:0” >> ~/.bashrc && source ~/.bashrc`
    - In order for graphic user intergace to work properly, make sure to also run Xming whenever you open a new terminal

If you are using an earlier version of Windows, your only other option is to install a virtual machine (e.g. [VirtualBox](https://www.virtualbox.org/wiki/Downloads)), set up an [Ubuntu](https://www.ubuntu.com/download/desktop) virtual environment, and install everything you need onto it. If you are in this situation and need some help setting up a virtual environment, please talk to one of the GSIs.

### 2.2 Installation instructions

You are now prepared to install all the software components we mentioned in section 1. In order to do so, follow the setup instructions located at: https://github.com/flow-project/flow/blob/master/docs/source/flow_setup.rst. Some complications may emerge as you try to install certain packages. If so, please feel free to ask one of the GSIs.

## 3. Testing your installation

You are finally ready to verify that your installation was successful. In order to do so, please run the below cell. Once you've done so, a window will emerge with a road network that looks something like a figure eight. Click on the <img style="display:inline;" src="img/play_button.png"> Play button, and vehicles will emerge on the network and being moving in a single direction. Once the simulation is complete, a few statistics describing the cumulative return and average speed of vehicles in the network will appear below the cell.

In [None]:
from flow.controllers import IDMController, StaticLaneChanger, ContinuousRouter
from flow.core.experiment import SumoExperiment
from flow.core.params import SumoParams, EnvParams, NetParams
from flow.core.vehicles import Vehicles
from flow.envs.loop.loop_accel import AccelEnv, ADDITIONAL_ENV_PARAMS
from flow.scenarios.figure_eight import Figure8Scenario, \
    ADDITIONAL_NET_PARAMS


sumo_params = SumoParams(sumo_binary="sumo-gui")

vehicles = Vehicles()
vehicles.add(veh_id="idm",
             acceleration_controller=(IDMController, {}),
             lane_change_controller=(StaticLaneChanger, {}),
             routing_controller=(ContinuousRouter, {}),
             speed_mode="no_collide",
             initial_speed=0,
             num_vehicles=14)

env_params = EnvParams(additional_params=ADDITIONAL_ENV_PARAMS)

additional_net_params = ADDITIONAL_NET_PARAMS.copy()
net_params = NetParams(no_internal_links=False,
                       additional_params=additional_net_params)

scenario = Figure8Scenario(name="figure8",
                           vehicles=vehicles,
                           net_params=net_params)

env = AccelEnv(env_params, sumo_params, scenario)

exp = SumoExperiment(env, scenario)

info_dict = exp.run(1, 1500)