# Tutorial 05: Visualizing Experiment Results

This tutorial describes the process of visualizing the results of Flow experiments, and of replaying them. 

**Note: This tutorial is only relevant if you use SUMO as a simulator. We currently do not support policy replay nor data collection when using Aimsun. The only exception is for reward plotting, which is independent on whether you have used SUMO or Aimsun during training.**

## 1. What is visualization

The visualization of simulation results breaks down into three main components:

- **reward plotting**: Visualization of the reward function is an essential step in evaluating the effectiveness and training progress of RL agents.

- **policy replay**: Flow includes tools for visualizing trained policies using SUMO's GUI. This enables more granular analysis of policies beyond their accrued reward, which in turn allows users to tweak actions, observations and rewards in order to produce some desired behavior. The visualizers also generate plots of observations and a plot of the reward function over the course of the rollout.

- **data collection and analysis**: Any Flow experiment can output its simulation data to a CSV file, `emission.csv`, containing the contents of SUMO's built-in `emission.xml` files. This file contains various data such as the speed, position, time, fuel consumption and many other metrics for every vehicle in the network and at each time step of the simulation. Once you have generated the `emission.csv` file, you can open it and read the data it contains using Python's [csv library](https://docs.python.org/3/library/csv.html) (or using Excel).

Visualization is different depending on which reinforcement learning library you are using, if any. Accordingly, the rest of this tutorial explains how to plot rewards, replay policies and collect data when using either no RL library, RLlib or rllab. 

**Contents:**

[How to visualize using SUMO without training](#2.1---Using-SUMO-without-training)

[How to visualize using SUMO with RLlib](#2.2---Using-SUMO-with-RLlib)

[How to visualize using SUMO with rllab](#2.3---Using-SUMO-with-rllab)

## 2. How to visualize

### 2.1 - Using SUMO without training

_In this case, since there is no training, there is no reward to plot and no policy to replay._

#### Data collection and analysis

SUMO-only experiments can generate emission CSV files seamlessly:

First, you have to tell SUMO to generate the `emission.xml` files. You can do that by specifying `emission_path` in the simulation parameters (class `SumoParams`), which is the path where the emission files will be generated. For instance:

In [None]:
from flow.core.params import SumoParams

sumo_params = SumoParams(sim_step=0.1, render=True, emission_path='data')

Then, you have to tell Flow to convert these XML emission files into CSV files. To do that, pass in `convert_to_csv=True` to the `run` method of your experiment object. For instance:

In [None]:
exp.run(1, 1500, convert_to_csv=True)

When running experiments, Flow will now automatically create CSV files next to the SUMO-generated XML files.

### 2.2 - Using SUMO with RLlib 

#### Reward plotting

RLlib supports reward visualization over the period of the training using the `tensorboard` command. It takes one command-line parameter, `--logdir`, which is an RLlib result directory. By default, it would be located within an experiment directory inside your `~/ray_results` directory. 

An example call would look like:

`tensorboard --logdir ~/ray_results/experiment_dir/result/directory`

You can also run `tensorboard --logdir ~/ray_results` if you want to select more than just one experiment.

If you do not wish to use `tensorboard`, an other way is to use our `flow/visualize/plot_ray_results.py` tool. It takes as arguments:

- the path to the `progress.csv` file located inside your experiment results directory (`~/ray_results/...`),
- the name(s) of the column(s) you wish to plot (reward or other things).

An example call would look like:

`flow/visualize/plot_ray_results.py ~/ray_results/experiment_dir/result/progress.csv training/return-average training/return-min`

If you do not know what the names of the columns are, run the command without specifying any column:

`flow/visualize/plot_ray_results.py ~/ray_results/experiment_dir/result/progress.csv`

and the list of all available columns will be displayed to you.

#### Policy replay

The tool to replay a policy trained using RLlib is located at `flow/visualize/visualizer_rllib.py`. It takes as argument, first the path to the experiment results (by default located within `~/ray_results`), and secondly the number of the checkpoint you wish to visualize (which correspond to the folder `checkpoint_<number>` inside the experiment results directory).

An example call would look like this:

`python flow/visualize/visualizer_rllib.py ~/ray_results/experiment_dir/result/directory 1`

There are other optional parameters which you can learn about by running `visualizer_rllib.py --help`. 

#### Data collection and analysis

Simulation data can be generated the same way as it is done [without training](#2.1---Using-SUMO-without-training).

If you need to generate simulation data after the training, you can run a policy replay as mentioned above, and add the `--gen-emission` parameter.

An example call would look like:

`python flow/visualize/visualizer_rllib.py ~/ray_results/experiment_dir/result/directory 1 --gen_emission`

### 2.3 - Using SUMO with rllab

_Note: due to rllab not being maintained anymore, we have decreased our support of rllab within Flow, and thus strongly encourage you to use RLlib instead._

#### Reward plotting

rllab includes a tool to plot the _average cumulative reward per rollout_ against _iteration number_ to show training progress. This "reward plot" can be generated for just one experiment or many. The tool to be called is rllab's `frontend.py`, which is inside the `rllab-multiagent/rllab/viskit/` directory.

This script `frontend.py` requires only one command-line input: the path to the result directory that a user wants to visualize. The result directory should contain both a `progress.csv` and a `params.json` file â€” pickle files containing per-iteration results are however not necessary. An example call to `frontend.py` is below. Click on the link to http://localhost:5000 to view the reward over time.

`python rllab/viskit/frontend.py /path/to/result/directory`

#### Policy replay

The tool to replay a policy trained using rllab is located at `flow/visualize/visualizer_rllab.py`. It requires one command-line input and has three additional optional arguments. The required input is the path to the pickle file `result.pkl` to be visualized (this is usually within an rllab result directory).

The optional inputs are:

- `--num_rollouts`: the number of rollouts to be visualized. The default value is 100. This argument takes integer input.
- `--plotname`: the name of the plot generated by the visualizer. The default value is `traffic_plot`. This argument takes string input.
- `--gen_emission`: Specifies whether to generate an emission file from the simulation. This argument is a flag and takes no input.

An example call would look like:

`python flow/visualize/visualizer_rllab.py /path/to/result.pkl --num_rollouts 1 --plotname plot_test --gen_emission`

#### Data collection and analysis

Simulation data can be generated the same way as it is done [without training](#2.1---Using-SUMO-without-training).

If you need to generate simulation data after the training, you can run a policy replay as mentioned above, and add the `--gen-emission` parameter.

An example call would look like:

`python flow/visualize/visualizer_rllab.py path/to/result.pkl --gen_emission`