# Tutorial 04: Visualizing Experiment Results

This tutorial describes the process of visualizing and replaying the results of Flow experiments run using RL. The process of visualizing results breaks down into two main components:

- reward plotting

- policy replay

Note that this tutorial only talks about visualization using sumo, and not other simulators like Aimsun. 

<hr>

## Visualization with RLlib

### Plotting Reward

Similarly to how rllab handles reward plotting, RLlib supports reward visualization over the period of training using `tensorboard`. `tensorboard` takes one command-line input, `--logdir`, which is an rllib result directory (usually located within an experiment directory inside your `ray_results` directory). An example function call is below.

In [7]:
#! tensorboard --logdir= ray_results/experiment_dir/result/directory
! tensorboard --logdir=~/ray_results/training_example

TensorBoard 1.14.0 at http://nick-ThinkPad:6006/ (Press CTRL+C to quit)
I1007 00:04:30.130889 140481751508736 _internal.py:122] ::ffff:127.0.0.1 - - [07/Oct/2019 00:04:30] "[37mGET / HTTP/1.1[0m" 200 -
I1007 00:04:30.584944 140481635800832 _internal.py:122] ::ffff:127.0.0.1 - - [07/Oct/2019 00:04:30] "[37mGET /font-roboto/oMMgfZMQthOryQo9n22dcuvvDin1pK8aKteLpeZ5c0A.woff2 HTTP/1.1[0m" 200 -
I1007 00:04:31.964401 140481619015424 _internal.py:122] ::ffff:127.0.0.1 - - [07/Oct/2019 00:04:31] "[37mGET /data/runs HTTP/1.1[0m" 200 -
I1007 00:04:31.965485 140481635800832 _internal.py:122] ::ffff:127.0.0.1 - - [07/Oct/2019 00:04:31] "[37mGET /data/environment HTTP/1.1[0m" 200 -
I1007 00:04:31.965806 140481400928000 _internal.py:122] ::ffff:127.0.0.1 - - [07/Oct/2019 00:04:31] "[37mGET /data/experiments HTTP/1.1[0m" 200 -
I1007 00:04:31.968789 140481751508736 _internal.py:122] ::ffff:127.0.0.1 - - [07/Oct/2019 00:04:31] "[37mGET /data/plugins_listing HTTP/1.1[0m" 200 -
I1007 00:04:32

I1007 00:04:32.731495 140481619015424 _internal.py:122] ::ffff:127.0.0.1 - - [07/Oct/2019 00:04:32] "[37mGET /data/plugin/scalars/scalars?tag=ray%2Ftune%2Finfo%2Flearner%2Fdefault_policy%2Fcur_lr&run=PPO_WaveAttenuationPOEnv-v0_0_2019-10-06_23-07-079nsv_a4k&experiment= HTTP/1.1[0m" 200 -
I1007 00:04:32.739460 140481400928000 _internal.py:122] ::ffff:127.0.0.1 - - [07/Oct/2019 00:04:32] "[37mGET /data/plugin/scalars/scalars?tag=ray%2Ftune%2Finfo%2Flearner%2Fdefault_policy%2Fcur_lr&run=PPO_WaveAttenuationPOEnv-v0_0_2019-10-06_23-18-54klgp__ke&experiment= HTTP/1.1[0m" 200 -
I1007 00:04:32.744487 140481635800832 _internal.py:122] ::ffff:127.0.0.1 - - [07/Oct/2019 00:04:32] "[37mGET /data/plugin/scalars/scalars?tag=ray%2Ftune%2Finfo%2Flearner%2Fdefault_policy%2Fentropy&run=PPO_WaveAttenuationPOEnv-v0_0_2019-10-06_23-07-079nsv_a4k&experiment= HTTP/1.1[0m" 200 -
I1007 00:04:32.745448 140481627408128 _internal.py:122] ::ffff:127.0.0.1 - - [07/Oct/2019 00:04:32] "[37mGET /data/plugin/sca

I1007 00:04:33.482952 140481384142592 _internal.py:122] ::ffff:127.0.0.1 - - [07/Oct/2019 00:04:33] "[37mGET /data/plugin/scalars/scalars?tag=ray%2Ftune%2Finfo%2Fgrad_time_ms&run=PPO_WaveAttenuationPOEnv-v0_0_2019-10-06_23-18-54klgp__ke&experiment= HTTP/1.1[0m" 200 -
I1007 00:04:33.483453 140481627408128 _internal.py:122] ::ffff:127.0.0.1 - - [07/Oct/2019 00:04:33] "[37mGET /data/plugin/scalars/scalars?tag=ray%2Ftune%2Finfo%2Fgrad_time_ms&run=PPO_WaveAttenuationPOEnv-v0_0_2019-10-06_23-07-079nsv_a4k&experiment= HTTP/1.1[0m" 200 -
I1007 00:04:33.489403 140481400928000 _internal.py:122] ::ffff:127.0.0.1 - - [07/Oct/2019 00:04:33] "[37mGET /data/plugin/scalars/scalars?tag=ray%2Ftune%2Finfo%2Flearner%2Fdefault_policy%2Fcur_kl_coeff&run=PPO_WaveAttenuationPOEnv-v0_0_2019-10-06_23-07-079nsv_a4k&experiment= HTTP/1.1[0m" 200 -
I1007 00:04:33.490469 140481635800832 _internal.py:122] ::ffff:127.0.0.1 - - [07/Oct/2019 00:04:33] "[37mGET /data/plugin/scalars/scalars?tag=ray%2Ftune%2Finfo%2F

I1007 00:04:35.855525 140481635800832 _internal.py:122] ::ffff:127.0.0.1 - - [07/Oct/2019 00:04:35] "[37mGET /data/plugin/scalars/scalars?tag=ray%2Ftune%2Finfo%2Fload_time_ms&run=PPO_WaveAttenuationPOEnv-v0_0_2019-10-06_23-07-079nsv_a4k&experiment= HTTP/1.1[0m" 200 -
I1007 00:04:35.868027 140481619015424 _internal.py:122] ::ffff:127.0.0.1 - - [07/Oct/2019 00:04:35] "[37mGET /data/plugin/scalars/scalars?tag=ray%2Ftune%2Finfo%2Fload_time_ms&run=PPO_WaveAttenuationPOEnv-v0_0_2019-10-06_23-18-54klgp__ke&experiment= HTTP/1.1[0m" 200 -
I1007 00:04:35.870556 140481627408128 _internal.py:122] ::ffff:127.0.0.1 - - [07/Oct/2019 00:04:35] "[37mGET /data/plugin/scalars/scalars?tag=ray%2Ftune%2Finfo%2Fload_time_ms&run=PPO_WaveAttenuationPOEnv-v0_0_2019-10-06_23-59-29h06wzir3&experiment= HTTP/1.1[0m" 200 -
I1007 00:04:35.870844 140481384142592 _internal.py:122] ::ffff:127.0.0.1 - - [07/Oct/2019 00:04:35] "[37mGET /data/plugin/scalars/scalars?tag=ray%2Ftune%2Finfo%2Fnum_steps_sampled&run=PPO_W

I1007 00:04:58.004148 140481619015424 _internal.py:122] ::ffff:127.0.0.1 - - [07/Oct/2019 00:04:58] "[37mGET /data/plugin/scalars/scalars?tag=ray%2Ftune%2Fsampler_perf%2Fmean_processing_ms&run=PPO_WaveAttenuationPOEnv-v0_0_2019-10-06_23-18-54klgp__ke&experiment= HTTP/1.1[0m" 200 -
I1007 00:04:58.004933 140481400928000 _internal.py:122] ::ffff:127.0.0.1 - - [07/Oct/2019 00:04:58] "[37mGET /data/plugin/scalars/scalars?tag=ray%2Ftune%2Fsampler_perf%2Fmean_inference_ms&run=PPO_WaveAttenuationPOEnv-v0_0_2019-10-06_23-59-29h06wzir3&experiment= HTTP/1.1[0m" 200 -
I1007 00:04:58.005270 140481384142592 _internal.py:122] ::ffff:127.0.0.1 - - [07/Oct/2019 00:04:58] "[37mGET /data/plugin/scalars/scalars?tag=ray%2Ftune%2Fsampler_perf%2Fmean_inference_ms&run=PPO_WaveAttenuationPOEnv-v0_0_2019-10-06_23-18-54klgp__ke&experiment= HTTP/1.1[0m" 200 -
I1007 00:04:58.005551 140481627408128 _internal.py:122] ::ffff:127.0.0.1 - - [07/Oct/2019 00:04:58] "[37mGET /data/plugin/scalars/scalars?tag=ray%2Ft

I1007 00:05:02.371103 140481619015424 _internal.py:122] ::ffff:127.0.0.1 - - [07/Oct/2019 00:05:02] "[37mGET /data/plugin/scalars/scalars?tag=ray%2Ftune%2Fepisode_reward_mean&run=PPO_WaveAttenuationPOEnv-v0_0_2019-10-06_23-59-29h06wzir3&experiment= HTTP/1.1[0m" 200 -
I1007 00:05:02.377636 140481384142592 _internal.py:122] ::ffff:127.0.0.1 - - [07/Oct/2019 00:05:02] "[37mGET /data/plugin/scalars/scalars?tag=ray%2Ftune%2Fepisode_reward_min&run=PPO_WaveAttenuationPOEnv-v0_0_2019-10-06_23-18-54klgp__ke&experiment= HTTP/1.1[0m" 200 -
I1007 00:05:02.378407 140481751508736 _internal.py:122] ::ffff:127.0.0.1 - - [07/Oct/2019 00:05:02] "[37mGET /data/plugin/scalars/scalars?tag=ray%2Ftune%2Fepisode_reward_min&run=PPO_WaveAttenuationPOEnv-v0_0_2019-10-06_23-07-079nsv_a4k&experiment= HTTP/1.1[0m" 200 -
I1007 00:05:02.395828 140481627408128 _internal.py:122] ::ffff:127.0.0.1 - - [07/Oct/2019 00:05:02] "[37mGET /data/plugin/scalars/scalars?tag=ray%2Ftune%2Fepisode_reward_min&run=PPO_WaveAtten

I1007 00:05:02.616505 140481384142592 _internal.py:122] ::ffff:127.0.0.1 - - [07/Oct/2019 00:05:02] "[37mGET /data/plugin/scalars/scalars?tag=ray%2Ftune%2Finfo%2Flearner%2Fdefault_policy%2Ftotal_loss&run=PPO_WaveAttenuationPOEnv-v0_0_2019-10-06_23-07-079nsv_a4k&experiment= HTTP/1.1[0m" 200 -
I1007 00:05:02.623020 140481751508736 _internal.py:122] ::ffff:127.0.0.1 - - [07/Oct/2019 00:05:02] "[37mGET /data/plugin/scalars/scalars?tag=ray%2Ftune%2Finfo%2Flearner%2Fdefault_policy%2Ftotal_loss&run=PPO_WaveAttenuationPOEnv-v0_0_2019-10-06_23-18-54klgp__ke&experiment= HTTP/1.1[0m" 200 -
I1007 00:05:02.627229 140481400928000 _internal.py:122] ::ffff:127.0.0.1 - - [07/Oct/2019 00:05:02] "[37mGET /data/plugin/scalars/scalars?tag=ray%2Ftune%2Finfo%2Flearner%2Fdefault_policy%2Ftotal_loss&run=PPO_WaveAttenuationPOEnv-v0_0_2019-10-06_23-59-29h06wzir3&experiment= HTTP/1.1[0m" 200 -
I1007 00:05:02.631619 140481627408128 _internal.py:122] ::ffff:127.0.0.1 - - [07/Oct/2019 00:05:02] "[37mGET /data

I1007 00:05:02.871551 140481400928000 _internal.py:122] ::ffff:127.0.0.1 - - [07/Oct/2019 00:05:02] "[37mGET /data/plugin/scalars/scalars?tag=ray%2Ftune%2Fsampler_perf%2Fmean_inference_ms&run=PPO_WaveAttenuationPOEnv-v0_0_2019-10-06_23-18-54klgp__ke&experiment= HTTP/1.1[0m" 200 -
I1007 00:05:02.876382 140481619015424 _internal.py:122] ::ffff:127.0.0.1 - - [07/Oct/2019 00:05:02] "[37mGET /data/plugin/scalars/scalars?tag=ray%2Ftune%2Fsampler_perf%2Fmean_env_wait_ms&run=PPO_WaveAttenuationPOEnv-v0_0_2019-10-06_23-59-29h06wzir3&experiment= HTTP/1.1[0m" 200 -
I1007 00:05:02.876719 140481635800832 _internal.py:122] ::ffff:127.0.0.1 - - [07/Oct/2019 00:05:02] "[37mGET /data/plugin/scalars/scalars?tag=ray%2Ftune%2Fsampler_perf%2Fmean_inference_ms&run=PPO_WaveAttenuationPOEnv-v0_0_2019-10-06_23-07-079nsv_a4k&experiment= HTTP/1.1[0m" 200 -
I1007 00:05:02.887591 140481384142592 _internal.py:122] ::ffff:127.0.0.1 - - [07/Oct/2019 00:05:02] "[37mGET /data/plugin/scalars/scalars?tag=ray%2Ftun

^C


If you do not wish to use `tensorboard`, you can also use the `flow/visualize/plot_ray_results.py` file. It takes as arguments the path to the `progress.csv` file located inside your experiment results directory, and the name(s) of the column(s) to plot. If you do not know what the name of the columns are, simply do not put any and a list of all available columns will be displayed to you. 

Example usage:

In [8]:
! plot_ray_results.py /ray_results/experiment_dir/progress.csv training/return-average training/return-min

/bin/sh: 1: plot_ray_results.py: not found


### Replaying a Trained Policy

The tool to replay a policy trained using RLlib is located in `flow/visualize/visualizer_rllib.py`. It takes as argument, first the path to the experiment results, and second the number of the checkpoint you wish to visualize. 

There are other optional parameters which you can learn about by running `visualizer_rllib.py --help`. 

In [4]:
! python ../../flow/visualize/visualizer_rllib.py /ray_results/experiment_dir/result/directory 1

Instructions for updating:
non-resource variables are not supported in the long term
2019-10-08 10:36:54,181	INFO node.py:498 -- Process STDOUT and STDERR is being redirected to /tmp/ray/session_2019-10-08_10-36-54_181442_12857/logs.
2019-10-08 10:36:54,294	INFO services.py:409 -- Waiting for redis server at 127.0.0.1:48977 to respond...
2019-10-08 10:36:54,402	INFO services.py:409 -- Waiting for redis server at 127.0.0.1:35635 to respond...
2019-10-08 10:36:54,406	INFO services.py:809 -- Starting Redis shard with 3.28 GB max memory.
2019-10-08 10:36:54,443	INFO node.py:512 -- Process STDOUT and STDERR is being redirected to /tmp/ray/session_2019-10-08_10-36-54_181442_12857/logs.
2019-10-08 10:36:54,445	INFO services.py:1475 -- Starting the Plasma object store with 4.92 GB memory using /dev/shm.
NOTE: With render mode sumo_gui, an extra instance of the SUMO GUI will display before the GUI for visualizing the result. Click the green Play arrow to continue.


2019-10-08 10:36:54,527	ERRO

2019-10-08 10:36:57,933	INFO dynamic_tf_policy.py:324 -- Initializing loss function with dummy input:

{ 'action_prob': <tf.Tensor 'default_policy/action_prob:0' shape=(?,) dtype=float32>,
  'actions': <tf.Tensor 'default_policy/actions:0' shape=(?, 1) dtype=float32>,
  'advantages': <tf.Tensor 'default_policy/advantages:0' shape=(?,) dtype=float32>,
  'behaviour_logits': <tf.Tensor 'default_policy/behaviour_logits:0' shape=(?, 2) dtype=float32>,
  'dones': <tf.Tensor 'default_policy/dones:0' shape=(?,) dtype=bool>,
  'new_obs': <tf.Tensor 'default_policy/new_obs:0' shape=(?, 3) dtype=float32>,
  'obs': <tf.Tensor 'default_policy/observation:0' shape=(?, 3) dtype=float32>,
  'prev_actions': <tf.Tensor 'default_policy/action:0' shape=(?, 1) dtype=float32>,
  'prev_rewards': <tf.Tensor 'default_policy/prev_reward:0' shape=(?,) dtype=float32>,
  'rewards': <tf.Tensor 'default_policy/rewards:0' shape=(?,) dtype=float32>,
  'value_targets': <tf.Tensor 'default_policy/value_targets:0' shape=

Loading configuration... done.
Success.
Loading configuration... done.


Loading configuration... done.
Success.
Loading configuration... done.

-----------------------
ring length: 227
v_max: 3.570869368805752
-----------------------
Loading configuration... done.
Success.
Loading configuration... done.
Loading configuration... done.
Success.
Loading configuration... done.
Traceback (most recent call last):
  File "../flow/visualize/visualizer_rllib.py", line 389, in <module>
    visualizer_rllib(args)
  File "../flow/visualize/visualizer_rllib.py", line 199, in visualizer_rllib
    state = env.reset()
  File "/home/nick/Programming/flow/flow/envs/ring/wave_attenuation.py", line 210, in reset
    observation = super().reset()
  File "/home/nick/Programming/flow/flow/envs/base.py", line 523, in reset
    raise FatalFlowError(msg=msg)
flow.utils.exceptions.FatalFlowError: 
Not enough vehicles have spawned! Bad start?
Missing vehicles / initial state:
- human_10: ('human', 'right', 0, 46.5971820306477, 0)
- human_11: ('human', 'right', 0, 56.57045154267076, 0

<hr>

## Data Collection and Analysis
Any Flow experiment can output its results to a CSV file containing the contents of SUMO's built-in `emission.xml` files, specifying speed, position, time, fuel consumption, and many other metrics for all vehicles in a network over time. 

This section describes how to generate those `emission.csv` files when replaying and analyzing a trained policy.

### RLlib

In [9]:
# --emission_to_csv does the same as above
! python ../../flow/visualize/visualizer_rllib.py results/sample_checkpoint 1 --gen_emission

python: can't open file '../../flow/visualize/visualizer_rllib.py': [Errno 2] No such file or directory


As in the rllab case, the `emission.csv` file can be found in `test_time_rollout/` and used from there.

### SUMO
SUMO-only experiments can generate emission CSV files as well, based on an argument to the `experiment.run` method. `run` takes in arguments `(num_runs, num_steps, rl_actions=None, convert_to_csv=False)`. To generate an `emission.csv` file, pass in `convert_to_csv=True` in the Python file running your SUMO experiment.