# Tutorial 0: Flow

This tutorial serves as an introduction to Flow; it will give you a better understanding about how to use it and will walk you through the structure of our code. Whether you want to be serious about using Flow or wish to contribute to the project, it is important that you first understand the flow of our codebase.

Note that this is not an installation tutorial. Flow's installation instructions are available [here](http://flow.readthedocs.io/en/latest/flow_setup.html). 

This tutorial is organized as follows:

* **[Section 1](#1.-High-level-flow)** gives a high-level explanation of how to use Flow.
* **[Section 2](#2.-Low-level-content)** provides a description for each folder within the `flow` directory.
* **[Section 3](#3.-Examples)** talks about Flow's example codes and how to use them.

You may want to skip sections 2 and 3 if you are just getting started with Flow, but you should definitely come back to it later. Section 1 is however a highly recommended read to get you started and to be sure you understand the basics of Flow. Section 3 talks about our example codes, which are good starting points for you to work with once you are done with the tutorials. Finally, you should read through Section 2 if you wish to understand more precisely how Flow works, so that you have a better grasp at how the examples are built and can start working on your own, custom ones. 

**How to get help:** If you happen, throughout this tutorial or the following ones, to have any general or technical question related to Flow, don't hesitate to have a look on [Stack Overflow](https://stackoverflow.com/questions/tagged/flow-project) and see if it has been answered already, or otherwise to post it using the tag `flow-project`. We will be happy to help you!

*Something missing? This tutorial, though non-exhaustive, tries to be quite specific. As a result, there could be new files that are not accounted for here, or old files that we forgot to remove. Don't hesitate to let us know if you find something missing or not up to date by posting a [Github issue](https://github.com/flow-project/flow/issues/new?labels=feature+request&template=feature.md).*

## 1. High-level flow

Flow acts as a bridge between a traffic simulator (e.g. Sumo, Aimsun...) and a reinforcement learning library (e.g. RLlib, rllab...). It provides you with an interface that lets you train agents on a custom road network without having to worry about integration with the traffic simulator and the learning library. Flow creates this connection automatically, and also provides you with some tools to analyze the trained policies.

In order to get started and train your own agent on your own road network, all you will need is: 

- <u>**a scenario**</u>: this is basically the term we use to talk about a road network. A scenario is a class that contains information about the road network on which your agents will be trained. It describes the roads (position, size, number of lanes, speed limit...), the connections between the roads (junctions, intersections...) and possibly other information (traffic lights...).

- <u>**an environment**</u>: this is an RL environment _(**not to be confused** with the physical environment that we call scenario)_. It is a class that allows you to control how the agent will be trained. You should define and can then tune: a **state space** (what the agent can see, for instance the position and speed of all vehicles in the network, or the difference of speed with the vehicle in front of the agent), an **action space** (what the agent can do, for instance a list of accelerations to apply to all RL vehicles, or a list of new states for all traffic lights), and a **reward function** (what the agent will try to maximize, for instance the mean speed of all vehicles in the network).

_Flow already contains a dozen of pre-defined [scenarios](#2.7---scenarios) and their corresponding [environments](#2.4---envs) that you can re-use or use as models to build your own scenarios and environments._

Once you have defined these two classes, the last step is to set up the parameters of the simulation, for instance:

- number of iterations (for the RL algorithm), number of rollouts (simulations) in each iteration, horizon (number of steps in each rollout)

- vehicles to add in the network (human vehicles, RL vehicles, vehicle inflows...), color sequences for traffic lights (if there are any)

- Flow parameters: name of the simulation, scenario and environment classes to be used, simulator to use (Sumo, Aimsun...)

- RL algorithm to use (PPO, TRPO...), parameters of the algorithm (discount rate, neural network parameters...), number of CPUs/GPUs to use, checkpoint frequency, whether or not to restart from an old checkpoint (to resume training or do transfer learning)

- whether or not to render the simulation (if you don't see anything happening in the simulator when you run the experiment, it is because rendering has been disabled, which makes training significantly faster)

It is also possible to run a simulation without doing any training, in this case you won't need any environment or RL algorithm.

Flow has a lot of [examples](#3.-Examples) setting up all these simulation parameters, so you can simply take one of them and modify it according to your needs. Once this is done, you can execute the Python file where the experiment is set up and the training shall begin. 

During the training or after it has ended, you can use Flow's visualization tools in order to visualize the data saved in the checkpoint files generated during the training. You can see how well your agent is doing by running a new simulation in the simulator, that will used the trained policy (this time, the simulation will be rendered). You can also plot the reward or return functions, time-space diagrams, capacity diagrams etc.

In the next section, we will explain in more details how Flow's codebase is organized, so that you can apply what you learned in this section in practice. It will also help you understand what is happening under the hood, in case you wish to contribute to improving the code!

## 2. Low-level content

Within the `flow` directory, you will find the following folders:

TODO reorder subsections in function of dependencies (for a top-to-bottom read)

- [`benchmarks`](#2.1---benchmarks): (?) several custom scenarios and configurations on which the performances of different RL algorithms can be evaluated and compared.

- [`controllers`](#2.2---controllers): (?) implementation of controllers for vehicles, such as IDM (Intelligent driver model), Follower-Stopper etc.

- [`core`](#2.3---core): (TODO) where the magic happens

- [`envs`](#2.4---envs): a collection of different environments for single-agent RL.

- [`multiagent_envs`](#2.5---multiagent_envs): a collection of different environments for multi-agent RL.

- [`renderer`](#2.6---renderer): (?) 

- [`scenarios`](#2.7---scenarios): a collection of different scenarios.

- [`utils`](#2.8---utils): (TODO) Aimsun folder + some other things

- [`visualize`](#2.9---visualize): a collection of scripts to visualize data from a trained policy.

The content of each of these folders is detailed in the following subsections. Don't hesitate to go and read the code files directly if you wish to get even more details. We try to keep everything commented and understandable. However if something remains unclear, even after reading all the tutorials, you can ask us on [Stack Overflow](https://stackoverflow.com/questions/tagged/flow-project) using the tag `flow-project` (make sure your question wasn't already asked before!).

## 2.1 - `benchmarks`

TODO

-- btw, website linked in https://github.com/flow-project/flow/tree/master/flow/benchmarks#reporting-optimal-scores doesn't seem functional?

## 2.2 - `controllers`

TODO

## 2.3 - `core`

TODO

## 2.4 - `envs`

The `envs` folder contains a certain number of default environments that are to be used for single-agent reinforcement learning within Flow. You can find out how an environment works and how you can create your own in [tutorial 9](https://github.com/flow-project/flow/blob/master/tutorials/tutorial09_environments.ipynb). The different environments available by default in Flow are as follows:

- **`Env` (`base_env.py`):** this is the base class for environments. It is an abstract class, you must inherit from it in order to define a usable environment (if you did not understand that, you can read more about [classes in Python](https://docs.python.org/3/tutorial/classes.html)). In its constructor, the simulation is initialized and started. In the `step` method, action and routing decisions are made for all non-RL and RL vehicles, this is in particular where the `get_state`, `compute_reward` and `apply_rl_actions` methods are called. There is also a `reset` method that is called each time the current simulation must be reset. 

- **`TestEnv` (`test.py`)**: this is a test environment, no learning will be performed here. It inherits from `Env` and defines the necessary methods with default values in order to be instantiable. You can use it if you need to put in an environment but don't want to actually make use of it.

- **`AccelEnv` (`loop/loop_accel.py`)**: this is a simple environment that will try to maximize the mean speed of the vehicles in the network. It assumes that the vehicles are in a loop, i.e. that there is a constant number of vehicles throughout the whole simulation. 

- **`WaveAttenuationEnv` (`loop/wave_attenuation.py`)**: this is an environment to train RL vehicles to attenuate the formation of stop-and-go waves on a ring scenario. 

- **`LaneChangeAccelEnv` (`loop/lane_changing.py`)**: this environment is similar to `AccelEnv`, but it also allows vehicle to change lanes. It assumes that there is a constant number of vehicles throughout the simulation.

- **`TrafficLightGridEnv` (`green_wave_env.py`)**: this is an environment to train RL traffic lights to regulate traffic flow on a grid scenario. The idea of "green wave" is that the traffic lights should become green gradually, following the wave of vehicles, instead of all turning green or red at the same time.

- **`WaveAttenuationMergePOEnv` (`merge.py`)**: this is an environment to train RL vehicles to minimize to reduce the congestion in a merge scenario. The RL vehicles are only present on the main road, not on the ramp.

- **`BottleneckEnv` (`bottleneck_env.py`)**: this is an environment that helps defining all the characteristics of a bottleneck scenario. It acts as an abstract class, but can still be used for tests when RL is not needed.

- **`BottleNeckAccelEnv` (`bottleneck_env.py`)**: this is an environment to train RL vehicles to effectively pass through a bottleneck.

- **`DesiredVelocityEnv` (`bottleneck_env.py`)**: this environment is similar to `BottleNeckAccelEnv`, but it also makes vehicles try to go to a certain speed.

- **`BayBridgeEnv` (`bay_bridge/base.py`)**: this is an environment to train RL vehicles to improve traffic on the bay bridge scenario. 

On top of that, some environments also have a partially observable version, that is defined in the same file as their fully-observable versions. The class names are `LaneChangeAccelPOEnv`,  `WaveAttenuationPOEnv` and `PO_TrafficLightGridEnv`. For instance, `WaveAttenuationPOEnv` is similar to `WaveAttenuationEnv` but the RL vehicle can only get information about the vehicle in front of it, instead of vehicles in the whole network.

There is also an `__init__.py` file in which you need to register your environment if you create a new one. 

All of these environments usually define an `ADDITIONAL_ENV_PARAMS` dictionary that allows you to tune the environment. You can read more about it in [tutorial 9](https://github.com/flow-project/flow/blob/master/tutorials/tutorial09_environments.ipynb). You can also find more about the state spaces, action spaces and reward functions defined in the different environments by reading the docstring of each class, which should give you ample explanations.

## 2.5 - `multiagent_envs`

TODO

## 2.6 - `renderer`

TODO

## 2.7 - `scenarios`

TODO

## 2.8 - `utils`

TODO

## 2.9 - `visualize`

TODO

## 3. Examples

TODO