# Tutorial 0: Flow

This tutorial gives you a high-level introduction to and a better understanding about what Flow is and how it works. Whether you want to be serious about using Flow or wish to contribute to the project, it may be helpful to understand the basics about how the Flow code is organized and what it does. This tutorial will introduce you to the core concepts used in Flow and is a highly recommended read before you dive into the next tutorials.

本教程为您提供了流的高级介绍，并更好地理解了流是什么以及它是如何工作的。无论您是希望认真地使用流，还是希望为项目做出贡献，了解流代码的组织方式及其功能的基础知识都是有帮助的。本教程将向您介绍Flow中使用的核心概念，强烈建议在深入学习下一个教程之前阅读本教程。

**How to get help:** If you happen, throughout the tutorials or when building your own code, to have any general or technical question related to Flow, don't hesitate to have a look on [Stack Overflow](https://stackoverflow.com/questions/tagged/flow-project) and see if it has been answered already, or otherwise to post it using the tag `flow-project`. We will be happy to help you!

* *如何得到帮助:* *如果你发生,整个教程或构建您自己的代码时,任何一般或流相关技术问题,不要犹豫地看看(Stack Overflow) (https://stackoverflow.com/questions/tagged/flow-project),看看它是否已经被回答,或者使用标记流项目的发布它。我们很乐意帮助你!

## 1. High-level of Flow

<img src="img/flow_venn_diagram.png" alt="Flow Venn Diagram" width="50%"/>

Flow acts as a bridge between traffic simulators (e.g. Sumo, Aimsun, ...) and reinforcement learning (RL) libraries (e.g. RLlib, Open AI, ...). It provides you with an interface that lets you train RL agents on a custom road network without having to worry about integration with the traffic simulator and the RL library. Flow creates this connection automatically. Flow also provides you with tools to analyze the trained policies.

flow充当了交通模拟器(如Sumo、Aimsun等)和增强学习(RL)库(如RLlib、Open AI等)之间的桥梁。它为您提供了一个接口，允许您在自定义道路网络上培训RL代理，而不必担心与交通模拟器和RL库的集成。Flow自动创建这个连接。Flow还为您提供了分析经过培训的策略的工具。

### Running Flow without training

All you need to run Flow is a network. 

- <u>**a network**</u>: this is basically the term we use to talk about a road network. A network is a class that contains information about the road network on which your agents will be trained. It describes the roads (position, size, number of lanes, speed limit, ...), the connections between the roads (junctions, intersections, ...), and possibly other information (traffic lights, ...).

这基本上是我们用来谈论道路网络的术语。网络是一个包含有关您的代理将在其上接受培训的道路网络的信息的类。它描述道路(位置，大小，车道数，限速，…)，道路之间的连接(交叉口，…)，可能还有其他信息(交通灯，…)。

Once you have defined this class, the next step is to set up the parameters of the simulation. These include, non-exhaustively, name of the simulation, the network to use, the simulator to use _(SUMO, Aimsun, ...)_, the vehicles and/or traffic lights to add to the network, etc.

一旦您定义了这个类，下一步就是设置模拟的参数。这些参数包括仿真的名称、要使用的网络、要使用的模拟器_(SUMO、Aimsun、…)_、要添加到网络中的车辆和/或交通灯等。

You can then run a simulation on this network without doing any training, in this case you won't need any "RL environment" (explained in next section). The next tutorials will show you how to do just that. Once you have gone through this tutorial, the next tutorials will walk you through the process of creating your own networks and RL environments, setting up a simulation so as to train your own agents, and finally visualizing the results.

然后，您可以在这个网络上运行模拟，而不需要进行任何培训，在这种情况下，您不需要任何“RL环境”(下一节将对此进行解释)。下一篇教程将向您展示如何做到这一点。一旦您完成了本教程，接下来的教程将指导您创建自己的网络和RL环境，设置模拟以培训自己的代理，并最终可视化结果。

### Running Flow with training

In order to get started and train your own agent on your own road network, you will need: 

- <u>**a network**</u>: explained above.

- <u>**an environment**</u>: this is the RL environment _(**not to be confused** with the physical environment, that we refer to as **network**)_. It is a class that allows you to control how the RL agent will be trained. To creat an environment, you will need to specify

这是RL环境_(**不要与物理环境混淆**，我们称之为**网络**)_。它是一个允许您控制如何训练RL代理的类。要创建一个环境，您需要指定

    - a **state space** that describes the states of the system that are available to observe. For example, for a vehicle, a state space could be the positions and velocities of all nearby vehicles, as well as its own speed. 
    - an **action space** describing how the agent can act in the environment. For example, a standard action for a vehicle would be an acceleration, whereas a standard action for a traffic light would be to switch the traffic light color. 
    - a **reward function** describing what the agent should try to maximize. Common rewards include maximizing the speed of the traffic system, the average flow of the traffic system, or the negative of the fuel emissions (a negative is used here to denote the penalty, so that total fuel emissions are minimized). 
    
 -一个**状态空间**，描述可观察的系统状态。例如，对于一个车辆，状态空间可以是附近所有车辆的位置和速度，以及它自己的速度。

-一个**动作空间**描述代理如何在环境中行动。例如，车辆的标准动作是加速，而交通灯的标准动作是改变交通灯的颜色。

-一个**奖励函数**描述什么代理应该尽量最大化。常见的奖励包括最大化交通系统的速度，交通系统的平均流量，或负的燃料排放(这里用一个负号来表示惩罚，这样总燃料排放就最小化了)。

Once you have defined these two classes, the last step is to set up the parameters of the simulation. These include, non-exhaustively, name of the simulation, the network and environment to use, the simulator to use _(SUMO, Aimsun, ...)_, the RL algorithm to use _(PPO, TRPO, ...)_ and its parameters _(number of iterations, number of CPUs/GPUs to use, discount rate, ...)_, the vehicles and/or traffic lights to add to the network, decision to render the simulation _(not rendering makes training much faster)_, etc.

定义了这两个类之后，最后一步是设置模拟的参数。简单,这些包括仿真名称、网络和环境使用,模拟器使用_(sumo,Aimsun,…) _, RL算法使用_ (PPO、TRPO…) _ _及其参数(数量的迭代,数量的cpu / gpu,贴现率,…)_,车辆和/或红绿灯添加到网络,决定渲染仿真_(不是渲染使训练速度更快)_,等等。

### Tools

During the training or after it has ended, you can use Flow's visualization tools in order to visualize the data saved in the checkpoint files generated during the training. You can see how well your agent is doing by running a new simulation in the simulator, that will used the trained policy (this time, the simulation will be rendered). You can also plot the reward or return functions, time-space diagrams, capacity diagrams etc.

在培训期间或培训结束后，您可以使用Flow的可视化工具来可视化存储在培训期间生成的检查点文件中的数据。通过在模拟器中运行一个新的模拟，您可以看到您的代理运行得有多好，该模拟将使用经过训练的策略(这一次，将呈现模拟)。你也可以绘制奖励或返回函数，时空图，容量图等。

To ease the process of getting started, Flow comes pre-built with over a dozen networks and RL environments that you can use as a starting point. Flow also has a lot of examples that set up simulations, with or without training, using these networks and environments in various ways. You can use them as a starting point and modify them according to your needs, or use them as templates to create your own code.

为了简化启动过程，Flow预先构建了超过12个网络和RL环境，您可以将其作为一个起点。Flow也有很多例子来设置模拟，不管有没有训练，以各种方式使用这些网络和环境。您可以将它们用作起点，并根据需要修改它们，或者将它们用作模板来创建自己的代码。

In the next section, we will give an overview of how Flow's codebase is organized, so that you can have some reference points when you go through the tutorials.

在下一节中，我们将概述Flow的代码库是如何组织的，这样您在学习教程时就可以有一些参考点。

## 2. Codebase structure

The `flow` codebase directory is structured as follows:

```python
flow
├── docs  # some random documents, don't worry about it
├── examples  # a lot of example codes using Flow -- this is where you want to head once you're done with the tutorials and want to start doing some real code很多使用Flow的示例代码——这是在完成教程并开始编写真正的代码之后，您想要做的事情
│   └── exp_configs  # configuration of the all the examples 所有例子的配置
│       ├── non_rl  # configurations of examples with simulations (e.g. either SUMO or Aimsun) without any Reinforcement Learning没有任何强化学习的例子配置与仿真(例如，sumo或Aimsun)
│       └── rl
│           ├── singleagent  # configurations of examples with training single agent RL contollers训练单一代理RL控制器的例子配置
│           └── multiagent  # configurations of examples with training multi agent RL contollers训练多智能体RL控制器的示例配置
├── flow
│   ├── benchmarks  # several custom networks and configurations on which you can evaluate and compare different RL algorithms几种自定义网络和配置，您可以评估和比较不同的RL算法
│   ├── controllers  # implementations of controllers for the vehicles (IDM, Follower-Stopper...)车辆控制器的实现(IDM, follow - stopper…)
│   ├── core  # the core logic of the code -- where the magic happens代码的核心逻辑——奇迹发生的地方
│   │   └── kernel
│   │       ├── network  # logic for the network网络逻辑
│   │       ├── simulation  # where the simulation is created and managed创建和管理模拟的地方 
│   │       ├── traffic_light  # logic for the traffic lights红绿灯的逻辑
│   │       └── vehicle  # logic for the vehicles车辆逻辑
│   ├── envs  # environments (where states, actions and rewards are handled)环境(处理状态、动作和奖励的地方)
│   │   └── multiagent  # multi-agent environments多代理环境
│   ├── renderer  # pyglet renderer渲染器
│   ├── networks  # networks (ie road networks)网络(即道路网络)
│   ├── utils  # the files that don't fit anywhere else其他地方放不下的文件
│   └── visualize  # scripts to replay policies, analyse reward functions etc.脚本重播政策，分析奖励功能等。
├── scripts  # mostly installation scripts主要是安装脚本
├── tests  # unit tests单元测试
└── tutorials  # <-- you are here你在这里
```

Don't hesitate to go and read the code files directly! We try to keep everything documented and understandable. However if something remains unclear, even after reading all the tutorials and going through the examples, you can ask us on [Stack Overflow](https://stackoverflow.com/questions/tagged/flow-project) using the tag `flow-project` (make sure your question wasn't already asked before!).

不要犹豫直接读取代码文件!我们尽量让每件事都有案可查，让人可以理解。然而，如果仍然有不清楚的地方，即使在阅读了所有教程和示例之后，您也可以在[Stack Overflow](https://stackoverflow.com/questions/tagged/flow-project)上使用标记“flow-project”(确保您的问题之前没有被问过!)