Improve documentation (#47)

* optimize idm policy docs * update docs on observation * update readme * introduce links in readme * update images * center * bold * text align to center * text align to center * finish
metadriverse · Sep 3, 2021 · f622a92 · f622a92
1 parent df390f5
commit f622a92
Show file tree

Hide file tree

Showing 6 changed files with 55 additions and 28 deletions.
diff --git a/README.md b/README.md
@@ -6,17 +6,26 @@
 
 # MetaDrive: Composing Diverse Driving Scenarios for Generalizable RL
 
-<br>
 
-**Though the development of MetaDrive is already settled for current stage, we are still working on managing the documentation and other stuff. We expect to finish all cleanup by 1st, September.**
+<div style="text-align: center; width:100%; margin: 0 auto; display: inline-block">
+<strong>
+[
+<a href="https://decisionforce.github.io/metadrive/">Website</a>
+|
+<a href="https://metadrive-simulator.readthedocs.io">Documentation</a>
+|
+<a href="https://github.com/decisionforce/metadrive">Github Repo</a>
+]
+</strong>
+</div>
 
+<br>
 
-Welcome to MetaDrive! MetaDrive is an driving simulator with many key features, including:
+Welcome to MetaDrive! MetaDrive is an driving simulator with many key features:
 
-- **Lightweight**: Extremely easy to download, install and run in almost all platforms.
-- **Realistic**: Accurate physics simulation and multiple sensory inputs.
-- **Efficient**: Up to 300 simulation step per second and easy to parallel.
-- **Compositionality**: Support generating infinite scenes and configuring various traffic, vehicle, and environmental settings.
+- **Lightweight**: Extremely easy to download, install and run in almost all platforms. Up to 300 simulation step per second and easy to parallel.
+- **Realistic**: Accurate physics simulation and multiple sensory input including Lidar, sensory data, top-down semantic map and first-person view images. 
+- **Compositional**: Support generating infinite scenes and configuring various traffics, vehicles, and environmental settings.
 
 
 ## 🛠 Quick Start

diff --git a/documentation/source/config_system.rst b/documentation/source/config_system.rst
@@ -118,7 +118,7 @@ Visualization & Rendering Config
 
 The config in this part specifies the setting related to visualization. The :code:`use_render` is the most useful one.
 
-    - :code:`use_render` (bool = False): whether to pop a window on your screen or not
+    - :code:`use_render` (bool = False): whether to pop a window on your screen or not. This is irrelevant to the vision-based observation.
     - :code:`disable_model_compression` (bool = True): Model compression reduces the memory consumption when using Panda3D window to visualize. Disabling model compression greatly improves the launch speed but might cause breakdown in low-memory machine.
     - :code:`cull_scene` (bool = True): When you want to access the image of camera, it should be set to True.
     - :code:`use_chase_camera_follow_lane` (bool = False): whether to force the third-person view camera following the heading of current lane
@@ -143,13 +143,13 @@ We list the vehicle config here. Observation Space will be adjusted by these con
         - :code:`side_detector` (dict): This Lidar only scans the side of the road but not vehicles. The config dict has identical keys as :code:`lidar` except :code:`num_others`.
         - :code:`lane_line_detector` (dict): This Lidar only scans the side of current lane but neither vehicles or road boundary. The config dict has identical keys as :code:`lidar` except :code:`num_others`.
         - :code:`show_lidar` (bool = False): whether to show the end of each Lidar laser in the scene
-        - :code:`rgb_camera` (tuple): (camera resolution width(int), camera resolution height(int). We use (84, 84) as the default size so that the RGB observation is compatible to those CNN used in Atari. Please refer to :ref:`use_native_rendering` for more information about using image as observation.
         - :code:`increment_steering` (bool = False): for keyboard control. When set to True, the steering angle and acceleration is determined by the key pressing time
         - :code:`vehicle_model` (str = "default"): which type of vehicle to use in ego vehicle (s, m, l, xl, default)
         - :code:`enable_reverse` (bool = False): If True and vehicle speed < 0, a brake action (e.g. acceleration = -1) will be parsed as reverse. This is used in the Multi-agent Parking Lot environment.
         - :code:`extra_action_dim` (int = 0): If you want to input more control signal than the default [steering, throttle/brake] in your customized environment, change the default value 0 to the extra number of dimensions.
         - :code:`random_color` (bool = False): whether to randomize the color of ego vehicles. This is useful in multi-agent environments.
-        - :code:`image_source` (str = "rgb_camera"): select in ["rgb_camera", "depth_camera"]. When using image observation, it decides where the image collected.
+        - :code:`image_source` (str = "rgb_camera"): select in ["rgb_camera", "depth_camera"]. When using image observation, it decides where the image collected. See :ref:`use_native_rendering` for more information.
+        - :code:`rgb_camera` (tuple = (84, 84): (camera resolution width (int), camera resolution height (int). We use (84, 84) as the default size so that the RGB observation is compatible to those CNN used in Atari. Please refer to :ref:`use_native_rendering` for more information about using image as observation. See :ref:`use_native_rendering` for more information.
         - :code:`spawn_lane_index` (tuple): which lane to spawn this vehicle. Default to one lane in the first block of the map
         - :code:`spawn_longitude/lateral` (float = 5.0, 0.0): The spawn point will be calculated by *spawn_longitude* and *spawn_lateral*
         - :code:`destination_node` (str = None): the destination road node name. This is used in real dataset replay map.
@@ -164,7 +164,7 @@ Other Observation Config
 
 The vehicle config decides many of the observational config.
 
-    - :code:`offscreen_render` (bool = False): If you want to use camera data, please set this to True.
+    - :code:`offscreen_render` (bool = False): If you want to use vision-based observation, please set this to True. See :ref:`use_native_rendering` for more information.
     - :code:`rgb_clip` (bool = True): if True than squeeze the value between \[0, 255\] to \[0.0, 1.0\]
     - :code:`headless_machine_render` (bool = False): Set this to True only when training on headless machine and using rgb image
 

diff --git a/documentation/source/index.rst b/documentation/source/index.rst
@@ -16,10 +16,9 @@ Welcome to the MetaDrive documentation!
 MetaDrive is an efficient and compositional driving simulator for reinforcement learning community!
 The key features of MetaDrive includes:
 
-- **Lightweight**: Extremely easy to download, install and run in almost all platforms.
-- **Realistic**: Accurate physics simulation and multiple sensory input including RGB camera, Lidar and sensory data.
-- **Efficient**: Up to 300 simulation step per second.
-- **Open-ended**: Support generating infinite scenes and configuring various traffics, vehicles, and environmental settings.
+- **Lightweight**: Extremely easy to download, install and run in almost all platforms. Up to 300 simulation step per second and easy to parallel.
+- **Realistic**: Accurate physics simulation and multiple sensory input including Lidar, sensory data, top-down semantic map and first-person view images.
+- **Compositional**: Support generating infinite scenes and configuring various traffics, vehicles, and environmental settings.
 
 This documentation brings you the information on installation, usages and more of MetaDrive!
 

diff --git a/documentation/source/observation.rst b/documentation/source/observation.rst
@@ -43,7 +43,7 @@ The above information is concatenated into a state vector by the `LidarStateObse
 
 .. _use_pygame_rendering:
 
-Pygame Top-down Semantic Maps
+Top-down Semantic Maps
 ********************************
 
 
@@ -84,27 +84,38 @@ The above figure shows the semantic meaning of each channel.
 Use First-view Images in Training
 ##################################
 
-MetaDrive supports visuomotor tasks by turning on the rendering during the training.
 
 .. image:: figs/rgb_obs.png
-   :width: 600
+   :width: 350
    :align: center
 
-.. image:: figs/depth_obs.png
-   :width: 600
+.. image:: figs/depth_obs.jpg
+   :width: 350
    :align: center
 
-Special config needs to activate camera observation.
 
-1. In env config **offline_render** needs to be **True** to tell MetaDrive retrieving images from camera
-2. In vehicle_config (under env config), set **image_source** to **rgb_camera** or **depth_camera** to get sensory data
-3. The image size will be determined by the camera parameters. For example, **rgb_camera=(200, 88)** means that the image is in 200 x 88
+MetaDrive supports visuomotor tasks by turning on the rendering during the training.
+The above figure shows the images captured by RGB camera (left) and depth camera (right).
+In this section, we discuss how to utilize such observation in a **headless** machine, such as computing node in cluster
+or other remote server.
+Before using such function in your project, please make sure the offscreen rendering is working in your
+machine. The setup tutorial is at :ref:`install_headless`.
+
+Now we can setup the vision-based observation in MetaDrive:
+
+* Step 1. Set the :code:`config["offscreen_render"] = True` to tell MetaDrive maintaining a image buffer in memory even no popup window exists.
+* Step 2. Set the :code:`config["vehicle_config"]["image_source"]` to :code:`"rgb_camera"` or :code:`"depth_camera"` according to your demand.
+* Step 3. The image size (width and height) will be determined by the camera parameters. The default setting is (84, 84) following the image size in Atari. You can customize the size by configuring :code:`config["vehicle_config"]["rgb_camera"]`. For example, :code:`config["vehicle_config"]["rgb_camera"] = (200, 88)` means that the image has 200 pixels in width and 88 pixels in height.
 
-There is a demo script using camera output via::
+There is a demo script using RGB camera as observation::
 
     python -m metadrive.examples.drive_in_single_agent_env --observation rgb_camera
 
+The script should print a message:
 
+.. code-block:: text
 
+    The observation is a dict with numpy arrays as values:  {'image': (84, 84, 3), 'state': (21,)}
 
+The image rendering consumes memory in the first GPU of your machine (if any). Please be careful when using this.
 
diff --git a/metadrive/examples/drive_in_single_agent_env.py b/metadrive/examples/drive_in_single_agent_env.py
@@ -5,10 +5,12 @@
 Note: This script require rendering, please following the installation instruction to setup a proper
 environment that allows popping up an window.
 """
+import argparse
 import random
 
+import numpy as np
+
 from metadrive import MetaDriveEnv
-import argparse
 
 if __name__ == "__main__":
     config = dict(
@@ -29,7 +31,13 @@
     if args.observation == "rgb_camera":
         config.update(dict(offscreen_render=True))
     env = MetaDriveEnv(config)
-    env.reset()
+    o = env.reset()
+    if args.observation == "rgb_camera":
+        assert isinstance(o, dict)
+        print("The observation is a dict with numpy arrays as values: ", {k: v.shape for k, v in o.items()})
+    else:
+        assert isinstance(o, np.ndarray)
+        print("The observation is an numpy array with shape: ", o.shape)
     for i in range(1, 1000000000):
         o, r, d, info = env.step([0, 0])
         env.render(text={

diff --git a/metadrive/policy/idm_policy.py b/metadrive/policy/idm_policy.py
@@ -287,7 +287,7 @@ def lane_change_policy(self, all_objects):
         next_lanes = self.control_object.navigation.next_ref_lanes
         lane_num_diff = len(current_lanes) - len(next_lanes) if next_lanes is not None else 0
 
-        # must perform lane change due to routing lane num change
+        # We have to perform lane changing because the number of lanes in next road is less than current road
         if lane_num_diff > 0:
             # lane num decreasing happened in left road or right road
             if current_lanes[0].is_previous_lane_of(next_lanes[0]):