Update doc for sensors (#593)

* switch to linux * redo interface * fix * fix obs bug * fix bug * update ipynb * fix bug * optimize * test read the doc * fix bug * github test * fix test * test create buffer * restore
metadriverse · Jan 7, 2024 · ca2a869 · ca2a869
1 parent 53c093c
commit ca2a869
Show file tree

Hide file tree

Showing 15 changed files with 307 additions and 160 deletions.
diff --git a/documentation/README.md b/documentation/README.md
@@ -50,7 +50,7 @@ The second way can not refer to subtitle.
 ### Execution some files when building doc
 
 The doc is set to disable executing all `.ipynb`.
-For enabling executing certain files, set in the file metadata with:
+For enabling executing certain files and generate result when compiling the doc, set in the file metadata with:
 
 ```python
 {

diff --git a/documentation/source/index.rst b/documentation/source/index.rst
@@ -54,11 +54,11 @@ Please feel free to contact us if you have any suggestions or ideas!
    :caption: Concepts and Customization
 
    system_design.ipynb
+   sensors.ipynb
    top_down_render.ipynb
    panda_render.ipynb
    vehicle.ipynb
    navigation.ipynb
-   sensors.ipynb
    description.ipynb
    record_replay.ipynb
    development.rst

diff --git a/documentation/source/obs.ipynb b/documentation/source/obs.ipynb
@@ -229,7 +229,9 @@
    "id": "43e05810",
    "metadata": {},
    "source": [
-    "Rendering images and buffering the image observations consume both the GPU and CPU memory of your machine. Please be careful when using this. If you feel the visual data collection is slow, why not try our advanced offscreen render: <a href=\"install.html#install-metadrive-with-advanced-offscreen-rendering\">Install MetaDrive with advanced offscreen rendering</a>. After verifying your installation, set `config[\"image_on_cuda\"] = True` to get **10x** faster rollout efficiency! It will keep the rendered image on the GPU memory all the time for training, so please ensure your GPU has enough memory to store them. "
+    "Rendering images and buffering the image observations consume both the GPU and CPU memory of your machine. Please be careful when using this. If you feel the visual data collection is slow, why not try our advanced offscreen render: <a href=\"install.html#install-metadrive-with-advanced-offscreen-rendering\">Install MetaDrive with advanced offscreen rendering</a>. After verifying your installation, set `config[\"image_on_cuda\"] = True` to get **10x** faster rollout efficiency! It will keep the rendered image on the GPU memory all the time for training, so please ensure your GPU has enough memory to store them. \n",
+    "\n",
+    "More details of how to use sensors is at [Sensors](sensors.ipynb)."
    ]
   },
   {
@@ -374,7 +376,7 @@
     "        return gym.spaces.Dict(os)\n",
     "\n",
     "    def observe(self, vehicle):\n",
-    "        os={o: getattr(self, o).observe(vehicle) for o in [\"rgb\", \"state\", \"depth\", \"semantic\"]}\n",
+    "        os={o: getattr(self, o).observe() for o in [\"rgb\", \"state\", \"depth\", \"semantic\"]}\n",
     "        return os"
    ]
   },
@@ -503,12 +505,14 @@
     "## Customization-MultiView\n",
     "Usually, you don't need to create a certain type of image sensor multiple times and maintain many instances in the game engine. Instead, you can create just one sensor but mount it to different positions and poses to collect multiple rendering results. In this way, the rendering can be more efficient.\n",
     "\n",
-    "In the following example, we want 4 RGB cameras to monitor the traffic density of 4 entries of an intersection. **In practice, we don't create any new RGB cameras but use the main RGB camera. In every frame, we move it to 4 target positions to collect data.** The result is the same as capturing images with 4 different RGB cameras, which harms the simulation's performance."
+    "In the following example, we want 4 RGB cameras to monitor the traffic density of 4 entries of an intersection. **In practice, we don't create any new RGB cameras but use the main RGB camera. In every frame, we move it to 4 target positions to collect data.** The result is the same as capturing images with 4 different RGB cameras, which harms the simulation's performance.\n",
+    "\n",
+    "More sensor examples can be found in [Sensors](sensors.ipynb)."
    ]
   },
   {
    "cell_type": "code",
-   "execution_count": 6,
+   "execution_count": 12,
    "id": "995d5314-92a7-4e68-8bb8-05f1bd8ab718",
    "metadata": {
     "editable": true,
@@ -535,7 +539,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 12,
+   "execution_count": 13,
    "id": "7f20c293-77f9-451c-a552-882def3d6257",
    "metadata": {
     "editable": true,
@@ -564,49 +568,64 @@
     "\n",
     "    @property\n",
     "    def observation_space(self):\n",
-    "        os={\"entry_{}\".format(idx): self.rgb.observation_space for idx in range(4)}\n",
+    "        os = {\"entry_{}\".format(idx): self.rgb.observation_space for idx in range(4)}\n",
     "        os[\"top_down\"] = self.rgb.observation_space\n",
     "        return gym.spaces.Dict(os)\n",
     "\n",
     "    def observe(self, vehicle):\n",
     "        ret = {}\n",
+    "        # The first rendered image is the top-down view\n",
+    "        ret[\"top_down\"] = self.rgb.observe()\n",
+    "        # The camera can be borrowed to render new images with new poses\n",
     "        for idx in range(4):\n",
-    "            ret[\"entry_{}\".format(idx)]= self.rgb.observe(self.engine.origin, position=[70, 8.75, 8], hpr=[idx*90, -15, 0], refresh=True)\n",
-    "        ret[\"top_down\".format(idx)]= self.rgb.observe(self.engine.origin, position=[70, 8.75, 50], hpr=[0, -89.99, 0], refresh=True)\n",
+    "            ret[\"entry_{}\".format(idx)] = self.rgb.observe(self.engine.origin,\n",
+    "                                                           position=[70, 8.75, 8],\n",
+    "                                                           hpr=[idx * 90, -15, 0])\n",
     "        return ret\n",
     "\n",
     "\n",
     "env_cfg = dict(agent_observation=MyObservation,\n",
-    "               image_observation=True, \n",
+    "               image_observation=True,\n",
     "               window_size=sensor_size,\n",
     "               map=\"X\",\n",
-    "               show_terrain= not os.getenv('TEST_DOC'),\n",
+    "               show_terrain=not os.getenv('TEST_DOC'),\n",
     "               traffic_density=0.2,\n",
     "               show_interface=False,\n",
     "               show_fps=False,\n",
     "               traffic_mode=\"respawn\",\n",
-    "               log_level=50, # no log message\n",
+    "               log_level=50,  # no log message\n",
     "               vehicle_config=dict(image_source=\"main_camera\"))\n",
     "\n",
     "\n",
+    "def reset_sensors(self):\n",
+    "    \"\"\"\n",
+    "    Put the main camera to the center of the intersection at the start of each episode\n",
+    "    \"\"\"\n",
+    "    self.main_camera.stop_track()\n",
+    "    self.main_camera.set_bird_view_pos([70, 8.75])\n",
+    "    self.main_camera.top_down_camera_height = 50\n",
+    "\n",
+    "\n",
+    "MetaDriveEnv.reset_sensors = reset_sensors\n",
+    "\n",
     "frames = []\n",
-    "env=MetaDriveEnv(env_cfg)\n",
+    "env = MetaDriveEnv(env_cfg)\n",
     "try:\n",
     "    env.reset()\n",
     "    print(\"Observation shape: \\n\", env.observation_space)\n",
     "    for step in range(1 if os.getenv('TEST_DOC') else 500):\n",
-    "        o, r, d, _, _ = env.step([0, -1]) # simulation\n",
-    "        \n",
+    "        o, r, d, _, _ = env.step([0, -1])  # simulation\n",
+    "\n",
     "        # visualize image observation\n",
     "        o_1 = o[\"entry_0\"][..., -1]\n",
     "        o_2 = o[\"entry_1\"][..., -1]\n",
     "        o_3 = o[\"entry_2\"][..., -1]\n",
     "        o_4 = o[\"entry_3\"][..., -1]\n",
     "        o_5 = o[\"top_down\"][..., -1]\n",
-    "        ret = cv2.hconcat([o_1, o_2, o_3, o_4, o_5])*255\n",
-    "        ret=ret.astype(np.uint8)\n",
-    "        frames.append(ret[::2, ::2,::-1])\n",
-    "    generate_gif(frames if os.getenv('TEST_DOC') else frames[-100:]) # only show -100 frames\n",
+    "        ret = cv2.hconcat([o_1, o_2, o_3, o_4, o_5]) * 255\n",
+    "        ret = ret.astype(np.uint8)\n",
+    "        frames.append(ret[::2, ::2, ::-1])\n",
+    "    generate_gif(frames if os.getenv('TEST_DOC') else frames[-100:])  # only show -100 frames\n",
     "finally:\n",
     "    env.close()"
    ]
@@ -663,7 +682,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.9.17"
+   "version": "3.7.13"
   },
   "mystnb": {
    "execution_mode": "force"

diff --git a/documentation/source/sensors.ipynb b/documentation/source/sensors.ipynb
@@ -5,25 +5,145 @@
    "id": "0a5575ba",
    "metadata": {},
    "source": [
-    "  # Sensors\n",
-    "  \n",
-    "  It should be a dict like this:\n",
-    "    # sensors={\n",
-    "    #   \"rgb_camera\": (RGBCamera, arg_1, arg_2, ..., arg_n),\n",
-    "    #    ...\n",
-    "    #   \"your_sensor_name\": (sensor_class, arg_1, arg_2, ..., arg_n)\n",
-    "    # }\n",
-    "    # Example:\n",
-    "    # sensors = dict(\n",
-    "    #           lidar=(Lidar,),\n",
-    "    #           side_detector=(SideDetector,),\n",
-    "    #           lane_line_detector=(LaneLineDetector,)\n",
-    "    #           rgb_camera=(RGBCamera, 84, 84),\n",
-    "    #           mini_map=(MiniMap, 84, 84, 250),\n",
-    "    #           depth_camera=(DepthCamera, 84, 84),\n",
-    "    #         )\n",
-    "    # These sensors will be constructed automatically and can be accessed in engine.get_sensor(\"sensor_name\")\n",
-    "    # NOTE: main_camera will be added automatically if you are using offscreen/onscreen mode"
+    "# Sensors\n",
+    "\n",
+    "Sensors are important for collecting information about surroundings.\n",
+    "By default, all environments provide 3 basic sensors:\n",
+    "\n",
+    "- Lidar\n",
+    "- SideDetector\n",
+    "- LaneLineDetector\n",
+    "\n",
+    "which are used for detecting moving objects, sidewalks/solid lines, and broken/solid lines respectively.\n",
+    "As these sensors are built based on ray test and don't need graphics support, they can be used in all modes.\n",
+    "Also, you don't need to recreate them again, as they are not binded with any objects until `perceive()` is called and the target object is specified. After collecting results, those ray-based sensors are detached and ready for next use.\n",
+    "\n",
+    "You can access them at anywhere through the `engine.get_sensor(sensor_id)`:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "d07cb731-8a81-4fbe-827e-1ca2d4b150e8",
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "from metadrive.envs.base_env import BaseEnv\n",
+    "\n",
+    "env = BaseEnv(dict(log_level=50))\n",
+    "env.reset()\n",
+    "\n",
+    "lidar = env.engine.get_sensor(\"lidar\")\n",
+    "side_lidar = env.engine.get_sensor(\"side_detector\")\n",
+    "lane_line_lidar = env.engine.get_sensor(\"lane_line_detector\")\n",
+    "print(\"Available sensors are:\", env.engine.sensors.keys())\n",
+    "\n",
+    "env.close()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "d33b3e7a-223a-4a89-bb9d-040f3c233adc",
+   "metadata": {},
+   "source": [
+    "## Add New Sensor\n",
+    "To add new sensors, you should request them by using `env_config`.\n",
+    "If an sensor is defined as follows:\n",
+    "```python\n",
+    "class MySensor(BaseSensor):\n",
+    "\n",
+    "    def __init__(self, args_1, args_2, engine)\n",
+    "```\n",
+    "Then we can create it by:\n",
+    "```python\n",
+    "env_cfg = dict(sensors=dict(new_sensor=(MySensor, args_1, args_2)))\n",
+    "env = MetaDriveEnv(env_cfg)\n",
+    "```\n",
+    "The following example shows how to create a RGBCamera whose buffer size are width=32, height=16.\n",
+    "**Note: for creating cameras or any sensors requiring rendering, please turn on `image_observation`**."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "425d66f9-118f-4b91-a343-ef1385281ba8",
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "from metadrive.envs.base_env import BaseEnv\n",
+    "from metadrive.component.sensors.rgb_camera import RGBCamera\n",
+    "import cv2\n",
+    "import os\n",
+    "size = (256, 128) if not os.getenv('TEST_DOC') else (16, 16) # for github CI\n",
+    "\n",
+    "env_cfg = dict(log_level=50, # suppress log\n",
+    "               image_observation=True,\n",
+    "               show_terrain=not os.getenv('TEST_DOC'),\n",
+    "               sensors=dict(rgb=[RGBCamera, *size]))\n",
+    "\n",
+    "\n",
+    "env = BaseEnv(env_cfg)\n",
+    "env.reset()\n",
+    "print(\"Available sensors are:\", env.engine.sensors.keys())\n",
+    "cam = env.engine.get_sensor(\"rgb\")\n",
+    "img = cam.get_rgb_array_cpu()\n",
+    "cv2.imwrite(\"img.png\", img)\n",
+    "\n",
+    "env.close()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "1dd842e2-c89f-4715-af17-4a2d00f7bdd9",
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "from IPython.display import Image\n",
+    "Image(open(\"img.png\", \"rb\").read())"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "a15f6912-500b-433e-a69a-74660028b3d6",
+   "metadata": {},
+   "source": [
+    "The log message shows that not only the `rgb` is created, but a `main_camera` is provided automatically, which is also an RGB camera rendering into the pop-up window. It can serve as a sensor as well. More details are available at\n",
+    "<a href=\"sensors.html#main-camera\">Main Camera</a>."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "c5311f1d-dc65-4f8e-840d-a46698571252",
+   "metadata": {
+    "tags": []
+   },
+   "source": [
+    "## Physics-based Sensors"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "8caacfa8-0d10-4dc2-8a6c-494dc7524b0d",
+   "metadata": {
+    "tags": []
+   },
+   "source": [
+    "## Graphics-based Sensors\n",
+    "\n",
+    "### Main Camera\n",
+    "\n",
+    "### RGB Camera\n",
+    "\n",
+    "### Depth Camera\n",
+    "\n",
+    "### Semantic Camera"
    ]
   }
  ],
@@ -44,6 +164,9 @@
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
    "version": "3.7.13"
+  },
+  "mystnb": {
+   "execution_mode": "force"
   }
  },
  "nbformat": 4,