Skip to content

Workshop 9 ‐ Robot Vision

gcielniak edited this page Nov 22, 2023 · 11 revisions

Preparations

  • Have the simulation and the LIMO robot ready for comparisons. If you work with the simulator please remember to pull the latest changes to the docker image or, if you work on your own installation of the software, from the repo. If you work with the real robot, pull the latest changes from the repo after you ssh to the robot and just before you start the container.

Task 1: Tools

  1. RQt tools are very convenient for inspecting image topics. First, install the rqt_image_view package by issuing sudo apt-get install ros-humble-rqt-image-view. Whilst the cameras on the real/simulated robot are running, issue the following command to visualise the colour image (check for image topic as these are slightly different for real/simulated robot): ros2 run rqt_image_view rqt_image_view --ros-args -r image:=/limo/depth_camera_link/image_raw. You can also skip the image topic arguments and select an image topic from the list available through GUI.

  2. The basic tool for viewing and saving images is image_view but has some limitations when it comes to accepting topics with different QoS settings. We can still use those tools for the image stream originating from the simulated robot: ros2 run image_view image_view --ros-args -r image:=/limo/depth_camera_link/image_raw

  3. The rosbag2 package allows for convenient recording and replaying of different types of ROS topics including the images. To record the rosbag file, issue ros2 bag [-o bagfilename] <topics> and then to replay the recorded file: ros2 bag play <filename>. Some of the sensor topics on the real robot might require overriding QoS policies which is covered in the following article.

  4. Some scenarios might involve streaming directly from camera/images (e.g. for testing, training and annotation without robot). The most straightforward way is to use the image_publisher package. The node can be run as follows ros2 run image_publisher image_publisher_node <input> where input can be both a video file name (e.g. test.mp4) or camera device (e.g. /dev/video0).

Task 2: Colour detection

  • Run the simulator and insert a coloured object (e.g. a construction cone) from the object library in front of the robot. You might need to adjust the size of the cone so that it fits entirely into the image.
  • Create a ROS package opencv_test, which should depend on cv_bridge and rclpy (see the practical example how to build the package with a node from scratch.)
  • Be inspired by the following example opencv_test.py and develop a Python node which subscribes to the camera of your LIMO robot, and e.g. masks out a specific colour in the image (e.g. orange for the construction cone). For masking out, the OpenCV function inRange can be quite handy as also used in this code.
  • Calculate the size and location of the segmented object (e.g. by either finding min/max pixel coordinates in the segmented image or by using connectedComponentsWithStats).
  • Print out the size and centre location of the largest object in the terminal window.
  • Try it out on LIMO with a real coloured object!

Task 3: Object detection

  • Use the simulation scenario with the construction cone as above.
  • Install the find_object_2d package: sudo apt-get install ros-humble-find-object-2d. Then run the application as ros2 run find_object_2d find_object_2d image:=<image_topic> and subscribe to the colour image topic. "Train" a cone detector (simply mark a rectangular area of the image to indicate the object's binding box) and save the model (i.e. a simple image) into a directory. Inspect the objects topic (see find_object_2d) and see if you can find the size and location of the object.
  • Compare the detection robustness against the colour detector from the previous task.
  • Try running the node in a non-interactive mode with your saved model: ros2 run find_object_2d find_object_2d --ros-args -r image:=/limo/depth_camera_link/image_raw -p objects_path:=[path_to_your_objects] -p gui:=false.
  • Try it out on LIMO with a real object!

Task 4: Vision-based control

  • Implement a simple vision-based controller so that the robot always aims at the centre of the detected object.
  • Use one of the above detectors to provide the centre location of the object.
  • Calculate the difference between the image centre and the object's location (it is sufficient to do that for the horizontal axis only).
  • Publish a suitable velocity command (cmd_vel) based on the following different controllers:
    • Bang-bang controller: turn in place in the opposite direction to the sign of the calculated difference with a fixed velocity. Try out different speeds and observe the behaviour of the robot.
    • Bang-bang controller with hysteresis: for less reactive behaviour, introduce a "dead zone" where the robot only corrects its orientation if the difference exceeds a certain threshold. Again, try different combinations of threshold and speed values.
    • Proportional controller: adjust the speed of the robot based on the magnitude of the calculated difference. This would require some scaling parameter (i.e. gain) to map the pixel difference into the speed values. Try out different values of the scaling parameter.

Task 5: CNN-based object detection [optional]

Try out the above tasks but with a CNN-based object detector instead.

First, read about YOLO, and even the original paper.

  • Clone a humble branch of the darknet_ros fork from our repository into your workspace git clone -b humble --recursive https://github.com/LCAS/darknet_ros.git
  • Edit darknet_ros/config/ros.yaml to use the correct image topic, e.g.:
      darknet_ros:
         ros__parameters:
            subscribers:
               camera_reading:
                  topic: /limo/depth_camera_link/image_raw
    
  • Build the package: colcon build --symlink-install
  • Source your workspace and run as ros2 launch darknet_ros darknet_ros.launch. Check the darknet_ros repo for details on how the object information is being published.
  • Adapt the detector into your vision-based controller.