# Mastering with ROS: Turtlebot3

<img src="img/robotis_logo.png" width="400" />

<img src="img/object_rec_unit.png" width="600" />

<img src="img/robotignite_logo_text.png" width="400"/>

## Unit 5: Perception and Object Recognition

<p style="background:green;color:white;">SUMMARY</p>

Estimated time of completion: <b>1'5h</b><br><br>
This Unit will show you how to use Perception and Object Recognition to get the position of graspable objects.

<p style="background:green;color:white;">END OF SUMMARY</p>

In the previous Chapter, you learned how to perform blob tracking using the RGB camera of the robot. But you need to know that the blob tracking you performed was getting the position in the 2D space, not in 3D. Although it is possible to get the 3D position with the RGB image (using the Are), you would need to do some extra work, and it is not very accurate. So, the practice in order to get the 3D position of an object in an environment, is to use the depth (PointCloud) data of the sensor. And is what you are going to do in thus Unit!

One of the most usefull perception skills is being able to recognise objects. This allows you to create robots that can grasp objects and understand the world around them a little bit better.
<br><br>
There are two main skills to master here:<br>

* **Recognise flat surfaces**: This allows the robot to detect places where objects usually are. For instance, tables or shelves. It's the first step to take when searching for objects.

* **Recognise objects**: Once you know where to look, you have to be able to recognise different object in the scene and localise where they are placed in the environment.

For this Unit, we are going to focus on the Object Recognition, since the Turtlebot3 robot is an small robot, and isn't likely going to be detecting tables.

Also, as you may have seen, we have made a modification in the simulation, in order to make easier the task of recognising objects in the environment.

* The ball to grasp has now a black and white texture, which makes it easier to detect by the camera.

<img src="img/ball_with_texture.png" width="400" />

* Also, we introduce you to your new robotic friend. The Turtlebot3 Waffle with the open manipulator!

So... with the proper introductions made, let's start working!

## Let's get some pictures!

The firt step will be to take some pictures of the object we want to grasp, in order to detect some key points that define the object. With this key points, we will be able to detect the object later, by comparing the pictures taken with the object being detected by the camera.

For this purpose we will use the **find_object_2d** package. So, in order to see how to do this, just follow the next exercise!

<p style="background:#EE9023;color:white;">Exercise 5.1</p>

a) The first step is to create your own object recognition package:

<table style="float:left;background: #407EAF">
<tr>
<th>
<p class="transparent">Execute in WebShell #1</p>
</th>
</tr>
</table>

In [None]:
roscd;cd ../src

In [None]:
catkin_create_pkg my_object_recognition_pkg rospy object_recognition_core

b) Inside your object recognition package, create a new launch file called **start_find_object_2d.launch**. Copy the following code inside it:

In [None]:
<?xml version="1.0" encoding="UTF-8"?>

<launch> 
    <arg name="camera_rgb_topic" default="/camera/rgb/image_raw" />
    <node name="find_object_2d_node" pkg="find_object_2d" type="find_object_2d" output="screen">
        <remap from="image" to="$(arg camera_rgb_topic)"/>
    </node>
    
</launch> 

As you can see, you just need to set the RGB camera image source and the system will be ready to go. In this case, it's **/camera/rgb/image_raw**.

c) Launch this file and go to The Graphical Tools tab. You should see something similar to this:

<img src="img/photos1.png" width="600" />

After a few seconds, you will be able to see the scene.

<img src="img/rec1.png" width="600" />

d) Now, it's time to get some pics of the object we want to grasp! In order to do this, select the **Edit -> Add object from scene** option.

<img src="img/photos3.png" width="400" />

You can also add previously taken images directly, but bear in mind that there are some peculiarities. The images appear in this object recogniser mirrored, if you compare them with the images from the cameras. So you should be careful with that.

e) In the **Add Object** screen, you just have to follow the steps in order to select the section of the image that you consider to be the object.

Click on the **Take picture** button.

<img src="img/rec2.png" width="600" />

Select the desired section of the image. In this case, it's the ball. Try to make the selection a little bit bigger than the ball itself.

<img src="img/rec3.png" width="600" />

Finally, click the **End** button.

Great! Once done, you should be detecting the object in the table. This system compares the images received by the camera with the saved ones, and looks for matches. If it matches in enough points, it considers it the desired object.

<img src="img/rec4.png" width="600" />

f) So, the last step will be to save all of the objects added. There are 2 main ways of doing this:<br>

* Saving the objects as images: **File -> Save Objects**. This will save all of the images taken in a folder
* Saving the whole session: **File -> Save Session**. This will save a binary with all of the images and settings. This is the most compact way of doing it, although you won't have access to the images of the objects. It depends on your needs

For now, let's just save the whole session. Inside your package, create a new folder named **saved_pictures**, and save the session inside this folder. You can name it **ball_session**.

<img src="img/rec5.png" width="600" />

<p style="background:#EE9023;color:white;">End of Exercise 5.1</p>

So, once you have your session stored, you need to be able to always start an object recognition session with all of that stored data. In order to do so, just follow the next exercise!

<p style="background:#EE9023;color:white;">Exercise 5.2</p>

a) Create a new launch file inside your package named **start_find_object_3d_session.launch**, and copy the following content into it.

In [None]:
<launch>
		
	<node name="find_object_3d" pkg="find_object_2d" type="find_object_2d" output="screen">
		<param name="gui" value="true" type="bool"/>
		<param name="settings_path" value="~/.ros/find_object_2d.ini" type="str"/>
		<param name="subscribe_depth" value="true" type="bool"/>
		<param name="session_path" value="$(find my_object_recognition_pkg)/saved_pictures/ball_session.bin" type="str"/>
		<param name="objects_path" value="" type="str"/>
		<param name="object_prefix" value="object" type="str"/>
		
		<remap from="rgb/image_rect_color" to="/camera/rgb/image_raw"/>
		<remap from="depth_registered/image_raw" to="/camera/depth/image_raw"/>
		<remap from="depth_registered/camera_info" to="/camera/depth/camera_info"/>
	</node>
	
</launch>

b) Launch the file. You should then be able to get the TF of the detected object published. If you have multiple images of the same object, you will get multiple frames of objects. It's up to you to filter them.

<img src="img/tfs2.png" width="500" />

<img src="img/tfs3.png" width="500" />

<img src="img/tfs1.png" width="500" />

c) You can also see the object detected by executing the following command in another terminal while the prior launch is working:

<table style="float:left;background: #407EAF">
<tr>
<th>
<p class="transparent">Execute in WebShell #2</p>
</th>
</tr>
</table>

In [None]:
rosrun find_object_2d print_objects_detected

<img src="img/object_detected.png" width="1000" />

<p style="background:#EE9023;color:white;">End of Exercise 5.2</p>

Awesome! So now we are able to get the TF of the object we want to grasp. But... how can we get the position of that object? Which, in fact, is the important data we really want to know if we want to grasp it.

Well, as you have seen in the previous exercise, we now have the TF from the detected object to the **camera_link** being published. And, obviously, we also have the TF from this camera_link frame to the **base_footprint** frame, which represents the base of the robot. So... with this TF data being published, we can already know the position of that object related to the base of the robot!

So, in order to get the position of the object, you just need to check the value of its TF regarding the world frame. You can check that by using the following command:

In [None]:
rosrun tf tf_echo base_footprint <object_frame>

So, if the frame of your object is named, like in this notebook, **object_8**, the command would be:

In [None]:
rosrun tf tf_echo base_footprint object_1

After a few seconds, you will get something like this:

<img src="img/object_tf.png" width="600" />