# Camera

Each drone is equipped with a single Arducam 5 megapixel camera. The camera is used to measure motion in the planar
directions. This camera points down towards the ground to measure `x`, `y`,
and yaw velocities of the drone using optical flow vectors that are
extracted from the camera images. This is a lightweight task, meaning
that it does not require a lot of computational effort, because these vectors
are already calculated by the Raspberry Pi's image processor for h264 video
encoding. We also use the camera to estimate the relative position of
the drone by estimating the rigid transformations between two images.


# Part 2: Affine Transformations

## Background Information

In order to estimate the Duckiedrone's position (a 2-dimensional column vector $v = [x \; y]^T$) using the camera, you will need to use affine transformations. An affine transformation $f: \mathbb{R}^n \to \mathbb{R}^m$ is any transformation of the form $v \to Av + b$, where $A \in \mathbb{R}^{m \times n}$ and $b \in \mathbb{R}^m$. The affine transformations we are interested in are *rotation*, *scale*, and *translation* in two dimensions. So, the affine transformations we will look at will map vectors in $\mathbb{R}^2$ to other vectors in $\mathbb{R}^2$.

Let's first look at rotation. We can rotate a column vector $v \in \mathbb{R}^2$ about the origin by the angle $\theta$ by premultiplying it by the following matrix:

$$
\begin{bmatrix}
  \cos \theta & -\sin \theta \\
  \sin \theta & \cos \theta \\
\end{bmatrix}
$$

Let's look at an example. Below we have the vector $[1, 2]^T$. To rotate the vector by $\theta=\frac{2\pi}{3}$, we pre-multiply the vector by the rotation matrix:

$$
  \begin{bmatrix}
  \cos \frac{2\pi}{3} & -\sin \frac{2\pi}{3} \\
  \sin \frac{2\pi}{3} & \cos \frac{2\pi}{3} \\
    \end{bmatrix}
    \begin{bmatrix}
    1 \\
    2 \\
    \end{bmatrix}
    = \begin{bmatrix}
    -2.232 \\
    -0.134 \\
    \end{bmatrix}
$$

A graphical representation of the transformation is shown below. The vector $[1, 2]^T$ is rotated $\frac{2\pi}{3}$ about the origin to get the vector $[-2.232, -0.134]^T$

<!-- <figure>
  <div style="text-align:center;">
  <img style='width:400px' src='../../_images/sensors/rotation.png'/>
  <p>Rotating one point about the origin</p>
</figure>
 -->

```{figure} ../../_images/sensors/rotation.png

Rotating one point about the origin
```

Next, let's look at how scale is represented. We can scale a vector $v \in \mathbb{R}^2$ by a scale factor $s$ by pre-multiplying it by the following matrix: 

$$
  \begin{bmatrix}
  s & 0 \\
  0 & s \\
    \end{bmatrix}
$$

We can scale a single point $[1, 2]^T$ by a factor of `.5` as shown below:

$$
  \begin{bmatrix}
  .5 & 0 \\
  0 & .5 \\
    \end{bmatrix}
    \begin{bmatrix}
    1 \\
    2 \\
    \end{bmatrix}
    = \begin{bmatrix}
    .5 \\
    1 \\
    \end{bmatrix}
$$

```{figure} ../../_images/sensors/scale1.png

Scaling one point
```
<!-- 
<figure>
  <div style="text-align:center;">
  <img style='width:400px' src='../../_images/sensors/scale1.png' />
  <figcaption>Scaling one point</figcaption>
</figure> -->

When discussing scaling, it is helpful to consider multiple vectors, rather than a single vector. Let's look at all the points on a rectangle and multiply each of them by the scale matrix individually to see the effect of scaling by a factor of `.5`:

```{figure} ../../_images/sensors/scale2.png

Scaling multiple points
```
<!-- 
<figure>
  <div style="text-align:center;">
  <img style='width:400px' src='../../_images/sensors/scale2.png' />
  <figcaption>Scaling multiple points</figcaption>
</figure>
 -->
Now we can see that the rectangle was scaled by a factor of `.5`.

What about translation? Remember that an affine transformation is of the form $v \to Av + b$. You may have noticed that rotation and scale are represented by only a matrix $A$, with the vector $b$ effectively equal to 0. We could represent translation by simply adding a vector $b = [dx \; dy]^T$ to our vector $v$. However, it would be convenient if we could represent all of our transformations as matrices, and then obtain a single transformation matrix that scales, rotates, and translates a vector all at once. We could not achieve such a representation if we represent translation by adding a vector.

So how do we represent translation (moving $dx$ in the $x$ direction and $dy$ in the $y$ direction) with a matrix? First, we append a 1 to the end of $v$ to get $v' = [x, y, 1]^T$. Then, we premultiply $v'$ by the following matrix:

$$
  \begin{bmatrix}
  1 & 0 & dx\\
  0 & 1 & dy\\
  0 & 0 & 1\\
    \end{bmatrix}
$$

Even though we are representing our $x$ and $y$ positions with a 3-dimensional vector, we are only ever interested in the first two elements, which represent our $x$ and $y$ positions. The third element of $v'$ is *always* equal to 1. Notice how pre-multiplying $v'$ by this matrix adds $dx$ to $x$ and $dy$ to $y$.
$$
  \begin{bmatrix}
  1 & 0 & dx\\
  0 & 1 & dy\\
  0 & 0 & 1\\
    \end{bmatrix}
     \begin{bmatrix}
    x \\
    y \\
    1 \\
    \end{bmatrix}
    = 
    \begin{bmatrix}
    x + dx \\
    y + dy\\
    1 \\
    \end{bmatrix}
$$

So this matrix is exactly what we want!

As a final note, we need to modify our scale and rotation matrices slightly in order to use them with $v'$ rather than $v$. A summary of the relevant affine transforms is below with these changes to the scale and rotation matrices.

$$
\boxed{
    \text{Rotation:}
  \begin{bmatrix}
  \cos \theta & -\sin \theta & 0 \\
  \sin \theta & \cos \theta & 0 \\
  0 & 0 & 1 \\
    \end{bmatrix}
    \quad \quad
    \text{Scale:}
  \begin{bmatrix}
  s & 0 & 0 \\
  0 & s & 0 \\
  0 & 0 & 1 \\
    \end{bmatrix}
        \quad \quad
    \text{Translation:}
  \begin{bmatrix}
  1 & 0 & dx \\
  0 & 1 & dy \\
  0 & 0 & 1 \\
    \end{bmatrix}
}
$$

## Estimating Position on the Duckiedrone

Now that we know how rotation, scale, and translation are represented as matrices, let's look at how you will be using these matrices in the sensors project. 

To estimate your drone's position, you will be using a function from OpenCV called 
<code>estimateRigidTransform</code>. This function takes in two images $I_1$ and $I_2$ and a boolean $B$. The function returns a matrix estimating the affine transform that would turn the first image into the second image. The boolean $B$ indicates whether you want to estimate the affect of shearing on the image, which is another affine transform. We don't want this, so we set $B$ to <code>False</code>.

<code>estimateRigidTransform</code> returns a matrix in the form of:

$$  E = 
  \begin{bmatrix}
  s\cos \theta & -s\sin \theta & dx \\
  s\sin \theta & s\cos \theta & dy \\
  \end{bmatrix}
$$

This matrix should look familiar, but it is slightly different from the matrices we have seen in this section. Let $R$, $S$, and $T$ be the rotation, scale, and translation matrices from the above summary box. Then, $E$ is the same as $TRS$, where the bottom row of $TRS$ is removed. You can think of $E$ as a matrix that first scales a vector $u = [x, y, 1]^T$ by a factor of $s$, then rotates it by $\theta$, then translates it by $dx$ in the $x$ direction and $dy$ in the $y$ direction, and then removes the 1 appended to the end of the vector to output $u' = [x', y']$.

Wow that was a lot of reading! Now on to the questions...

## Questions
1. Your Duckiedrone is flying over a highly textured planar surface. The Duckiedrone's current $x$ position is $x_0$, its current $y$ position is $y_0$, and its current yaw is $\phi_0$. Using the Raspberry Pi Camera, you take a picture of the highly textured planar surface with the Duckiedrone in this state. You move the Duckiedrone to a different state ($x_1$ is your $x$ position, $y_1$ is your $y$ position, and $\phi_1$ is your yaw) and then take a picture of the highly textured planar surface using the Raspberry Pi Camera. You give these pictures to <code>esimateRigidTransform</code> and it returns a matrix $E$ in the form shown above. 
    
    Write expressions for $x_1$, $y_1$, and $\phi_1$. Your answers should be in terms of $x_0$, $y_0$, $\phi_0$, and the elements of $E$. Assume that the Duckiedrone is initially is located at the origin and aligned with the axes of the global coordinate system.

  * Hint 1: Your solution does not have to involve matrix multiplication or other matrix operations. Feel free to pick out specific elements of the matrix using normal 0-indexing, i.e. $E[0][2]$.
  * Hint 2: Use the function arctan2 in some way to compute the yaw.


# Part 4: Estimating Velocity by Summing Optical Flow Vectors

We want to estimate our $x$ and $y$ velocity using the Duckiedrone's camera. Thankfully, optical flow from the Raspberry Pi Camera is calculated on board the Raspberry Pi itself. All we have to do is process the optical flow vectors that have already been calculated for us!

To calculate the $x$ velocity, we have to sum the $x$ components of all the optical flow vectors and multiply the sum by some normalization constant. We calculate the $y$ velocity in the same way. Let $c$ be the normalization constant that allows us to convert the sum of components of optical flow vectors into a velocity.

How do we calculate $c$? Well, it must have something to do with the current height of the drone. Things that are far away move more slowly across your field of view. If a drone is at a height of `.6` and a feature passes through its camera's field of view in 1 second, then that drone is moving faster than another drone at a height of `.1` whose camera also passes over the same feature in 1 second. If we let $a$ be the altitude of the drone, then the drone's normalization constant must be $c = ab$, where $b$ is some number that accounts for the conversion of optical flow vectors multiplied by an altitude to a velocity. You do not have to worry about calculating $b$ (the *flow coefficient*), as it is taken care of for you.

In summary, to calculate the $x$ velocity, we have to sum the $x$ components of the optical flow vectors and then multiply the sum by $ab$. The $y$ velocity is calculated in the exact same way.

## Questions
1.  The Pi calculates that the optical flow vectors are [5 4], [1, 2], and [3, 2]. The flow vectors are in the form [$x$-component, $y$-component]. What are your $x$ and $y$ velocities $\dot{x}$ and $\dot{y}$? Your solution will be in terms of $a$, the altitude, and $b$, the flow coefficient.


## Handin

Use [this link](https://classroom.github.com/a/QKoUdfRa) to access the assignment on GitHub classroom. Commit the
files to hand in, as you did in the Introduction assignment.

Your handin should contain the following files:

- `solutions.tex` 
- `solutions.pdf`

(sensors-velocity)=
# Velocity Estimation via Optical Flow

In this part of the project you will create a class that interfaces with the Arducam to extract planar velocities from optical flow vectors.

## Code Structure
To interface with the camera, you will be using the `raspicam_node` library. This library publishes both images and optical flow vectors to ROS topics.  You will estimate velocity using the flow vectors, and estimate small changes in position by extracting features from pairs of frames. In the sensors project repo, we've included a script called `student_optical_flow.py` which you will edit, so it publishes the estimated velocity from the flow vectors.

Similarly a second script is `student_rigid_transform.py` which you will edit, so it subscribes to the image topic and publishes position estimates. 

## Analyze and Publish the Sensor Data
On your drones, the chip on the Raspberry Pi dedicated to video processing from the camera calculates motion vectors ([optical flow](https://en.wikipedia.org/wiki/Optical_flow)) automatically for H.264 video encoding. [Click here to learn more](https://www.raspberrypi.org/blog/vectors-from-coarse-motion-estimation/). You will be analyzing these motion vectors in order to estimate the velocity of your drone.

**Exercises**

You will now implement your velocity estimation using optical flow by completing all of the `TODO`'s in student_optical_flow.py. There are two methods you will be implementing.

The first method is `setup`, which will be called to initialize the instance variables.

  1. Create a ROS publisher to publish the velocity values.

The perspicacious roboticist may have noticed that magnitude of the velocity in global coordinates is dependent on the height of the drone. Add a subscriber to the topic /pidrone/state to your AnalyzeFlow class and save the z position value to a class variable in the callback. Use this variable to scale the velocity measurements by the height of the drone (the distance the camera is from what it is perceiving).

  2. Create a ROS subscriber to obtain the altitude (z-position) of the drone for scaling the motion vectors.

The second method is `motion_cb`, which is called every time that the camera gets a set of flow vectors, and is used to analyze the flow vectors to estimate the x and y velocities of your drone.

  1. Estimate the velocities, using the `TODO`'s as a guide.

  2. Publish the velocities.

## Check your Measurements
You'll want to make sure that the values you're publishing make sense. To do this, you'll be echoing the values that you're publishing and empirically verifying that they are reasonable.

**Exercises**

Verify your velocity measurements

1. Start up your drone and launch a screen
2. Navigate to \`4 and quit the node that is running
3. Run `rosrun project-sensors-yourGithubName student_analyze_flow.py`
4. Enter `rostopic echo /pidrone/picamera/twist`
5. Move the drone by hand to the left and right and forward and backward to verify that the measurements make sense

## Checkoff
1. Verify that the velocity values are reasonable (roughly in the range of -1m/s to 1m/s) and have the correct sign (positive when the drone is moving to the right or up, and negative to the left or down).



(sensors-position)=
# Position Estimation via OpenCV's estimateRigidTransform

In this part of the project you will create a class that interfaces with the picamera to extract planar positions of the drone relative to the first image taken using OpenCV's estimateRigidTransform function.

## Ensure images are being passed into the analyzer
Before attempting to analyze the images, we should first check that the images are being properly passed into the analyze method

**Exercises**

1. Open `student_rigid_transform_node.py` and print the `data` argument in the method `image_callback`.  Verify you are receiving images from the camera. 

## Analyze and Publish the Sensor Data
To estimate our position we will make use of OpenCV’s [<i>estimateAffinePartial2D</i>](https://docs.opencv.org/3.4/d9/d0c/group__calib3d.html#ga27865b1d26bac9ce91efaee83e94d4dd) function. This will return an affine transformation between two images if the two images have enough in common to be matched, otherwise, it will return None.

**Exercises**

Complete the TODOs in `image_callback`, which is called every time that the camera gets an image, and is used to analyze two images to estimate the x and y translations of your drone.

  2. Save the first image and then compare subsequent images to it using cv2.estimateAffinePartial2D. (Note that the fullAffine argument should be set to False.)
  3. If you print the output from `estimateAffinePartial2D`, you’ll see a 2x3 matrix when the camera sees what it saw in the first frame, and a None when it fails to match. This 2x3 matrix is an affine transform which maps pixel coordinates in the first image to pixel coordinates in the second image. 
  4. Implement the method `translation_and_yaw`, which takes an affine transform and returns the x and y translations of the camera and the yaw.
  5. As with velocity measurements, the magnitude of this translation in global coordinates is dependent on the height of the drone. Add a subscriber to the topic /pidrone/state and save the value to `self.altitude` in the callback. Use this variable to compensate for the height of the camera in your method from step 4 which interprets your affineTransform.

## Account for the case in which the first frame is not found
Simply matching against the first frame is not quite sufficient for estimating position because as soon as the drone stops seeing the first frame it will be lost. Fortunately we have a fairly simple fix for this: compare the current frame with the previous frame to get the displacement, and add the displacement to the position the drone was in in the previous frame. The framerate is high enough and the drone moves slow enough that the we will almost never fail to match on the previous frame.

**Exercises**

Modify your `RigidTransformNode` class to add the functionality described above.

1. Store the previous frame. When estimateAffinePartial2D fails to match on the first frame, run estimateAffinePartial2D on the previous frame and the current frame.
2. When you fail to match on the first frame, add the displacement to the position in the previous frame. You should use `self.x_position_from_state` and `self.y_position_from_state` (the position taken from the `pidrone/state` topic) as the previous coordinates.

**Note** The naive implementation simply sets the position of the drone when we see the first frame, and integrates it when we don’t. What happens when we haven’t seen the first frame in a while so we’ve been integrating, and then we see the first frame again? There may be some disagreement between our integrated position and the one we find from matching with our first frame due to accumulated error in the integral, so simply setting the position would cause a jump in our position estimate. The drone itself didn’t actually jump, just our estimate, so this will wreak havoc on whatever control algorithm we write based on our position estimate. To mitigate these jumps, you should use a filter to blend your integrated estimate and your new first-frame estimate. Since this project is only focused on publishing the measurements, worrying about these discrepancies is unnecessary. In the UKF project, you will address this problem.

## Connect to the JavaScript Interface

Now that we’ve got a position estimate, let’s begin hooking our code
up to the rest of the flight stack.

To connect to the JavaScript interface, clone `pidrone_pkg` on your
base station machine.  Point any web browser at the web/index.html
directory.  This will open up the web interface that we will be using
the rest of the semester.

  1. Create a subscriber (in the setup function) to the topic `/pidrone/reset_transform` and a callback owned by the class to handle messages. [ROS Empty messages](http://docs.ros.org/lunar/api/std_msgs/html/msg/Empty.html) are published on this topic when the user presses r for reset on the JavaScript interface. When you receive a reset message, you should take a new first frame, and set your position estimate to the origin again.
  2. Create a subscriber to the topic `/pidrone/position_control`. [ROS Bool messages](http://docs.ros.org/lunar/api/std_msgs/html/msg/Bool.html) are published on this topic when the user presses `p` or `v` on the JavaScript interface. When we’re not doing position hold we don’t need to be running this resource-intensive computer vision, so when you receive a message you should enable or disable your position estimation code.

## Measurement Visualization
Debugging position measurements can also be made easier through the use of a visualizer. A few things to look for are sign of the position, magnitude of the position, and the position staying steady when the drone isn't moving. Note again that these measurements are unfiltered and will thus be noisy; don't be alarmed if the position jumps when it goes from not seeing the first frame to seeing it again.

**Exercises**

Use the web interface to visualize your position estimates

1. Connect to your drone and start a new screen
2. Run `rosrun project-sensors-yourGithubName student_rigid_transform_node.py` in \`4.
3. Hold your drone up about 0.25 m with your hand
4. In the web interface, press `r` and the `p` to engage position hold.
5. Use `rostopic echo /pidrone/picamera/pose` to view the output of your <i>student_analyze_phase</i> class
6. Move your drone around by hand to verify that the values make sense.
7. Look at the web interface and see if it tracks your drone. Pressing `r` should set the drone drone visualizer back to the origin.

## Checkoff 
1. Verify that the position values are reasonable (roughly in the range of -1m to 1m) and have the correct sign (positive when the drone is moving to the right or up, and negative to the left or down).
