# `S1`: Sensor Lab 1: Pose Estimation

Pose estimation refers to computer vision techniques that detect human figures in images and videos, so that one could determine, for example, where someone’s elbow shows up in an image. It is important to be aware of the fact that pose estimation merely estimates where key body joints are and does not recognize who is in an image or video.

In this lab we will be working with the [Raspberry Pi 4](https://www.raspberrypi.com/products/raspberry-pi-4-model-b/), the [Pi Camera](https://projects.raspberrypi.org/en/projects/getting-started-with-picamera), and a [Coral USB Accelerator](https://coral.ai/products/accelerator/).

__Outline:__
* [1. Connect the Camera Module](#Ch1)
* [2. Control the Camera with Python Code](#Ch2)
* [3. Creating a Capture Booth to Register participants' pictures](#Ch3)
* [4.  Pose estimation](#Ch4)
    * [4.1. Classification](#Ch41)
    * [4.2. PoseNet](#Ch42)
    * [4.3. Save data to a file](#Ch43)



## 1. Connect the Camera Module <a class="anchor" id="Ch1"></a>

<div>
<img src="images/Camera_and_pi_4.png" width="300">
</div>

Ensure your Raspberry Pi is turned off.
1. Locate the Camera Module port
<div>
<img src="images/pi4-camera-port.png" width="500">
</div>

2. Gently pull up on the edges of the port’s plastic clip
<div>
<img src="images/pull_edges.png" width="300">
</div>

3. Insert the Camera Module ribbon cable; make sure the connectors at the bottom of the ribbon cable are facing the contacts in the port
<div>
<img src="images/facing_backwards.png" width="300">
</div>

4. Push the plastic clip back into place

5. Start up your Raspberry Pi
6. Go to the main menu and open the __Raspberry Pi Configuration__ tool.
7. Select the __Interfaces__ tab and ensure that the __camera__ is __enabled__
8. Reboot your Raspberry Pi


## 2. Try to Control the Camera with Python Code <a class="anchor" id="Ch2"></a>

The Python `picamera` library allows you to control your Camera Module. 

- Open a new file in the editor (e.g. in Mu) and save it as `camera_example.py`. __⚠️ Warning:__ never save the file as `picamera.py`!
- Try the following code on your Pi:


In [None]:
from picamera import PiCamera
from time import sleep

camera = PiCamera()

camera.start_preview()
sleep(5)
camera.stop_preview()

Save and run this program. The camera preview should be shown for five seconds and then close again. 

> ℹ️ **Note**: the camera preview only works when a monitor is directly connected to your Raspberry Pi. If you are using remote access (such as SSH or VNC), you won’t be able to see the camera preview. You can work around this by saving an image and viewing that instead (the next steps of this lab).

> ❓ **Test Yourselves**: Try to describe line-by-line what this python code is doing.

<br />

> 🏆 **Challenge**: Save a picture from the camera by using the `camera.capture()` function. Save the image as `capture.jpg` in `/home/pi/Desktop`
>
> (note: it’s important to sleep for at least two seconds _before_ capturing an image, to give the camera time to adjust to the room's light levels.)


In [None]:
# write your own code here #

If your picture is upside-down, you can rotate it by 180 degrees by adding the following:

In [None]:
camera.rotation = 180

You can rotate the image by 90, 180, or 270 degrees. To reset the image, set `camera.rotation` to 0 degrees.

The Python `picamera` software provides a number of effects and configurations to change how your images look. Check out the following website to find some examples:
https://projects.raspberrypi.org/en/projects/getting-started-with-picamera/7

All documentation on the PiCamera project can be found here:
https://picamera.readthedocs.io/en/release-1.13/index.html

## 3. Create a Capture Booth GUI to Register Participants' Body Pictures <a class="anchor" id="Ch3"></a>

Now that you have gotten to know the picamera a bit better, you will now make a simple GUI that you can use to capture pictures of your participants' (clothed 😉) bodies.

> 🏆 **Challenge**: Use `guizero` (see [L3](../L3_PythonGUIsAndHardware/L3_PythonGUIsAndHardware.ipynb)) to create a capture booth GUI.
>
> - It should request a participant ID (e.g. via a text box in which you type an ID like  `P01`)
> - It should have a button that, when pressed, causes the application to take a picture of the participant's body
> - The picture should be saved as a file with a relevant name (e.g. `P01_front.png`) in a folder called `participants`
> - It should show the captured picture in the GUI
>
> ℹ️ **Note**:
>
> - The face of the participant should not be shown in the picture. Make sure that the camera only captures a picture of the body by verbally instructing the participant on where to stand in front of the camera (or move the camera around).
>
> 💡 **Tips**:
>
> - Start by creating your GUI layout without actually implementing the functionality. E.g. use a placeholder image where the participant's image will ultimately go, and later substitute it for the actual picture.

In [None]:
# fill out your own code here

## 4. Capture Participants' Poses <a class="anchor" id="Ch4"></a>

Pose estimation is the task of using an Machine Learning model to estimate the pose of a person from an image or a video by estimating the spatial locations of key body joints (keypoints). In this lab you are going to set up a __portable pose estimation lab__ using the __Raspberry Pi 4__, the __Pi Camera__, and the __Coral USB Accelerator__. The __Coral USB Accelerator__ is a USB device that provides an __Edge TPU__ as a coprocessor for your device. It accelerates inferencing for your machine learning models when attached to either a Linux, Mac, or Windows host computer. 

The pose estimation models take a processed camera image as the input and outputs information about keypoints. The keypoints detected are indexed by a part ID, with a confidence score between 0.0 and 1.0. The confidence score indicates the probability that a keypoint exists in that position.

There are two TensorFlow Lite pose estimation models:
- MoveNet: the state-of-the-art pose estimation model available in two flavors: Lighting and Thunder. 
- PoseNet: the previous generation pose estimation model released in 2017.

In this lab you'll work with the __PoseNet__ model.

The various body joints detected by the pose estimation model are tabulated below:


| Id | Part |
| --- | --- |
|0|	nose |
|1|	leftEye |
|2|	rightEye |
|3|	leftEar |
|4|	rightEar |
|5|	leftShoulder|
|6|	rightShoulder|
|7|	leftElbow|
|8|	rightElbow|
|9|	leftWrist|
|10|	rightWrist|
|11	| leftHip|
|12	| rightHip|
|13	| leftKnee|
|14	| rightKnee|
|15|	leftAnkle|
|16	| rightAnkle|



![PoseNetExample](images/PoseNet_example.png)
source: https://www.tensorflow.org/lite/examples/pose_estimation/overview


### 4.1 Classification <a class="anchor" id="Ch41"></a>

We will first run general examples to  check if the required libraries are installed. In this example we are going to classify the following image:

<div>
<img src="images/parrot.jpg" width="200">
</div>


1. Make sure your RP is switched off.
2. Connect the RP to:
    - Pi Camera
    - USB Coral Accelerator
3. Now connect the _charger_ to switch on the Raspberry Pi
4. Connect with the RP via de VNC Viewer (if you are not connected to an external screen).

#### 4.1.1 Check if tensorflow works for image classification:
First we want to navigate to the folder containing our test, do this by typing in the Terminal:

```bash
cd coral/tflite/python/examples/classification
```
    
Than we will try classify an example image with a tensorflow model, do this by copy-pasting the following code and press enter:

```bash
python3 classify_image.py \
  --model models/mobilenet_v2_1.0_224_inat_bird_quant.tflite \
  --labels models/inat_bird_labels.txt \
  --input images/parrot.jpg
```

You should get something like this:

```text
----INFERENCE TIME----
Note: The first inference on Edge TPU is slow because it includes loading the model into Edge TPU memory.
317.5ms
288.7ms
286.4ms
286.4ms
286.5ms
-------RESULTS--------
Ara macao (Scarlet Macaw): 0.77734
```

#### 4.1.2 Check if the USB coral accelerator works for image classification:

Now we will check the availability of the USB coral accelerator by running the same test again. This time we use a model that is compiled to run on the USB tpu.

Copy-paste the following code and run by pressing enter:

```bash
python3 classify_image.py \
  --model models/mobilenet_v2_1.0_224_inat_bird_quant_edgetpu.tflite \
  --labels models/inat_bird_labels.txt \
  --input images/parrot.jpg
```
    
You should see a result like this:

```text
----INFERENCE TIME----
Note: The first inference on Edge TPU is slow because it includes loading the model into Edge TPU memory.
11.8ms
3.0ms
2.8ms
2.9ms
2.9ms
-------RESULTS--------
Ara macao (Scarlet Macaw): 0.75781
```

### 4.2 PoseNet <a class="anchor" id="Ch42"></a>

(https://github.com/google-coral/project-posenet)

#### 4.2.1 How does it work? <a class="anchor" id="Ch421"></a>

At a high level, pose estimation happens in two phases:

An input RGB image is fed through a convolutional neural network. In our case this is a MobileNet V1 architecture. Instead of a classification head however, there is a specialized head which produces a set of heatmaps (one for each kind of key point) and some offset maps. This step runs on the EdgeTPU. The results are then fed into step 2)

- A special multi-pose decoding algorithm is used to decode poses, pose confidence scores, keypoint positions, and keypoint confidence scores. 

- If you're interested in the details of the decoding algorithm and how PoseNet works under the hood, you could take a look at the original research paper or this post: https://medium.com/tensorflow/real-time-human-pose-estimation-in-the-browser-with-tensorflow-js-7dd0bc881cd5 which describes the raw heatmaps produced by the convolutional model.

#### 4.2.2 Important PoseNet Concepts <a class="anchor" id="Ch422"></a>

<div> <img src="images/keypoints.png" width="800"></div>

| Concept | Description |
| ------- | ----------- |
| Pose    | At the highest level, PoseNet will return a pose object that contains a list of keypoints and an instance-level confidence score for each detected person. |
| Keypoint | A part of a person’s pose that is estimated, such as the nose, right ear, left knee, right foot, etc. It contains both a position and a keypoint confidence score. PoseNet currently detects 17 keypoints illustrated in the diagram above.
| Keypoint Confidence Score | This determines the confidence that an estimated keypoint position is accurate. It ranges between 0.0 and 1.0. It can be used to hide keypoints that are not deemed strong enough. |
| Keypoint Position | 2D x and y coordinates in the original input image where a keypoint has been detected. |


#### PoseNet Setup

Now that we have verified that TensorFlow and Coral are working, we will now install the required packages for PoseNet. In a terminal window, run the following commands:

- Install `python3-pycoral`:

```bash
sudo apt-get install python3-pycoral
```

- Install `tflite`:

```bash
python3 -m pip install tflite-runtime
```

- Install the Edge TPU runtime:

```bash
echo "deb https://packages.cloud.google.com/apt coral-edgetpu-stable main" | sudo tee /etc/apt/sources.list.d/coral-edgetpu.list

curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add -
sudo apt-get update  
sudo apt-get install libedgetpu1-std
```    

After installing those, you can now physically connect your __Coral__ to the Raspberry Pi using the USB-C cable supplied. Because the Coral supports USB 3.0, it should be attached to one of the Blue USB ports on the Raspberry Pi 4 to allow the fastest transfer speeds. If you already attached the Coral, then remove it and connect again.

```bash     
cd ~/google-coral
```

```bash
git clone https://github.com/google-coral/project-posenet.git
```
    
If you receive the message 'fatal: destination path 'project-posenet' already exists and is not an empty directory.' Posenet was already downloaded on your device. Move on to the next step:

```bash
cd /home/pi/google-coral/project-posenet    
sh install_requirements.sh
```   

#### 4.2.3 Example PoseNet Code <a class="anchor" id="Ch423"></a>

The example code, `pose_camera.py`, is a camera example that streams the camera's image through posenet and draws the pose on top as an overlay. This is a great first example to run to familiarize yourself with the network and its outputs.

Run the demo in a terminal:

```bash
cd /home/pi/google-coral/project-posenet
python3 pose_camera.py
```

If the camera and monitor are both facing you, consider adding the --mirror flag:

```bash
python3 pose_camera.py --mirror
```

> ℹ️ **Note**: The github repository contains the following 3 posenet model files for different input resolutions. The larger resolutions process more slowly, but allow a wider field of view, for further-away poses to be processed correctly.
>
> ```text
> posenet_mobilenet_v1_075_721_1281_quant_decoder_edgetpu.tflite
> posenet_mobilenet_v1_075_481_641_quant_decoder_edgetpu.tflite
> posenet_mobilenet_v1_075_353_481_quant_decoder_edgetpu.tflite
> ```

You can change the camera resolution by using the --res parameter:

```bash
python3 pose_camera.py --res 480x360  # fast but low res
python3 pose_camera.py --res 640x480  # default
python3 pose_camera.py --res 1280x720 # slower but high res
```

### 4.3 Save Pose Data to a CSV <a class="anchor" id="Ch43"></a>

In the previous section you have extracted keypoints from a live video using OpenPose. However, we cannot analyse data if this isn't saved somewhere.

To do this, you are going to need to know how to generate unique timestamped filenames ([X2](../X2_GeneratingTimestampedFilenames/X2_GeneratingTimestampedFilenames.ipynb)) and how to write to CSV files ([X1](../X1_WritingCSVFiles/X1_WritingCSVFiles.ipynb)). You should then produce a modified version of `pose_camera.py` called `logging_pose_camera.py` that logs (writes) your keypoints to a data file.

> 🏆 **Challenge**: Go through the [X1](../X1_WritingCSVFiles/X1_WritingCSVFiles.ipynb) and [X2](../X2_GeneratingTimestampedFilenames/X2_GeneratingTimestampedFilenames.ipynb) "eXtra Content" materials.
>
> - After going through [X1](../X1_WritingCSVFiles/X1_WritingCSVFiles.ipynb), you should know how to write CSV files
> - After going through [X2](../X2_GeneratingTimestampedFilenames/X2_GeneratingTimestampedFilenames.ipynb), you should know how to generate timestamped file names
> - Save `pose_camera.py` under a different name: `logging_pose_camera.py` (i.e. a logging version of `pose_camera.py`)
> - Combine both techniques and edit `logging_pose_camera.py` such that it writes your keypoints to a timestamped CSV file. The file should be saved as `data/output_yourdatestring.csv`.
> - Make sure to close your CSV file at the end of the program/acquisition
>
> 💡 **Tips**:
>
> - You only need one file (via `open`) and one `csv.writer` for the entire acquisition.
> - `main` is only called once per acquisition.
> - You will need to insert multiple rows during an acquisition - one per frame. 
> - `render_overlay` is called once per frame.

In [None]:
# 💡 tip: you may need to build a row of your CSV cell-by-cell

from datetime import datetime

row = []
row.append(datetime.now())  # append a timestamp in the first column 
for label, keypoint in pose.keypoints.items(): # loop through all keypoints
    row.append(keypoint.point[0]) # append the x-coordinate of the keypoint to the row 
    row.append(keypoint.point[1]) # append the y-coordinate of the keypoint in the row

# (and then you need to write this row to a CSV using a `csv.writer`: see X1)