# `S1`: Sensor Lab 1: Pose Estimation

Pose estimation refers to computer vision techniques that detect human figures in images and videos, so that one could determine, for example, where someone’s elbow shows up in an image. It is important to be aware of the fact that pose estimation merely estimates where key body joints are and does not recognize who is in an image or video.

In this lab we will be working with the [Raspberry Pi 4](https://www.raspberrypi.com/products/raspberry-pi-4-model-b/), the [Pi Camera](https://projects.raspberrypi.org/en/projects/getting-started-with-picamera), and a [Coral USB Accelerator](https://coral.ai/products/accelerator/).

## Outline

* [1. Setup Hardware](#Ch1)
  * [1.1 Connect the Camera Module](#Ch11)
  * [1.2 Connect the USB Coral Accelerator](#Ch12)
  * [1.3 Power Up the Pi](#Ch13)
* [2. Setup Software](#Ch2)
* [3. Try to Control the Camera with Python Code](#Ch3)
* [4. Create a Capture Booth GUI to Register Participants' Body Pictures](#Ch4)
* [5. Capture Participants' Poses](#Ch5)
  * [5.1 Classification](#Ch51)
    * [5.1.1 Check if tensorflow works for image classification](#Ch511)
    * [5.1.2 Check if the USB coral accelerator works for image classification](#Ch512)
  * [5.2 PoseNet](#Ch52)
    * [5.2.1 How does it work?](#Ch521)
    * [5.2.2 Important PoseNet Concepts](#Ch522)
    * [5.2.3 Example PoseNet Code](#Ch523)
  * [5.3 Save Pose Data to a CSV](#Ch53)


## 1. Setup Hardware <a id="Ch1"></a>


### 1.1 Connect the Camera Module <a id="Ch11"></a>

<div>
<img src="images/Camera_and_pi_4.png" width="300">
</div>

Ensure your Raspberry Pi is turned off.

1. Locate the Camera Module port

<div>
<img src="images/pi4-camera-port.png" width="500">
</div>

2. Gently pull up on the edges of the port’s plastic clip

<div>
<img src="images/pull_edges.png" width="300">
</div>

3. Insert the Camera Module ribbon cable; make sure the connectors at the bottom of the ribbon cable are facing the contacts in the port

<div>
<img src="images/facing_backwards.png" width="300">
</div>

4. Push the plastic clip back into place


### 1.2 Connect the USB Coral Accelerator <a id="Ch12" />

1. Make sure your RP is switched off.
2. Plug the USB Coral Accelerator dongle into a **blue** (USB3) USB slot
3. **Note**: the <span style="color:#0000ff">blue</span> USB ports are faster than the not-blue ones


### 1.3 Power Up the Pi  <a id="Ch13" />

1. Connect the USB-C _charger_ to the Pi
2. The Pi will automatically switch on as soon as it has power
3. You should now be able to connect to the Pi via VNC (if you are not connected with a physical external screen).


## 2. Setup Software  <a id="Ch2" />

The lab organizers have already gone through [X0_SoftwareSetup](../X0_SoftwareSetup/README.md), which sets up the necessary software for you, before you were given the Pi, so you probably don't need to set up any software.

The configuration script does things like enabling the Pi camera interface and installing `guizero`. If you're curious about what it did, you can read through `s1.py`'s source code in the [pbl](../X0_SoftwareSetup/pbl/pbl) module.

> ℹ️ **Problem With Your Pi?**
>
> The course organizers have tried their best to ensure all the configuration options and software you'll need is already installed before the course begins, but we can miss things. If you find that the Pi isn't working for you then you can try:
>
> - Asking for help
> - Running `pbl test` in the terminal, which runs some basic checks that ensure things like libraries etc. are installed
> - Reinstalling the necessary software by running `sudo pbl install` in the terminal (⚠️ **warning**: takes a long time)
> - Manually going through the legacy setup guide [here](Legacy/S1_LegacySoftwareSetup.ipynb)

## 3. Try to Control the Camera with Python Code <a id="Ch3" />

The Python `picamera` library allows you to control your Camera Module. 
1. Open a Python editor on the RP to start a new Python3 script
2. **Save the script in a new folder: `home/pbl/Documents/Lab1/CAM_test.py`**
. __⚠️ Warning:__ never save the file as `picamera.py`!
3. Try the following code on your Pi:

In [None]:
from picamera import PiCamera
from time import sleep

camera = PiCamera()

camera.start_preview()
sleep(5)
camera.stop_preview()

Save and run this program. The camera preview should be shown for five seconds and then close again. 

> ℹ️ **Note**: the camera preview only works when a monitor is directly connected to your Raspberry Pi. If you are using remote access (such as SSH or VNC), you won’t be able to see the camera preview. You can work around this by saving an image and viewing that instead (the next steps of this lab).

> ❓ **Test Yourselves**: Try to describe line-by-line what this python code is doing.

<br />

> 🏆 **Challenge `S1.3`**: Adjust the `CAM_test.py` script to save a picture from the camera by using the `camera.capture()` function. Save the image as `capture.jpg` in a folder you make: **`home/pbl/Documents/Lab1/Captures`**
>
> ℹ️ **Note**: it’s important to sleep for at least two seconds _before_ capturing an image, to give the camera time to adjust to the room's light levels.


In [None]:
# Add this to your code
camera.capture("home/pbl/Documents/Lab1/Captures/capture.jpg")

If your picture is upside-down, you can rotate it by 180 degrees by adding the following:

In [None]:
camera.rotation = 180

You can rotate the image by 90, 180, or 270 degrees. To reset the image, set `camera.rotation` to 0 degrees.

The Python `picamera` software provides a number of effects and configurations to change how your images look. Check out the following website to find some examples:
https://projects.raspberrypi.org/en/projects/getting-started-with-picamera/7

All documentation on the PiCamera project can be found here:
https://picamera.readthedocs.io/en/release-1.13/index.html

## 4. Create a Capture Booth GUI to Register Participants' Body Pictures <a id="Ch4" />
Now that you have gotten to know the picamera a bit better, you will now make a simple GUI that you can use to capture pictures of your participants' (clothed 😉) bodies. You will do this in the following two challenges (🏆`S1.4a` and  🏆`S1.4b`). **Make a new script: `home/pbl/Documents/Lab1/CAM_register_participants.py`**. 

### 4.1 Step 1
> 🏆 **Challenge `S1.4a`**: Use `guizero` (see [L3](../L3_PythonGUIsAndHardware/L3_PythonGUIsAndHardware.ipynb)) to create a Preview/Save Image GUI. 
> - Create two buttons: one button `Preview` and one button `Save image`: 
> - - The `Preview` button should cause the camera to show a preview
> - - The `Save image` button should cause the camera to save an image (Tip: use the `camera.capture()` function again). Save the image in a folder. 

In [1]:
### Create the GUI described in Challenge S1.4a here


### 4.2 Step 2 
Now that you have made this GUI, you will improve it to create a **Capture Booth GUI to Register Participants' Body Pictures**.

> 🏆 **Challenge `S1.4b`**: Use `guizero` (see [L3](../L3_PythonGUIsAndHardware/L3_PythonGUIsAndHardware.ipynb)) to create a capture booth GUI. 
> - The GUI should request a participant ID (e.g. via a text box in which you type an ID like  `P01`) that you can select
> - The GUI should have a button that, when pressed, causes the application to take a picture of the participant's body
> - The picture should be saved as a file with a relevant name (e.g. `P01_front.png`) in a folder called `Participants` (`home/pbl/Documents/Lab1/Participants_captures`). 
> - The GUI should show the captured picture in the GUI
> - **The majority of this script is given in the box below. Read through the script carefully!**
> - **Adjust the `capture()` function so that it saves the image in the correct location**
>
> ℹ️ **Note**:
> - The face of the participant should not be shown in the picture. Make sure that the camera only captures a picture of the body by verbally instructing the participant on where to stand in front of the camera (or move the camera around).


In [None]:
### Adjust this code to create the GUI described in Challenge S1.4b here

# Import packages
from picamera import PiCamera
from time import sleep
from guizero import *
import csv
import os.path

# Handling the participant ID is done using class_csv_handler_id 
# This is a custom-made class for this sensor, which was installed
# previously on your Pi in /home/pbl/Desktop/
import sys
sys.path.insert(0, '/home/pbl/Desktop/')
print(sys.path)
from class_csv_handler_id import *

# This creates the GUI with 3 boxes 
app = App(title="Capture Booth", width=800, height=400)
global camera
camera = PiCamera()

box1 = Box(app, align ="left", layout="auto", width=300, height=350)
box2 = Box(app, align ="right", width=500, height=350)
box3 = Box(app, align ="top", width=300, height=50)

# Code for registering a new participant ID
# This function saves participant IDs that are provided through the GUI to a .csv file
def ask_id(): 
    new_id = app.question("New participant", "Enter participant ID")
    if new_id is not None:
        if [new_id] in id_list:
            info("","That participant is already registered")

        else:
            new_id = new_id.split()
            id_handler.write_id("/home/pbl/Documents/Lab1/Participants_captures/Participant_IDs.csv", new_id)
            id_list.append(new_id)
            id_box.append(new_id)
    return id_list

# Code for selecting a participant ID
# This function lets you select a participant ID from the box in the GUI
def select_id(): 
    p_id = id_box.value[0]
    print(p_id, type(p_id))
    return p_id

# Code for capturing an image using the Raspberry Pi camera
def capture(): 
    p_id = select_id() #First, the participant ID should be selected on the GUI
    
    ### YOUR CODE HERE
    # set this variable: 
    # capture_name = 
    ### YOUR CODE HERE

    picture = Picture(box2, image=capture_name, width=300, height=300)
       
# Creates a directory /home/pbl/Documents/Lab1/Participants_captures/ if it does not exist yet
if not os.path.isdir("/home/pbl/Documents/Lab1/Participants_captures/"):
    os.mkdir("/home/pbl/Documents/Lab1/Participants_captures/")
    
# Uses the class_csv_handler_id
id_handler = csv_handler_id()
id_list = id_handler.read_ids("/home/pbl/Documents/Lab1/Participants_captures/Participant_IDs.csv")

# Adds things to the GUI boxes
Text(box1, text="Select participant", size=11)
id_box = ListBox(box1, id_list, command=select_id, scrollbar=True)

button_new = PushButton(box1, text="Register new participant", command=ask_id)
button_cap = PushButton(box1, text="Capture!", command=capture)

# Shows the GUI
app.display()

## 5. Capture Participants' Poses <a id="Ch5"></a>

Pose estimation is the task of using an Machine Learning model to estimate the pose of a person from an image or a video by estimating the spatial locations of key body joints (keypoints). In this lab you are going to set up a __portable pose estimation lab__ using the __Raspberry Pi 4__, the __Pi Camera__, and the __Coral USB Accelerator__.

The __Coral USB Accelerator__ is a USB device that provides an __Edge TPU__ as a coprocessor for your device. It accelerates inferencing for your machine learning models when attached to either a Linux, Mac, or Windows host computer. 

The pose estimation models take a processed camera image as the input and outputs information about keypoints. The keypoints detected are indexed by a part ID, with a confidence score between 0.0 and 1.0. The confidence score indicates the probability that a keypoint exists in that position.

There are two TensorFlow Lite pose estimation models:
- MoveNet: the state-of-the-art pose estimation model available in two flavors: Lighting and Thunder. 
- PoseNet: the previous generation pose estimation model released in 2017.

In this lab you'll work with the __PoseNet__ model.

The various body joints detected by the pose estimation model are tabulated below:


| Id | Part |
| --- | --- |
|0|	nose |
|1|	leftEye |
|2|	rightEye |
|3|	leftEar |
|4|	rightEar |
|5|	leftShoulder|
|6|	rightShoulder|
|7|	leftElbow|
|8|	rightElbow|
|9|	leftWrist|
|10|	rightWrist|
|11	| leftHip|
|12	| rightHip|
|13	| leftKnee|
|14	| rightKnee|
|15|	leftAnkle|
|16	| rightAnkle|



![PoseNetExample](images/PoseNet_example.png)

source: https://www.tensorflow.org/lite/examples/pose_estimation/overview


### 5.1 Classification <a class="anchor" id="Ch51"></a>

We will first run general examples to check if the required libraries are installed. In this example we are going to classify the following image:

<div>
<img src="images/parrot.jpg" width="200">
</div>

#### 5.1.1 Check if tensorflow works for image classification <a id="Ch511" />

First we want to navigate to the folder containing our test, do this by typing in the Terminal:

```bash
cd /opt/coral_example
```
    
Than we will try classify an example image with a tensorflow model, do this by copy-pasting the following code and press enter:

```bash
python3 classify_image.py \
  --model mobilenet_v2_1.0_224_inat_bird_quant.tflite \
  --labels inat_bird_labels.txt \
  --input parrot.jpg
```

You should get something like this:

```text
----INFERENCE TIME----
Note: The first inference on Edge TPU is slow because it includes loading the model into Edge TPU memory.
317.5ms
288.7ms
286.4ms
286.4ms
286.5ms
-------RESULTS--------
Ara macao (Scarlet Macaw): 0.77734
```

#### 5.1.2 Check if the USB coral accelerator works for image classification <a id="Ch512" />

Now we will check the availability of the USB coral accelerator by running the same test again. This time we use a model that is compiled to run on the USB tpu.

Copy-paste the following code and run by pressing enter:

```bash
python3 classify_image.py \
  --model mobilenet_v2_1.0_224_inat_bird_quant_edgetpu.tflite \
  --labels inat_bird_labels.txt \
  --input parrot.jpg
```
    
You should see a result like this:

```text
----INFERENCE TIME----
Note: The first inference on Edge TPU is slow because it includes loading the model into Edge TPU memory.
11.8ms
3.0ms
2.8ms
2.9ms
2.9ms
-------RESULTS--------
Ara macao (Scarlet Macaw): 0.75781
```

### 5.2 PoseNet <a id="Ch52" />

(https://github.com/google-coral/project-posenet)

#### 5.2.1 How does it work? <a id="Ch521" />

At a high level, pose estimation happens in two phases:

An input RGB image is fed through a convolutional neural network. In our case this is a MobileNet V1 architecture. Instead of a classification head however, there is a specialized head which produces a set of heatmaps (one for each kind of key point) and some offset maps. This step runs on the EdgeTPU. The results are then fed into step 2)

- A special multi-pose decoding algorithm is used to decode poses, pose confidence scores, keypoint positions, and keypoint confidence scores. 

- If you're interested in the details of the decoding algorithm and how PoseNet works under the hood, you could take a look at the original research paper or this post: https://medium.com/tensorflow/real-time-human-pose-estimation-in-the-browser-with-tensorflow-js-7dd0bc881cd5 which describes the raw heatmaps produced by the convolutional model.

#### 5.2.2 Important PoseNet Concepts <a id="Ch522" />

<div> <img src="images/keypoints.png" width="800"></div>

| Concept | Description |
| ------- | ----------- |
| Pose    | At the highest level, PoseNet will return a pose object that contains a list of keypoints and an instance-level confidence score for each detected person. |
| Keypoint | A part of a person’s pose that is estimated, such as the nose, right ear, left knee, right foot, etc. It contains both a position and a keypoint confidence score. PoseNet currently detects 17 keypoints illustrated in the diagram above.
| Keypoint Confidence Score | This determines the confidence that an estimated keypoint position is accurate. It ranges between 0.0 and 1.0. It can be used to hide keypoints that are not deemed strong enough. |
| Keypoint Position | 2D x and y coordinates in the original input image where a keypoint has been detected. |

#### 5.2.3 Example PoseNet Code <a id="Ch523" />

> ℹ️ **Note**: PoseNet should already have been installed for you
>
> This is explained in the Software Setup section, but if you are having issues then you can also try to manually go through the legacy notes [here](Legacy/S1_LegacySoftwareSetup.ipynb)

The example code, `pose_camera.py`, is a camera example that streams the camera's image through posenet and draws the pose on top as an overlay. This is a great first example to run to familiarize yourself with the network and its outputs.

Run the demo in a terminal:

```bash
cd /opt/project-posenet
python3 pose_camera.py
```

If the camera and monitor are both facing you, consider adding the `--mirror` flag:

```bash
python3 pose_camera.py --mirror
```

> ℹ️ **Note**: The github repository (https://github.com/google-coral/project-posenet.git) contains the following 3 posenet model files in `models/mobilenet` for different input resolutions. The larger resolutions process more slowly, but allow a wider field of view, for further-away poses to be processed correctly.
>
> ```text
> posenet_mobilenet_v1_075_721_1281_quant_decoder_edgetpu.tflite
> posenet_mobilenet_v1_075_481_641_quant_decoder_edgetpu.tflite
> posenet_mobilenet_v1_075_353_481_quant_decoder_edgetpu.tflite
> ```

You can change the camera resolution by using the --res parameter:

```bash
python3 pose_camera.py --res 480x360  # fast but low res
python3 pose_camera.py --res 640x480  # default
python3 pose_camera.py --res 1280x720 # slower but high res
```

### 5.3 Save Pose Data to a CSV <a id="Ch53" />
In the previous section you have extracted keypoints from a live video using PoseNet. However, we cannot analyse data if this isn't saved somewhere.

To do this, you are going to need to know how to generate unique timestamped filenames ([X2](../X2_GeneratingTimestampedFilenames/X2_GeneratingTimestampedFilenames.ipynb)) and how to write to CSV files ([X1](../X1_WritingCSVFiles/X1_WritingCSVFiles.ipynb)). **You will then produce a new python script, `home/pbl/Desktop/project-posenet/CAM_logging_data.py`, that writes your keypoint values (and a timestamp) to a CSV file.**

> 🏆 **Challenge `S1.5.3a`**: Go through the [X1](../X1_WritingCSVFiles/X1_WritingCSVFiles.ipynb) and [X2](../X2_GeneratingTimestampedFilenames/X2_GeneratingTimestampedFilenames.ipynb) "eXtra Content" materials. 
>
> - After going through [X1](../X1_WritingCSVFiles/X1_WritingCSVFiles.ipynb), you should know how to write CSV files
> - After going through [X2](../X2_GeneratingTimestampedFilenames/X2_GeneratingTimestampedFilenames.ipynb), you should know how to generate timestamped file names
> - Combine both techniques to write your data to a *timestamped* CSV file
>
>
> 🏆 **Challenge `S2.5.2b`**: **Create a new script: 
`CAM_logging_data.py`**. This script should create a timestamped CSV file containing the raw data. 
> 1. First, verify that you can run the `pose_camera.py` script as was done in the terminal using: 
> ```bash
> cd /opt/project-posenet
> python3 pose_camera.py
> ```
> 2. Make a copy of `/opt/project-posenet` on your desktop with `cp -ar /opt/project-posenet ~/Desktop`
> 3. Verify that you can run this `pose_camera.py` script using: 
> ```bash
> cd ~/Desktop/project-posenet
> python3 pose_camera.py
> ```
> Create a script in the new folder: **`home/pbl/Desktop/project-posenet/CAM_logging_data.py`**
> - Use the script that is provided below to start the `CAM_logging_data.py` script. 
> - Adjust the script to generate a timestamped CSV filename (e.g. `$yourpath$/output_$timestamp$.csv`). Create a list (or other data type) and append a row of keypoint data each time new datapoints are generated by the loop. 
> - Make sure all pose estimation datapoints are written to the CSV file
> - Make sure that when the program stops, you **close** the CSV file
>
>
> 💡 **Tips**:
>
> - You only need one file (via `open`) and one `csv.writer` for the entire acquisition.
> - In the code below, it is indicated where you should add code
> - In the last cell of this notebook, an example is given for the loop that you should write
> - You will need to insert multiple rows during an acquisition - one per frame. 
> - `render_overlay` is called once per frame 

In [None]:
import sys
sys.path.insert(0, '/Desktop/project-posenet/')
from pose_camera import *

import time
from datetime import datetime
import csv

n = 0
sum_process_time = 0
sum_inference_time = 0
ctr = 0
fps_counter = avg_fps_counter(30)

### YOUR CODE HERE
# Create filename and open the file
# Create csvwriter
### YOUR CODE HERE

# USE THIS PIECE OF CODE TO GET THE NAMES (KEYS) OF THE KEYPOINTS   
# listkeys = [element for tupl in EDGES for element in tupl]
# listkeys = list(set(listkeys))
# listkeys = [str(p) for p in listkeys]
# listX = [s + '_x' for s in listkeys]
# listY = [s + '_y' for s in listkeys]
# xykeyslist = listX + listY
# USE THIS PIECE OF CODE TO GET THE NAMES (KEYS) OF THE KEYPOINTS   

### YOUR CODE HERE
# Write a toprow to your csv file that contains 'date' and all keys
### YOUR CODE HERE

def run_inference(engine, input_tensor):
    return engine.run_inference(input_tensor)

def render_overlay(engine, output, src_size, inference_box):
    global n, sum_process_time, sum_inference_time, fps_counter

    svg_canvas = svgwrite.Drawing('', size=src_size)
    start_time = time.monotonic()
    outputs, inference_time = engine.ParseOutput()
    end_time = time.monotonic()
    n += 1
    sum_process_time += 1000 * (end_time - start_time)
    sum_inference_time += inference_time * 1000

    avg_inference_time = sum_inference_time / n
    text_line = 'PoseNet: %.1fms (%.2f fps) TrueFPS: %.2f Nposes %d' % (
        avg_inference_time, 1000 / avg_inference_time, next(fps_counter), len(outputs)
    )

    shadow_text(svg_canvas, 10, 20, text_line)
    for pose in outputs:
        
        draw_pose(svg_canvas, pose, src_size, inference_box)
        
        ### YOUR CODE HERE
        # Make empty lists 
        # Create loop: for i in pose.keypoints: and append keypoints for x an y
        # Add xpoints to ypoints and insert a datestamp
        # Write the row to csv
        ### YOUR CODE HERE
    
    return (svg_canvas.tostring(), False)

run(run_inference, render_overlay)

### YOUR CODE HERE
# Close the file here
### YOUR CODE HERE


In [None]:
# 💡 tip: you may need to build a row of your CSV cell-by-cell
from datetime import datetime

xpoints = []
ypoints = []
for i in pose.keypoints: # loop through all keypoints
    xpoints.append(pose.keypoints[i].point[0]) # append the x-coordinate of the keypoint to the row 
    ypoints.append(pose.keypoints[i].point[1]) # append the y-coordinate of the keypoint in the row

xypoints = xpoints + ypoints
xypoins.insert(0, str(datetime.now())) # insert a timestamp in the first column 

# (and then you need to write this row to a CSV using a `csv.writer`: see X1)