# PBL Lab 1: portable pose estimation lab

Pose estimation refers to computer vision techniques that detect human figures in images and videos, so that one could determine, for example, where someone’s elbow shows up in an image. It is important to be aware of the fact that pose estimation merely estimates where key body joints are and does not recognize who is in an image or video. In this lab we will be working with the [Raspberry Pi 4](https://www.raspberrypi.com/products/raspberry-pi-4-model-b/), the [Pi Camera](https://projects.raspberrypi.org/en/projects/getting-started-with-picamera), and a [Coral USB Accelerator](https://coral.ai/products/accelerator/).

__Outline:__
* [1. Connect the camera Module](#Ch1)
* [2. Controlling the Camera with Python code](#Ch2)
* [3. Creating a Capture Booth to Register participants' pictures](#Ch3)
* [4.  Pose estimation](#Ch4)
    * [4.1. Classification](#Ch41)
    * [4.2. PoseNet](#Ch42)
    * [4.3. Save data to a file](#Ch43)



## 1. Connect the camera Module <a class="anchor" id="Ch1"></a>

<div>
<img src="images/Camera_and_pi_4.png" width="300">
</div>

Ensure your Raspberry Pi is turned off.
1. Locate the Camera Module port
<div>
<img src="images/pi4-camera-port.png" width="500">
</div>

2. Gently pull up on the edges of the port’s plastic clip
<div>
<img src="images/pull_edges.png" width="300">
</div>

3. Insert the Camera Module ribbon cable; make sure the connectors at the bottom of the ribbon cable are facing the contacts in the port
<div>
<img src="images/facing_backwards.png" width="300">
</div>

4. Push the plastic clip back into place

5. Start up your Raspberry Pi
6. Go to the main menu and open the __Raspberry Pi Configuration__ tool.
7. Select the __Interfaces__ tab and ensure that the __camera__ is __enabled__
8. Reboot your Raspberry Pi.


## 2. Controlling the Camera with Python code <a class="anchor" id="Ch2"></a>
The Python __picamera__ library allows you to control your Camera Module. 
1. Open a new file in the editor (e.g. in Mu) and save it as __camera_example.py__
__Note:__ never save the file as __picamera.py__!

2. Try the following code on your RP:


In [None]:
from picamera import PiCamera
from time import sleep

camera = PiCamera()

camera.start_preview()
sleep(5)
camera.stop_preview()

__Test yourselves: Try to describe line by line what is happening__

Save and run this program. The camera preview should be shown for five seconds and then close again. 

__Note:__ the camera preview only works when a monitor is connected to your Raspberry Pi. If you are using remote access (such as SSH or VNC), you won’t’ see the camera preview. So instead, let's save the picture using __camera.capture()__. 
- save the the image as 'capture01.jpg'in the directory '/home/pi/Desktop'
- It’s important to sleep for at least two seconds _before_ capturing an image, because this gives the camera’s sensor time to sense the light levels.


In [None]:
# write your own code here #



If your picture is upside-down, you can rotate it by 180 degrees by adding the following:

In [None]:
camera.rotation = 180

You can rotate the image by 90, 180, or 270 degrees. To reset the image, set rotation to 0 degrees.

The Python picamera software provides a number of effects and configurations to change how your images look. Check out the following website to find some examples:
https://projects.raspberrypi.org/en/projects/getting-started-with-picamera/7

All documentation on the PiCamera project can be found here:
https://picamera.readthedocs.io/en/release-1.13/index.html

## 3. Creating a Capture Booth to Register participants' pictures. <a class="anchor" id="Ch3"></a>

Now that you got to know the picamera a bit better, make a simple interface that you can use to capture pictures of your participants. The GUI should be able to do the following:

- enter participant label(e.g."P01")
- Have a button to take a picture from the front and automatically saves the picture ("participant_number + front") in folder 'participants'
- put a text overlay "participant_number" over the pictures 
- show the picture when taken

Tip:
-First create your lay-out without functionality; use a placeholder image at the position where the captured image should be placed


__Note!__ you do not want the face of the participant to be recogizable, so make sure that the picture only captures the body by verbally instructing the participant on where to stand in front of the camera (or move the camera around)

In [None]:
# Fill out your own code here #




## 4. Pose estimation <a class="anchor" id="Ch4"></a>

Pose estimation is the task of using an Machine Learning model to estimate the pose of a person from an image or a video by estimating the spatial locations of key body joints (keypoints). In this lab you are going to set up a __portable pose estimation lab__ using the __Raspberry Pi 4__, the __Pi Camera__, and the __Coral USB Accelerator__. The __Coral USB Accelerator__ is a USB device that provides an __Edge TPU__ as a coprocessor for your device. It accelerates inferencing for your machine learning models when attached to either a Linux, Mac, or Windows host computer. 

The pose estimation models take a processed camera image as the input and outputs information about keypoints. The keypoints detected are indexed by a part ID, with a confidence score between 0.0 and 1.0. The confidence score indicates the probability that a keypoint exists in that position.

There are two TensorFlow Lite pose estimation models:
- MoveNet: the state-of-the-art pose estimation model available in two flavors: Lighting and Thunder. 
- PoseNet: the previous generation pose estimation model released in 2017.

In this lab you'll work with the __PoseNet__ model.

The various body joints detected by the pose estimation model are tabulated below:


| Id | Part |
| --- | --- |
|0|	nose |
|1|	leftEye |
|2|	rightEye |
|3|	leftEar |
|4|	rightEar |
|5|	leftShoulder|
|6|	rightShoulder|
|7|	leftElbow|
|8|	rightElbow|
|9|	leftWrist|
|10|	rightWrist|
|11	| leftHip|
|12	| rightHip|
|13	| leftKnee|
|14	| rightKnee|
|15|	leftAnkle|
|16	| rightAnkle|



![PoseNetExample](images/PoseNet_example.png)
[source: https://www.tensorflow.org/lite/examples/pose_estimation/overview]


### 4.1 Classification <a class="anchor" id="Ch41"></a>

We will first run general examples to  check if the required libraries are installed. In this example we are going to classify the following image:

<div>
<img src="images/parrot.jpg" width="200">
</div>


1. Make sure your RP is switched off.
2. Connect the RP to:
    - Pi Camera
    - USB Coral Accelerator
3. Now connect the _charger_ to switch on the Raspberry Pi
4. Connect with the RP via de VNC Viewer (if you are not connected to an external screen).

#### 4.1.1 Check if tensorflow works for image classification:
First we want to navigate to the folder containing our test, do this by typing in the Terminal:

    cd coral/tflite/python/examples/classification
    
Than we will try classify an example image with a tensorflow model, do this by copy-pasting the following code and press enter:

    python3 classify_image.py \
    --model models/mobilenet_v2_1.0_224_inat_bird_quant.tflite \
    --labels models/inat_bird_labels.txt \
    --input images/parrot.jpg

You should get something like this:

        ----INFERENCE TIME----
        Note: The first inference on Edge TPU is slow because it includes loading the model into Edge TPU memory.
        317.5ms
        288.7ms
        286.4ms
        286.4ms
        286.5ms
        -------RESULTS--------
        Ara macao (Scarlet Macaw): 0.77734

#### 4.1.2 Check if the USB coral accelerator works for image classification:
Now we will check the availability of the USB coral accelerator by running the same test again. This time we use a model that is compiled to run on the USB tpu. Copy-paste the following code and run by pressing enter:

    python3 classify_image.py \
    --model models/mobilenet_v2_1.0_224_inat_bird_quant_edgetpu.tflite \
    --labels models/inat_bird_labels.txt \
    --input images/parrot.jpg
    
You should see a result like this:

    ----INFERENCE TIME----
    Note: The first inference on Edge TPU is slow because it includes loading the model into Edge TPU memory.
    11.8ms
    3.0ms
    2.8ms
    2.9ms
    2.9ms
    -------RESULTS--------
    Ara macao (Scarlet Macaw): 0.75781

### 4.2 PoseNet <a class="anchor" id="Ch42"></a>

(https://github.com/google-coral/project-posenet)

#### 4.2.1 How does it work? <a class="anchor" id="Ch421"></a>
At a high level pose estimation happens in two phases:

An input RGB image is fed through a convolutional neural network. In our case this is a MobileNet V1 architecture. Instead of a classification head however, there is a specialized head which produces a set of heatmaps (one for each kind of key point) and some offset maps. This step runs on the EdgeTPU. The results are then fed into step 2)

- A special multi-pose decoding algorithm is used to decode poses, pose confidence scores, keypoint positions, and keypoint confidence scores. 

- If you're interested in the details of the decoding algorithm and how PoseNet works under the hood, you could take a look at the original research paper or this post: https://medium.com/tensorflow/real-time-human-pose-estimation-in-the-browser-with-tensorflow-js-7dd0bc881cd5 which describes the raw heatmaps produced by the convolutional model.

#### 4.2.2 Important concepts <a class="anchor" id="Ch422"></a>
__Pose__: at the highest level, PoseNet will return a pose object that contains a list of keypoints and an instance-level confidence score for each detected person.

__Keypoint__: a part of a person’s pose that is estimated, such as the nose, right ear, left knee, right foot, etc. It contains both a position and a keypoint confidence score. PoseNet currently detects 17 keypoints illustrated in the following diagram:

<div> <img src="images/keypoints.png" width="800"></div>

__Keypoint Confidence Score__: this determines the confidence that an estimated keypoint position is accurate. It ranges between 0.0 and 1.0. It can be used to hide keypoints that are not deemed strong enough.

__Keypoint Position__: 2D x and y coordinates in the original input image where a keypoint has been detected.

Now that we know that our TensorFlow and Coral are working, we will now install the required packages for PoseNet

    sudo apt-get install python3-pycoral

Install tflite:
    
    python3 -m pip install tflite-runtime

Install the Edge TPU runtime:

    echo "deb https://packages.cloud.google.com/apt coral-edgetpu-stable main" | sudo tee /etc/apt/sources.list.d/coral-edgetpu.list

    curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add -

    sudo apt-get update
    
    
    sudo apt-get install libedgetpu1-std
    

Now it’s time to connect your __Coral__ using the USB-C cable supplied. This conforms to USB 3.0 standards and should be attached to one of the Blue USB ports on the Raspberry Pi to allow the fastest transfer speeds. If you already attached the Coral, then remove and connect again.
     
    cd ~/google-coral
    
try:

    git clone https://github.com/google-coral/project-posenet.git
    
if you receive the message 'fatal: destination path 'project-posenet' already exists and is not an empty directory.' Posenet was already downloaded on your device. Move on to the next step:

    cd /home/pi/google-coral/project-posenet
    
    sh install_requirements.sh
    
   

#### 4.2.3 Example <a class="anchor" id="Ch423"></a>
The example code _pose_camera.py_ is a camera example that streams the camera image through posenet and draws the pose on top as an overlay. This is a great first example to run to familiarize yourself with the network and its outputs.

Run the demo like this:

    cd /home/pi/google-coral/project-posenet

    python3 pose_camera.py

If the camera and monitor are both facing you, consider adding the --mirror flag:

    python3 pose_camera.py --mirror


_Optional:_ \
In the repo there are included 3 posenet model files for differnet input resolutions. The larger resolutions are slower of course, but allow a wider field of view, or further-away poses to be processed correctly.

    posenet_mobilenet_v1_075_721_1281_quant_decoder_edgetpu.tflite
    posenet_mobilenet_v1_075_481_641_quant_decoder_edgetpu.tflite
    posenet_mobilenet_v1_075_353_481_quant_decoder_edgetpu.tflite

You can change the camera resolution by using the --res parameter:

    python3 pose_camera.py --res 480x360  # fast but low res
    python3 pose_camera.py --res 640x480  # default
    python3 pose_camera.py --res 1280x720 # slower but high res

### 4.3 Saving data to file <a class="anchor" id="Ch43"></a>

In the previous section you have extracted keypoints from a live video using OpenPose. However, we cannot analyse data if this isn't saved. You are now going to edit the '_pose_camera.py_' script such that the keypoints are written to a csv file.

1. Make sure that you save the pose_camera.py under a different name: logging_pose_camera.py

The first thing we need when we want to save experimental data is a timestamp for each keypoint. And, while we are on it, it would be great if the data are automatically saved with a filename that holds the date and time.

2. Therefore, we need to import __datetime__ from the library __datetime__

To explore what __datetime__ does, you can run the following example.

In [None]:
from datetime import datetime

x = datetime.now()
print(x)

The date contains year, month, day, hour, minute, second, and microsecond. The __datetime__ module has many methods to return information about the date object.

| Directive |	Description |
| --- | --- |
| %a	| Weekday, short version	Wed	|
| %A	| Weekday, full version	Wednesday	| 
| %w	| Weekday as a number 0-6, 0 is Sunday	3| 	
| %d	| Day of month 01-31	31	| 
| %b	| Month name, short version	Dec | 	
| %B	| Month name, full version	December | 	
| %m	| Month as a number 01-12	12	| 
| %y	| Year, short version, without century	18	 | 
| %Y	| Year, full version	2018	| 
| %H	| Hour 00-23	17	| 
| %I	| Hour 00-12	05	| 
| %p	| AM/PM	PM	| 
| %M	| Minute 00-59	41	| 
| %S	| Second 00-59	08	| 
| %f	| Microsecond 000000-999999	548513	| 
| %z	| UTC offset	+0100	| 
| %Z	| Timezone	CST	| 
| %j	| Day number of year 001-366	365	| 
| %U	| Week number of year, Sunday as the first day of week, 00-53	52	| 
| %W	| Week number of year, Monday as the first day of week, 00-53	52	| 
| %c	| Local version of date and time	Mon Dec 31 17:41:00 2018	| 
| %C	| Century	20	| 
| %x	| Local version of date	12/31/18	| 
| %X	| Local version of time	17:41:00	| 
| %%	| A % character	%	| 
| %G	| ISO 8601 year	2018	| 
| %u	| ISO 8601 weekday (1-7)	1 | 	
| %V	| ISO 8601 weeknumber (01-53)	01	 |
_(source: https://www.w3schools.com/python/python_datetime.asp )_

3. Create a code to extract a string similar to: _20220404-114200_ (year,month,day - hour,minute,second)

In [None]:
from datetime import datetime 

# Enter your own code here!




4. Use the code you just wrote in the '__def__ main()' of your _logging_pose_camera.py_ script to create a filename _data/output_yourdatestring.csv_. Don't forget to also import the library at the start of your script.

You will need the __datetime__ module again later on, but for now let's focus on creating a file in which we can log the data.

5. In Lecture 3, you have learnt how to open a file with a specific filename. Use this to open a file in which we can _write_ with the filename you created in (4). Place this code also in the '__def__ main()' function.

Instead of opening a text file like we did in Lecture 3 we now want to create a csv file. There is a special Python module we can use: __csv__

6. import the csv module in _logging_pose_camera.py_

csv.writer class is used to insert data to the CSV file. This class returns a writer object which is responsible for converting the user’s data into a delimited string. A csvfile object should be opened with _newline=''_ otherwise newline characters inside the quoted fields will not be interpreted correctly. Therefore, we need to slightly adjust the code we put in (5)

7. add _newline=''_ within the brackets of the line in which you open the csv file. 

csv.writer class provides two methods for writing to CSV. They are __writerow()__ and __writerows()__. writerow() writes a single row at a time. writerow(fields) is used to write multiple rows at a time. Below is an example of how to write data into a csv file

In [None]:
# CSV Example

import csv 
    
# field names 
fields = ['Name', 'Course', 'Year', 'Grade'] 
    
# data rows of csv file 
rows = [ ['Nikhil', 'KT2502', '2022', '9.0'], 
         ['Sanchit', 'KT2501', '2021', '7.1'], 
         ['Aditya', 'KT2502', '2022', '9.3'], 
         ['Sagar', 'KT2502', '2022', '9.5'], 
         ['Prateek', 'KT2502', '2022', '7.8'], 
         ['Sahil', 'KT2502', '2022', '9.1']] 


    
# name of csv file 
filename = "course_records.csv"
    
# writing to csv file 
csvfile = open(filename, 'w',newline='') 
    
# creating a csv writer object 
csvwriter = csv.writer(csvfile) 
        
# writing the fields 
csvwriter.writerow(fields) 
        
# writing the data rows 
csvwriter.writerows(rows)
    
csvfile.close()

The file is saved in the same folder as this python notebook.

7. Use this example to create a csv writer object in __def__ main(). Think about how you would like to export the keypoints, what will be your headers? Create this header (fields) and write it to the csv file.

The rows will be filled with the keypoint data while you are measuring. This will be added to the  '__def__ render_overlay' function

8. As an example, the following code adds the data to a single row


In [None]:
row = []
for label, keypoint in pose.keypoints.items(): #loops through all keypoints
    row.append(keypoint.point[0]) # appends the x-coordinate of the keypoint to the row 
    row.append(keypoint.point[1]) # appends the y-coordinate of the keypoint in the row
           
# add date to row
row.insert(0,now) # enters the date to as first entry of the row

      
9. Write the data array that you constructed to the csv file.

10. Finally, make sure that at the end of the __def__ main() function, you close the csv file.

In [None]:
## Add your code to the Raspberry Pi Code
