# Lab 1 Presentation

In [None]:
import imageio.v3 as iio
import cv2
from matplotlib import pyplot as plt
from camera_widget import cv2_snap, cv2_vid
from IPython.display import Image, Video
from feat import Detector
import opencv_jupyter_ui as jcv2

## Image Manipulation

We can load images as `ndarray`s shaped `(height, width, channels)` using `iio.imread()`, and display them with `plt.imshow()`.

In [None]:
seagull = iio.imread("seagull.jpg") # Copyright Yuqiong Wang, 2023. Used with permission from the author.
plt.imshow(seagull)
type(seagull), seagull.shape

Typically, the type is `uint8`: integers in range 0 ... 255.

In [None]:
seagull.dtype, seagull.min(), seagull.mean(), seagull.max()

Since the picture is represented as an array, we can easily manipulate it. E.g., we can use *index slicing* to crop it:

In [None]:
cutout = seagull[190:640, 525:975]
plt.imshow(cutout)
cutout.shape

If we do an extreme crop, we can see the individual pixels that form the image. In RGB, each pixel is represented by 3 values: Red, Green, and Blue intensities.

In [None]:
extreme_cutout = seagull[230:255, 595:620]
fig = plt.figure(figsize=(10,10))
plt.imshow(extreme_cutout)
extreme_cutout.shape

To show that our array stores the colors as RGB, let's set R to zero:

In [None]:
no_red = cutout.copy()
no_red[:, :, 0] = 0
plt.imshow(no_red)

Similarly, we can invert colors with some math:

In [None]:
inverse = 255 - cutout
plt.imshow(inverse)

`iio.imwrite()` allows us to save the image to a variety of formats. In this case, `JPG`:

In [None]:
iio.imwrite("inverse.jpg", inverse)

Let's load a PNG with transparency now. In this case, the channels are `RGBA`: Red, Green, Blue, Alpha.
* `A = 0` means fully transparent
* `A = 255` means fully opaque.

In [None]:
uu_logo = iio.imread("uu_logo.png") # Trademark owned by Uppsala University.
plt.imshow(uu_logo)
uu_logo.shape

Since PyPlot adds a white background, it's not so obvious that we have transparency. Let's make everything fully opaque to reveal the hidden RGB colors:

In [None]:
full_alpha = uu_logo.copy()
full_alpha[:, :, 3] = 255
plt.imshow(full_alpha)

If you get tired of the PyPlot numbers, in a Jupyter notebook you can use `IPython.display.Image`. It even respects transparency.

In [None]:
Image("uu_logo.png")

## Video Manipulation

There is no good way to display videos using PyPlot, but we can use `IPython.display.Video` to embed a video in a notebook:

In [None]:
Video("cats.mp4")

The same command `iio.imread()` we used for images can also be used for videos. In this case, the array is shaped `(frames, height, width, channels)`.

In [None]:
# "cats": copyright Marc Fraile, 2023.
cats = iio.imread("cats.mp4")
cats.shape

Loading the video as a sequence of frames discards some important information, like the framerate. We can recover this information (and other *metadata*) using `iio.immeta()`:

In [None]:
cats_meta = iio.immeta("cats.mp4")
cats_meta

Of course, we can display individual frames just like we display images:

In [None]:
plt.imshow(cats[200])

## Exercises

In this first block, we will practice one of the oldest and simplest tricks in Image Analysis: **color thresholding**. We will try to isolate a brightly colored object in an image by checking color values.

1. `exercise_1_toy_knife_rgb.py`: use color segmentation using RGB directly.
2. `exercise_2_toy_knife_hsv.py`: let's try again with a different representation of color: HSV (Hue - Saturation - Value).

### Hints

* `np.zeros_like(image, shape=...)`
* You can use comparisons to index into an `ndarray`.
* You can copy most code from Exercise 1 into Exercise 2.
* `cv2.cvtColor()`
* `cv2` also has loading and saving functions `cv2.imread()` and `cv2.imwrite()`, but they expect BGR and don't work with videos.
* Use the internet!

## Webcam Access

If you look up online how to do any image processing task in Python, you will be told to use [OpenCV](https://opencv.org/). This is an old C++ library with a clunky Python interface, and has plenty of downsides, but it's hard to beat it in number of features or speed of execution.

We will use OpenCV for real-time webcam access. To smooth over its usage in notebooks, we will also use `opencv_jupyter_ui`.

We have written an utility to take snapshots and another to take videos using OpenCV. You can find the code in the module `camera_widget`. Let's test them out, starting with the picture-taking app:

In [None]:
snap = cv2_snap()
snap.shape

In [None]:
plt.imshow(snap)

And let's check the video-taking solution next:

In [None]:
fps, vid = cv2_vid()
fps, vid.shape

In [None]:
plt.imshow(vid[0])

To display the video, let's save it to a local file first:

In [None]:
iio.imwrite("webcam_video.mp4", vid, fps=fps)
Video("webcam_video.mp4")

We can use a simplified version of the code we used in `camera_widget` to show a live-feed of our webcam using OpenCV:

In [None]:
cam = cv2.VideoCapture(0)

while True:
    # check = True means we managed to get a frame.
    # If check = False, the device is not available, and we should quit.
    check, frame = cam.read()
    if not check:
        break

    # OpenCV uses a separate window to display output.
    jcv2.imshow("video", frame)

    # Press ESC to exit.
    key = jcv2.waitKey(1) & 0xFF
    if key == 27:
        break

cam.release()
jcv2.destroyAllWindows()

We can get creative and output the channels separately. Note that OpenCV does not follow the standard RGB convention, using BGR instead.

In [None]:
import cv2
import opencv_jupyter_ui as jcv2
import numpy as np

cam = cv2.VideoCapture(0)

while True:
    check, in_frame = cam.read()
    if not check:
        break

    (h, w, c) = in_frame.shape

    out_frame = np.zeros_like(in_frame, shape=(h*2, w*2, c))

    out_frame[:h,:w,:] = in_frame
    out_frame[:h,w:,0] = in_frame[:,:,0]
    out_frame[h:,:w,1] = in_frame[:,:,1]
    out_frame[h:,w:,2] = in_frame[:,:,2]

    jcv2.imshow("video", out_frame)

    # Press ESC to exit.
    key = jcv2.waitKey(10) & 0xFF
    if key == 27:
        break

cam.release()
jcv2.destroyAllWindows()

## Feature Extraction

In this course, we are interested in features related to the expression of emotion. We will focus in facial features: the position of the face, the activation of different muscles used to express emotion, or even the expressed emotion itself. [Py-Feat](https://py-feat.org/) is a modern Python library that allows us to easily work with all these feature types.

Let's load a detector and test it on the faces of the TAs.

In [None]:
detector = Detector(device="cuda")
detector

Note that the detector packages several models with different functions: finding faces in a picture, detecting key points (landmarks) in each face, deducing facial muscle activations (AUs), detecting emotion...

We can pass a filename to `detector.detect_image()`:

In [None]:
Image("lux.jpg", width=480)

In [None]:
lux_prediction = detector.detect_image("lux.jpg")
print(type(lux_prediction))
lux_prediction

We can see it returned a `Fex`, which is a subclass of a Pandas `DataFrame`. It contains one row per face detected, and a bumch of features related to the face. `Fex` has a few added helper functions, like `plot_detections()`:

In [None]:
lux_prediction.plot_detections()

What about Alessio?

In [None]:
alessio_prediction = detector.detect_image("alessio.jpg")
alessio_prediction.plot_detections()
display(Image("alessio.jpg", width=480))
alessio_prediction

Note that in this case the detector found a "false positive" in the background, so it's returning the data for two "faces". Apparently, the building in the background was angry.

Speaking of angry faces...

In [None]:
marc_prediction = detector.detect_image("marc.jpg")
marc_prediction.plot_detections()
display(Image("marc.jpg"))
marc_prediction

The AUs correspond to muscle activations in the face, and can be used to predict emotion. We can ask Py-Feat to display the detected AU activations:

In [None]:
marc_prediction.plot_detections(faces="aus", muscles={"all": "heatmap"})

In [None]:
lux_prediction.plot_detections(faces="aus", muscles={"all": "heatmap"})

## Exercises

In this block, we will apply two **face tracking** approaches to a live feed from our webcam: using both old-school vs. modern solutions. In the modern case, we will add **emotion detection**.

3. `exercise_3_face_tracking.py`: use OpenCV to run a classic algorithm in the CPU. It only detects faces.
4. `exercise_4_emotion_detection.ipynb`: use Py-Feat to run a neural-net based algorithm in the GPU. It detects faces, and which emotion they express.

### Hints

* Exercise 3:
    * You can copy a lot from previous exercises.
    * How good is the "classic" solution?
        * Does it run fast?
        * How easy is it to make it lose track of your face?
        * How easy is it to get false positives? (detect faces where there are none).
* Exercise 4:
    * You need to use a more involved part of the `Detector` API:
        * `detector.detect_faces()`
        * `detector.detect_landmarks()`
        * `detector.detect_emotions()`
        * What do these functions do?
    * Another helpful function call: `cam.set(cv2.CAP_PROP_BUFFERSIZE, 1)`
        * What does this function call do?
    * How good is the "modern" solution?
        * Does it run fast?
        * How easy is it to make it lose track of your face?
        * How easy is it to get false positives? (detect faces where there are none).