# CS4243 - Lab Session 6

Computer Vision & Pattern Recognition

Week 9, Mon 17 Oct, AY 2023/24

Author: Dr. Amirhassan MONAJEMI. Modified by: Lingdong KONG

## Welcome! 👋

This notebook contains the tutorials for the `sixth` lab sessions. The following materials are covered:

- Part 1: Optical Flow

Let's get started!

<hr>

# Part 1: Optical Flow

#### Goal of this section:
- Learn how to track feature points using the *Lucas-Kanade algorithm* in the `lucas_kanade_point_tracking` function.
- Learn how to draw the tracked points and the motion vectors on each frame.

    Note:
    
    - The tracking continues until you press the `'Esc'` key to exit the application.
    - Make sure that the path to your video file is correct. 

  Extra:
    
    - You can configure the code to use your camera as the video source.
    - You might need to adjust the parameters for good feature point detection (e.g., `cv2.goodFeaturesToTrack`) to suit your specific tracking needs.

We start by capturing video frames from a video file.


#### Videos for testing: 
- Example 1: `vtest.avi`
- Example 2: `10142.mp4`
- Example 3: `10231.mp4`
- Example 4: `10236.mp4`

You can download these videos from Canvas: CS4243 -> Files -> Python_notebooks -> set5 -> `OF_samples.zip`

#### Procedures:
1. Run the code using a few test videos (as listed above).
2. Understand the role of the parameters and change them accordingly to comments.
3. Try to make it better again based on the comments.
4. Answer the questions.

In [1]:
# importing necessary libraries 
import cv2
import numpy as np

**Main Function** (`lucas_kanade_point_tracking`)

- Input parameters:
    - Two frames, `prev_frame` and `curr_frame`, and an array of `prev_points` coordinations.
- Returns:
    - The updated `prev_points` and the new `next_points` arrays.

In [2]:
def lucas_kanade_point_tracking(prev_frame, curr_frame, prev_points):
    lk_params = dict(
        winSize=(15, 15),
        maxLevel=2,
        criteria=(cv2.TERM_CRITERIA_EPS | cv2.TERM_CRITERIA_COUNT, 10, 0.03),
    )

    # Step 1: calculate optical flow using Lucas-Kanade method
    next_points, status, _ = cv2.calcOpticalFlowPyrLK(
        prev_frame, curr_frame, prev_points, None, **lk_params
    )
    status = status.ravel()
    
    # Step 2: filter out points with status = 1 (successfully tracked)
    prev_points = prev_points[status == 1]
    next_points = next_points[status == 1]

    return prev_points, next_points

**Questions:**

> **Q1** - What are the types and shapes of those parameters?
> - **Answer:** 

`prev_frame`: 2/3d array

`curr_frame`: 2/3d array
    
`prev_points`: array of coordinates with shape (n, 2)

> **Q2** - What are the shapes of those returned arrays? Any changes in `prev_points`?
> - **Answer:** 

`prev_points` and `next_points` will have the shape `(m, 2)`, where `m` is the number of successfully tracked points (which could be less than or equal to `n`, the original number of points in `prev_points`).

`prev_points` might be changed in this function. Only those points which were successfully tracked (i.e., where the status is 1) are retained.

> **Q3** - What are the parameters for the Lucas-Kanade optical flow method? what do they do?
> - **Answer:**

`winSize=(15, 15)` - Size of the search window at each pyramid level. In this function, it's set to (15, 15). Changing the window size will affect the accuracy and computation time. A larger window might capture more evident motion but at a higher computation cost.

`maxLevel=2` - The 0-based maximal pyramid level number. In this function, it's set to 2. This means that there are three levels in total: level `0` (original image), level `1`, and level `2`. Adjusting the number of levels affects the algorithm's ability to capture motion at various scales. A value of 0 means only the original scale, while a higher value allows for capturing motion at reduced resolutions.
        
`criteria` - This sets the termination criteria of the iterative search algorithm. The function uses the combination of a maximum number of iterations (`cv2.TERM_CRITERIA_COUNT`) and a specified amount of epsilon (`cv2.TERM_CRITERIA_EPS`), which determines the minimal desired accuracy.
 
> **Q4** - What are the criteria? how to finish an iterative algorithm?
> - **Answer:** 

- The criteria are set using a combination of two factors:
    1. The maximum number of the iteration (set to 10 in this function).
    2. Epsilon accuracy (set to 0.03 here).
- The iterative algorithm stops when either of the conditions is met: it either reaches the specified number of iterations or achieves the desired accuracy.


**References:**

- Source 1: https://docs.opencv.org/3.4/d4/dee/tutorial_optical_flow.html
- Source 2: https://www.geeksforgeeks.org/python-opencv-optical-flow-with-lucas-kanade-method

In [3]:
# load the video
cap = cv2.VideoCapture('./OF_samples/10231.mp4')

Learn more about the `VideoCapture` function of OpenCV. 

**Questions:**

> **Q5** - Do we need other parameters here?
> - **Answer:**
- The `cv2.VideoCapture()` function primarily takes one argument, which can be either the name of a video file (like in the above function) or an integer number representing the ID of the camera device you want to use.
- So, in the provided function, only the path to the video file is needed.
- However, if you want to access a different camera or live stream, you might use a different argument, typically an integer (like 0 for the default camera). In some situations, you might also want to set other properties of the `VideoCapture` object using the `set()` method, but these are not parameters of the `VideoCapture` function itself.    

> **Q6** - How can we read a live video stream?
> - **Answer:**
- To read a live video stream using OpenCV's `VideoCapture` function, you would typically pass an integer representing the camera's ID.
- For most systems, `0` would be the default built-in camera (like a laptop's webcam). `1` would be a secondary camera if connected, and so on.
-  So, to capture video from the default camera, you would initialize the `VideoCapture` object like this:
```python
cap = cv2.VideoCapture(0)
```
- After initializing, you can then read frames in a loop using the `read()` method until you decide to break the loop or the stream ends.

**References:**
- Source 3: https://docs.opencv.org/4.x/dd/d43/tutorial_py_video_display.html

In [4]:
# read the first frame
ret, prev_frame = cap.read()

In [5]:
print(ret)

True


In [6]:
print(prev_frame.shape)

(1920, 1080, 3)


**Questions:**

> **Q7** - What does the `ret` parameter do here?
> - **Answer:**
- The `ret` is a boolean variable that returns `True` if the frame is successfully grabbed (or read) and `False` otherwise. 
- When working with video files or live video streams in OpenCV, it's common to use `ret` to check if a frame has been successfully retrieved. 
- If `ret` is `False`, it often indicates that there are no more frames to read (in case of a video file) or there was some issue in capturing the frame (in case of a live stream).

In [7]:
# convert the frame to gray level
prev_gray = cv2.cvtColor(prev_frame, cv2.COLOR_BGR2GRAY)

Define initial points to track (e.g., using `cv2.goodFeaturesToTrack`)
- We can have any function which returns an array of the [x, y] coordination pairs here.

Input Parameters:
- `prev_gray`: the frame that used to extract feature points or anchor points from.
- `maxCorners`: the maximum number of points to detect.

In [8]:
prev_points = cv2.goodFeaturesToTrack(
    prev_gray,
    maxCorners=100,
    qualityLevel=0.3,
    minDistance=7,
)

**Questions:**

> **Q8** - What if changing `maxCorners` to `4`?
> - **Answer:** 

- The `maxCorners` parameter determines the maximum number of corners or feature points to return.
- If set to 4, the `cv2.goodFeaturesToTrack()` function will return at most 4 of the strongest corners (or feature points) it can find in the `prev_gray` image. 
- This means that even if there are more potential corners available, only the top 4 (based on the given qualityLevel criteria) will be selected.


> **Q9** - What if changing `qualityLevel` to `0.9` and `0.01`?
> - **Answer:** 

- The `qualityLevel` parameter is a multiplier for the minimum eigenvalue of a 2x2 normal matrix of optical flow equations, the smallest eigenvalue of which is used to sort the corners. In simpler terms, it sets the threshold for the quality of the corners.
- If set to 0.9: The function will be stricter in determining which corners are considered "good". It will only select corners that have a quality score greater than 90% of the highest quality corner detected. As a result, you might end up with fewer corners than the specified maxCorners value, unless there are many very high-quality corners.
- If set to 0.01: The function will be more lenient, and you'll get many more corners, potentially up to your `maxCorners` limit, but many of these corners might not be very prominent or distinct.
    

> **Q10** - The `minDistance` parameter is the minimum distance between detected feature points in pixel. What if changing it to `1` and `50`?
> - **Answer:** 
- The `minDistance` parameter defines the minimum Euclidean distance between the returned corners.
- If set to 1: Detected corners can be very close to each other, with a minimum distance of 1 pixel between them. This might result in clusters of points in regions with high texture.
- If set to 50: Detected corners will be at least 50 pixels apart from each other. This ensures that the corners are well distributed over the image, but in small or tightly textured regions, you might miss out on potential feature points because of this larger distance constraint.

**References:**
- Source 4: https://docs.opencv.org/3.4/d4/d8c/tutorial_py_shi_tomasi.html 
- Source 5: https://theailearner.com/tag/cv2-goodfeaturestotrack

In [9]:
while True:
    
    ret, curr_frame = cap.read()
    
    if not ret:
        break

    curr_gray = cv2.cvtColor(curr_frame, cv2.COLOR_BGR2GRAY)

    # track points using Lucas-Kanade method
    prev_points, next_points = lucas_kanade_point_tracking(prev_gray, curr_gray, prev_points)

    # draw tracks on the current frame
    for i, (prev, next) in enumerate(zip(prev_points, next_points)):
        x1, y1 = prev.ravel()
        x2, y2 = next.ravel()
        cv2.line(curr_frame, (int(x1), int(y1)), (int(x2), int(y2)), (0, 0, 255), 2)
        cv2.circle(curr_frame, (int(x2), int(y2)), 5, (0, 255, 0), -1) 
        
    cv2.namedWindow("Optical Flow", cv2.WINDOW_NORMAL)
    cv2.imshow('Optical Flow', curr_frame)
    
    # press 'Esc' to exit
    if cv2.waitKey(0) & 0xFF == 27:
        break

: 

**Questions:**

> **Q11** - How can we break the `while` loop? why anded with `0xFF`?
> - **Answer:** 
- There are two conditions in the function that can break the `while` loop:
1. When the `ret` is False, which indicates that there is no more frame to read from the video or there was an issue in capturing the frame. This is checked by the condition:
```python
if not ret: break
```
2. When the user presses the 'Esc' key. This is checked by the condition:
```python
if cv2.waitKey(0) & 0xFF == 27: break
```
- The `cv2.waitKey(0)` function waits indefinitely for a key event. The value returned by this function is a 32-bit integer. The & `0xFF` operation is a bitwise `AND` that is used to mask off the upper 24 bits of the integer, leaving only the lower 8 bits. 
- The ASCII value for the 'Esc' key is `27`, so the condition checks if the lower 8 bits of the returned integer equal `27`, meaning the 'Esc' key was pressed. The reason for the bitwise `AND` with `0xFF` is to handle compatibility between 32-bit and 64-bit systems, ensuring consistent behavior across different platforms.


In [4]:
cap.release()
cv2.destroyAllWindows()

<hr>

## Next:

We have seen an example using video `10231.mp4`. You might want to explore other videos in this tutorial:

- Example 1: `vtest.avi`
- Example 2: `10142.mp4`
- Example 3: `10231.mp4` (Completed)
- Example 4: `10236.mp4`

### 🎉 Congratulations! You have finished this lab tutorial.