# <strong style="color: tomato;">Capstone Project</strong> $\color{blue}{\text{}}$
---

## <span style="color: yellowgreen;">1. </span>Introduction.

We will be creating a program that can detect a hand, segment the hand, and count the number of fingers being held up.

&ensp;


## <span style="color: yellowgreen;">2. </span>Variables and background.

**Part One**:

- First let’s define some global variables.
- Afterwards, we’ll set up a function that updates a running average of the background values in an ROI. This will later on allow us to detect new objects (hand) in the ROI.

&ensp;

**Strategy for counting fingers**:
- Grab an ROI
- Calculate a running average background value for 60 frames of video
- Once avg value is found, then the hand can enter the ROI.

Set an ROI and calculate the average running value for some amount of frames. Then once a hand enters, we can detect a change and apply thresholding. After the thresholding:

- Once the hand enters the ROI, we will use a Convex Hull to draw a polygon around the hand.
- Using some math, we’ll calculate the center of the hand against the angle of outer points to infer finger count. Keep in mind this strategy won’t be perfect.

**Imports**:

In [22]:
import cv2
import numpy as np

# used for distance calculation later on
from sklearn.metrics import pairwise

**Global variables**:

In [23]:
# this will be a global variable that we update through a few functions
background = None

# start with a halfway point between 0 and 1 of accumulated weight
accumulated_weight = 0.5

# manually set up our ROI for grabbing the hand.
# feel free to change these (top right)
roi_top = 20
roi_bottom = 300
roi_right = 300
roi_left = 600

**Finding average background value**:

The function calculates the weighted sum of the input image src and the accumulator dst so that dst becomes a running average of a frame sequence:

In [24]:
def calculate_accumulated_avg(frame, accumulated_weight):
    '''
    Given a frame and a previous accumulated weight, computed the weighted average of the image passed in.
    '''
    
    global background

    # for first time, create the background from a copy of the frame.
    if background is None:
        background = (frame).copy().astype('float')
        return None
    
    # compute weighted average, accumulate it and update the background
    # we are not returning the avg. We only return None on the startup of the program
    # src, dst, alpha
    cv2.accumulateWeighted(frame, background, accumulated_weight)


The next step is to use thresholding to grab the hand segment from the ROI. We will create a function that can do this.

&ensp;


## <span style="color: yellowgreen;">3. </span>Segmentation.

**Segment the hand region in frame**:

In [25]:
def segment(frame, threshold=25): # may want to play arround with thata value
    global background

    # calculates the Absolute Differentce between the backgroud and the passed in frame
    diff = cv2.absdiff(background.astype('uint8'), frame)

    # Apply a threshold to the image so we can grab the foreground
    # We only need the threshold, so we can throw away the first item in the tuple with an underscore _
    _, thresholded = cv2.threshold(diff, threshold, 255, cv2.THRESH_BINARY)

    # grab external contours from the image
    # again, only grabbing what we need here so we can throw away the rest
    # _, contours, _ = cv2.findContours(thresholded.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
    image, contours, hierarchy = cv2.findContours(thresholded.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)

    # ff length of contours list is 0, then we didn't grab any contours => return None
    if len(contours) == 0:
        return None
    else:
        # assuming the largest external contour in ROI is the hand (largest by area)
        # This will be our segment
        hand_segment = max(contours, key=cv2.contourArea)

        # return both the hand segment and the thresholded hand image
        return (thresholded, hand_segment)


## <span style="color: yellowgreen;">4. </span>Counting and Convex Hull.

**Counting fingers with a Convex Hull**:

Now that we have the hand segment, the next step is to actually count the fingers being held up. We can do this by utilizing a **Convex Hull**. A convex hull draws a polygon by connecting points around the most external points in a frame.

<div style="display: flex; justify-content: center; align-items: center; text-align: center;"><div style="margin-top: 0.5em; margin-bottom: -0.3em; width: 35%;">
<img src="./src/img/CAP/CAP_1.png">
</div></div>

&ensp;

In our case, this set of points is actually just our thresholded image of a hand (and the external contour information):

<div style="display: flex; justify-content: center; align-items: center; text-align: center;"><div style="margin-top: 0.5em; margin-bottom: -0.3em; width: 35%;">
<img src="./src/img/CAP/CAP_2.png">
</div></div>

&ensp;

We can expect a general shape of our polygon to be something like this (notice that we’ll need to account for lines from the wrist):

<div style="display: flex; justify-content: center; align-items: center; text-align: center;"><div style="margin-top: 0.5em; margin-bottom: -0.3em; width: 35%;">
<img src="./src/img/CAP/CAP_3.png">
</div></div>

&ensp;

- First we will calculate the most extreme points (top, bottom, left, and right).
- We can then calculate their intersection and estimate that as the center of the hand
- Next we will calculate the distance for the point furthest away from the center

<div style="display: flex; justify-content: center; align-items: center; text-align: center;"><div style="margin-top: 0.5em; margin-bottom: -0.3em; width: 35%;">
<img src="./src/img/CAP/CAP_4.png">
</div></div>

&ensp;

Then using a ratio of that distance we create a circle. Any points outside of this circle and far away enough from the bottom, should be extended fingers.

<div style="display: flex; justify-content: center; align-items: center; text-align: center;"><div style="margin-top: 0.5em; margin-bottom: -0.3em; width: 35%;">
<img src="./src/img/CAP/CAP_5.png">
</div></div>

In [26]:
def count_fingers(thresholded, hand_segment):
    # global background

    # Calculate the convex hull of the hand segment
    conv_hull = cv2.convexHull(hand_segment)

    # grab the extreme points

    # Now the convex hull will have at least 4 most outward points, on the top, bottom, left, and right.
    # Let's grab those points by using argmin and argmax. Keep in mind, this would require reading the documentation
    # and understanding the general array shape returned by the conv hull.
    # We will be referencing the documentation as the implementation of the convex is complicated

    # Find the top, bottom, left , and right. Then make sure they are in tuple format
    top = tuple(conv_hull[conv_hull[:, :, 1].argmin()][0]) # tuple of X and Y coordinates
    bottom = tuple(conv_hull[conv_hull[:, :, 1].argmax()][0])
    left   = tuple(conv_hull[conv_hull[:, :, 0].argmin()][0])
    right  = tuple(conv_hull[conv_hull[:, :, 0].argmax()][0])

    # In theory, the center of the hand is half way between the top and bottom and halfway between left and right
    centerX = (left[0] + right[0]) // 2 # we want to make shure that it is an int
    centerY = (top[1] + bottom[1]) // 2

    # find the maximum euclidean distance between the center of the palm
    # and the most extreme points of the convex hull
    distance = pairwise.euclidean_distances([(centerX, centerY)], Y=[left, right, top, bottom])[0]

    # Grab the largest distance
    max_distance = distance.max()

    # Create a circle with 90% radius of the max euclidean distance
    radius = int(max_distance * 0.85)
    circumference = (2 * np.pi * radius)

    # Now grab an ROI of only that circle
    #  we only want the x and y, we do not care about the color channel
    circular_roi = np.zeros(thresholded.shape[:2], dtype="uint8")
    # circular_roi = np.zeros(thresholded[:2], dtype='uint8')

    # draw the circular ROI
    cv2.circle(circular_roi, (centerX, centerY), radius, 255, 10)

    # Using bit-wise AND with the cirle ROI as a mask.
    # This then returns the cut out obtained using the mask on the thresholded hand image.
    circular_roi= cv2.bitwise_and(thresholded, thresholded, mask=circular_roi)

    # grab all the contours in this circular roi image and hierarchy could be discarded with _
    image, contours, hierarchy = cv2.findContours(circular_roi.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)

    # count the no of points outside the circle
    # Finger count starts at 0
    count = 0

    # loop through the contours to see if we count any more fingers.
    for cnt in contours:

        # Bounding box of countour
        (x, y, w, h) = cv2.boundingRect(cnt)

        # Increment count of fingers based on two conditions:
        
        # 1. Contour region is not the very bottom of hand area (the wrist)
        out_of_wrist = ((centerY + (centerY * 0.25)) > (y + h))
                
        # 2. Number of points along the contour does not exceed 25% of the circumference of the circular ROI (otherwise we're counting points off the hand)
        limit_points = ((circumference * 0.25) > cnt.shape[0])

        if out_of_wrist and limit_points:
            count += 1

    return count

## <span style="color: yellowgreen;">5. </span>Bringing it all together.

**Running the program**:

In [29]:
cam = cv2.VideoCapture(0)

# Calculate the background and intialize a frame count
num_frames = 0

while True:
    # get the current frame
    ret, frame = cam.read()
    
    # flip the frame so that it is not the mirror view
    frame = cv2.flip(frame, 1)

    # clone the frame
    frame_copy = frame.copy()

    # Grab the ROI from the frame
    roi = frame_copy[roi_top:roi_bottom, roi_right:roi_left]

    # Apply grayscale and blur to ROI
    gray = cv2.cvtColor(roi, cv2.COLOR_BGR2GRAY)
    gray = cv2.GaussianBlur(gray, (7, 7), 0)

    # For the first 60 frames we will calculate the average of the background.
    # We will tell the user while this is happening
    if num_frames < 60:
        calculate_accumulated_avg(gray, accumulated_weight)

        if num_frames <= 59:
            cv2.putText(frame_copy, "WAIT! GETTING BACKGROUND AVG.", (200, 400), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 0, 255), 2)
            cv2.imshow("Finger Count",frame_copy)

    else:
        # now that we have the background, we can segment the hand.
        
        # segment the hand region
        hand = segment(gray)

        # First check if we were able to actually detect a hand
        if hand is not None:
            
            # unpack
            thresholded, hand_segment = hand

            # Draw contours around hand segment in live stream
            cv2.drawContours(frame_copy, [hand_segment + (roi_right, roi_top)], -1, (255, 0, 0), 3)

            # Count the fingers
            fingers = count_fingers(thresholded, hand_segment)

            # Display count
            cv2.putText(frame_copy, str(fingers), (70, 85), cv2.FONT_HERSHEY_SIMPLEX, 3, (0, 0, 255), 2)

            # Also display the thresholded image
            cv2.imshow("Thesholded", thresholded)
            
    # Draw ROI Rectangle on frame copy
    cv2.rectangle(frame_copy, (roi_left, roi_top), (roi_right, roi_bottom), (0, 0, 255), 5)

    # increment the number of frames for tracking
    num_frames += 1

    # Display the frame with segmented hand
    cv2.imshow("Finger Count", frame_copy)


    # Close windows with Esc
    k = cv2.waitKey(1) & 0xFF

    if k == 27:
        break

# Release the camera and destroy all the windows
cam.release()
cv2.destroyAllWindows()