# **<center><font style="color:rgb(100,109,254)">Module 3: Advance Gesture Controlled Shape/Object Manipulation </font></center>**

<center>
    <img src='https://drive.google.com/uc?export=download&id=1OGxEgnz1eeMKP-y9dvYtBfV1Zc5VR1to'>
    <a href='https://www.microsoft.com/en-us/hololens/developers'>HoloLens photo courtesy of Microsoft</a>
</center>


## **<font style="color:rgb(134,19,348)"> Module Outline </font>**

The module can be split into the following parts:


- ***Lesson 1:* Create a Basic Hand Paint Application** *(This Tutorial)*

- *Lesson 2: Add Adjustable Paint Color Functionality* 

- *Lesson 3: Draw Shapes/Objects utilizing Hand Gestures*

- *Lesson 4: Manipulate Shapes/Objects utilizing Hand Gestures*

**Please Note**, these Jupyter Notebooks are not for sharing; do read the Copyright message below the Code License Agreement section, which is in the last cell of this notebook.
-Taha Anwar

Alright, without further ado, let's dive in.

### **<font style="color:rgb(134,19,348)"> Import the Libraries</font>**

First, we will import the required libraries.

In [1]:
import cv2
import numpy as np
import mediapipe as mp
from collections import deque
from previous_lesson import (detectHandsLandmarks, recognizeGestures)

## **<font style="color:rgb(134,19,348)">Initialize the Hands Landmarks Detection Model</font>**

After that, we will need to initialize the **`mp.solutions.hands`** class and then set up the **`mp.solutions.hands.Hands()`** function with appropriate arguments and also initialize **`mp.solutions.drawing_utils`** class that is needed to visualize the detected landmarks, as we have been doing in the previous lessons. 

In [2]:
# Initialize the mediapipe hands class.
mp_hands = mp.solutions.hands

# Set up the Hands functions for videos.
hands = mp_hands.Hands(static_image_mode=False, max_num_hands=2, 
                       min_detection_confidence=0.8, min_tracking_confidence=0.8)

# Initialize the mediapipe drawing class.
mp_drawing = mp.solutions.drawing_utils

## **<font style="color:rgb(134,19,348)">Create a Function to Draw utilizing Hands Landmarks</font>**

Now we will create a function **`draw()`**, that will simply utilize the hand gestures recognized by the functions from the previous lessons to draw (with the ☝️ Gesture), erase (with the ✋ Gesture), and Clear all the drawings (with the 🤟 Gesture) from a `canvas` *(i.e. just an empty black image)*.

In [4]:
def draw(frame, canvas, current_gesture, hands_tips_positions, prev_coordinates, paint_color, brush_size=20, eraser_size=80):
    '''
    This function will draw, erase and clear a canvas based on different hand gestures.
    Args:
        frame:                A frame/image from the webcam feed.
        canvas:               A black image equal to the webcam feed size, to draw on.
        current_gesture:      The current gesture of the hand recognized using our gesture recognizer from a previous lesson.
        hands_tips_positions: A dictionary containing the landmarks of the tips of the fingers of a hand.
        prev_coordinates:     The hand brush x and y coordinates from the previous frame.
        paint_color:          The color to draw with, on the canvas.
        brush_size:           The size of the paint brush to draw with, on the canvas.
        eraser_size:          The size of the eraser to erase with, on the canvas.
    Returns:
        canvas: The black image with the intented drawings on it, in the paint color.
    '''
    
    # Get the hand brush previous x and y coordinates values (i.e. from the previous frame).
    prev_x, prev_y = prev_coordinates
    
    # Get the height and width of the frame of the webcam video.
    frame_height, frame_width, _ = frame.shape
    
     # Check if the current hand gesture is INDEX POINTING UP.
    if current_gesture == 'INDEX POINTING UP':

        # Write the current mode on the frame with the paint color.
        cv2.putText(img=frame, text='Paint Mode Enabled', org=(10, frame_height-30),
                    fontFace=cv2.FONT_HERSHEY_SIMPLEX, fontScale=1,
                    color=paint_color, thickness=2)

        # Get the x and y coordinates of tip of the index finger of the hand.
        x, y = hands_tips_positions['INDEX']

        # Check if x and y have valid values.
        # These will be none if the right hand was not detected in the frame.
        # This check will be necessary if you are checking gesture of a different hand and
        # want tips landmarks of the different one. But we are not doing that right now,
        # so if you want you can remove this check.
        if x and y:

            # Check if the previous x and y donot have valid values.
            if not(prev_x) and not(prev_y):

                # Set the previous x and y to the current x and y values.
                prev_x, prev_y = x, y

            # Draw a line on the canvas from previous x and y to the current x and y with the paint color 
            # and thickness equal to the brush_size.
            cv2.line(img=canvas, pt1=(prev_x, prev_y), pt2=(x, y), color=paint_color, thickness=brush_size)

            # Update the previous x and y to the current x and y values.
            prev_x, prev_y = x, y

        
    # Check if the current hand gesture is HIGH-FIVE.
    elif current_gesture == 'HIGH-FIVE':

        # Write the current mode on the frame.
        cv2.putText(img=frame, text='Erase Mode Enabled', org=(10, frame_height-30),
                    fontFace=cv2.FONT_HERSHEY_SIMPLEX, fontScale=1, 
                    color=paint_color, thickness=2)

        # Get the x and y coordinates of tip of the middle finger of the hand.
        x1, y = hands_tips_positions['MIDDLE'] 

        # Get the x coordinate of tip of the ring finger of the hand.
        x2, _ = hands_tips_positions['RING'] 

        # Check if the right hand was detected in the frame.
        if x1 and x2 and y:

            # Calculate the midpoint between tip x coordinate of the middle and ring finger
            x = (x1 + x2) // 2

            # Check if the previous x and y donot have valid values.
            if not(prev_x) and not(prev_y):

                # Set the previous x and y to the current x and y values.
                prev_x, prev_y = x, y

            # Draw a circle on the frame at the current x and y coordinates, equal to the eraser size.
            # This is drawn just to represent an eraser on the current x and y values.
            cv2.circle(img=frame, center=(x, y), radius=int(eraser_size/2), color=(255,255,255), thickness=-1)

            # Draw a black line on the canvas from previous x and y to the current x and y.
            # This will erase the paint between previous x and y and the current x and y.
            cv2.line(img=canvas, pt1=(prev_x, prev_y), pt2=(x, y), color=(0,0,0), thickness=eraser_size)

            # Update the previous x and y to the current x and y values.
            prev_x, prev_y = x, y
    
    # Check if the current hand gesture is SPIDERMAN.
    elif current_gesture == 'SPIDERMAN':

        # Write 'Clear Everything' on the frame.
        cv2.putText(img=frame, text='Clear Everything', org=(10, frame_height-30), fontFace=cv2.FONT_HERSHEY_SIMPLEX,
                    fontScale=1, color=paint_color, thickness=2)

        # Clear the canvas, by re-initializing it to a complete black image.
        canvas = np.zeros((frame_height, frame_width, 3), np.uint8)
    
    # Return the canvas along with the previous x and y coordinates.
    return canvas, (prev_x, prev_y)


And now that we have the **`draw()`** function, we will utilize it to paint with hand gestures on a webcam feed by copying the drawings from the `canvas` to the frames of the video (webcam feed) in real-time, you may be thinking that; why don't we directly draw on the frames. Well to get the intuition behind this, we first need to understand what exactly videos are. It’s no secret that a video is just a sequence of multiple still images (aka. frames) that are updated really fast creating the appearance of a motion. Consider the video (converted into .gif format) below of a cat jumping on a bookshelf, it is just a combination of 15 different still images that are being updated one after the other.

<center><img src='https://lh6.googleusercontent.com/y1-ePQlu3v8HKxuwPHBFxxg70iudUmRhwPiNWMKPKMMyoIl2TSz4jYjgGV_K0XGWrjrxanhIHRqlsgdLbyXFMx8bt1g2hQeQPzwLNW6tpUIr9mBq-53A4msRDBGEqSmsUgzksyC7=s0'></center>


So if we draw on a frame then that drawing will disappear from the webcam feed as soon as the frame is updated which normally happens in milliseconds. That's why we need a `canvas` to keep track of all the drawings that we want to visualize on a webcam feed.

In [4]:
# Initialize the VideoCapture object to read from the webcam.
camera_video = cv2.VideoCapture(0, cv2.CAP_DSHOW)


# Create named window for resizing purposes.
cv2.namedWindow('Hand Paint', cv2.WINDOW_NORMAL)

# Initialize variables to store previous x and y location.
# That are hand brush x and y coordinates in the previous frame.
prev_x = None 
prev_y = None

# Initialize a canvas to draw on.
canvas = np.zeros(shape=(int(camera_video.get(cv2.CAP_PROP_FRAME_HEIGHT)),
                         int(camera_video.get(cv2.CAP_PROP_FRAME_WIDTH)), 3),
                  dtype=np.uint8)

# Initialize a variable to store the color value.
paint_color = 0, 255, 0

# Initialize a variable to store the buffer length.
BUFFER_MAX_LENGTH = 10

# Initialize a buffer to store recognized gestures.
buffer = deque([], maxlen=BUFFER_MAX_LENGTH)

# Initialize a variable to store the hand label for gesture recognition.
hand_label = 'RIGHT'

# Initialize a variable to store the erase mode.
erase_mode = False

# Iterate until the webcam is accessed successfully.
while camera_video.isOpened():
   
    # Read a frame.
    ok, frame = camera_video.read()
    
    # Check if frame is not read properly then 
    # continue to the next iteration to read the next frame.
    if not ok:
        continue
    
    # Flip the frame horizontally for natural (selfie-view) visualization.
    frame = cv2.flip(frame, 1)

    # Get the height and width of the frame of the webcam video.
    frame_height, frame_width, _ = frame.shape
    
    # Perform Hands landmarks detection on the frame.
    frame, results = detectHandsLandmarks(frame, hands, draw=True, display=False)
    
    # Check if the hands landmarks in the frame are detected.
    if results.multi_hand_landmarks:
        
        # Perform a hand gesture recognition.
        # I have modified this recognizeGestures() function,
        # to return the fingers tips position of the both hands.
        current_gesture, hands_tips_positions = recognizeGestures(frame, results,
                                                                  hand_label, draw=False,
                                                                  display=False)
        # Check if a known gesture is recognized.
        if current_gesture != 'UNKNOWN':
            
            # Check if all the gestures stored in the buffer are equal to the current gesture.
            if all(current_gesture==gesture for gesture in buffer):
                
                # Append the current gesture into the buffer.
                buffer.append(current_gesture)
                
            # Otherwise.
            else:
                
                # Clear the buffer.
                buffer.clear()
            
            # Check if the length of the buffer is equal to the maxlength, that is 10.
            if len(buffer) == BUFFER_MAX_LENGTH:
                
                # Draw, Erase or Clear the canvas depending upon the current gesture.
                canvas, (prev_x, prev_y) = draw(frame, canvas, current_gesture,
                                                hands_tips_positions[hand_label],
                                                (prev_x, prev_y), paint_color)

            # Otherwise.
            else:

                # Reset, by updating the previous x and y values to None.
                # This is required to start a new drawing.
                prev_x, prev_y = None, None
               
    # Otherwise.
    else:
        
        # Clear the buffer.
        buffer.clear()
        
    # Write instructions to switch hand for gesture recognition on the frame.
    cv2.putText(img=frame, text=f'{hand_label} hand selected, press s to switch.',
                org=(10, 30), fontFace=cv2.FONT_HERSHEY_SIMPLEX, fontScale=1, color=paint_color,
                thickness=2)

    # Update the pixel values of the frame with the canvas's values at the indexes where canvas!=0
    # i.e. where canvas is not black and something is drawn there.
    # In short, this will copy the drawings from canvas to the frame.
    frame[np.mean(canvas, axis=2)!=0] = canvas[np.mean(canvas, axis=2)!=0]
    
    # Display the frame.
    cv2.imshow("Hand Paint", frame)
    
   # Wait for 1ms. If a key is pressed, retreive the ASCII code of the key.
    k = cv2.waitKey(1) & 0xFF
    
    # Check if 'ESC' is pressed and break the loop.
    if k == 27:
        break
    
    # Check if 's' key is pressed and switch the hand label.
    elif k == ord('s'):
        
        # Set gesture hand label to 'LEFT', if it was 'RIGHT',
        # Otherwise if it was 'LEFT', set it to 'RIGHT'.
        hand_label = 'LEFT' if hand_label == 'RIGHT' else 'RIGHT'
        
# Release the VideoCapture Object and close the windows.
camera_video.release()
cv2.destroyAllWindows()




# Additional comments:
#           - In summary, we use the hand gesture to draw and erase on
#             a black canvas equal to the size of our video screen.
#           - The program will keep recording the user's action. If the
#             set parameters has been met (i.e. Buffer size and finger positions),
#             then, the program will do an action.
#           - The user's action will affect the canvas. And in each frame, all
#             the pixels that are not black will be copied into the video frame.
#           - As for the drawing function, it will take the previous and current 
#             x and y coordinates of the index finger. Then, it will use that to draw
#             a line connecting the two points.

Awesome! the application is working as we intended. We can draw anything we want, but we have only one color option at a time, so let's change that in the next lesson.

### **<font style="color:rgb(255,140,0)"> Code License Agreement </font>**
```
Copyright (c) 2022 Bleedai.com

Feel free to use this code for your own projects commercial or noncommercial, these projects can be Research-based, just for fun, for-profit, or even Education with the exception that you’re not going to use it for developing a course, book, guide, or any other educational products.

Under *NO CONDITION OR CIRCUMSTANCE* you may use this code for your own paid educational or self-promotional ventures without written consent from Taha Anwar (BleedAI.com).

```
