# Gesture Recognition Based on MediaPipe

This section introduces how to implement gesture recognition using MediaPipe + OpenCV.

## What is MediaPipe?

MediaPipe is an open-source framework developed by Google for building machine learning-based multimedia processing applications. It provides a set of tools and libraries for processing video, audio, and image data, and applies machine learning models to achieve various functionalities such as pose estimation, gesture recognition, and face detection. MediaPipe is designed to offer efficient, flexible, and easy-to-use solutions, enabling developers to quickly build a variety of multimedia processing applications.

## Preparation

Since the product automatically runs the main program at startup, which occupies the camera resource, this tutorial cannot be used in such situations. You need to terminate the main program or disable its automatic startup before restarting the robot.

It's worth noting that because the robot's main program uses multi-threading and is configured to run automatically at startup through crontab, the usual method sudo killall python typically doesn't work. Therefore, we'll introduce the method of disabling the automatic startup of the main program here.

### Terminate the Main Program

1. Click the "+" icon next to the tab for this page to open a new tab called "Launcher."
2. Click on "Terminal" under "Other" to open a terminal window.
3. Type bash into the terminal window and press Enter.
4. Now you can use the Bash Shell to control the robot.
5. Enter the command: `sudo killall -9 python`.

## Example

The following code block can be run directly:

1. Select the code block below.
2. Press Shift + Enter to run the code block.
3. Watch the real-time video window.
4. Press `STOP` to close the real-time video and release the camera resources.

### If you cannot see the real-time camera feed when running:

- Click on Kernel -> Shut down all kernels above.
- Close the current section tab and open it again.
- Click `STOP` to release the camera resources, then run the code block again.
- Reboot the device.

### Features of this Section

When the code block runs successfully, you can place your hand in front of the camera, and the real-time video frame will display annotations indicating the joints of the hand. These annotations will change with the movement of your hand, and the positions of each joint will be outputted as well, facilitating further development for gesture control.

MediaPipe's gesture recognition process uses different names to correspond to different joints. You can retrieve the position information of a joint by calling its corresponding number.

#### MediaPipe Han
d
1.WRIST

2.THUMB_CMC

3.THUMB_MCP

4.THUMB_IP

5.THUMB_TIP

6.INDEX_FINGER_MCP

7.INDEX_FINGER_PIP

8.INDEX_FINGER_DIP

9.INDEX_FINGER_TIP

10.MIDDLE_FINGER_MCP

11.MIDDLE_FINGER_PIP

12.MIDDLE_FINGER_DIP

13.MIDDLE_FINGER_TIP

14.RING_FINGER_MCP

15.RING_FINGER_PIP

16.RING_FINGER_DIP

17.RING_FINGER_TIP

18.PINKY_MCP

19.PINKY_PIP

20.PINKY_DIP

21.PINKY_TIP

In [None]:
import cv2  # Import the OpenCV library for image processing
import imutils, math  # Auxiliary libraries for image processing and mathematical operations
from IPython.display import display, Image  # Library for displaying images in Jupyter Notebook
import ipywidgets as widgets  # Library for creating interactive widgets such as buttons
import threading  # Library for creating new threads to execute tasks asynchronously
import mediapipe as mp  # Import the MediaPipe library for hand keypoint detection

# Create a "Stop" button that allows the user to stop the video stream by clicking on it
# ================
stopButton = widgets.ToggleButton(
    value=False,
    description='Stop',
    disabled=False,
    button_style='danger',  # Button style: 'success', 'info', 'warning', 'danger', or ''
    tooltip='Description',
    icon='square'  # FontAwesome icon name (without the `fa-` prefix)
)

# Initialize MediaPipe drawing utilities and hand keypoint detection model
mpDraw = mp.solutions.drawing_utils
mpHands = mp.solutions.hands
hands = mpHands.Hands(max_num_hands=1)  # Initialize the hand keypoint detection model to detect up to one hand

# Define the display function to process video frames and perform hand keypoint detection
def view(button):
    camera = cv2.VideoCapture(-1) 
    camera.set(cv2.CAP_PROP_FRAME_WIDTH, 640)
    camera.set(cv2.CAP_PROP_FRAME_HEIGHT, 480)
    
    display_handle=display(None, display_id=True)  # Create a display handle to update the displayed image
    
    while True:
        # frame = picam2.capture_array()
        _, frame = camera.read()
        # frame = cv2.flip(frame, 1) # If your camera reverses your image

        img = cv2.cvtColor(frame, cv2.COLOR_RGB2BGR)
        results = hands.process(img)

        # If hand keypoints are detected
        if results.multi_hand_landmarks:
            for handLms in results.multi_hand_landmarks:  # Iterate through each detected hand
                # Draw hand keypoints
                for id, lm in enumerate(handLms.landmark):
                    h, w, c = img.shape
                    cx, cy = int(lm.x * w), int(lm.y * h)  # Calculate the position of the keypoint in the image
                    cv2.circle(img, (cx, cy), 5, (255, 0, 0), -1)  # Draw a circle at the keypoint position

                
                frame = cv2.cvtColor(frame, cv2.COLOR_RGB2BGR)
                mpDraw.draw_landmarks(frame, handLms, mpHands.HAND_CONNECTIONS)  # Draw hand skeleton connections
                frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB) 

                target_pos = handLms.landmark[mpHands.HandLandmark.INDEX_FINGER_TIP]

        _, frame = cv2.imencode('.jpeg', frame)
        display_handle.update(Image(data=frame.tobytes()))
        if stopButton.value==True:
            # picam2.close() # If yes, close the camera
            cv2.release() # If yes, close the camera
            display_handle.update(None)

# Display the "Stop" button and start a thread to execute the display function
# ================
display(stopButton)
thread = threading.Thread(target=view, args=(stopButton,))
thread.start()