# Real-time Hand Gesture Recognition using OpenCV & MediaPipe

Gesture recognition is an active research field in Human-Computer Interaction technology. It has many applications in virtual environment control and sign language translation, robot control, or music creation. In this machine learning project on Hand Gesture Recognition, we are going to make a real-time Hand Gesture Recognizer using the MediaPipe framework and OpenCV in Jupyter.

# What is OpenCV?

OpenCV is a great tool for image processing and performing computer vision tasks. It is an open-source library that can be used to perform tasks like face detection, objection tracking, landmark detection, and much more.

# What is MediaPipe?

MediaPipe is a customizable machine learning solutions framework developed by Google. It is an open-source and cross-platform framework, and it is very lightweight. MediaPipe comes with some pre-trained ML solutions such as face detection, pose estimation, hand recognition, object detection, etc.

We’ll first use MediaPipe to recognize the hand and the hand key points. MediaPipe returns a total of 21 key points for each detected hand.

These key points will be fed into a pre-trained gesture recognizer network to recognize the hand pose.
![image.png](attachment:image.png)

Prerequisites for this project:

1. OpenCV – 4.5
Run “pip install opencv-python” to install OpenCV.

2. MediaPipe – 0.8.5
Run “pip install mediapipe” to install MediaPipe.


# Steps For The Project

1. Import necessary packages.

2. Checking Webcam.

3. Performing Hand Recognition Algorithm.

4. Detect hand keypoints.

5. Recognize hand gestures.

Step 1 – Import necessary packages:

In [1]:
import cv2
import mediapipe as mp
import time

Step 2: Checking Webcam

q=01110001

0xFF=11111111

Sol =01110001 

In [2]:
cap = cv2.VideoCapture(0)

while True:
    success, image = cap.read()
    print(success)
    print(image)
    
    #cv2.imshow() shows frame on a new openCV window.
    cv2.imshow("Hand Tracking", image)
    
    #The cv2.waitKey() function keeps the window open until the key ‘q’ is pressed.
    if cv2.waitKey(10) & 0xFF == ord('q'):
        break
        
cap.release()
cv2.destroyAllWindows()

True
[[[ 99  56  46]
  [ 99  56  46]
  [ 96  57  49]
  ...
  [139  97  80]
  [145  95  75]
  [153  95  71]]

 [[103  58  48]
  [103  58  48]
  [ 98  57  49]
  ...
  [142  97  79]
  [144  95  75]
  [149  96  74]]

 [[105  58  48]
  [106  59  49]
  [102  56  51]
  ...
  [144  98  77]
  [143  95  75]
  [144  95  75]]

 ...

 [[ 54  34  41]
  [ 54  34  41]
  [ 56  34  42]
  ...
  [ 77  43  37]
  [ 80  41  37]
  [ 81  38  36]]

 [[ 56  35  45]
  [ 56  35  45]
  [ 56  34  47]
  ...
  [ 80  41  38]
  [ 83  39  37]
  [ 85  36  35]]

 [[ 56  31  40]
  [ 57  32  41]
  [ 58  34  46]
  ...
  [ 80  39  37]
  [ 84  39  38]
  [ 88  38  38]]]
True
[[[132  82  73]
  [128  77  68]
  [132  81  66]
  ...
  [181 128 110]
  [188 126 104]
  [194 122  97]]

 [[130  84  78]
  [128  82  76]
  [132  85  76]
  ...
  [180 128 110]
  [185 127 107]
  [190 126 103]]

 [[127  85  81]
  [126  84  80]
  [128  85  82]
  ...
  [176 128 110]
  [175 126 106]
  [177 127 105]]

 ...

 [[102  57  38]
  [105  59  40]
  [107  60

Step 3: Mp.solution.hands module performs the hand recognition algorithm. So we create the object and store it in mpHands.

Using mpHands.Hands method we configured the model.

In [3]:
mp_hands = mp.solutions.hands

hands = mp_hands.Hands()

#Convert BGR image to RGB    
#MediaPipe works with RGB images but OpenCV reads images in BGR format.
#So, using cv2.cvtCOLOR() function we convert the frame to RGB format.

img_RGB = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

results = hands.process(img_RGB)

In [4]:
cap = cv2.VideoCapture(0)
while True:
    success, image = cap.read()
    
    #Extracting Information from the object results if the object is detected or not    
    print(results)
    
    cv2.imshow("Hand Tracking", image)
    if cv2.waitKey(10) & 0xFF == ord('q'):
        break
cap.release()
cv2.destroyAllWindows()

<class 'mediapipe.python.solution_base.SolutionOutputs'>
<class 'mediapipe.python.solution_base.SolutionOutputs'>
<class 'mediapipe.python.solution_base.SolutionOutputs'>
<class 'mediapipe.python.solution_base.SolutionOutputs'>
<class 'mediapipe.python.solution_base.SolutionOutputs'>
<class 'mediapipe.python.solution_base.SolutionOutputs'>
<class 'mediapipe.python.solution_base.SolutionOutputs'>
<class 'mediapipe.python.solution_base.SolutionOutputs'>
<class 'mediapipe.python.solution_base.SolutionOutputs'>
<class 'mediapipe.python.solution_base.SolutionOutputs'>
<class 'mediapipe.python.solution_base.SolutionOutputs'>
<class 'mediapipe.python.solution_base.SolutionOutputs'>
<class 'mediapipe.python.solution_base.SolutionOutputs'>
<class 'mediapipe.python.solution_base.SolutionOutputs'>
<class 'mediapipe.python.solution_base.SolutionOutputs'>
<class 'mediapipe.python.solution_base.SolutionOutputs'>
<class 'mediapipe.python.solution_base.SolutionOutputs'>
<class 'mediapipe.python.soluti

 Step 4: Check if the Hand is Being Detected or not

In [5]:
cap = cv2.VideoCapture(0)
while True:
    success, image = cap.read()
    
    img_RGB = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
    
    results = hands.process(img_RGB)
    #Printing coordinate values of x, y, z
    print(results.multi_hand_landmarks)
    
    cv2.imshow("Hand Tracing", image)
    if cv2.waitKey(10) & 0xFF == ord('q'):
        break
cap.release()
cv2.destroyAllWindows()

None
None
None
None
None
None
None
None
None
None
None
None
None
[landmark {
  x: 0.8999204635620117
  y: 0.9107288122177124
  z: 2.075264688983225e-07
}
landmark {
  x: 0.8438243865966797
  y: 0.8520198464393616
  z: -0.014971405267715454
}
landmark {
  x: 0.8039233684539795
  y: 0.7693783640861511
  z: -0.020920997485518456
}
landmark {
  x: 0.7827864289283752
  y: 0.7040078639984131
  z: -0.02681679278612137
}
landmark {
  x: 0.7734858989715576
  y: 0.6484040021896362
  z: -0.031692251563072205
}
landmark {
  x: 0.8428060412406921
  y: 0.6856458187103271
  z: -0.00012796284863725305
}
landmark {
  x: 0.8468336462974548
  y: 0.6121425628662109
  z: -0.011247404851019382
}
landmark {
  x: 0.851070761680603
  y: 0.5741595029830933
  z: -0.022950895130634308
}
landmark {
  x: 0.8563542366027832
  y: 0.5404295921325684
  z: -0.03194216638803482
}
landmark {
  x: 0.8855924010276794
  y: 0.6856672167778015
  z: -0.00403266865760088
}
landmark {
  x: 0.8981090188026428
  y: 0.60200691223144

Step 5: Recognize hand gestures.

In [1]:
cap = cv2.VideoCapture(0)
hands = mp_hands.Hands()

#Drawing Points
mp_draw = mp.solutions.drawing_utils

#Frame Rate
p_time = 0    #Present time
c_time = 0    #Current time

while True:
    success, image = cap.read()      
    img_RGB = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
    results = hands.process(img_RGB)
    
    if results.multi_hand_landmarks:
        for hand_landmarks in results.multi_hand_landmarks:
            for id, lm in enumerate(hand_landmarks.landmark):
                
                #height, width, and channels (h, w, c) of the image
                h, w, c = image.shape
                
                #converted the circle values (cx, cy) to integers
                cx, cy = int(lm.x*w), int(lm.y*h)
                #print(id, cx, cy)
                
                #High light display fingertips
                #Drawing circle
                if id == 0:
                    cv2.circle(image, (cx, cy), 15, (255, 255, 0), cv2.FILLED)
                if id == 4:
                    cv2.circle(image, (cx, cy), 15, (255, 255, 0), cv2.FILLED)
                if id == 8:
                    cv2.circle(image, (cx, cy), 15, (255, 255, 0), cv2.FILLED)
                if id == 12:
                    cv2.circle(image, (cx, cy), 15, (255, 255, 0), cv2.FILLED)
                if id == 16:
                    cv2.circle(image, (cx, cy), 15, (255, 255, 0), cv2.FILLED)
                if id == 20:
                    cv2.circle(image, (cx, cy), 15, (255, 255, 0), cv2.FILLED)
                
                
                
            mp_draw.draw_landmarks(image, hand_landmarks, mp_hands.HAND_CONNECTIONS)
    
    #Current Time 
    c_time = time.time()
    fps = 1/(c_time-p_time)
    p_time = c_time
    #Display FPS
    cv2.putText(image, str(int(fps)), (10, 70), cv2.FONT_HERSHEY_COMPLEX, 3, (255, 0, 255), 3)
    
    cv2.imshow("Hand Tracking", image)
    if cv2.waitKey(10) & 0xFF == ord('q'):
        break
cap.release()
cv2.destroyAllWindows()

<IPython.core.display.Javascript object>

NameError: name 'mp_hands' is not defined

In [None]:
#Missing performance measure matric
#Graph Plotting is essential in project
#More than two algorithms must be used in project