First we installed and imported our dependencies and then we went and took a look at the landmark map for the hand model
We then went and actually made some detections from our webcam, rendered it, applied some custom coloring
and then we actually saved our output

1. Installing Mediapipe for python using pip install 
2. Detecting Hand poses from real time webcam feed
3. Outputting Images using opencv

In [15]:
%pip install mediapipe opencv-python

Note: you may need to restart the kernel to use updated packages.


In [16]:
import mediapipe as mp # Importing all the media pipe solutions - It's really quick - Hands model
import cv2 # Will be using this for connecting with webcam
import numpy as np # makes it easy to work with different types of outputs from your mediapipe model

# Will be used for outputs into jpg format
import uuid # This'll be used for naming our images 
            # This allows you to generate a UNIFORM UNIQUE IDENTIFIER or a random string which we're then able to use as our image name. 
            # This means we are not going to get any overlap when we actually capture our images
#import time
import os # Operating system - library for python which makes it easy working with different OS.

In [17]:
# Two media pipe Components

mp_drawing = mp.solutions.drawing_utils # Drawing Utilities - makes it easier for us to render all the different LANDMARKS in our hand. 
                                        # As output we'll get series of landmarks and there will be one landmark for each individual joint in your hand.
mp_hands = mp.solutions.hands # As we are using the hands model from mediapipe, so this hands model will bring all from the mediapipe sucessfully.

<img src=https://google.github.io/mediapipe/images/mobile/hand_landmarks.png />

In [18]:
#execution_path = os.getcwd()
cap = cv2.VideoCapture(0) # First we are getting our webcam feed and also we've passed through that we want our video capture device number 0

# max_num_hands for multiple hands 
with mp_hands.Hands(min_detection_confidence=0.8, min_tracking_confidence=0.5) as hands: #instantiating our meidapipe hands model and we are going to use this 
                                                                                         #using the 'with' statement
# There are two keyword arguments 1. min_detection_confidence = 80% 2. min_tracking_confidence = 50%, and we're going to reference this hands for further use.
# So about these two keyword arguments 
# When you first use the media pipe hands model, it's going to detect your hand and then from that particular image it's going to track the hand.
# So ideally this sets how accurate our model is going to be ie. Detecting with 80% and then tracking with 50% (Balanced)  
  
  # Confidence 
  # Detection : Threshold for the intial detection to be successful

  # Tracking : Threshold for tracking after intial detection.
  
  while cap.isOpened(): # reading through each frame within our video capture. While we are connected to our webcam the ret and frame variables are used 
                        # just extracting or unpacking the results from our cap.read function
    ret, frame = cap.read() # this frame variable is important because it's going to represent the image from our webcam

    # cvtColor() - This function allows us to recolor an image and then we're passing through our frame which we got from above and then we are specifying 
    # what color combination we want.
    image = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB) # With this we're going to be recoloring the frame that we get from our webcam - Converting from BGR to RGB
    # So when we get a feed from opencv by default the image color is going to be in the format BGR BUT in order to work with mediapipe we need to send that
    # image to the detection model as an RGB format (basically shifting around color arrays)

    
    # Flipping the image on horizontal : Important
    image = cv2.flip(image, 1) # this allows us to flip the image horizontaly

    # Set Flag
    image.flags.writeable = False
    
    # Important - This line is actually going ahead and making our detections and in which we've passed image whose flag is set to false and alrady being 
    # converted to RGB
    results = hands.process(image) # Detections - This actually makes our detections
    
    # Setting back to true and this is gonna allow us to render to this particular image or draw on this particular image.
    image.flags.writeable = True
    
    # Recoloring back to BGR
    # RGB 2 BGR
    image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)
    
    # Detections # Printing our detection results and which we'll allow us to see our detections
    print(results)
    # Here it'll be printing out the string and not actually anything rendered to the image

    # This following three lines will allow us to actually draw our landmarks to our image now.
    # Rendering results # Now we are actually going to render our results
    if results.multi_hand_landmarks: # First we're checking whether or not we've actually got any results in our results array or 
                                     # our results variable (just checking if anything is inthere and if not we're gonna skip our rendering)
      for num, hand in enumerate(results.multi_hand_landmarks): # we're looping through each one of our results inside of the multi-hand landmarks variable
                                                                # (enumerate is basically is used for more future proofing) 
        # mp_drawing.draw_landmarks(image, hand, mp_hands.HAND_CONNECTIONS) # (here we're going to loop through each set of results with the hand variable)
        # and then we are actually passing through mp_hands, so our image really going to be the image that we've been working with so far and that's the
        # image from our webcam hand at that particular point of time is going to be a set of landmarks from our results.multi_hand_landmarks and then our
        # mp_handconnections really represent the sets of coordinates.

        # OR alternatively

        # Now to change the color of lines and landmarks we use the following code.
        # For this we're going to use two drawing specs 
        # Those two are extra lines of code are just passed through our draw_landmarks function
        mp_drawing.draw_landmarks(image, hand, mp_hands.HAND_CONNECTIONS, 
          mp_drawing.DrawingSpec(color=(121, 22, 76), thickness=2, circle_radius=4), # and this also passes through three variable ie. 
          #the color variable (and in this it'll be in the format of (B,G,R) because we've already converted that image back to BGR befoer we go and render), 
          # the thickness variable(going to represent the line thickness ) and 
          # the circle_radius variable(in this case for landmark circls, it's going to change the size of the circle ).
          mp_drawing.DrawingSpec(color=(250, 44, 250), thickness=2, circle_radius=2) #121 at B for pink lines, 250 at B for purple
          # this would basically change the color of lines(the upper one)
          )

    # So before down here we were still rendering our frame and this is our raw webcam feed that we got from cap.read()
    # cv2.imshow('Hand Traking', frame) # render the image to the screen and also naming our frame. Here it passes two variables 1. name of the image or
    # output 2. the image that you want to show(frame)

    # BUT
    # What we actually gonna do is pass through our image which now had our different joints actually rendered to it
    cv2.imshow('Hand Tracking', image)
    # And this is all for gracefully closing down the window, once you are done with it.
    if cv2.waitKey(10) & 0xFF == ord('q'): # If you click Q or close the window then you are gonna break the loop and stop the feed
      break

cap.release() # Once its done, we effectively release our webcam
cv2.destroyAllWindows() # And closes down our frame


<class 'mediapipe.python.solution_base.SolutionOutputs'>
<class 'mediapipe.python.solution_base.SolutionOutputs'>
<class 'mediapipe.python.solution_base.SolutionOutputs'>
<class 'mediapipe.python.solution_base.SolutionOutputs'>
<class 'mediapipe.python.solution_base.SolutionOutputs'>
<class 'mediapipe.python.solution_base.SolutionOutputs'>
<class 'mediapipe.python.solution_base.SolutionOutputs'>
<class 'mediapipe.python.solution_base.SolutionOutputs'>
<class 'mediapipe.python.solution_base.SolutionOutputs'>
<class 'mediapipe.python.solution_base.SolutionOutputs'>
<class 'mediapipe.python.solution_base.SolutionOutputs'>
<class 'mediapipe.python.solution_base.SolutionOutputs'>
<class 'mediapipe.python.solution_base.SolutionOutputs'>
<class 'mediapipe.python.solution_base.SolutionOutputs'>
<class 'mediapipe.python.solution_base.SolutionOutputs'>
<class 'mediapipe.python.solution_base.SolutionOutputs'>
<class 'mediapipe.python.solution_base.SolutionOutputs'>
<class 'mediapipe.python.soluti

Okay, So we've rendered our images and drawn our results to the array.

But now i'll try flip the image on the horizontal beacause when we go and do more with this we actually want our frame to be flipped. So ideally i wanna able to detect our left hand and our right hand

So, we got our webcam feed and now we actually want to do is overlay onto this. So we're going to take the feed from our webcam, pass it to mediapipe, make detections and then render the results to the image which is at the moment is known as frame.

But before we pass it to the cv2.imshow, we are actually going to apply our different detections to that particular image. 

So ideally you'll get back not just a webcam feed but a webcam feed with all of those real time detections applied.

Now, 
we're going to set our image writable flag equal to false, so this just stops us to copying our image and ideally peformance tunes and we're going to make detections, set it back to writable equals true and then we're going to recolor it back to BGR to be able to render our results. 

In [19]:
# To know more about the drawing spec and the variables inside them , run the following command
# Drawing Spec is basically a Mediapipe class that allows you to customize the look of your detections. 
# In this case we use it to customize the landmarks and connections.
# mp_drawing.DrawingSpec??

In [20]:
# with this we'll get 5 different variables image, landmark_list(hand, hand detections or hand landmarks in our code), 
# connections list (which is our HAND_CONNECTION in our code), landmark drawing spec(represents color for your joints or your landmarks) 
# and a connection drawn spec(represents the color of the lines)
# Use this down to know more without hastag
# mp_drawing.draw_landmarks??

In [21]:
#results

# Pretty Fast with rendering 

In [22]:
# This will get us the results from the last frame detected
#results.multi_hand_landmarks

# And as our result we've got all our landmarks 

# Coordinates X(X axis) : Landmark position in the horizonal axis, Y(Y axis) : Landmark position in the vertical axis, 
# z(estimated distance from the camera ) : Landmark depth from the camera

<img src=https://google.github.io/mediapipe/images/mobile/hand_landmarks.png />

In [23]:
# So now this represents our hand connections 
# For eg.  (0,1) means Wrist(0) is connected to THUMB_CMC(1) and it basically shows you the set of relationships OR basically allows you to draw the
# connections
mp_hands.HAND_CONNECTIONS

frozenset({(0, 1),
           (0, 5),
           (0, 17),
           (1, 2),
           (2, 3),
           (3, 4),
           (5, 6),
           (5, 9),
           (6, 7),
           (7, 8),
           (9, 10),
           (9, 13),
           (10, 11),
           (11, 12),
           (13, 14),
           (13, 17),
           (14, 15),
           (15, 16),
           (17, 18),
           (18, 19),
           (19, 20)})

3. Output these images 

In [2]:
os.mkdir('Output images')

In [26]:
cap = cv2.VideoCapture(0)

with mp_hands.Hands(min_detection_confidence=0.8, min_tracking_confidence=0.5) as hands:
  
  while cap.isOpened(): 
    ret, frame = cap.read()

    image = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB) 
    
    # Flip to horizontal 
    image = cv2.flip(image, 1)

    # Set Flag
    image.flags.writeable = False
    
    # Important - This line is actually going ahead and making our detections and in which we've passed image whose flag is set to false and already
    # being converted to RGB
    results = hands.process(image) # Detections - This actually makes our detections
    
    # Setting back to true and this is gonna allow us to render to this particular image or draw on this particular image.
    image.flags.writeable = True
    
    # Recoloring back to BGR
    # RGB 2 BGR
    image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)
    
    # Detections # Printing our detection results and which we'll allow us to see our detections
    print(results)
    
    if results.multi_hand_landmarks:
      for num, hand in enumerate(results.multi_hand_landmarks):        
        mp_drawing.draw_landmarks(image, hand, mp_hands.HAND_CONNECTIONS, 
          mp_drawing.DrawingSpec(color=(121, 22, 76), thickness=2, circle_radius=4),
          mp_drawing.DrawingSpec(color=(250, 44, 250), thickness=2, circle_radius=2)
          )

    cv2.imwrite(os.path.join('Output images', '{}.jpg'.format(uuid.uuid1())), image)
    # This line of code below is going to save our image
    # cv2.imwrite( # we are using this imwrite method to be able to go and write out our image and we pass through cv2
      # os.path.join( # this is where we define our path ie. where our output images will be
        # 'Output images', # storing in this folder and then we'll give a file name down
         # '{}.jpg'.format(uuid.uuid1())), # This line of code is actually 'naming our image' and we are doing some string formatting 
          # image)

    cv2.imshow('Hand Tracking', image)
    
    if cv2.waitKey(10) & 0xFF == ord('q'):
      break

cap.release()
cv2.destroyAllWindows()


<class 'mediapipe.python.solution_base.SolutionOutputs'>
<class 'mediapipe.python.solution_base.SolutionOutputs'>
<class 'mediapipe.python.solution_base.SolutionOutputs'>
<class 'mediapipe.python.solution_base.SolutionOutputs'>
<class 'mediapipe.python.solution_base.SolutionOutputs'>
<class 'mediapipe.python.solution_base.SolutionOutputs'>
<class 'mediapipe.python.solution_base.SolutionOutputs'>
<class 'mediapipe.python.solution_base.SolutionOutputs'>
<class 'mediapipe.python.solution_base.SolutionOutputs'>
<class 'mediapipe.python.solution_base.SolutionOutputs'>
<class 'mediapipe.python.solution_base.SolutionOutputs'>
<class 'mediapipe.python.solution_base.SolutionOutputs'>
<class 'mediapipe.python.solution_base.SolutionOutputs'>
<class 'mediapipe.python.solution_base.SolutionOutputs'>
<class 'mediapipe.python.solution_base.SolutionOutputs'>
<class 'mediapipe.python.solution_base.SolutionOutputs'>
<class 'mediapipe.python.solution_base.SolutionOutputs'>
<class 'mediapipe.python.soluti

In [25]:
# If i try UUID then it'll create a different name for images 
# for eg: UUID('a013f1b4-8b5d-11ec-9c50-f8da0c2086ac')
# Run this down several times to get unique names
# uuid.uuid1()


# '{}.jpg'.format(uuid.uuid1()) ---> # 'ecfd5558-8b5d-11ec-9917-f8da0c2086ac.jpg'
# So basically this above line is going to go and append a unique identifier.jpg to be able to go and name our file 
# and the reason we do this so that we dont have any conflicts when we go and output our image becoz we want it to be a unique identifier

# os.path.join('Output images', '{}.jpg'.format(uuid.uuid1())) ----> 'Output images\\c7a4530a-8b5e-11ec-b622-f8da0c2086ac.jpg'
# So above basically specifies that we're going to be outputting our image to the folder output images and
# then we're passing through the name of our image with .jpg which is being randomly generated on the flight 
# because i'm on a windows machine 'the file path is a double backslash but on mac or linux it'll be forward slash'


# cv2.imwrite(os.path.join('Output images', '{}.jpg'.format(uuid.uuid1())), image) - will give us TRUE as output 
# This is the first parameter that we pass through is the file path to our image -> os.path.join('Output images', '{}.jpg'.format(uuid.uuid1())) and 
# Second parameter is our actual image itself - > image