# Purpose of This Notebook  
1. We will come up with the name of our hand poses we want detected.
1. Take pictures to train our model found in our `02_Training_The_model` notebook.
1. Take pictures to test our model on different images not found in step 2

## Table Of Content

1. [Imports](#imports)
1. [Preparing to Take the Images](#prep)
1. [Taking the Pictures for Training](#ready)
1. [Taking Validation Pictures](#val)

## Imports<span id='imports'>
---
`mediapipe`: What will be drawing our hands for easier recognition by models  
`uuid`: Used to create unique id labels for all of our images  
`time`: To set a timer for how long we will take pictures for  
`cv2`: OpenCv is what's used to access the webcam   
`os`: used to add images to the folders we will make

In [1]:
import mediapipe as mp
import uuid
import time
import cv2 
import os

<span id='prep'></span>
## Preparing for the Loop
---

#### Create Labels for each hand pose, and where we want to save the images

In [2]:
# The Hand poses we want to track
labels = ['play_pause', 'forward', 'rewind', 'volume_up', 'volume_down', 'screenshot']

# path to what folder it will be saved in
image_path = '../imgs/collected/'

**The `collected` folder should start off empty. Folders named after the labels will be created automatically**
<img src='./notebook_imgs/collected_folder.png'>

### Instantiate MediaPipe to draw the hands.   
<img src = https://google.github.io/mediapipe/images/mobile/hand_landmarks.png width="650">  

> Will look similar to this but without the numbers on there

In [4]:
mp_hands = mp.solutions.hands
mp_drawing = mp.solutions.drawing_utils
hands = mp_hands.Hands(
    max_num_hands=1,
    min_detection_confidence=0.7, # need a 70% confidence of hands to show up
    min_tracking_confidence=0.7) # 70% confidence needed to keep tracking

<span id = 'ready'></span>
## Put everything together in a `While Loop` to take all the pictures
---

In [4]:
# because sleep is set for 1 sec, should get AROUND same number of pictures as we set this for
secs_for_action = 150

# instantiate openCV
cap = cv2.VideoCapture(0)

# open up my webcam 
while cap.isOpened():
    for idx, label in enumerate(labels):
        try:
            !mkdir {'..\imgs\collected\\'+label}
        except:
            pass
        print(f'Collecting images for {label}')
        
        ret, img = cap.read()
        
        # tell us what action we're going to be a taking a picture for.
        cv2.putText(img, f'Waiting for collecting {label.upper()} action...', org=(10, 30), fontFace=cv2.FONT_HERSHEY_SIMPLEX, fontScale=1, color=(255, 255, 255), thickness=2)
        cv2.imshow('Collecting Images', img)
        cv2.waitKey(3000)
        
        # go for as long as we set to take pictures for
        start_time = time.time()
        while time.time() - start_time < secs_for_action:
            ret, img = cap.read()
            
            # Detections
            image = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) # convert image from BGR to RGB to work with mediapipe
            image = cv2.flip(image, 1) # flip on horizontal
            image.flags.writeable = False    # set flag to False
            results = hands.process(image)   # actually makes the detections
            image.flags.writeable = True     # set flag back to True
            image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)  # set color back to BRG
            
            # Drawing landmarks to the images
            if results.multi_hand_landmarks:
                for num, hand in enumerate(results.multi_hand_landmarks):
                    mp_drawing.draw_landmarks(image, hand, mp_hands.HAND_CONNECTIONS,
                                             mp_drawing.DrawingSpec(color=(51, 51, 255), thickness = 2, circle_radius=2),
                                             mp_drawing.DrawingSpec(color=(0, 0, 0), thickness = 2, circle_radius=2))

                    
            
            image_name = os.path.join(image_path, label, label+'.'+f'{str(uuid.uuid1())}.jpg')
            cv2.imwrite(image_name, image)
            cv2.imshow('Collecting Images', image)
            
            time.sleep(1)
            
            # press q to break out
            if cv2.waitKey(1) & 0xFF == ord('q'):
                break
        
        # once we do all 6 labels, lets break out
        if idx == (len(labels) - 1):
            cap.release()
            cv2.destroyAllWindows()

A subdirectory or file .\imgs\collected\play_pause already exists.


Collecting images for play_pause


A subdirectory or file .\imgs\collected\forward already exists.


Collecting images for forward


A subdirectory or file .\imgs\collected\rewind already exists.


Collecting images for rewind
Collecting images for volume_up


A subdirectory or file .\imgs\collected\volume_up already exists.
A subdirectory or file .\imgs\collected\volume_down already exists.


Collecting images for volume_down


A subdirectory or file .\imgs\collected\screenshot already exists.


Collecting images for screenshot


<span id = 'val'></span>
## Getting validation images to test our finished model on new unseen images
---

In [1]:
image_path = '../imgs/collected/test/'

In [6]:
# because sleep is set for 1 sec, should get AROUND same number of pictures as we set this for
secs_for_action = 3

# instantiate openCV
cap = cv2.VideoCapture(0)

# open up my webcam 
while cap.isOpened():
    for idx, label in enumerate(labels):
        print(f'Collecting images for {label}')
        
        ret, img = cap.read()
        
        # tell us what action we're going to be a taking a picture for.
        cv2.putText(img, f'Waiting for collecting {label.upper()} action...', org=(10, 30), fontFace=cv2.FONT_HERSHEY_SIMPLEX, fontScale=1, color=(255, 255, 255), thickness=2)
        cv2.imshow('Collecting Images', img)
        cv2.waitKey(3000)
        
        # go for as long as we set to take pictures for
        start_time = time.time()
        while time.time() - start_time < secs_for_action:
            ret, img = cap.read()
            
            # Detections
            image = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) # convert image from BGR to RGB to work with mediapipe
            image = cv2.flip(image, 1) # flip on horizontal
            image.flags.writeable = False    # set flag to False
            results = hands.process(image)   # actually makes the detections
            image.flags.writeable = True     # set flag back to True
            image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)  # set color back to BRG
            
            # Drawing landmarks to the images
            if results.multi_hand_landmarks:
                for num, hand in enumerate(results.multi_hand_landmarks):
                    mp_drawing.draw_landmarks(image, hand, mp_hands.HAND_CONNECTIONS,
                                             mp_drawing.DrawingSpec(color=(51, 51, 255), thickness = 2, circle_radius=2),
                                             mp_drawing.DrawingSpec(color=(0, 0, 0), thickness = 2, circle_radius=2))

                    
            
            image_name = os.path.join(image_path, label+'.'+f'{str(uuid.uuid1())}.jpg')
            cv2.imwrite(image_name, image)
            cv2.imshow('Collecting Images', image)
            
            time.sleep(1)
            
            # press q to cancel 
            if cv2.waitKey(1) & 0xFF == ord('q'):
                break
        
        # once we do all 6 labels, lets break out
        if idx == (len(labels) - 1):
            cap.release()
            cv2.destroyAllWindows()

Collecting images for picture_1
Collecting images for picture_2
Collecting images for picture_3
Collecting images for picture_4
Collecting images for picture_5
Collecting images for picture_6
