# Desafio 2 Hand Talk - Reconhecimento de Ações em Vídeo + Desafio Bônus (Real Time Classification).

## Objetivo

Você precisará criar um sistema que reconhece determinadas ações em um vídeo, escolha pelo menos 20 ações diferentes (quantidade de classes). Utilize qualquer base de dados disponível na web, mas o seu sistema terá que ser validado, então certifique-se de que ele seja capaz de reconhecer a ação de qualquer entrada de vídeo nova, inclusive de uma webcam.

## Desafio bônus

Uma vez cumprido o desafio, nós trazemos para você uma desafio bônus, aquele que não é obrigatório fazer, maaaaas irá encher os olhos do avalador com uma ⭐. Então bora lá…. o desafio bônus é: que o seu sistema seja capaz de reconhecer ações em **tempo real**.

## Requisitos

- Python 3+
- Tensorflow 2.x
- Necessário processar o frame com Mediapipe Holistic antes de enviar para o modelo.

### Importanto bibliotecas e pacotes

In [4]:
# Importando bibliotecas
import pandas as pd
import numpy as np
import tensorflow as tf
import mediapipe as mp
import os
import csv
import cv2 as cv
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
from keras.utils import to_categorical

In [5]:
# Tratando exibição de valores pelo NumPy.
np.set_printoptions(precision=3, suppress=True)

### Vamos definir uma função para extrair features das imagens (coordenadas do esqueleto da mão/ Hand Landmarks) com o MediaPipe, para posteriormente treinar o modelo.

In [6]:
# Definindo função para extrair Features das imagens
def extract_feature(input_image):
    mp_hands = mp.solutions.hands
    mp_drawing = mp.solutions.drawing_utils 
    image = cv.imread(input_image)
    with mp_hands.Hands(static_image_mode=True, max_num_hands=2, min_detection_confidence=0.1) as hands:
        while True:
            results = hands.process(cv.flip(cv.cvtColor(image, cv.COLOR_BGR2RGB), 1))
            image_height, image_width, _ = image.shape
           
            # Desenhar Landmarks das mãos
            if not results.multi_hand_landmarks:
                # Usamos esse pedaço do código para setar as landmarks para zero se não houver mãos na tela
                
                # Pulso
                wristX = 0
                wristY = 0
                wristZ = 0
                
                # Dedão
                thumb_CmcX = 0
                thumb_CmcY = 0
                thumb_CmcZ = 0
                
                thumb_McpX = 0
                thumb_McpY = 0
                thumb_McpZ = 0
                
                thumb_IpX = 0
                thumb_IpY = 0
                thumb_IpZ = 0
                
                thumb_TipX = 0
                thumb_TipY = 0
                thumb_TipZ = 0

                # Indicador
                index_McpX = 0
                index_McpY = 0
                index_McpZ = 0
                
                index_PipX = 0
                index_PipY = 0
                index_PipZ = 0
                
                index_DipX = 0
                index_DipY = 0
                index_DipZ = 0
                
                index_TipX = 0
                index_TipY = 0
                index_TipZ = 0

                # Dedo médio
                middle_McpX = 0
                middle_McpY = 0
                middle_McpZ = 0
                
                middle_PipX = 0
                middle_PipY = 0
                middle_PipZ = 0
                
                middle_DipX = 0
                middle_DipY = 0
                middle_DipZ = 0
                
                middle_TipX = 0
                middle_TipY = 0
                middle_TipZ = 0

                # Anelar
                ring_McpX = 0
                ring_McpY = 0
                ring_McpZ = 0
                
                ring_PipX = 0
                ring_PipY = 0
                ring_PipZ = 0
                
                ring_DipX = 0
                ring_DipY = 0
                ring_DipZ = 0
                
                ring_TipX = 0
                ring_TipY = 0
                ring_TipZ = 0

                # Mindinho
                pinky_McpX = 0
                pinky_McpY = 0
                pinky_McpZ = 0
                
                pinky_PipX = 0
                pinky_PipY = 0
                pinky_PipZ = 0
                
                pinky_DipX = 0
                pinky_DipY = 0
                pinky_DipZ = 0
                
                pinky_TipX = 0
                pinky_TipY = 0
                pinky_TipZ = 0
                
                # "Setar" a variável para a imagem anotada para o valor zero.
                annotated_image = 0

                # Retornar todo o Landmark e a imagem anotada
                return (wristX, wristY, wristZ,
                        thumb_CmcX, thumb_CmcY, thumb_CmcZ,
                        thumb_McpX, thumb_McpY, thumb_McpZ,
                        thumb_IpX, thumb_IpY, thumb_IpZ,
                        thumb_TipX, thumb_TipY, thumb_TipZ,
                        index_McpX, index_McpY, index_McpZ,
                        index_PipX, index_PipY, index_PipZ,
                        index_DipX, index_DipY, index_DipZ,
                        index_TipX, index_TipY, index_TipZ,
                        middle_McpX, middle_McpY, middle_McpZ,
                        middle_PipX, middle_PipY, middle_PipZ,
                        middle_DipX, middle_DipY, middle_DipZ,
                        middle_TipX, middle_TipY, middle_TipZ,
                        ring_McpX, ring_McpY, ring_McpZ,
                        ring_PipX, ring_PipY, ring_PipZ,
                        ring_DipX, ring_DipY, ring_DipZ,
                        ring_TipX, ring_TipY, ring_TipZ,
                        pinky_McpX, pinky_McpY, pinky_McpZ,
                        pinky_PipX, pinky_PipY, pinky_PipZ,
                        pinky_DipX, pinky_DipY, pinky_DipZ,
                        pinky_TipX, pinky_TipY, pinky_TipZ,
                        annotated_image)
            
            annotated_image = cv.flip(image.copy(), 1)
            for hand_landmarks in results.multi_hand_landmarks:
                # Pulso
                wristX = hand_landmarks.landmark[mp_hands.HandLandmark.WRIST].x * image_width
                wristY = hand_landmarks.landmark[mp_hands.HandLandmark.WRIST].y * image_height
                wristZ = hand_landmarks.landmark[mp_hands.HandLandmark.WRIST].z

                # Dedão
                thumb_CmcX = hand_landmarks.landmark[mp_hands.HandLandmark.THUMB_CMC].x * image_width
                thumb_CmcY = hand_landmarks.landmark[mp_hands.HandLandmark.THUMB_CMC].y * image_height
                thumb_CmcZ = hand_landmarks.landmark[mp_hands.HandLandmark.THUMB_CMC].z
                
                thumb_McpX = hand_landmarks.landmark[mp_hands.HandLandmark.THUMB_MCP].x * image_width
                thumb_McpY = hand_landmarks.landmark[mp_hands.HandLandmark.THUMB_MCP].y * image_height
                thumb_McpZ = hand_landmarks.landmark[mp_hands.HandLandmark.THUMB_MCP].z
                
                thumb_IpX = hand_landmarks.landmark[mp_hands.HandLandmark.THUMB_IP].x * image_width
                thumb_IpY = hand_landmarks.landmark[mp_hands.HandLandmark.THUMB_IP].y * image_height
                thumb_IpZ = hand_landmarks.landmark[mp_hands.HandLandmark.THUMB_IP].z
                
                thumb_TipX = hand_landmarks.landmark[mp_hands.HandLandmark.THUMB_TIP].x * image_width
                thumb_TipY = hand_landmarks.landmark[mp_hands.HandLandmark.THUMB_TIP].y * image_height
                thumb_TipZ = hand_landmarks.landmark[mp_hands.HandLandmark.THUMB_TIP].z

                # Indicador
                index_McpX = hand_landmarks.landmark[mp_hands.HandLandmark.INDEX_FINGER_MCP].x * image_width
                index_McpY = hand_landmarks.landmark[mp_hands.HandLandmark.INDEX_FINGER_MCP].y * image_height
                index_McpZ = hand_landmarks.landmark[mp_hands.HandLandmark.INDEX_FINGER_MCP].z
                
                index_PipX = hand_landmarks.landmark[mp_hands.HandLandmark.INDEX_FINGER_PIP].x * image_width
                index_PipY = hand_landmarks.landmark[mp_hands.HandLandmark.INDEX_FINGER_PIP].y * image_height
                index_PipZ = hand_landmarks.landmark[mp_hands.HandLandmark.INDEX_FINGER_PIP].z
                
                index_DipX = hand_landmarks.landmark[mp_hands.HandLandmark.INDEX_FINGER_DIP].x * image_width
                index_DipY = hand_landmarks.landmark[mp_hands.HandLandmark.INDEX_FINGER_DIP].y * image_height
                index_DipZ = hand_landmarks.landmark[mp_hands.HandLandmark.INDEX_FINGER_DIP].z
                
                index_TipX = hand_landmarks.landmark[mp_hands.HandLandmark.INDEX_FINGER_TIP].x * image_width
                index_TipY = hand_landmarks.landmark[mp_hands.HandLandmark.INDEX_FINGER_TIP].y * image_height
                index_TipZ = hand_landmarks.landmark[mp_hands.HandLandmark.INDEX_FINGER_TIP].z

                # Dedo médio
                middle_McpX = hand_landmarks.landmark[mp_hands.HandLandmark.MIDDLE_FINGER_MCP].x * image_width
                middle_McpY = hand_landmarks.landmark[mp_hands.HandLandmark.MIDDLE_FINGER_MCP].y * image_height
                middle_McpZ = hand_landmarks.landmark[mp_hands.HandLandmark.MIDDLE_FINGER_MCP].z
                
                middle_PipX = hand_landmarks.landmark[mp_hands.HandLandmark.MIDDLE_FINGER_PIP].x * image_width
                middle_PipY = hand_landmarks.landmark[mp_hands.HandLandmark.MIDDLE_FINGER_PIP].y * image_height
                middle_PipZ = hand_landmarks.landmark[mp_hands.HandLandmark.MIDDLE_FINGER_PIP].z
                
                middle_DipX = hand_landmarks.landmark[mp_hands.HandLandmark.MIDDLE_FINGER_DIP].x * image_width
                middle_DipY = hand_landmarks.landmark[mp_hands.HandLandmark.MIDDLE_FINGER_DIP].y * image_height
                middle_DipZ = hand_landmarks.landmark[mp_hands.HandLandmark.MIDDLE_FINGER_DIP].z
                
                middle_TipX = hand_landmarks.landmark[mp_hands.HandLandmark.MIDDLE_FINGER_TIP].x * image_width
                middle_TipY = hand_landmarks.landmark[mp_hands.HandLandmark.MIDDLE_FINGER_TIP].y * image_height
                middle_TipZ = hand_landmarks.landmark[mp_hands.HandLandmark.MIDDLE_FINGER_TIP].z

                # Anelar
                ring_McpX = hand_landmarks.landmark[mp_hands.HandLandmark.RING_FINGER_MCP].x * image_width
                ring_McpY = hand_landmarks.landmark[mp_hands.HandLandmark.RING_FINGER_MCP].y * image_height
                ring_McpZ = hand_landmarks.landmark[mp_hands.HandLandmark.RING_FINGER_MCP].z
                
                ring_PipX = hand_landmarks.landmark[mp_hands.HandLandmark.RING_FINGER_PIP].x * image_width
                ring_PipY = hand_landmarks.landmark[mp_hands.HandLandmark.RING_FINGER_PIP].y * image_height
                ring_PipZ = hand_landmarks.landmark[mp_hands.HandLandmark.RING_FINGER_PIP].z
                
                ring_DipX = hand_landmarks.landmark[mp_hands.HandLandmark.RING_FINGER_DIP].x * image_width
                ring_DipY = hand_landmarks.landmark[mp_hands.HandLandmark.RING_FINGER_DIP].y * image_height
                ring_DipZ = hand_landmarks.landmark[mp_hands.HandLandmark.RING_FINGER_DIP].z
                
                ring_TipX = hand_landmarks.landmark[mp_hands.HandLandmark.RING_FINGER_TIP].x * image_width
                ring_TipY = hand_landmarks.landmark[mp_hands.HandLandmark.RING_FINGER_TIP].y * image_height
                ring_TipZ = hand_landmarks.landmark[mp_hands.HandLandmark.RING_FINGER_TIP].z

                # Mindinho
                pinky_McpX = hand_landmarks.landmark[mp_hands.HandLandmark.PINKY_MCP].x * image_width
                pinky_McpY = hand_landmarks.landmark[mp_hands.HandLandmark.PINKY_MCP].y * image_height
                pinky_McpZ = hand_landmarks.landmark[mp_hands.HandLandmark.PINKY_MCP].z
                
                pinky_PipX = hand_landmarks.landmark[mp_hands.HandLandmark.PINKY_PIP].x * image_width
                pinky_PipY = hand_landmarks.landmark[mp_hands.HandLandmark.PINKY_PIP].y * image_height
                pinky_PipZ = hand_landmarks.landmark[mp_hands.HandLandmark.PINKY_PIP].z
                
                pinky_DipX = hand_landmarks.landmark[mp_hands.HandLandmark.PINKY_DIP].x * image_width
                pinky_DipY = hand_landmarks.landmark[mp_hands.HandLandmark.PINKY_DIP].y * image_height
                pinky_DipZ = hand_landmarks.landmark[mp_hands.HandLandmark.PINKY_DIP].z
                
                pinky_TipX = hand_landmarks.landmark[mp_hands.HandLandmark.PINKY_TIP].x * image_width
                pinky_TipY = hand_landmarks.landmark[mp_hands.HandLandmark.PINKY_TIP].y * image_height
                pinky_TipZ = hand_landmarks.landmark[mp_hands.HandLandmark.PINKY_TIP].z

                # Desenhar o "esqueleto" da mão
                mp_drawing.draw_landmarks(annotated_image, hand_landmarks, mp_hands.HAND_CONNECTIONS)
                
            return (wristX, wristY, wristZ,
                    thumb_CmcX, thumb_CmcY, thumb_CmcZ,
                    thumb_McpX, thumb_McpY, thumb_McpZ,
                    thumb_IpX, thumb_IpY, thumb_IpZ,
                    thumb_TipX, thumb_TipY, thumb_TipZ,
                    index_McpX, index_McpY, index_McpZ,
                    index_PipX, index_PipY, index_PipZ,
                    index_DipX, index_DipY, index_DipZ,
                    index_TipX, index_TipY, index_TipZ,
                    middle_McpX, middle_McpY, middle_McpZ,
                    middle_PipX, middle_PipY, middle_PipZ,
                    middle_DipX, middle_DipY, middle_DipZ,
                    middle_TipX, middle_TipY, middle_TipZ,
                    ring_McpX, ring_McpY, ring_McpZ,
                    ring_PipX, ring_PipY, ring_PipZ,
                    ring_DipX, ring_DipY, ring_DipZ,
                    ring_TipX, ring_TipY, ring_TipZ,
                    pinky_McpX, pinky_McpY, pinky_McpZ,
                    pinky_PipX, pinky_PipY, pinky_PipZ,
                    pinky_DipX, pinky_DipY, pinky_DipZ,
                    pinky_TipX, pinky_TipY, pinky_TipZ,
                    annotated_image)

### E vamos definir uma função para criar o CSV com novas informações (features retiradas de novas imagens), para construir nosso dataset para utilizar no treino do modelo.

In [7]:
# Função para criar um arquivo CSV ou adicionar as novas informações (features retiradas de novas imagens) para um CSV já existente.
# E esse CSV será o dataset que vamos utilizar para treinar o modelo.
def toCSV(filecsv, class_type,
          wristX, wristY, wristZ,
          thumb_CmcX, thumb_CmcY, thumb_CmcZ,
          thumb_McpX, thumb_McpY, thumb_McpZ,
          thumb_IpX, thumb_IpY, thumb_IpZ,
          thumb_TipX, thumb_TipY, thumb_TipZ,
          index_McpX, index_McpY, index_McpZ,
          index_PipX, index_PipY, index_PipZ,
          index_DipX, index_DipY, index_DipZ,
          index_TipX, index_TipY, index_TipZ,
          middle_McpX, middle_McpY, middle_McpZ,
          middle_PipX, middle_PipY, middle_PipZ,
          middle_DipX, middle_DipY, middle_DipZ,
          middle_TipX, middle_TipY, middle_TipZ,
          ring_McpX, ring_McpY, ring_McpZ,
          ring_PipX, ring_PipY, ring_PipZ,
          ring_DipX, ring_DipY, ring_DipZ,
          ring_TipX, ring_TipY, ring_TipZ,
          pinky_McpX, pinky_McpY, pinky_McpZ,
          pinky_PipX, pinky_PipY, pinky_PipZ,
          pinky_DipX, pinky_DipY, pinky_DipZ,
          pinky_TipX, pinky_TipY, pinky_TipZ):
    if os.path.isfile(filecsv):
        #print ("File exist thus shall write append to the file")
        with open(filecsv, 'a+', newline='') as file:
            # Create a writer object from csv module
            writer = csv.writer(file)
            writer.writerow([class_type,
                             wristX, wristY, wristZ,
                             thumb_CmcX, thumb_CmcY, thumb_CmcZ,
                             thumb_McpX, thumb_McpY, thumb_McpZ,
                             thumb_IpX, thumb_IpY, thumb_IpZ,
                             thumb_TipX, thumb_TipY, thumb_TipZ,
                             index_McpX, index_McpY, index_McpZ,
                             index_PipX, index_PipY, index_PipZ,
                             index_DipX, index_DipY, index_DipZ,
                             index_TipX, index_TipY, index_TipZ,
                             middle_McpX, middle_McpY, middle_McpZ,
                             middle_PipX, middle_PipY, middle_PipZ,
                             middle_DipX, middle_DipY, middle_DipZ,
                             middle_TipX, middle_TipY, middle_TipZ,
                             ring_McpX, ring_McpY, ring_McpZ,
                             ring_PipX, ring_PipY, ring_PipZ,
                             ring_DipX, ring_DipY, ring_DipZ,
                             ring_TipX, ring_TipY, ring_TipZ,
                             pinky_McpX, pinky_McpY, pinky_McpZ,
                             pinky_PipX, pinky_PipY, pinky_PipZ,
                             pinky_DipX, pinky_DipY, pinky_DipZ,
                             pinky_TipX, pinky_TipY, pinky_TipZ])
    else:
        #print ("File not exist thus shall create new file as", filecsv)
        with open(filecsv, 'w', newline='') as file:
            # Create a writer object from csv module
            writer = csv.writer(file)
            writer.writerow(["class_type",
                             "wristX", "wristY", "wristZ",
                             "thumb_CmcX", "thumb_CmcY", "thumb_CmcZ",
                             "thumb_McpX", "thumb_McpY", "thumb_McpZ",
                             "thumb_IpX", "thumb_IpY", "thumb_IpZ",
                             "thumb_TipX", "thumb_TipY", "thumb_TipZ",
                             "index_McpX", "index_McpY", "index_McpZ",
                             "index_PipX", "index_PipY", "index_PipZ",
                             "index_DipX", "index_DipY", "index_DipZ",
                             "index_TipX", "index_TipY", "index_TipZ",
                             "middle_McpX", "middle_McpY", "middle_McpZ",
                             "middle_PipX", "middle_PipY", "middle_PipZ",
                             "middle_DipX", "middle_DipY", "middle_DipZ",
                             "middle_TipX", "middle_TipY", "middle_TipZ",
                             "ring_McpX", "ring_McpY", "ring_McpZ",
                             "ring_PipX", "ring_PipY", "ring_PipZ",
                             "ring_DipX", "ring_DipY", "ring_DipZ",
                             "ring_TipX", "ring_TipY", "ring_TipZ",
                             "pinky_McpX", "pinky_McpY", "pinky_McpZ",
                             "pinky_PipX", "pinky_PipY", "pinky_PipZ",
                             "pinky_DipX", "pinky_DipY", "pinky_DipZ",
                             "pinky_TipX", "pinky_TipY", "pinky_TipZ"])
            writer.writerow([class_type,
                             wristX, wristY, wristZ,
                             thumb_CmcX, thumb_CmcY, thumb_CmcZ,
                             thumb_McpX, thumb_McpY, thumb_McpZ,
                             thumb_IpX, thumb_IpY, thumb_IpZ,
                             thumb_TipX, thumb_TipY, thumb_TipZ,
                             index_McpX, index_McpY, index_McpZ,
                             index_PipX, index_PipY, index_PipZ,
                             index_DipX, index_DipY, index_DipZ,
                             index_TipX, index_TipY, index_TipZ,
                             middle_McpX, middle_McpY, middle_McpZ,
                             middle_PipX, middle_PipY, middle_PipZ,
                             middle_DipX, middle_DipY, middle_DipZ,
                             middle_TipX, middle_TipY, middle_TipZ,
                             ring_McpX, ring_McpY, ring_McpZ,
                             ring_PipX, ring_PipY, ring_PipZ,
                             ring_DipX, ring_DipY, ring_DipZ,
                             ring_TipX, ring_TipY, ring_TipZ,
                             pinky_McpX, pinky_McpY, pinky_McpZ,
                             pinky_PipX, pinky_PipY, pinky_PipZ,
                             pinky_DipX, pinky_DipY, pinky_DipZ,
                             pinky_TipX, pinky_TipY, pinky_TipZ])

## Extrair features das imagens do Dataset obtido no Kaggle (link no início do documento) - para montar dataset de treino.
### * Vou utilizar somente o SIBI datasets version V02 - imagens da pasta "training".

In [10]:
paths = "./SIBI_datasets_LEMLITBANG_SIBI_R_90.10_V02/SIBI_datasets_LEMLITBANG_SIBI_R_90.10_V02/training/"
csv_path = "hands_SIBI_training.csv"

In [12]:
if os.path.exists(csv_path):
    print("O CSV já existe, vamos deletar antes de começar a extração para criar um dataset novo")
    os.remove(csv_path)
else:
    print("O CSV não existe", csv_path, ", vamos criar um dataset após a extração dos dados.")
    
for dirlist in os.listdir(paths):
    for root, directories, filenames in os.walk(os.path.join(paths, dirlist)):
        print("Dentro da pasta", dirlist, "existem :", len(filenames), "imagens")
        for filename in filenames:
            if filename.endswith(".jpg") or filename.endswith(".JPG"):
                (wristX, wristY, wristZ,
                 thumb_CmcX, thumb_CmcY, thumb_CmcZ,
                 thumb_McpX, thumb_McpY, thumb_McpZ,
                 thumb_IpX, thumb_IpY, thumb_IpZ,
                 thumb_TipX, thumb_TipY, thumb_TipZ,
                 index_McpX, index_McpY, index_McpZ,
                 index_PipX, index_PipY, index_PipZ,
                 index_DipX, index_DipY, index_DipZ,
                 index_TipX, index_TipY, index_TipZ,
                 middle_McpX, middle_McpY, middle_McpZ,
                 middle_PipX, middle_PipY, middle_PipZ,
                 middle_DipX, middle_DipY, middle_DipZ,
                 middle_TipX, middle_TipY, middle_TipZ,
                 ring_McpX, ring_McpY, ring_McpZ,
                 ring_PipX, ring_PipY, ring_PipZ,
                 ring_DipX, ring_DipY, ring_DipZ,
                 ring_TipX, ring_TipY, ring_TipZ,
                 pinky_McpX, pinky_McpY, pinky_McpZ,
                 pinky_PipX, pinky_PipY, pinky_PipZ,
                 pinky_DipX, pinky_DipY, pinky_DipZ,
                 pinky_TipX, pinky_TipY, pinky_TipZ,
                 annotated_image) = extract_feature(os.path.join(root, filename))
            
                if ((not wristX == 0) and (not wristY == 0)):
                    toCSV(csv_path, dirlist, 
                          wristX, wristY, wristZ,
                          thumb_CmcX, thumb_CmcY, thumb_CmcZ,
                          thumb_McpX, thumb_McpY, thumb_McpZ,
                          thumb_IpX, thumb_IpY, thumb_IpZ,
                          thumb_TipX, thumb_TipY, thumb_TipZ,
                          index_McpX, index_McpY, index_McpZ,
                          index_PipX, index_PipY, index_PipZ,
                          index_DipX, index_DipY, index_DipZ,
                          index_TipX, index_TipY, index_TipZ,
                          middle_McpX, middle_McpY, middle_McpZ,
                          middle_PipX, middle_PipY, middle_PipZ,
                          middle_DipX, middle_DipY, middle_DipZ,
                          middle_TipX, middle_TipY, middle_TipZ,
                          ring_McpX, ring_McpY, ring_McpZ,
                          ring_PipX, ring_PipY, ring_PipZ,
                          ring_DipX, ring_DipY, ring_DipZ,
                          ring_TipX, ring_TipY, ring_TipZ,
                          pinky_McpX, pinky_McpY, pinky_McpZ,
                          pinky_PipX, pinky_PipY, pinky_PipZ,
                          pinky_DipX, pinky_DipY, pinky_DipZ,
                          pinky_TipX, pinky_TipY, pinky_TipZ,)
                
                else :
                    print(os.path.join(root, filename), "Imagem sem landmarks para a mão.")

print("===================A extração de dados para o treino está completa!!!===================")

O CSV já existe, vamos deletar antes de começar a extração para criar um dataset novo
Dentro da pasta A existem : 42 Imagens
Dentro da pasta B existem : 42 Imagens
./SIBI_datasets_LEMLITBANG_SIBI_R_90.10_V02/SIBI_datasets_LEMLITBANG_SIBI_R_90.10_V02/training/B\IMG_20210605_174907.jpg Imagem sem landmarks para a mão.
Dentro da pasta C existem : 42 Imagens
./SIBI_datasets_LEMLITBANG_SIBI_R_90.10_V02/SIBI_datasets_LEMLITBANG_SIBI_R_90.10_V02/training/C\C_2.jpg Imagem sem landmarks para a mão.
./SIBI_datasets_LEMLITBANG_SIBI_R_90.10_V02/SIBI_datasets_LEMLITBANG_SIBI_R_90.10_V02/training/C\IMG_20210605_174952.jpg Imagem sem landmarks para a mão.
Dentro da pasta D existem : 42 Imagens
./SIBI_datasets_LEMLITBANG_SIBI_R_90.10_V02/SIBI_datasets_LEMLITBANG_SIBI_R_90.10_V02/training/D\IMG_20210605_175052.jpg Imagem sem landmarks para a mão.
Dentro da pasta E existem : 42 Imagens
./SIBI_datasets_LEMLITBANG_SIBI_R_90.10_V02/SIBI_datasets_LEMLITBANG_SIBI_R_90.10_V02/training/E\IMG_20210605_175140.jp

### Extrair features das imagens do Dataset para montar dataset de validação.

In [14]:
paths = "./SIBI_datasets_LEMLITBANG_SIBI_R_90.10_V02/SIBI_datasets_LEMLITBANG_SIBI_R_90.10_V02/validation/"
csv_path = "hands_SIBI_validation.csv"

if os.path.exists(csv_path):
    print("O CSV já existe, vamos deletar antes de começar a extração para criar um dataset novo")
    os.remove(csv_path)
else:
    print("O CSV não existe", csv_path, ", vamos criar um dataset após a extração dos dados.")
    
for dirlist in os.listdir(paths):
    for root, directories, filenames in os.walk(os.path.join(paths, dirlist)):
        print("Dentro da pasta", dirlist, "existem :", len(filenames), "imagens")
        for filename in filenames:
            if filename.endswith(".jpg") or filename.endswith(".JPG"):
                (wristX, wristY, wristZ,
                 thumb_CmcX, thumb_CmcY, thumb_CmcZ,
                 thumb_McpX, thumb_McpY, thumb_McpZ,
                 thumb_IpX, thumb_IpY, thumb_IpZ,
                 thumb_TipX, thumb_TipY, thumb_TipZ,
                 index_McpX, index_McpY, index_McpZ,
                 index_PipX, index_PipY, index_PipZ,
                 index_DipX, index_DipY, index_DipZ,
                 index_TipX, index_TipY, index_TipZ,
                 middle_McpX, middle_McpY, middle_McpZ,
                 middle_PipX, middle_PipY, middle_PipZ,
                 middle_DipX, middle_DipY, middle_DipZ,
                 middle_TipX, middle_TipY, middle_TipZ,
                 ring_McpX, ring_McpY, ring_McpZ,
                 ring_PipX, ring_PipY, ring_PipZ,
                 ring_DipX, ring_DipY, ring_DipZ,
                 ring_TipX, ring_TipY, ring_TipZ,
                 pinky_McpX, pinky_McpY, pinky_McpZ,
                 pinky_PipX, pinky_PipY, pinky_PipZ,
                 pinky_DipX, pinky_DipY, pinky_DipZ,
                 pinky_TipX, pinky_TipY, pinky_TipZ,
                 annotated_image) = extract_feature(os.path.join(root, filename))
            
                if ((not wristX == 0) and (not wristY == 0)):
                    toCSV(csv_path, dirlist, 
                          wristX, wristY, wristZ,
                          thumb_CmcX, thumb_CmcY, thumb_CmcZ,
                          thumb_McpX, thumb_McpY, thumb_McpZ,
                          thumb_IpX, thumb_IpY, thumb_IpZ,
                          thumb_TipX, thumb_TipY, thumb_TipZ,
                          index_McpX, index_McpY, index_McpZ,
                          index_PipX, index_PipY, index_PipZ,
                          index_DipX, index_DipY, index_DipZ,
                          index_TipX, index_TipY, index_TipZ,
                          middle_McpX, middle_McpY, middle_McpZ,
                          middle_PipX, middle_PipY, middle_PipZ,
                          middle_DipX, middle_DipY, middle_DipZ,
                          middle_TipX, middle_TipY, middle_TipZ,
                          ring_McpX, ring_McpY, ring_McpZ,
                          ring_PipX, ring_PipY, ring_PipZ,
                          ring_DipX, ring_DipY, ring_DipZ,
                          ring_TipX, ring_TipY, ring_TipZ,
                          pinky_McpX, pinky_McpY, pinky_McpZ,
                          pinky_PipX, pinky_PipY, pinky_PipZ,
                          pinky_DipX, pinky_DipY, pinky_DipZ,
                          pinky_TipX, pinky_TipY, pinky_TipZ,)
                
                else :
                    print(os.path.join(root, filename), "Imagem sem landmarks para a mão.")
                
print("===================A extração de dados para validação está completa!!!===================")

O CSV já existe, vamos deletar antes de começar a extração para criar um dataset novo
Dentro da pasta A existem : 9 imagens
Dentro da pasta B existem : 10 imagens
Dentro da pasta C existem : 10 imagens
Dentro da pasta D existem : 9 imagens
Dentro da pasta E existem : 9 imagens
Dentro da pasta F existem : 9 imagens
Dentro da pasta G existem : 10 imagens
Dentro da pasta H existem : 10 imagens
Dentro da pasta I existem : 10 imagens
./SIBI_datasets_LEMLITBANG_SIBI_R_90.10_V02/SIBI_datasets_LEMLITBANG_SIBI_R_90.10_V02/validation/I\IMG_20210605_172731.jpg Imagem sem landmarks para a mão.
Dentro da pasta J existem : 9 imagens
Dentro da pasta K existem : 10 imagens
./SIBI_datasets_LEMLITBANG_SIBI_R_90.10_V02/SIBI_datasets_LEMLITBANG_SIBI_R_90.10_V02/validation/K\IMG_20210605_172855.jpg Imagem sem landmarks para a mão.
Dentro da pasta L existem : 10 imagens
./SIBI_datasets_LEMLITBANG_SIBI_R_90.10_V02/SIBI_datasets_LEMLITBANG_SIBI_R_90.10_V02/validation/L\D54CB5CC-4353-4ABC-96C7-5DB3C0378E31.jpg

### Lendo o Dataset de treino construído

In [16]:
# Lendo dataset com o Pandas
df_train = pd.read_csv("hands_SIBI_training.csv", header=0)

# Ordenando de acordo com o alfabeto
df_train = df_train.sort_values(by=["class_type"])

display(df_train)

Unnamed: 0,class_type,wristX,wristY,wristZ,thumb_CmcX,thumb_CmcY,thumb_CmcZ,thumb_McpX,thumb_McpY,thumb_McpZ,...,pinky_McpZ,pinky_PipX,pinky_PipY,pinky_PipZ,pinky_DipX,pinky_DipY,pinky_DipZ,pinky_TipX,pinky_TipY,pinky_TipZ
0,A,633.076921,1031.794568,-1.029208e-06,552.740470,1014.596629,-0.016873,488.762230,950.072316,-0.015834,...,-0.014528,625.415772,841.898928,-0.030686,621.903911,885.461414,-0.008079,634.787813,893.351131,0.015991
23,A,1125.083566,1526.296854,-1.574101e-06,802.092314,1375.602007,-0.084639,595.728397,1025.357246,-0.107311,...,-0.018440,1378.066421,723.765910,-0.075082,1310.911298,947.677612,-0.045380,1293.768644,996.228576,-0.004563
24,A,606.059119,997.371497,-1.021800e-06,542.077795,975.891567,-0.016489,488.507822,912.044861,-0.018623,...,-0.025010,619.540215,819.682154,-0.043586,611.695915,864.094074,-0.022174,620.061144,878.710000,0.001083
25,A,640.976742,1024.759770,-1.163735e-06,567.882031,1006.959683,-0.013950,507.874981,935.404389,-0.011033,...,-0.023898,643.821225,825.429484,-0.040171,637.996227,872.443930,-0.016840,648.244545,891.350276,0.007488
26,A,314.847702,3331.943756,2.845621e-07,354.272716,3418.695923,-0.017547,387.147813,3487.958405,-0.011431,...,0.080590,363.476163,3488.726074,0.105393,378.224711,3506.369568,0.118019,389.623101,3521.623810,0.127040
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1030,Z,2538.824587,2736.188367,1.260576e-07,2462.919574,2616.520126,0.000939,2299.346423,2532.108889,-0.009162,...,-0.066991,2081.181879,2830.343569,-0.082286,2132.875171,2843.416618,-0.070778,2205.996637,2830.635519,-0.058647
1031,Z,2119.173346,2558.475957,7.089590e-08,2055.647750,2436.629878,-0.000313,1902.685847,2340.947706,-0.009532,...,-0.058520,1667.197409,2606.632695,-0.073744,1723.730664,2629.507397,-0.063213,1792.579737,2616.742350,-0.052031
1032,Z,2347.059660,2586.737692,1.008877e-07,2275.578604,2460.131370,0.003507,2115.500321,2366.809129,-0.006253,...,-0.064689,1868.288369,2641.144021,-0.077718,1923.779283,2661.967546,-0.065333,1993.641500,2650.382339,-0.053380
1023,Z,738.849841,1859.723072,5.368790e-07,774.882208,1841.190759,-0.136386,918.720272,1815.288757,-0.175115,...,0.014722,1233.930645,1812.526391,-0.036510,1143.906039,1863.667094,-0.040690,1065.033158,1832.436586,-0.027891


### Lendo o Dataset para validação contruído

In [17]:
# Lendo dataset com o Pandas
df_test = pd.read_csv("hands_SIBI_validation.csv", header=0)

# Ordenando de acordo com o alfabeto
df_test = df_test.sort_values(by=["class_type"])

df_test

Unnamed: 0,class_type,wristX,wristY,wristZ,thumb_CmcX,thumb_CmcY,thumb_CmcZ,thumb_McpX,thumb_McpY,thumb_McpZ,...,pinky_McpZ,pinky_PipX,pinky_PipY,pinky_PipZ,pinky_DipX,pinky_DipY,pinky_DipZ,pinky_TipX,pinky_TipY,pinky_TipZ
0,A,1609.969445,1784.219007,-1.749426e-06,1369.532337,1709.749967,-0.042583,1148.541421,1496.676805,-0.062476,...,-0.052162,1657.074078,1124.225888,-0.091846,1640.837749,1317.489039,-0.057767,1681.876300,1367.194283,-0.020742
1,A,721.896172,1655.642509,-1.420811e-06,530.854702,1427.744269,-0.034982,442.552775,1126.326799,-0.054139,...,-0.051890,1116.067052,1094.979286,-0.091447,1040.838599,1238.942862,-0.066167,998.329639,1323.051572,-0.035349
2,A,749.508023,1684.742451,-1.402071e-06,563.867211,1453.759909,-0.036050,486.350209,1149.080753,-0.054777,...,-0.049351,1145.679832,1128.134489,-0.089399,1066.006184,1271.043181,-0.065400,1017.981291,1357.542038,-0.034858
3,A,626.759470,1036.433014,-1.010604e-06,557.352439,1013.014010,-0.019213,502.976462,940.376073,-0.019421,...,-0.026154,646.706343,850.894426,-0.045288,638.890430,895.976001,-0.022625,649.126634,910.227441,0.002208
4,A,303.717607,3094.467133,2.660258e-07,373.753236,3178.387299,-0.009904,384.873405,3259.094238,-0.005669,...,0.074729,270.845872,3268.955566,0.102029,289.052582,3289.632111,0.118794,303.846127,3296.206604,0.131020
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
210,Z,607.813954,1321.033716,7.040624e-07,662.446499,1287.171960,-0.190697,824.793935,1257.432699,-0.242297,...,0.023394,1208.488584,1214.890003,-0.046840,1095.316410,1289.378405,-0.056087,1015.204191,1262.588382,-0.042592
205,Z,419.206649,803.872267,1.096508e-06,413.938612,797.197503,-0.082267,452.044219,781.140809,-0.114144,...,-0.026700,536.410868,793.225133,-0.078523,514.125288,812.778553,-0.085061,493.014157,825.348620,-0.075572
204,Z,978.328505,1808.656837,1.034333e-06,945.278280,1858.174145,-0.115397,1007.886907,1893.349442,-0.177530,...,-0.064582,1375.302467,1773.535432,-0.110885,1278.924811,1821.105372,-0.110281,1192.749185,1793.148831,-0.099464
206,Z,714.603248,1737.283194,5.701007e-07,738.054170,1731.209089,-0.164280,868.574480,1711.071499,-0.212323,...,0.015985,1198.332878,1661.383299,-0.047904,1114.986639,1718.931599,-0.063430,1047.808903,1694.959461,-0.058270


### Enumerar as classes dos datasets

In [18]:
# Enumerar as classes
df_train["class_type"] = pd.Categorical(df_train["class_type"])
df_train["class_type"] = df_train.class_type.cat.codes

df_test["class_type"] = pd.Categorical(df_test["class_type"])
df_test["class_type"] = df_test.class_type.cat.codes

In [19]:
# Copiar label e feature para treino
y_train = df_train.pop("class_type")
x_train = df_train.copy()

y_test = df_test.pop("class_type")
x_test = df_test.copy()

# Transformar essas features em array
x_train = np.array(x_train)
x_test = np.array(x_test)

### Transformando o shape dos arrays para alimentar o modelo

In [20]:
# Verificando o shape do array antes da transformação
print(x_train.shape)
print(x_test.shape)

# Transformando o tamanho do array para alimentar o modelo
x_train = np.reshape(x_train, (x_train.shape[0], x_train.shape[1], 1))
x_test = np.reshape(x_test, (x_test.shape[0], x_test.shape[1], 1))

# Verificando o novo shape do array
print(x_train.shape)
print(x_test.shape)

(1056, 63)
(212, 63)
(1056, 63, 1)
(212, 63, 1)


### Olhando os dados que serão utilizados para alimentar o modelo

In [22]:
# Verificando dados de treino e features do teste.
print(x_train[1])
print(y_test[200])

[[1125.084]
 [1526.297]
 [  -0.   ]
 [ 802.092]
 [1375.602]
 [  -0.085]
 [ 595.728]
 [1025.357]
 [  -0.107]
 [ 559.411]
 [ 662.405]
 [  -0.128]
 [ 543.397]
 [ 396.204]
 [  -0.127]
 [ 750.58 ]
 [ 685.552]
 [   0.027]
 [ 780.826]
 [ 417.263]
 [  -0.074]
 [ 797.145]
 [ 691.046]
 [  -0.12 ]
 [ 789.808]
 [ 814.798]
 [  -0.132]
 [ 970.713]
 [ 714.971]
 [   0.032]
 [1006.45 ]
 [ 473.353]
 [  -0.073]
 [ 962.678]
 [ 822.1  ]
 [  -0.092]
 [ 925.045]
 [ 865.597]
 [  -0.073]
 [1167.153]
 [ 778.265]
 [   0.011]
 [1208.52 ]
 [ 585.606]
 [  -0.1  ]
 [1135.388]
 [ 906.241]
 [  -0.073]
 [1099.269]
 [ 938.183]
 [  -0.021]
 [1364.387]
 [ 854.725]
 [  -0.018]
 [1378.066]
 [ 723.766]
 [  -0.075]
 [1310.911]
 [ 947.678]
 [  -0.045]
 [1293.769]
 [ 996.229]
 [  -0.005]]
24


### Categorizar as labels

In [26]:
# Número de classes - letras do alfabeto
num_classes = 26

# Usando utils do Keras para categorizar as labels 
y_train = to_categorical(y_train, num_classes)
y_test = to_categorical(y_test, num_classes)

### Definir modelo - Rede Neural Convolucional de uma dimensão

In [27]:
# Modelo CNN unidimensional
model = tf.keras.models.Sequential([
    tf.keras.layers.Conv1D(filters=32, kernel_size=5, strides=1, padding="causal", activation="relu", input_shape=x_train.shape[1:3]),
    tf.keras.layers.Conv1D(filters=32, kernel_size=5, strides=1, padding="causal", activation="relu"),
    tf.keras.layers.MaxPooling1D(pool_size=2),
    tf.keras.layers.Conv1D(filters=64, kernel_size=5, strides=1, padding="causal", activation="relu"),
    tf.keras.layers.Conv1D(filters=64, kernel_size=5, strides=1, padding="causal", activation="relu"),
    tf.keras.layers.MaxPooling1D(pool_size=2),
    tf.keras.layers.Conv1D(filters=128, kernel_size=5, strides=1, padding="causal", activation="relu"),
    tf.keras.layers.Conv1D(filters=128, kernel_size=5, strides=1, padding="causal", activation="relu"),
    tf.keras.layers.MaxPooling1D(pool_size=2),
    tf.keras.layers.Conv1D(filters=256, kernel_size=5, strides=1, padding="causal", activation="relu"),
    tf.keras.layers.Conv1D(filters=256, kernel_size=5, strides=1, padding="causal", activation="relu"),
    tf.keras.layers.MaxPooling1D(pool_size=2),
    tf.keras.layers.Dropout(rate=0.2),
    # Flatten the results to feed into a DNN
    tf.keras.layers.Flatten(),
    # 512 neuron hidden layer
    tf.keras.layers.Dense(512, activation='relu'), 
    tf.keras.layers.Dense(num_classes, activation='softmax')])

model.compile(loss = 'categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.summary()

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv1d_8 (Conv1D)           (None, 63, 32)            192       
                                                                 
 conv1d_9 (Conv1D)           (None, 63, 32)            5152      
                                                                 
 max_pooling1d_4 (MaxPooling  (None, 31, 32)           0         
 1D)                                                             
                                                                 
 conv1d_10 (Conv1D)          (None, 31, 64)            10304     
                                                                 
 conv1d_11 (Conv1D)          (None, 31, 64)            20544     
                                                                 
 max_pooling1d_5 (MaxPooling  (None, 15, 64)           0         
 1D)                                                    

### Treinando o modelo!!!

In [28]:
#Treinando o modelo
model.fit(x_train, y_train, epochs=50, batch_size=32, validation_data=(x_test, y_test))

Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50


<keras.callbacks.History at 0x2afa35ea430>

### Salvando o modelo para tentar cumprir o desafio bônus - vou classificar o alfabeto de sinais do dataset à partir de minha própria WebCam, em tempo real.

In [29]:
model.save('./cnn_sibi/')



INFO:tensorflow:Assets written to: ./cnn_sibi/assets


INFO:tensorflow:Assets written to: ./cnn_sibi/assets
