# Description and Goal
- Deaf children are often born to hearing parents who do not know sign language. Your challenge in this competition is to help identify signs made in processed videos, which will support the development of mobile apps to help teach parents sign language so they can communicate with their Deaf children.
- The goal of this competition is to classify isolated American Sign Language (ASL) signs. 
- The evaluation metric for this contest is simple classification accuracy.

# Import Libraries & Load Data


In [4]:
import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

## Data EDA

In [5]:
BASE_PATH = "./"

train_data = pd.DataFrame()
train_data = pd.read_csv(f'{BASE_PATH}train.csv')


# Explore Files

In [6]:
train_data

Unnamed: 0,path,participant_id,sequence_id,sign
0,train_landmark_files/26734/1000035562.parquet,26734,1000035562,blow
1,train_landmark_files/28656/1000106739.parquet,28656,1000106739,wait
2,train_landmark_files/16069/100015657.parquet,16069,100015657,cloud
3,train_landmark_files/25571/1000210073.parquet,25571,1000210073,bird
4,train_landmark_files/62590/1000240708.parquet,62590,1000240708,owie
...,...,...,...,...
94472,train_landmark_files/53618/999786174.parquet,53618,999786174,white
94473,train_landmark_files/26734/999799849.parquet,26734,999799849,have
94474,train_landmark_files/25571/999833418.parquet,25571,999833418,flower
94475,train_landmark_files/29302/999895257.parquet,29302,999895257,room


In total, we have 4 columns with 94477 entries. The columns are path, participant_id, sequence_id, and sign. The path is the location of the data used to try and predict the sign. This data is in the file type .parquet, it is faster to read compared to json or csv. 

* path - The path to the landmark file.
* participant_id - A unique identifier for the data contributor.
* sequence_id - A unique identifier for the landmark sequence.
* sign - The label for the landmark sequence.

In [7]:
train_data['sign'].value_counts()

sign
listen    415
look      414
shhh      411
donkey    410
mouse     408
         ... 
dance     312
person    312
beside    310
vacuum    307
zipper    299
Name: count, Length: 250, dtype: int64

We can see that we have 250 unique signs that can be consider for this project. The most common sign is Listen while zipper is the least likely to appear.

In [8]:
landmark=pd.read_parquet(BASE_PATH+train_data.iloc[0].path)

Display of data contained in the landmark parquet files that the model will be processing

Unique frames and type in the dataset

In [9]:
unique_frames = landmark["frame"].nunique()
unique_types = landmark["type"].nunique()
types_in_video = landmark["type"].unique()
print(
    f"We have {unique_frames} unique frames and {unique_types} unique types: {types_in_video} in the file."
)

We have 23 unique frames and 4 unique types: ['face' 'left_hand' 'pose' 'right_hand'] in the file.


In [10]:
import cv2
import mediapipe as mp
mp_drawing = mp.solutions.drawing_utils
mp_drawing_styles = mp.solutions.drawing_styles
mp_hands = mp.solutions.hands

## Real time hand recognition

In [11]:
cap = cv2.VideoCapture(0)
with mp_hands.Hands(
    model_complexity=0,
    min_detection_confidence=0.5,
    min_tracking_confidence=0.5) as hands:
  while cap.isOpened():
    success, image = cap.read()
    if not success:
      print("Ignoring empty camera frame.")
      # If loading a video, use 'break' instead of 'continue'.
      continue

    # To improve performance, optionally mark the image as not writeable to
    # pass by reference.
    image.flags.writeable = False
    image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
    results = hands.process(image)

    # Draw the hand annotations on the image.
    image.flags.writeable = True
    image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)
    if results.multi_hand_landmarks:
      for hand_landmarks in results.multi_hand_landmarks:
        mp_drawing.draw_landmarks(
            image,
            hand_landmarks,
            mp_hands.HAND_CONNECTIONS,
            mp_drawing_styles.get_default_hand_landmarks_style(),
            mp_drawing_styles.get_default_hand_connections_style())
    # Flip the image horizontally for a selfie-view display.
    cv2.imshow('MediaPipe Hands', cv2.flip(image, 1))
    if cv2.waitKey(5) & 0xFF==ord("q"):
      break
cap.release()

# Create Model

# Submit Model in TensorFlow Lite
- Method to submit provided by the competition

In [12]:
ROWS_PER_FRAME = 543  # number of landmarks per frame

def load_relevant_data_subset(pq_path):
    data_columns = ['x', 'y', 'z']
    data = pd.read_parquet(pq_path, columns=data_columns)
    n_frames = int(len(data) / ROWS_PER_FRAME)
    data = data.values.reshape(n_frames, ROWS_PER_FRAME, len(data_columns))
    return data.astype(np.float32)

In [13]:
import tflite_runtime.interpreter as tflite
interpreter = tflite.Interpreter(model_path)

found_signatures = list(interpreter.get_signature_list().keys())

if REQUIRED_SIGNATURE not in found_signatures:
    raise KernelEvalException('Required input signature not found.')

prediction_fn = interpreter.get_signature_runner("serving_default")
output = prediction_fn(inputs=frames)
sign = np.argmax(output["outputs"])

ModuleNotFoundError: No module named 'tflite_runtime'

: 