# ASL- Fingerspelling 

## What is American Sign Language Fingerspelling Recognition ?

American Sign Language Fingerspelling Recognition is a technology that uses computer vision and machine learning algorithms to recognize and interpret the hand gestures used in American Sign Language (ASL) fingerspelling. It can be used to create tools and applications that help people with hearing impairments to communicate more effectively with others. The technology involves comparing the input image of the hand gesture to a pre-defined set of templates, extracting relevant features from the input image, or training a neural network on a large dataset of ASL fingerspelling images to learn the patterns and features that are most important for recognition. Despite some challenges, ASL Fingerspelling Recognition has the potential to greatly improve the lives of people with hearing impairments.

## Data Overview

### Files
#### [train/supplemental_metadata].csv


* path - The path to the landmark file.
* file_id - A unique identifier for the data file.
* participant_id - A unique identifier for the data contributor.
* sequence_id - A unique identifier for the landmark sequence. Each data file may contain many sequences.
* phrase - The labels for the landmark sequence. The train and test datasets contain randomly generated addresses, phone numbers, and urls derived from components of real addresses/phone numbers/urls. Any overlap with real addresses, phone numbers, or urls is purely accidental. The supplemental dataset consists of fingerspelled sentences. Note that some of the urls include adult content. The intent of this competition is to support the Deaf and Hard of Hearing community in engaging with technology on an equal footing with other adults.

### character_to_prediction_index.json

#### [train/supplemental]_landmarks/ 
The landmark data. The landmarks were extracted from raw videos with the MediaPipe holistic model. Not all of the frames necessarily had visible hands or hands that could be detected by the model.
The landmark files contain the same data as in the ASL Signs competition (minus the row ID column) but reshaped into a wide format. This allows you to take advantage of the Parquet format to entirely skip loading landmarks that you aren't using.

* sequence_id - A unique identifier for the landmark sequence. Most landmark files contain 1,000 sequences. The sequence ID is used as the dataframe index.
* frame - The frame number within a landmark sequence.
* [x/y/z]_[type]_[landmark_index] - There are now 1,629 spatial coordinate columns for the x, y and z coordinates for each of the 543 landmarks. The type of landmark is one of ['face', 'left_hand', 'pose', 'right_hand']. Details of the hand landmark locations can be found here. The spatial coordinates have already been normalized by MediaPipe. Note that the MediaPipe model is not fully trained to predict depth so you may wish to ignore the z values. The landmarks have been converted to float32.

In [None]:
### import libraries
import pandas as pd,numpy as np,os
import json
import plotly.graph_objects as go
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px
import plotly.io as pio
from pathlib import Path
pio.templates.default = "simple_white"
print("importing..")

In [None]:
def map_new_to_old_style(sequence):
    types = []
    landmark_indexes = []
    for column in list(sequence.columns)[1:544]:
        parts = column.split("_")
        if len(parts) == 4:
            types.append(parts[1] + "_" + parts[2])
        else:
            types.append(parts[1])

        landmark_indexes.append(int(parts[-1]))

    data = {
        "frame": [],
        "type": [],
        "landmark_index": [],
        "x": [],
        "y": [],
        "z": []
    }

    for index, row in sequence.iterrows():
        data["frame"] += [int(row.frame)]*543
        data["type"] += types
        data["landmark_index"] += landmark_indexes

        for _type, landmark_index in zip(types, landmark_indexes):
            data["x"].append(row[f"x_{_type}_{landmark_index}"])
            data["y"].append(row[f"y_{_type}_{landmark_index}"])
            data["z"].append(row[f"z_{_type}_{landmark_index}"])

    return pd.DataFrame.from_dict(data)

# assign desired colors to landmarks
def assign_color(row):
    if row == 'face':
        return 'red'
    elif 'hand' in row:
        return 'dodgerblue'
    else:
        return 'green'

# specifies the plotting order
def assign_order(row):
    if row.type == 'face':
        return row.landmark_index + 101
    elif row.type == 'pose':
        return row.landmark_index + 30
    elif row.type == 'left_hand':
        return row.landmark_index + 80
    else:
        return row.landmark_index
    
def visualise2d_landmarks(parquet_df, title=""):
    connections = [  
        [0, 1, 2, 3, 4,],
        [0, 5, 6, 7, 8],
        [0, 9, 10, 11, 12],
        [0, 13, 14, 15, 16],
        [0, 17, 18, 19, 20],

        
        [38, 36, 35, 34, 30, 31, 32, 33, 37],
        [40, 39],
        [52, 46, 50, 48, 46, 44, 42, 41, 43, 45, 47, 49, 45, 51],
        [42, 54, 56, 58, 60, 62, 58],
        [41, 53, 55, 57, 59, 61, 57],
        [54, 53],

        
        [80, 81, 82, 83, 84, ],
        [80, 85, 86, 87, 88],
        [80, 89, 90, 91, 92],
        [80, 93, 94, 95, 96],
        [80, 97, 98, 99, 100], ]

    parquet_df = map_new_to_old_style(parquet_df)
    frames = sorted(set(parquet_df.frame))
    first_frame = min(frames)
    parquet_df['color'] = parquet_df.type.apply(lambda row: assign_color(row))
    parquet_df['plot_order'] = parquet_df.apply(lambda row: assign_order(row), axis=1)
    first_frame_df = parquet_df[parquet_df.frame == first_frame].copy()
    first_frame_df = first_frame_df.sort_values(["plot_order"]).set_index('plot_order')


    frames_l = []
    for frame in frames:
        filtered_df = parquet_df[parquet_df.frame == frame].copy()
        filtered_df = filtered_df.sort_values(["plot_order"]).set_index("plot_order")
        traces = [go.Scatter(
            x=filtered_df['x'],
            y=filtered_df['y'],
            mode='markers',
            marker=dict(
                color=filtered_df.color,
                size=9))]

        for i, seg in enumerate(connections):
            trace = go.Scatter(
                    x=filtered_df.loc[seg]['x'],
                    y=filtered_df.loc[seg]['y'],
                    mode='lines',
            )
            traces.append(trace)
        frame_data = go.Frame(data=traces, traces = [i for i in range(17)])
        frames_l.append(frame_data)

    traces = [go.Scatter(
        x=first_frame_df['x'],
        y=first_frame_df['y'],
        mode='markers',
        marker=dict(
            color=first_frame_df.color,
            size=9
        )
    )]
    for i, seg in enumerate(connections):
        trace = go.Scatter(
            x=first_frame_df.loc[seg]['x'],
            y=first_frame_df.loc[seg]['y'],
            mode='lines',
            line=dict(
                color='black',
                width=2
            )
        )
        traces.append(trace)
    fig = go.Figure(
        data=traces,
        frames=frames_l
    )


    fig.update_layout(
        width=500,
        height=800,
        scene={
            'aspectmode': 'data',
        },
        updatemenus=[
            {
                "buttons": [
                    {
                        "args": [None, {"frame": {"duration": 100,
                                                  "redraw": True},
                                        "fromcurrent": True,
                                        "transition": {"duration": 0}}],
                        "label": "&#9654;",
                        "method": "animate",
                    },
                    {
                        "args": [[None], {"frame": {"duration": 0, "redraw": False},
                                          "mode": "immediate",
                                          "transition": {"duration": 0}}],
                        "label": "&#9612;&#9612;",
                        "method": "animate",
                    },
                ],
                "direction": "left",
                "pad": {"r": 100, "t": 100},
                "font": {"size":20},
                "type": "buttons",
                "x": 0.1,
                "y": 0,
            }
        ],
    )
    camera = dict(
        up=dict(x=0, y=-1, z=0),
        eye=dict(x=0, y=0, z=2.5)
    )
    fig.update_layout(title_text=title, title_x=0.5)
    fig.update_layout(scene_camera=camera, showlegend=False)
    fig.update_layout(xaxis = dict(visible=False),
            yaxis = dict(visible=False),
    )
    fig.update_yaxes(autorange="reversed")

    fig.show()
    
def get_phrase(df, file_id, sequence_id):
    return df[
        np.logical_and(
            df.file_id == file_id, 
            df.sequence_id == sequence_id
        )
    ].phrase.iloc[0]

# Explore Metadata

### explore supplemental_metadata

In [None]:
# Load the supplemental_metadata.csv file into memory
supplemental_df = pd.read_csv("/kaggle/input/asl-fingerspelling/supplemental_metadata.csv")
pd.set_option('display.max_columns', None)
supplemental_df.head(3)

In [None]:
## get count phrases
phrase_count = supplemental_df["phrase"]

In [None]:
## get count of unique phrases
unique_phrase = supplemental_df["phrase"].unique()

In [None]:
print("number of phrase is : {} and number of unique phrase is : {}".format(len(phrase_count), len(unique_phrase)))

In [None]:
# type(phrase_count),type(unique_phrase)

### create separete dataframe to store phrases and their value counts

In [None]:
# Get the value counts for the 'phrase' columns
phrase_counts = supplemental_df['phrase'].value_counts()

# Create a new DatFrame with 'phrase' and 'count' columns 
phrase_data = pd.DataFrame({'phrases': phrase_counts.index, 'phrase_count': phrase_counts.values})

In [None]:
phrase_data.head(10)


### visualize data for 5 most frequent and least frequent phrases

In [None]:
fig = px.bar(phrase_data.iloc[:5,:], x='phrase_count', y='phrases', color='phrases', orientation='h')
fig.update_layout(
    title={
        'text': "count of top 5 most frequent phrases",
        'y':0.96,
        'x':0.4,
        'xanchor': 'center',
        'yanchor': 'top'
    },
    legend_title_text='Aspect:'
)

fig.show()

In [None]:
fig = px.bar(phrase_data.iloc[504:508,:], x='phrase_count', y='phrases', color='phrases', orientation='h')
fig.update_layout(
    title={
        'text': "count of 5 least phrases",
        'y':0.96,
        'x':0.4,
        'xanchor': 'center',
        'yanchor': 'top'
    },
    legend_title_text='Aspect:'
)

fig.show()

# Explore Landmark

## loading parquest file of 

In [None]:
##create subset of dataset where phrase is "coming up with killer sound bites"
top_phrase = supplemental_df[supplemental_df["phrase"]=="coming up with killer sound bites"]['path'].values[0]
top_phrase

In [None]:
base_dir=Path("/kaggle/input/asl-fingerspelling")

### explore landmark file of top_phrase

In [None]:
landmark_file = pd.read_parquet(base_dir/top_phrase)
landmark_file.head()

In [None]:
landmark_file=landmark_file.reset_index(inplace=False)

In [None]:
landmark_file.head()

In [None]:
# len(landmark_file.columns)  ## 1630

In [None]:
landmark_file.shape

In [None]:
# view number of unique sequence_ids in dataset 
# landmark_file["sequence_id"].nunique() # 1000

In [None]:
# return 1st two sequence_ids
landmark_file["sequence_id"].unique()[:2]

In [None]:
# landmark_file["frame"].nunique() # 507

In [None]:
#fetch landmark data for sequence id=1535467051
landmark_1st_id=landmark_file[landmark_file["sequence_id"]==1535467051]

In [None]:
landmark_1st_id

# explore train file

In [None]:
train_data=pd.read_csv("/kaggle/input/asl-fingerspelling/train.csv")
train_data.shape

In [None]:
train_data.head()

#### explore file ->/kaggle/input/asl-fingerspelling/character_to_prediction_index.json

In [None]:
char_to_pred="/kaggle/input/asl-fingerspelling/character_to_prediction_index.json"
# Python program to read
# json file
char=[]
values=[]
import json

# Opening JSON file
f = open(char_to_pred)

# returns JSON object as
# a dictionary
data = json.load(f)

# Iterating through the json
# list
for i,j in data.items():
    char.append(i)
    values.append(j)
#   print("key:"+str(i),"values:"+str(j))

# Closing file
f.close()

# print("\n characters list:",char)
# print("\n values list:",values)

In [None]:
char_to_pred_index=pd.DataFrame({"char":char,"values":values})
char_to_pred_index.head(20)

# Visualizing the Sequence

### Let's get random file ID and sequence to Visualize the sequence

In [None]:
# Get number of unique file_ids in train folder
unique_file_ids = len(np.unique(supplemental_df['file_id']))

# Generate a random integer between 0 and 53 (unique_file_ids)
random_id = np.random.randint(0, unique_file_ids)

# Getting the random file_id 
random_file_id = np.unique(supplemental_df['file_id'])[random_id]

In [None]:
# Get all different sequences in random file
signs = supplemental_df[supplemental_df['file_id'] == random_file_id ]

In [None]:
signs.head()

The number of unique sequences in each .parquet file is 1000.

In [None]:
len(np.unique(signs.index))

In [None]:
# Get a random Sequence id
random_squence_id = signs.sample()['sequence_id'].item()

#### Let's Load random file id to visualize the sequence 

In [None]:
path_to_sign = f"/kaggle/input/asl-fingerspelling/supplemental_landmarks/{random_file_id}.parquet"
parquet = pd.read_parquet(path_to_sign)

In [None]:
sequence = parquet[parquet.index == random_squence_id]
sequence

In [None]:
sequence_phrase = get_phrase(supplemental_df, random_file_id, random_squence_id)
visualise2d_landmarks(sequence, f"Phrase: {sequence_phrase}")

## _________________________THANK YOU !!! ___________________

please upvote if you like my work.