### Tunisian Sign Language EDA

## Problematic :

The situation for deaf people in Tunisia is challenging, with estimates of
the deaf population ranging from 40,000 to 60,000 people. These individuals
face significant difficulties communicating with non-deaf individuals, as many
people do not understand or know sign language. As a result, deaf individuals
often find themselves in situations where verbal communication is the norm,
which can lead to feelings of isolation and exclusion from society.
Additionally, the lack of access to expert interpretation services in many circumstances
can exacerbate these challenges. This can lead to underemployment,
public health issues, and other difficulties that can create a definitive
gate between deaf individuals and society.
Unfortunately, there are currently no official

## Proposed solution

**Our proposed solution involves developing a deep learning model for Tunisian
Sign Language recognition and integrating it into a mobile application using
TensorFlow Lite.**

This works was inspired by this two notebooks : 
* https://www.kaggle.com/code/danielpeshkov/animated-data-visualization
* https://www.kaggle.com/code/dschettler8845/gislr-learn-eda-baseline

In [None]:
import numpy as np
import pandas as pd
import tensorflow as tf
import tensorflow_addons as tfa

%matplotlib inline
import matplotlib.pyplot as plt # Matlab-style plotting
import seaborn as sns
color = sns.color_palette()
sns.set_style('darkgrid')

from tqdm.notebook import tqdm
from sklearn.model_selection import train_test_split, GroupShuffleSplit 

import glob
import sys
import os
import math
import gc
import sys
import sklearn
import scipy
import json


## EDA & Visualization 

In [None]:
# Load the train dataframe
train = pd.read_csv('/kaggle/input/tunisian-sign-language-landmarks/train.csv')
print(train.shape)

In [None]:
# change the path column
train['path'] = train['path'].str.replace(r'output_dir\\', '/kaggle/input/tunisian-sign-language-landmarks/output_dir/')
train.loc[train["sign"]=="tleth", "sign"] = "thleth"
train.loc[train["sign"]=="inty", "sign"] = "Inty"

In [None]:
train.head()

In [None]:
# Define the number of categories
num_signs = len(train['sign'].unique())
num_signs

The data set includes 11 signs and covers the primary terms used in vital domains of the deaf community in Tunisia. These signs were classified into five categories: 
* Demands: Commonly used terms such as Hello, How are you, Help me .. 
* Famille: Family Words such as mom, dad, aunt .. 
* Destinations: Most popular destinations like hospital,municipality
* Jours: Weekdays.
* Transportation: Modes of transportation such as vehicle and bus... 

In [None]:
# plt.figure(figsize=(10, 6))
# sns.countplot(data = train, y = 'Categorie', order = train.Categorie.value_counts().index)

In [None]:
fig, ax = plt.subplots(figsize=(8, 8))
train["sign"].value_counts().head(25).sort_values(ascending=True).plot(
    kind="barh", ax=ax, title="Signs in Training Dataset"
)
ax.set_xlabel("Number of Training Examples")
plt.show()

### Sequence landmarks data

Overall Nature of Data
We have x-y-z coordinates of landmark indices of both left and right hands for each frame of a sequence. We need to use all/some of the frames to classify the sequence as a whole into the 10 odd signs there are. Below are the landmark indices of hand from the mediapipe page.

![Mediapipe Hand landmarks](https://mediapipe.dev/images/mobile/hand_landmarks.png)

In [None]:
example_fn = train.query('sign == "car"')["path"].values[0]

example_landmark = pd.read_parquet(f"{example_fn}")
example_landmark.head()

As we can see, there are x-y-z locations. We can perhaps combine all of the details of a frame in a single row when we move on to the modeling part. Here it gives us a little more detailed look into the data.

In [None]:
example_landmark.shape

The shape is 21 Landmark * 2 Hands * 38 frame = 1596

In [None]:
example_landmark.describe()

In [None]:
unique_frames = example_landmark["frame"].nunique()
unique_types = example_landmark["type"].nunique()

print(
    f"The file has {unique_frames} unique frames and {unique_types} types of landmarks"
)

In [None]:
listen_files = train.query('sign == "cv"')["path"].values
for i, f in enumerate(listen_files):
    example_landmark = pd.read_parquet(f"{f}")
    unique_frames = example_landmark["frame"].nunique()
    unique_types = example_landmark["type"].nunique()
    types_in_video = example_landmark["type"].unique()
    print(
        f"The file has {unique_frames} unique frames and {unique_types} unique types: {types_in_video}"
    )
    if i == 20:
        break

In [None]:
N_PARQUETS_TO_READ = 155  # So we don't have to load all 95k

combined_meta = {}
for i, d in tqdm(train.iterrows(), total=len(train)):
    file_path = d["path"]
    example_landmark = pd.read_parquet(f"{file_path}")
    # Get the number of landmarks with x,y,z data per type
    meta = (
        example_landmark.dropna(subset=["x", "y", "z"])["type"].value_counts().to_dict()
    )
    meta["frame"] = example_landmark["frame"].nunique()
    xyz_meta = (
        example_landmark.agg(
            {
                "x": ["min", "max", "mean"],
                "y": ["min", "max", "mean"],
                "z": ["min", "max", "mean"],
            }
        )
        .unstack()
        .to_dict()
    )

    for key in xyz_meta.keys():
        new_key = key[0] + "_" + key[1]
        meta[new_key] = xyz_meta[key]
    combined_meta[file_path] = meta
    if i >= N_PARQUETS_TO_READ:
        break


In [None]:
train_with_meta = train.merge(
    pd.DataFrame(combined_meta).T.reset_index().rename(columns={"index": "path"}),
    how="left",
)
train_with_meta.to_parquet("train_with_meta.parquet")

In [None]:
train_with_meta.head()

In [None]:
train_with_meta[["left_hand", "right_hand","pose"]].sum().sort_values().plot(
    kind="barh", title="Sum of Rows by Landmark Type"
)
plt.show()

## Data animation

This is the file we will be looking at. Feel free to change the directory to any of the files available to visualize them as well.

In [None]:
from matplotlib.animation import FuncAnimation
from IPython.display import HTML

In [None]:
## Change this directory to any file
path_to_sign = '/kaggle/input/tunisian-sign-language-landmarks/output_dir/005cv.parquet'
sign = pd.read_parquet(f'{path_to_sign}')
sign.y = sign.y * -1

#### sign.head(50)

In [None]:
sign.frame.unique()

In [None]:
def get_hand_points(hand):
    x = [[hand.iloc[0].x, hand.iloc[1].x, hand.iloc[2].x, hand.iloc[3].x, hand.iloc[4].x], # Thumb
         [hand.iloc[5].x, hand.iloc[6].x, hand.iloc[7].x, hand.iloc[8].x], # Index
         [hand.iloc[9].x, hand.iloc[10].x, hand.iloc[11].x, hand.iloc[12].x], 
         [hand.iloc[13].x, hand.iloc[14].x, hand.iloc[15].x, hand.iloc[16].x], 
         [hand.iloc[17].x, hand.iloc[18].x, hand.iloc[19].x, hand.iloc[20].x], 
         [hand.iloc[0].x, hand.iloc[5].x, hand.iloc[9].x, hand.iloc[13].x, hand.iloc[17].x, hand.iloc[0].x]]

    y = [[hand.iloc[0].y, hand.iloc[1].y, hand.iloc[2].y, hand.iloc[3].y, hand.iloc[4].y],  #Thumb
         [hand.iloc[5].y, hand.iloc[6].y, hand.iloc[7].y, hand.iloc[8].y], # Index
         [hand.iloc[9].y, hand.iloc[10].y, hand.iloc[11].y, hand.iloc[12].y], 
         [hand.iloc[13].y, hand.iloc[14].y, hand.iloc[15].y, hand.iloc[16].y], 
         [hand.iloc[17].y, hand.iloc[18].y, hand.iloc[19].y, hand.iloc[20].y], 
         [hand.iloc[0].y, hand.iloc[5].y, hand.iloc[9].y, hand.iloc[13].y, hand.iloc[17].y, hand.iloc[0].y]] 
    return x, y

In [None]:
sign["type"].unique()

In [None]:
sign = sign[sign.type=='left_hand'].dropna()
def animation_frame(f):
    frame = sign[sign.frame==f]
    left = frame[frame.type=='left_hand']
    lx, ly = get_hand_points(left)
    ax.clear()
    for i in range(len(lx)):
        ax.plot(lx[i], ly[i])
    plt.xlim(xmin, xmax)
    plt.ylim(ymin, ymax)

        
print(f"The sign being shown here is: {train[train.path==f'{path_to_sign}'].sign.values[0]}")

## These values set the limits on the graph to stabilize the video
xmin = sign.x.min() - 0.2
xmax = sign.x.max() + 0.2
ymin = sign.y.min() - 0.2
ymax = sign.y.max() + 0.2

fig, ax = plt.subplots()
l, = ax.plot([], [])
animation = FuncAnimation(fig, func=animation_frame, frames=sign.frame.unique())

HTML(animation.to_html5_video())

## Data preprocessing and modeling

In [None]:
DATA_COLUMNS    = ['x', 'y', 'z']
ROWS_PER_FRAME  = 42
NUM_SHARDS      = 2
SAVE_PATH       = '/kaggle/working/'
BATCH_SIZE = 32


In [None]:
def load_relevant_data_subset(pq_path):
    data = pd.read_parquet(pq_path, columns=DATA_COLUMNS)
    n_rows = len(data)
    if n_rows % ROWS_PER_FRAME != 0:
        n_rows = (n_rows // ROWS_PER_FRAME) * ROWS_PER_FRAME
        data = data.iloc[:n_rows]
    n_frames = int(n_rows / ROWS_PER_FRAME)
    data = data.values.astype(np.float32)
    return data.reshape(n_frames, ROWS_PER_FRAME, len(DATA_COLUMNS))


In [None]:
def tf_get_features(ftensor):
    def feat_wrapper(ftensor):
        return load_relevant_data_subset(ftensor.numpy().decode('utf-8'))
    return tf.py_function(
        feat_wrapper,
        [ftensor],
        Tout=tf.float32
    )

In [None]:
def set_shape(x):
    
    # None dimensions can be of any length
    return tf.ensure_shape(x, (None, ROWS_PER_FRAME, len(DATA_COLUMNS)))

In [None]:
train

In [None]:
# extract the signs that contains plus than 3 samples in the train set
sign_counts = train["sign"].value_counts()
train_filtered = train[train["sign"].isin(sign_counts[sign_counts > 3].index)]
print(train_filtered["sign"].nunique())

In [None]:
# Add ordinally Encoded Sign (assign number to each sign name)
train_filtered['sign_ord'] = train_filtered['sign'].astype('category').cat.codes

# Dictionaries to translate sign <-> ordinal encoded sign
SIGN2ORD = train_filtered[['sign', 'sign_ord']].set_index('sign').squeeze().to_dict()

In [None]:
SIGN2ORD

In [None]:
import numpy as np
from sklearn.model_selection import train_test_split

# assume X contains the input data and y contains the class labels
X = train_filtered.drop("sign",axis=1)
y = train_filtered.sign

# perform stratified train-validation split
X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.3, stratify=y)



In [None]:
# Concatenate the two DataFrames along the rows axis
train_final = pd.concat([X_train, y_train], axis=1)
train_final.head()

In [None]:
# Concatenate the two DataFrames along the rows axis
val = pd.concat([X_val, y_val], axis=1)
val.head()

In [None]:
X_ds = tf.data.Dataset.from_tensor_slices(
    train_final.path.values                                              # start with a dataset of the parquet paths
).map(
    tf_get_features                                                   # load individual sequences
).map(
    set_shape                                                         # set and enforce element shape
).apply(
    tf.data.experimental.dense_to_ragged_batch(batch_size=BATCH_SIZE) # apply batching function
)

# load and batch the labels
y_ds = tf.data.Dataset.from_tensor_slices(
    train_final.sign.map(SIGN2ORD).values.reshape(-1,1)
).batch(BATCH_SIZE)

# zip the features and labels
train_ds = tf.data.Dataset.zip((X_ds, y_ds))

In [None]:
# Sharding could be improved, as the distribution of elements in different shards should optimally be equal.
# Currently, it will be a sample from a uniform distribution because this is simple to implement
def shard_func(*_):
    return tf.random.uniform(shape=[], maxval=NUM_SHARDS, dtype=tf.int64)

In [None]:
train_ds.prefetch(tf.data.AUTOTUNE).save(SAVE_PATH, shard_func=shard_func)

In [None]:
def check_throughput(ds_path):
    for x in tqdm(tf.data.Dataset.load(ds_path)):
        pass

In [None]:
check_throughput(SAVE_PATH)

In [None]:
for x, y in tf.data.Dataset.load(SAVE_PATH).take(1):
    print(x.shape)
    print(y.shape)

In [None]:
# Set constants and pick important landmarks
LANDMARK_IDX = list(range(1,41))
DS_CARDINALITY = 10
N_SIGNS = 14

In [None]:
dataset = tf.data.Dataset.load(SAVE_PATH)

In [None]:
def preprocess(ragged_batch, labels):
    ragged_batch = tf.gather(ragged_batch, LANDMARK_IDX, axis=2)
    ragged_batch = tf.where(tf.math.is_nan(ragged_batch), tf.zeros_like(ragged_batch), ragged_batch)
    return tf.concat([ragged_batch[...,i] for i in range(3)],-1), labels

train_ds = dataset.map(preprocess)
# val_ds = dataset.take(VAL_SIZE).cache().prefetch(tf.data.AUTOTUNE)
# train_ds = dataset.skip(VAL_SIZE).cache().shuffle(20).prefetch(tf.data.AUTOTUNE)

In [None]:
X_val_ds = tf.data.Dataset.from_tensor_slices(
    val.path.values                                              # start with a dataset of the parquet paths
).map(
    tf_get_features                                                   # load individual sequences
).map(
    set_shape                                                         # set and enforce element shape
).apply(
    tf.data.experimental.dense_to_ragged_batch(batch_size=BATCH_SIZE) # apply batching function
)

# load and batch the labels
y_val_ds = tf.data.Dataset.from_tensor_slices(
    val.sign.map(SIGN2ORD).values.reshape(-1,1)
).batch(BATCH_SIZE)

# zip the features and labels
val_ds = tf.data.Dataset.zip((X_val_ds, y_val_ds))

val_ds.prefetch(tf.data.AUTOTUNE).save(SAVE_PATH, shard_func=shard_func)

check_throughput(SAVE_PATH)

dataset = tf.data.Dataset.load(SAVE_PATH)

val_ds = dataset.map(preprocess)

In [None]:
val_ds

In [None]:
train_ds

#### WORK STILL IN PROGRESS

## Thank you!

* Thank you for taking the time to read through my notebook. I hope you found it interesting and informative. If you have any feedback or suggestions for improvement, please don't hesitate to let me know in the comments.
* If you liked this notebook, please consider upvoting it so that others can discover it too. Your support means a lot to me, and it helps to motivate me to create more content in the future.