# COGS 118B - Final Project Results & Discussion

# American Sign Language Recognition

# Names

- Allen Phu
- Kevin Yu
- Saksham Rai
- Rodrigo Lizaran-Molina

# Results

You may have done tons of work on this. Not all of it belongs here. 

Reports should have a __narrative__. Once you've looked through all your results over the quarter, decide on one main point and 2-4 secondary points you want us to understand. Include the detailed code and analysis results of those points only; you should spend more time/code/plots on your main point than the others.

If you went down any blind alleys that you later decided to not pursue, please don't abuse the TAs time by throwing in 81 lines of code and 4 plots related to something you actually abandoned.  Consider deleting things that are not important to your narrative.  If its slightly relevant to the narrative or you just want us to know you tried something, you could keep it in by summarizing the result in this report in a sentence or two, moving the actual analysis to another file in your repo, and providing us a link to that file.

### Main Takeaways
After working through this project, we've realized that, ultimately, unsupervised methods were not that great for this dataset. Overall, we had had relatively poor metrics from multiple clustering algorithms. However, we did have high accuracy when using supervised models such as SVM and CNN. While GMM and K-Means had poor performance, the CNN gave us the best performance, with SVM coming in at second. CNN algorithms already have a reputation for being strong algorithms for image processing, as well as SVM (but not to the same degree.)

## All Library Imports

In [3]:
import numpy as np
import pandas as pd
from sklearn.preprocessing import StandardScaler
from sklearn.mixture import GaussianMixture
from sklearn.metrics import silhouette_score
from sklearn.metrics import rand_score, adjusted_rand_score
from scipy import stats


### CNN - Results

### CNN - Discussion

### GMM - Results
- Files are in gmm_traindata and gmm_testdata in the parent repository.

We decided to utilize ARI, RI, AIC, BIC, and Silhouette Score regarding GMMs in order to gauge the efficacy of the Sign Language MNIST dataset for ASL. The extended code and analysis is in the aforementioned gmm_traindata and gmm_testdata: the metrics are summarized below. Although this small section deals with only the "train" dataset, we also did work on the "test" dataset as well, linked up top. Of note, we also tried standardizing the data for GMMs, noted by "scaled_predicted". In the interest of time, we decided to just print out values in our final report. As stated earlier, all other information is within the two files. We decided to use the training data from the dataset for our GMM's purposes; this came from a GMM of 5 components.

In [7]:
adjusted_rand_nonscaled = 0.04676689255258681
non_adjusted_rand_nonscaled = 0.7703969635432097
adjusted_rand_scaled = 0.023481092716432513
non_adjusted_rand_scaled = 0.769771644446842
aic = 104560606.05819455
scaled_aic = -59301488.58365816
bic = 117240621.61718772
scaled_bic = -46621473.02466498
silhouette_nonscaled = 0.02272728942562161
silhouette_scaled = 0.004961856247034123

# NON-SCALED
print("Non-scaled Adjusted Rand Score:", adjusted_rand_nonscaled)
print("Non-scaled Non-Adjusted Rand Score:", non_adjusted_rand_nonscaled, '\n')

# SCALED
print("Scaled Adjusted Rand Score:", adjusted_rand_scaled)
print("Scaled Non-Adjusted Rand Score:", non_adjusted_rand_scaled, '\n')

# aic
print("aic:", aic)
print("scaled_aic:", scaled_aic, '\n')

# bic
print("bic:", bic)
print("scaled_bic:", scaled_bic, '\n')

# silhouette score
print("silhouette_nonscaled:", silhouette_nonscaled)
print("silhouette_scaled:", silhouette_scaled)

Non-scaled Adjusted Rand Score: 0.04676689255258681
Non-scaled Non-Adjusted Rand Score: 0.7703969635432097 

Scaled Adjusted Rand Score: 0.023481092716432513
Scaled Non-Adjusted Rand Score: 0.769771644446842 

aic: 104560606.05819455
scaled_aic: -59301488.58365816 

bic: 117240621.61718772
scaled_bic: -46621473.02466498 

silhouette_nonscaled: 0.02272728942562161
silhouette_scaled: 0.004961856247034123


### GMM - Discussion

At first glance, our scores were, across the board, very poor, so we also tried scaling the data to see if it'd make any difference; we didn't expect much, and didn't get much. 

While the non-adjusted Rand scores looked promising for GMM, the adjusted Rand scores were quite bad, at below 0.05 for both the scaled and non-scaled data. This pattern continued with our AIC, BIC, and Silhouette scores. After trying various values of "n_components" as well as trying GMM on the test dataset as well, we came to the conclusion that using GMM was a poor fit for our dataset.

### K-Means - Results


### K-Means - Discussion

### SVM


# Discussion

### Interpreting the result

OK, you've given us quite a bit of tech informaiton above, now its time to tell us what to pay attention to in all that.  Think clearly about your results, decide on one main point and 2-4 secondary points you want us to understand. Highlight HOW your results support those points.  You probably want 2-5 sentences per point.

### Limitations

Are there any problems with the work?  For instance would more data change the nature of the problem? Would it be good to explore more hyperparams than you had time for?   

### Ethics & Privacy

Our American Sign Language recognition research project utilises the MNIST dataset of hand signs. Even though this dataset has been fairly corroborated and extensively updated over the years; since our project has an application in helping the "hard of hearing", it's necessary to consider some possible ethical concerns:

1. Clustering Algorithm Bias: The MNIST training data might lack diversity in hand shapes, skin tones, signing styles, or individual variations, resulting in the algorithm prioritising majority patterns, leading to inaccurate classifications for underrepresented groups. For instance, a dataset skewed towards younger signers might struggle with recognizing signs used more frequently by older adults. In addition to this, even with a perfect algorithm also, there might be implicit biases in the dataset which are being amplified by our clustering algorithm.For example, the data might contain more examples of right-handed signing, potentially impacting the model's ability to recognize signs performed with the left hand. While such conclusions are less often, they are important to take under consideration because the implications of such biases are significant. Misclassifications can lead to communication breakdowns, and misunderstandings, all of which on an extended period and a larger sample space may even lead to social exclusion for specific groups within the Deaf community. We aim to address these concerns by trialling and testing the usage of different clustering algorithms to mitigate the risk of false clustering. Hierarchical clustering, for instance, builds a hierarchy of clusters, allowing me to zoom in on specific signing styles or hand shapes potentially underrepresented in the data. Similarly, density-based methods like DBSCAN focus on identifying clusters based on data density, making them less susceptible to the influence of majority patterns that could amplify bias. By evaluating these alternative algorithms with diverse test sets and fairness metrics (accuracy, precision and F1 score), we will choose the model that best balances inclusivity and accuracy, ensuring equitable representation across all user groups.

2. Data Privacy Issue: The MNIST Hand Sign dataset we are utilising contains 27,455 cases of test data. These are real-life pictures collected from different thousands of different people. Now, in high-security stakes, hand-pictures also serve as identifiers. Due to this reason, similar to any other dataset, our MNIST ASL dataset is also subject to the protection of the privacy of individuals. In addition to this, we will also have input data from voluntary non-ASL communicators to test the real-time interpretation feature of the app. We hope to address this by keeping the identity of the volunteers undisclosed. In addition, we do it by utilising a "black-box" way of allowing the user to interact with the model. Without giving them any information about existing ASL hand signs, we hope to test the model for its accuracy in identifying any or all signs that people make, closest to an actual ASL sign.

### Conclusion

Reiterate your main point and in just a few sentences tell us how your results support it. Mention how this work would fit in the background/context of other work in this field if you can. Suggest directions for future work if you want to.

# Footnotes

<a name="onenote"></a>1.[^](#one): Ashley Chow, Glenn Cameron, Manfred Georg, Mark Sherwood, Phil Culliton, Sam Sepah, Sohier Dane, Thad Starner. (2023). Google - American Sign Language Fingerspelling Recognition. Kaggle. https://kaggle.com/competitions/asl-fingerspelling<br> 

<a name="twonote"></a>2.[^](#two): El Moujahid, K. (2021, December 1). Machine learning to make sign language more accessible. Google. https://blog.google/outreach-initiatives/accessibility/ml-making-sign-language-more-accessible/<br> 

<a name="threenote"></a>3.[^](#three): Garimella, M. (2022, August 23). Sign Language Recognition with Advanced Computer Vision. Medium. https://towardsdatascience.com/sign-language-recognition-with-advanced-computer-vision-7b74f20f3442<br> 

<a name="fournote"></a>4.[^](#four): tecperson. (October 2017). Sign Language MNIST. Kaggle. https://www.kaggle.com/datasets/datamunge/sign-language-mnist<br> 

<a name="fivenote"></a>5.[^](#five): Pathan, R. K., Biswas, M., Yasmin, S., Khandaker, M. U., Salman, M., & Youssef, A. A. F. (2023). Sign language recognition using the fusion of image and hand landmarks through multi-headed convolutional neural network. Scientific Reports, 13(1), 16975. https://doi.org/10.1038/s41598-023-43852-x<br> 

<a name="sixnote"></a>6.[^](#six): Chen, Y. (2023, December 29). Learning American Sign Language (ASL) with Google’s Teachable Machine: A No-Code Experiment. Medium. https://medium.com/@dynotes/breaking-barriers-using-googles-no-code-approach-for-sign-language-recognition-and-learning-fc92ae16522c#bypass<br> 

"<a name="MLTeaching"></a>7.[^](#MLTeachingNote) Google open Teachable Machine. https://teachablemachine.withgoogle.com/train/tiny_image"
