This was a semester-long independent research project auditing the accuracy rates of commercial facial emotion recognition classifiers. The paper and poster are included in this Github repository.
The use of artificial intelligence (AI) for decision-making like choosing who to hire or determining recidivism rates is becoming more commonplace. However, ubiquity does not mean that AI is a perfected technology. In particular, this study will focus on one controversial application of AI— facial recognition. While there have been studies showing the intersectional discrepancies of facial recognition, current work has not yet extended to auditing emotional detection, a subarea of study within facial recognition. To fill this gap, we will be modifying the phenotypical features of faces from the AffectNet dataset in this study. Using this augmented dataset, we will then audit the performance of commercial systems in determining the emotions of these faces. In particular, we are looking to see if there are intersectional discrepancies between our four key target groups (lighter-skinned males, lighter-skinned females, darker-skinned males, and darker-skinned females). The aim of this project is to see whether there are any differences, and, if so, what biases these commercial companies might be upholding through their technologies.