- About
- Features
- Installation
- Run
- Development info
- Testing
- Contributing
- Reading Material
- Ethical discussions
- Future work
- Data used
- Contact
Facial recognition software is used in many industries, including but not limited to law enforcement, airports, healthcare facilities, technology manufacturing companies. As facial recognition technologies gain popularity there are some privacy and ethical concerns rising regarding the development and use of these tools. According to a report by the National Institute of Standards and Technology, comercial facial recognition tools exhibited biases with age, gender and race.This is an important issue because biased facial recognition technology in law enforcment may lead to false accusations and arrests, or in airports it may cause delayed flights. Therefore the purpose of this project is to highlight ethical issues with face recognition technologies, compare efficiency of different classification algorithms and raise questions about development and use of face recognition tools. The project is funded by Mozilla Foundation and it will be used in AI course at Allegheny College. Please visit the Allegheny Ethical CS for more information.
-
Gender classification(binary)
-
Experiment with models by adjusting the training data
-
Compare classification approaches:
- Convolution Neural Network(CNN)
- Multi-layer Perception(MLP)
- Random Forest(RandomForest)
- Support Vector(SVM)
-
Clone the source code onto your machine
With HTTPS:
https://github.com/Allegheny-Mozilla-Fellows/facial_recognition_bias.git
or With SSH:
git@github.com:Allegheny-Mozilla-Fellows/facial_recognition_bias.git
-
Install Poetry(Recommended)
Poetry is a tool for dependency managment and packaging in Python. Please follow the documentation here on how to install poetry on your machine
Use poetry install
to install dependencies. Please refer to poetry documentation here for more info about dependency installation.
Please note that installation process may take a while.
After entering the virtual environment and installing the dependencies please refer to the following links for the detailed info on how to run each classifier.
- Convolution Neural Network(CNN)
- Multi-layer Perception(MLP)
- Random Forest(RandomForest)
- Support Vector(SVM)
Alternatively all dependencies required for this project will need to be installed
locally on your machine. You may use pip
for that purpose.
python3 -m pip install --upgrade pip
pip install package_name
After installing all the dependencies please refer to the following links for the detailed info on how to run each classifier.
- Convolution Neural Network(CNN)
- Multi-layer Perception(MLP)
- Random Forest(RandomForest)
- Support Vector(SVM)
When under developmnet always install the dependencies with poetry install
and run the program with poetry run python program_name
.
You can add new dependencies to pyproject.toml
either manually or by poetry add package_name
. Please refer to documentation here for more information.
Use poetry update
for updating the dependencies to their latest versions as neccessary. Please refer to documentation here for more information.
Please use pre-commit
hooks for linting the code. Install pre-commit with pip install pre-commit
or follow the documentation here. After cloning the repository locally run pre-commit install
to install pre-commit into your git hooks.
NOTE: You would have to run pre-commit install
every time you clone a repository. Please refer to documentation here for more information.
NOTE: You will not be able to complete commit unless all the linters pass. Only staged changes will be checked at the time of commit.
Developers of this program can run the test suite with Pytest
poetry run pytest
Use poetry run pre-commit run --all-files
to check the code with linters and get the diagnostic info.
Currently this project uses following linters:
- pylint
- pydocstyle
- flake8
- black
You may add more linters to .pre-commit-config.yaml
We welcome everyone who is interested in helping improve this project! If you are interested in being a contributor, please review our Code of Conduct and Guidelines for Contributors before raising an issue, or beginning a contribution.
To create a pull request please follow this template
Currently this project mainly examines the gender biases, and how easy it is to manipulate with the classification algorithm by modifing the training data. Users of this program can experiment with classifiers and see that more diverse the data more precicise the trained model will be. Please refer to README.md for each individual classifier for more information about how to experiment with this project. This work also allows to compare the efficency of various classification algorithms. The project can further be extended by examining biases with race and age, and adding more classification algorithms for comparison, or adding feature to visualise the efficiency of the classifiers.
Here is the list of articles that may give the user more insights into the biases in facial recognition technologies.
-
Facial recognition to 'predict criminals' sparks row over AI bias
-
the-computer-got-it-wrong-how-facial-recognition-led-to-a-false-arrest-in-michig
-
google-ai-will-no-longer-use-gender-labels-like-woman-or-man-on-images-of-people-to-avoid-bias
-
Gender classification tools are usually binary, what ethical issues may this lead to? Why?
-
What are some of the ways we can prevent biases in face recognition technologies as developers and as users?
-
Which industries should be allowed to use facial recognition tools, who should not? Why?
The images used in this project are retrieved from Kaggle and are stored in file data/images
directory. File stores about 10 000 face images. The images are annotated with age, gender and ethnicity. The images are cropped and aligned.
The labels of each face image is embedded in the file name, formatted like age_gender_race_date&time.jpg
- age is an integer from 0 to 116, indicating the age
- gender is either 0 (male) or 1 (female)
- race is an integer from 0 to 4, denoting White, Black, Asian, Indian, and Others (like Hispanic, Latino, Middle Eastern).
More information on data can be found here.
If you have any questions or concerns about this project please contact:
- Dr. Jumadinova(jjumadinova@allegheny.edu)
- Teona Bagashvili(bagashvilit@allegheny.edu)