MLOpsProj

Goal of the project

The project aims to classify facial expressions into 7 categories: 0=Angry, 1=Disgust, 2=Fear, 3=Happy, 4=Sad, 5=Surprise, 6=Neutral. In detail the user will be able to upload a picture of a face and the model will classify the facial expression into one of the 7 different categories. The whole project aims to orchestrate the whole process of training, deploying and monitoring the model. There won't be any frontend for the user to interact with, but the user will be able to upload a picture of a face through an API and get the classification result back. The problem here would be to face align the faces in the image and crop it accordingly, but there are tools that can do that and we will treat it like a blackbox.

Frameworks used

The project will use vision transformer models from the HuggingFace library. We intend to use the PyTorch framework for training the model. For face alignment (so the inference part of the model) we will use the face-alignment deep learning models or simply the openCV library similar to here

Dataset

The data is taken from the FER2013 dataset on Kaggle. It contains 48x48 pixel grayscale images of faces and the corresponding labels. The data is fairly simple but that will let us focus on the MLOps part of the project.

Models

We will specifically use a variation of the ViT model, which is a transformer model that uses a vision transformer architecture. The pretrained model will be taken from Huggingface's model repository and fine-tuned on the FER2013 dataset.

Depending on the face-alignment model we use (deep learning or openCV) we will either use the pretrained model or opencv to align the faces in the image, so there won't be any training done in this part.

Automate testing

Unit tests in the CI process cover aspects of data preprocessing, model building and training. We test on a random subset of the protected raw dataset and training models where the state_dict is randomized as well.

We use pytest framework in CI and obtain a 97% code coverage. Besides, GitHub Secrets are used to store and access the api-key of wandb used in traing script in the automate testing.

Name		Name	Last commit message	Last commit date
Latest commit History 194 Commits
.dvc		.dvc
.github/workflows		.github/workflows
.idea		.idea
dockerfiles		dockerfiles
docs		docs
notebooks		notebooks
reports		reports
src		src
tests		tests
.dockerignore		.dockerignore
.dvcignore		.dvcignore
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
Makefile		Makefile
README.md		README.md
TODO.md		TODO.md
app.py		app.py
cloudbuild.yaml		cloudbuild.yaml
data.dvc		data.dvc
docker-compose.yaml		docker-compose.yaml
image.jpg		image.jpg
models.dvc		models.dvc
predict_model.dockerfile		predict_model.dockerfile
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
requirements_dev.txt		requirements_dev.txt
requirements_full.txt		requirements_full.txt
sweep_config.yaml		sweep_config.yaml
train_model.dockerfile		train_model.dockerfile
vertexai_config.yaml		vertexai_config.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MLOpsProj

Goal of the project

Frameworks used

Dataset

Models

Automate testing

About

Releases

Packages

Contributors 4

Languages

Bozhi-Lyu/MLOpsProj

Folders and files

Latest commit

History

Repository files navigation

MLOpsProj

Goal of the project

Frameworks used

Dataset

Models

Automate testing

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages