AC215 - Milestone 5
Project Organization
βββ LICENSE
βββ notebooks
βΒ Β βββ breed_labels.txt
βΒ Β βββ DogNet_Breed_Distillation.ipynb
βΒ Β βββ ExploratoryDataAnalysis.ipynb
βΒ Β βββ model_testing.ipynb
βββ README.md
βββ requirements.txt
βββ src
βββ api-service
βΒ Β βββ Dockerfile
βΒ Β βββ Pipfile
βΒ Β βββ Pipfile.lock
βΒ Β βββ api
βΒ Β βΒ Β βββ model.py
βΒ Β βΒ Β βββ service.py
βΒ Β βββ config
βΒ Β βΒ Β βββ breed-to-index.json
βΒ Β βΒ Β βββ index-to-breed.json
βΒ Β βΒ Β βββ model-controller-config.json
βΒ Β βΒ Β βββ util.py
βΒ Β βββ docker-entrypoint.sh
βΒ Β βββ docker-shell.sh
βΒ Β βββ secrets
βΒ Β βββ wandb.json
βββ deployment
βΒ Β βββ Dockerfile
βΒ Β βββ deploy-create-instance.yml
βΒ Β βββ deploy-docker-images.yml
βΒ Β βββ deploy-provision-instance.yml
βΒ Β βββ deploy-setup-containers.yml
βΒ Β βββ deploy-setup-webserver.yml
βΒ Β βββ docker-entrypoint.sh
βΒ Β βββ docker-shell.sh
βΒ Β βββ inventory.yml
βΒ Β βββ loginProfile
βΒ Β βββ nginx-conf
βΒ Β βΒ Β βββ nginx
βΒ Β βΒ Β βββ nginx.conf
βΒ Β βββ secrets
βΒ Β βββ deployment.json
βΒ Β βββ gcp-service.json
βΒ Β βββ ssh-key-deployment
βΒ Β βββ ssh-key-deployment.pub
βββ dvc
βΒ Β βββ Dockerfile
βΒ Β βββ Pipfile
βΒ Β βββ Pipfile.lock
βΒ Β βββ docker-shell.sh
βΒ Β βββ team-engai-dogs.dvc
βββ frontend-react
βΒ Β βββ docker-shell.sh
βΒ Β βββ Dockerfile
βΒ Β βββ Dockerfile.dev
βΒ Β βββ package.json
βΒ Β βββ public
βΒ Β βΒ Β βββ favicon.ico
βΒ Β βΒ Β βββ index.html
βΒ Β βΒ Β βββ manifest.json
βΒ Β βββ src
βΒ Β βΒ Β βββ app
βΒ Β βΒ Β βΒ Β βββ App.css
βΒ Β βΒ Β βΒ Β βββ App.js
βΒ Β βΒ Β βΒ Β βββ background.png
βΒ Β βΒ Β βΒ Β βββ components
β βΒ Β βΒ Β βΒ Β βββ Footer
β βΒ Β βΒ Β β β βββ Footer.css
β βΒ Β βΒ Β β β βββ Footer.js
β βΒ Β βΒ Β βΒ Β βββ ImageUpload
β βΒ Β βΒ Β β β βββ ImageUpload.css
β βΒ Β βΒ Β β β βββ ImageUpload.js
β βΒ Β βΒ Β βΒ Β βββ ModelToggle
β βΒ Β βΒ Β β βββ ModelToggle.css
β βΒ Β βΒ Β β βββ ModelToggle.js
βΒ Β βΒ Β βΒ Β βββ services
βΒ Β βΒ Β βΒ Β βββ BreedParse.js
βΒ Β βΒ Β βΒ Β βββ DataService.js
βΒ Β βΒ Β βββ index.js
βΒ Β βββ yarn.lock
βββ model-deployment
βΒ Β βββ Dockerfile
βΒ Β βββ Pipfile
βΒ Β βββ Pipfile.lock
βΒ Β βββ cli.py
βΒ Β βββ docker-entrypoint.sh
βΒ Β βββ docker-shell.sh
βββ models
βΒ Β βββ resnet152v2
βΒ Β βββ Dockerfile
βΒ Β βββ Pipfile
βΒ Β βββ Pipfile.lock
βΒ Β βββ distiller.py
βΒ Β βββ docker-shell.sh
βΒ Β βββ dog_breed_dataset
βΒ Β βΒ Β βββ images
βΒ Β βΒ Β βββ Images
βΒ Β βββ model_training_age_dataset.py
βΒ Β βββ model_training_breed_dataset.py
βΒ Β βββ model_training_breed_dataset_distillation.py
βΒ Β βββ model_training_breed_dataset_pruned.py
βΒ Β βββ run-model.sh
βΒ Β βββ secrets
βΒ Β βΒ Β βββ data-service-account.json
βΒ Β βββ util.py
βββ preprocessing
βΒ Β βββ Dockerfile
βΒ Β βββ Pipfile
βΒ Β βββ Pipfile.lock
βΒ Β βββ ResizeDogImages.ipynb
βΒ Β βββ docker-entrypoint.sh
βΒ Β βββ docker-shell.sh
βΒ Β βββ preprocess_age.py
βΒ Β βββ preprocess_breed.py
βΒ Β βββ util.py
βββ pwd
βββ secrets
βΒ Β βββ data-service-account.json
βΒ Β βββ wandb.json
βββ tensorizing
βΒ Β βββ Dockerfile
βΒ Β βββ Pipfile
βΒ Β βββ Pipfile.lock
βΒ Β βββ curr_image
βΒ Β βββ curr_image.jpg
βΒ Β βββ docker-entrypoint.sh
βΒ Β βββ docker-shell.sh
βΒ Β βββ hold_working_age.py
βΒ Β βββ secrets
βΒ Β βΒ Β βββ data-service-account.json
βΒ Β βββ tensorize_age_dataset.py
βΒ Β βββ tensorize_breed_dataset.py
βββ validation
βΒ Β βββ Dockerfile
βΒ Β βββ Pipfile
βΒ Β βββ Pipfile.lock
βΒ Β βββ cv_val.py
βΒ Β βββ cv_val_sql.py
βΒ Β βββ docker-shell.sh
βΒ Β βββ requirements.txt
βββ workflow
βββ Dockerfile
βββ Pipfile
βββ Pipfile.lock
βββ age_model_training.yaml
βββ cli.py
βββ data_preprocessing.yaml
βββ docker-entrypoint.sh
βββ docker-shell.sh
βββ pipeline.yaml
βββ secrets
βΒ Β βββ compute-service-account.json
βββ tensorizing.yaml
32 directories, 109 files
Team Members Nevil George, Juan Pablo Heusser, Curren Iyer, Annie Landefeld, Abhijit Pujare
Group Name EngAi Group
Project In this project, we aim to build an application that can predict a dog's breed and age using a photo.
In this milestone we worked on multiple aspects of the project:
(1) Deployment of the web service to GCP [/src/deployment/](src/deployment/)
(2) Frontend/React container [/src/frontend-react/](src/frontend-react/)
(3) API service [/src/api-service/](src/api-service/)
(4) Add model deployment to Vertex AI [/src/model-deployment/](src/model-deployment/)
(5) Switching from Model Pruning to Knowledge Distillation as compression technique
You can find the Solutions Architecture and Technical Architecture diagrams below. The two diagrams detail how the various components of the system work together to classify dog images.
We used Ansible to automate the provisioning and deployment of our frontend and backend containers to GCP. Below you can find a screenshot of the VM that's running our service on GCP.
Additionally, you can find a screenshot that shows the container images we have pused to the GCP container repository:
Deployment Container /src/deployment/
This container builds the containers, creates and provisions a GCP instance and then deploys those containers to those intances.
If you wish to run the container locally :
- Navigate to src/deployment in your terminal
- Run sh docker-shell.sh
- Build and Push Docker Containers to GCR (Google Container Registry) by running the following yaml"
ansible-playbook deploy-docker-images.yml -i inventory.yml
- Create Compute Instance (VM) Server which will host the containers
ansible-playbook deploy-create-instance.yml -i inventory.yml --extra-vars cluster_state=present
- Provision Compute Instance in GCP to setup all required software
ansible-playbook deploy-provision-instance.yml -i inventory.yml
- Install Docker Containers on the Compute Instance
ansible-playbook deploy-setup-containers.yml -i inventory.yml
- Setup Webserver on the Instance
ansible-playbook deploy-setup-webserver.yml -i inventory.yml
/src/model-deployment/ In order to finish out the model pipeline which powers the ML application, we added the final step of model deployment to the Vertex AI pipeline. This step utilizes a command line interface to take the model from Weights & Biases, upload it to Google Cloud Storage, and deploy it to Vertex AI. With the final step in place, the end to end model development from data processing, to tensorizing, to model training, and now model deployment are all part of a unified pipeline.
To use just the model deployment service, first launch the service with ./docker-shell.sh
to get to the interpreter.
- Upload the model from Weights & Biases to GCS
python3 cli.py --upload
- Deploy the model to Vertex AI
python3 cli.py --deploy
/notebooks/DogNet_Breed_Distillation.ipynb
In milestone 4 we used model pruning as our compression technique but realized that distillation was more suitable for our application since most of the models layers were not being trained. All of the code used to test different model combinations and distillation can be found in the notebook linked above.
We tested different base architectures for both the teacher and the student model.
With this model architecture we obtained a maximum validation accuracy of 82.5% on epoch 20. The model learned fairly quickly compared to other architectures, achieving a 68% validation accuracy on the first epoch.
This base architecture did not perform well on the dogs dataset, as we only achieved a 42.25% maximum validation accuracy on epoch 27.
Using the DenseNet201 model architecture we achieved very good results for such a small model, yet it still obtained a lower max validation accuracy compared to ResNet152v2, of 81.9%. The difference is minimal but as a team we decided to use ResNet152v2 as our teacher model.
This model architecture did not perform well on the dataset. The training accuracy was around 84% by the end of the 30 epochs, while the validation accuracy was around just 24% meaning that the model was not generalizing well, and overfitting the training data.
Similar to the ConNextBase architecture, this model did not generalize well and overfit the training data, achieving a max training accuracy of 87.7% and max validation accuracy of 56.3%
With this base model architecture we achieved a maximum validation accuracy of 71.6% by epoch 17. The model was able to learn quickly initially and the accuracy obtained was significantly lower than that obtained with the teacher model, making it a prime candidate for model distillation.
For model distillation we decided to use the teacher model with the ResNet152v2 base architecture and we built a new student model using the DenseNet121 architecture. Then based on the contents reviewed in class we proceeded to implement the distillation training loop and train the student model by distilling from the teacher model. We obtained a 92.6% validation accuracy, even greater than with the teacher model, on epoch 28. Using distillation we managed to compress the teacher model 7.65x and achieve better validation accuracy.
This result es extremely positive as the distilled student model achieved a better validation accuracy than the teach model. Even more so, this model obtained a validation accuracy similar to top SOTA models for Fine-Grained Image Classification on the Stanford Dogs dataset.
(https://paperswithcode.com/sota/fine-grained-image-classification-on-stanford-1)
The NΒΊ1 model on this list, the ViT-NeT model achieved a 93.6% accuracy on the same dataset. Our results would place our distilled student model in the top 10 of this list.
Below is a comparison table obtained from the ViT-NeT paper.
Source: Kim, S., Nam, J., & Ko, B. C. (2022). ViT-NeT: Interpretable Vision Transformers with Neural Tree Decoder. In Proceedings of the 39th International Conference on Machine Learning (PMLR 162). Baltimore, Maryland, USA.
The `api-service` provides two endpoints, the index and the predict endpoints. The `/predict` endpoint is called from the frontend with an image to make a model inference.The ModelController
is responsible for calling either the local model (saved in the container) or the remote model (stored on VertexAI)
We have three components in the Components directory.
Footer contains the footer that stores the history of the past 5 search results (just the predicted breed, not the probabilities).
ImageUpload contains the interface for uploading an image to the website, making a call to the model (depending on ModelToggle), returning the predicted breed and confidence level (probability), and storing that predicted breed in the Footer as part of the search history.
ModelToggle has a dropdown for the user to select either our Hosted or Local Model. We included both to show the difference in response times. The model itself is the same so the performance in terms of accuracy is expected to be the same as well. The parameter is passed from the user-selected dropdown as part of the formData argument that is read in DataService in the services section (see below).
We have two React files in the Services directory.
BreedParse is used to extract the reader friendly version of the predicted breed species name to display it in the results section of ImageUpload and append it to the history of the past 5 results in the Footer.
DataService is used to make the server call to the API endpoint to select the right model, depending on the selection in the ModelToggle component.
GCP Bucket Structure
team-engai-dogs
βββ dog_age_dataset/
βββ Expert_Train/
βββ PetFinder_All/
βββ dog_breed_dataset/
βββ annotations/
βββ images/
βββ dvc_store
We have the same structure for the tensorized data as well, in bucket team-engai-dogs-tensorized
.