API for the Quotify webapp to generate quotes using a finetuned GPT2 model. The model can be downloaded from the releases page. The backend was built using FastAPI and deployed with Docker on Google Cloud Platform.
This was the best method we managed to find to deploy large models (>500MB) to the cloud.
API Docs: https://quotify-engine-l6lhxur2aq-uc.a.run.app/docs
- Clone Repo
git clone https://github.com/Quotify-Bot/quotify-backend.git
- Change directory
cd quotify-backend
- Download the model named
pytorch_model.bin
from releases and add it to thefinetuned_models
directory - Install virtual environment
virtualenv env
- Activate environment
env\Scripts\activate
- Install requirements
pip install -r requirements.txt
- Install pytorch cpu
pip install torch==1.7.1+cpu -f https://download.pytorch.org/whl/torch_stable.html
- Start the server
uvicorn main:app --host 0.0.0.0 --port 8080
- Create docker image
docker build -t <image_name>:<tag_name> .
- Login using gcloud CLI
gcloud auth login
- Tag the image in the correct format for deployment
docker tag <image_name>:<tag_name> gcr.io/<project_name>/<image_name>:<tag_name>
- Push to GCP container registry
docker push gcr.io/<project_name>/<image_name>:<tag_name>
- Go to GCP container registry and deploy using cloud run