Judging a Book by Its Cover

Judging a Book by Its Cover is a web app that predicts the rating for a book based on its cover image. To see it in action, visit the website (update: link removed).

Details of the Web App

The training data (average user rating and book cover images) is obtained from about 15,000 books from Goodreads with at least 10 reviews (using the goodreads Python package). The underlying model uses features extracted from the ResNet18 model pre-trained on ImageNet data, except the last fully connected layer was modified for the regression task at hand.

The app is split into the frontend and backend modules. The backend is built with FastAPI, a relatively new and high-performing Python web framework, and handles the data processing and model scoring. The frontend is built using Streamlit, a convenient and opinionated library for building single-page interactive web applications that is great for prototyping.

Details on Deployment

To make the app more easily reproducible for deployment, I used Docker to containerize the frontend and backend components, and used Docker Compose to orchestrate the two containers.

To host the web app, I purchased a domain and rented a Ubuntu server from Vultr. It was fairly simple to set up Docker on the compute instance via apt get, and I was able to get the app up and running in no time. I followed this tutorial for the DNS setup.

To automate the deployment process, I decided to go with GitHub Actions. The workflow I set up would first build the multi-container app and run my backend tests to make sure the endpoints work as expected and that the predictions are served with sufficiently low latency. Next, it would connect to the remote server using SSH, pull the changes from the repository, rebuild the images and restart the containers. A lot of the inspiration came from this helpful tutorial. The whole process only takes a few minutes to complete (most of which is spent on building the Docker images during the test stage), and allows me to make sure that I haven't introduced any code breaking changes, as shown below:

And to summarize, the diagram below illustrates the architecture of the app.

How to Run

Scripts for the web app can be found in the app folder. To run the app locally, run docker-compose build followed by docker-compose --detach (Docker Compose required) within the app directory, and it will be accessible on localhost.

The ML pipeline (data collection and model training, along with logging) can be ran with run.sh. Relevant files can be found within the ml folder.

Learnings 📚

Through this project, I had a chance to review the documentation on Docker Compose and learn more about the tool. I also got to use GitHub Actions to automate the CI/CD process. It was also an interesting experience purchasing a domain for the first time, and very satisfying to see my own web app live on the Internet 👩‍💻!

Some Next Steps

Work on improving the model: I can try fine-tuning the whole model on my book cover image dataset instead of using fixed weights from the pre-trained model, apply a deeper ResNet model or use a different architecture (e.g. ResNeXt). I did not spend too much time on model refinement yet because I wanted to focus on getting the app up and running to learn about the deployment process.
Convert the site from HTTP to HTTPS by obtaining and configuring an SSL certificate.
Learn more about reverse proxies like nginx.
Automate SSH key rotation with GitHub Actions
Set up a monitoring solution for system-related (e.g. latency, error rate), model-related (e.g. prediction distribution) and resource-related (e.g. memory utilization) metrics, and simulate production traffic with Locust to test it out, similar to what's proposed here

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
.github/workflows		.github/workflows
app		app
assets		assets
ml		ml
.flake8		.flake8
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.toml		.toml
README.md		README.md
requirements.txt		requirements.txt
run.sh		run.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.github/workflows

.github/workflows

app

app

assets

assets

ml

ml

.flake8

.flake8

.gitignore

.gitignore

.pre-commit-config.yaml

.pre-commit-config.yaml

.toml

.toml

README.md

README.md

requirements.txt

requirements.txt

run.sh

run.sh

Repository files navigation

Judging a Book by Its Cover

Details of the Web App

Details on Deployment

How to Run

Learnings 📚

Some Next Steps

About

Releases

Packages

Languages

x249wang/judging_a_book_by_its_cover

Folders and files

Latest commit

History

Repository files navigation

Judging a Book by Its Cover

Details of the Web App

Details on Deployment

How to Run

Learnings 📚

Some Next Steps

About

Resources

Stars

Watchers

Forks

Languages