Implementation of full stack deep learning project highlighting important components of:
- Web deployment
- CI/CD for machine learning
git clone https://github.com/SathesanThavabalasingam/fsdl-text-recognizer.git
cd fsdl-text-recognizerRun conda env create to create an environment called fsdl-text-recognizer, as defined in environment.yml.
This environment will provide us with the right Python version as well as the CUDA and CUDNN libraries.
We will install Python libraries using pip-sync, however, which will let us do three nice things:
- Separate out dev from production dependencies (
requirements-dev.invsrequirements.in). - Have a lockfile of exact versions for all dependencies (the auto-generated
requirements-dev.txtandrequirements.txt). - Allow us to easily deploy to targets that may not support the
condaenvironment.
So, after running conda env create, activate the new environment and install the requirements:
conda activate fsdl-text-recognizer
pip-sync requirements.txt requirements-dev.txtIf you add, remove, or need to update versions of some requirements, edit the .in files, then run
pip-compile requirements.in && pip-compile requirements-dev.in
Now, every time you work in this directory, make sure to start your session with conda activate fsdl-text-recognizer.
Before we get started, please run a command that will take a little bit of time to execute.
python text_recognizer/datasets/emnist_dataset.pyRunning tasks/lint.sh fully lints our codebase with a few different checkers:
pipenv checkscans our Python package dependency graph for known security vulnerabilitiespylintdoes static analysis of Python files and reports both style and bug problemspycodestylechecks for simple code style guideline violations (somewhat overlapping withpylint)mypyperforms static type checking of Python filesbanditperforms static analysis to find common security vulnerabilities in Python codeshellcheckfinds bugs and potential bugs in shell scrips
A note: in writing Bash scripts, I often refer to this excellent guide.
Note that the linters are configured using .pylintrc and setup.cfg files, as well as flags specified in lint.sh.
Getting linting right will pay off in no time, and is a must for any multi-developer codebase.
The relevant new files for setting up continuous integration are
evaluation/evaluate_character_predictor.pyevaluation/evaluate_line_predictor.pytasks/test_validation.sh
There is one additional file (in the top-level directory): .circleci/config.yml
Set up CircleCI first and then look at the new evaluation files.
Go to https://circleci.com and log in with your Github account.
Click on Add Project. Select your fork of the fsdl-text-recognizer-project repo.
It will ask you to place the config.yml file in the repo.
Good news -- it's already there, so you can just hit the "Start building" button.
Validation test files: they simply evaluate the trained predictors on respective test sets, and make sure they are above threshold accuracy.
Now that CircleCI is done building, push a commit so that we can see it build again, and check out the nice green chechmark in our commit history (https://github.com/SathesanThavabalasingam/fsdl-text-recognizer-project/commits/master)
First, we will get a Flask web server up and running and serving predictions.
python api/app.py
Open up another terminal tab (click on the '+' button under 'File' to open the launcher). In this terminal, send some test image to the web server running in the first terminal.
export API_URL=http://0.0.0.0:8000
curl -X POST "${API_URL}/v1/predict" -H 'Content-Type: application/json' --data '{ "image": "data:image/png;base64,'$(base64 -i text_recognizer/tests/support/emnist_lines/test_case.png)'" }'
If you want to look at the image you just sent, you can navigate to
text_recognizer/tests/support/emnist_lines in the file browser on the
left, and open the image.
The web server code should have a unit test just like the rest of our code.
Let's check it out: the tests are in api/tests/test_app.py.
You can run them with
tasks/test_api.shNow, we'll build a docker image with our application.
The Dockerfile in api/Dockerfile defines how we're building the docker image.
In root directory run:
tasks/build_api_docker.shThis should take a couple of minutes to complete.
When it's finished, you can run the server with tasks/run_api_docker.sh
You can run the same curl commands as you did when you ran the flask server earlier, and see that you're getting the same results.
curl -X POST "${API_URL}/v1/predict" -H 'Content-Type: application/json' --data '{ "image": "data:image/png;base64,'$(base64 -w0 -i text_recognizer/tests/support/emnist_lines/or\ if\ used\ the\ results.png)'" }'
If needed, you can connect to your running docker container by running:
```sh
docker exec -it api bash
TO DO:
- Add deployment to web service
- Add