GitHub - ikhlestov/deployml: Scripts for "Deploy ML to production" workshop

Pre-Requirements

Docker. You may get it here
Update docker memory limit if necessary
Git, Python>=3.5

Environment SetUp

Case 1

Clone the workshop repository

git clone git@github.com:ikhlestov/deployml.git && cd deployml
Create virtualenv

python3.6 -m venv .venv && source .venv/bin/activate
Install corresponding requirements

pip install -r requirements/dev_mac.txt

or

pip install -r requirements/dev_ubuntu_cpu.txt

Note: requirements are based on python3.6. If you have another python version you should change link to the pytorch wheel at the requirements file which you may get here

Case 1.1

Additionally download tensorflow source code nearby:

git clone https://github.com/tensorflow/tensorflow.git -b v1.6.0

Case 2

pull small docker container:

docker pull ikhlestov/deployml_dev_small

or

pull large docker container(in case of really good Internet connection):

docker pull ikhlestov/deployml_dev

Case 3

Build your own docker container:

Clone the workshop repository git clone git@github.com:ikhlestov/deployml.git && cd deployml
Check dockers containers defined at the dockers folder
Run build commands:
- docker build -f dockers/Dev . -t ikhlestov/deployml_dev (for the workshop you should build only this image)
- docker build -f dockers/Dev_small . -t ikhlestov/deployml_dev_small
- docker build -f dockers/Prod . -t ikhlestov/deployml_prod
Compare their sizes docker images | grep "deployml_dev\|deployml_dev_small\|deployml_prod"

Notes:

Don't forget about .dockerignore file.
Try to organize your docker files to use cache.
Optimize your docker containers
Try to release with some smaller distributions.
You may use multistage builds

Frameworks comparison

Check defined models in the models folder
Run docker container with mounted directory:

docker run -v $(pwd):/deployml -p 6060:6060 -p 8080:8080 -it ikhlestov/deployml_dev /bin/bash
Run time measurements inside docker:

python benchmarks/compare_frameworks.py

Tensorflow optimization methods

Save our tensorflow model.

python optimizers/save_tensorflow_model.py

1.1 Import saved model to tensorboard

python misc/import_pb_to_tensorboard.py --model_dir saves/tensorflow/usual_model.pbtxt --log_dir saves/tensorboard/usual_model --graph_type PbTxt

1.2 Run tensorboard in the background

tensorboard --logdir saves/tensorboard --port 6060 --host=0.0.0.0 &

If you've encountered error such "ModuleNotFoundError: No module named 'html5lib.filters.base'" please install another version of the html5lib pip uninstall -y html5lib && pip install html5lib --no-cache
Build frozen graph. More about it you may read here

python optimizers/get_frozen_graph.py

python misc/import_pb_to_tensorboard.py --model_dir saves/tensorflow/constant_graph.pb --log_dir saves/tensorboard/constant_graph
Build optimized frozen graph

python optimizers/get_optimized_frozen_graph.py

python misc/import_pb_to_tensorboard.py --model_dir saves/tensorflow/optimized_graph.pb --log_dir saves/tensorboard/optimized_graph

Get quantized graph:

3.1 With plain python (link to script)

 python /tensorflow/tensorflow/tools/quantization/quantize_graph.py \
     --input=saves/tensorflow/optimized_graph.pb \
     --output=saves/tensorflow/quantized_graph_python.pb \
     --output_node_names="output" \
     --mode=weights

3.2. With bazel (tensorflow tutorial)

 ../tensorflow/bazel-bin/tensorflow/tools/graph_transforms/transform_graph \
     --in_graph=`pwd`/saves/tensorflow/optimized_graph.pb \
     --out_graph=`pwd`/saves/tensorflow/quantized_graph_bazel.pb  \
     --inputs="input:0" \
     --outputs="output:0" \
     --transforms='quantize_weights'

3.3 Note: tf.contrib.quantize provide only simulated quantization.

3.4 Import quantized models to the tensorboard

 python misc/import_pb_to_tensorboard.py \
     --model_dir saves/tensorflow/quantized_graph_bazel.pb \
     --log_dir saves/tensorboard/quantized_graph_bazel
 
 python misc/import_pb_to_tensorboard.py \
     --model_dir saves/tensorflow/quantized_graph_python.pb \
     --log_dir saves/tensorboard/quantized_graph_python

Compare resulted graphs

5.1 sizes ls -l saves/tensorflow/

5.2 architecture at the tensorboard

5.3 Compare resulted graphs performance python benchmarks/compare_tf_optimizations.py

Try various restrictions

6.1 CPU restriction

 docker run -v $(pwd):/deployml -it --cpus="1.0" ikhlestov/deployml_dev /bin/bash

6.2 Memory restriction

 docker run -v $(pwd):/deployml -it --memory=1g ikhlestov/deployml_dev /bin/bash

6.3 Use GPUs

 docker run --runtime=nvidia -v $(pwd):/deployml -it ikhlestov/deployml_dev /bin/bash

6.3 Try to run two models on two different CPUs 6.4 Try to run two models on two CPU simultaneously

Training optimization approaches

You may also take a look at other methods (list of resources) like:

Pruning

XNOR nets

Knowledge distillation

Simple servers

One-to-one server(servers/simple_server.py)

Scaling with multiprocessing(servers/processes_server.py)

You may start servers (not simultaneously) as:

python servers/simple_server.py

or

python servers/processes_server.py

and test them with:

python servers/tester.py

Queues based(Kafka, RabbitMQ, etc)

Serving with tf-serving

Testing

Preprocessing and code testing

Q: Where data preprocessing should be done? CPU or GPU or even another host?

enter to the preprocessing directory cd preprocessing
run various resizers benchmarks python benchmark.py
- Note: opencv may be installed from PyPi for python3
check unified resizer at the image_preproc.py
try to run tests for it pytest test_preproc.py(and they will fail)
fix resizer
run tests again pytest test_preproc.py

What else should be tested(really - as much as possible):

General network inference
Model loading/saving
New models deploy
Any preprocessing
Corrupted inputs - Nan, Inf, zeros
Deterministic output
Input ranges/distributions
Output ranges/distributions
Test that model will fail in known cases
...
Just check this video :)

You my run tests:

At the various docker containers
Under the tox

Profiling

Code:
Tensorflow:
CPU/GPU:
- nvidia-smi
- gpustat
- psutil
- nvidia profiler
Lifetime benchmark - airspeed velocity

Routines automation

Continuous integration:
- Jenkins
- Travis
- TeamCity
- CircleCI
Clusters:
- Kubernetes
- Mesos
- Docker swarm
Configuration management:
- Terraform
- Ansible
- Chef
- Puppet
- SaltStack

Converting weights to the tensorflow

Converting from keras to tensorflow:
- Get keras saved model python converters/save_keras_model.py
- Convert keras model to the tensorflow save format python converters/convert_keras_to_tf.py
Converting from PyTorch to tensorflow:
- Trough keras - converter
- Manually

In any case you should know about:

Conclusion

I'm grateful for the cool ideas to Alexandr Onbysh, Aleksandr Obednikov, Kyryl Truskovskyi and to the Ring Urkaine in overall.

Take a look at the checklist.

Thank you for reading!

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
benchmarks		benchmarks
converters		converters
dockers		dockers
images		images
misc		misc
models		models
optimizers		optimizers
preprocessing		preprocessing
requirements		requirements
servers		servers
.dockerignore		.dockerignore
.gitignore		.gitignore
README.md		README.md
check_all_commands.py		check_all_commands.py
checklist.md		checklist.md
optimization_approaches.md		optimization_approaches.md
setup.py		setup.py

ikhlestov/deployml

Folders and files

Latest commit

History

Repository files navigation

Contents