Skip to content

MaybeShewill-CV/mortred_model_server

Repository files navigation

icon.png

icon

Mortred-AI-Web-Server: A Noob Web Server for AI Models

| English | 中文 |

Morted AI Model Server is a toy web server for deep learning models. Server tries its best to make the most usage of your cpu and gpu resources. All dl models are trained by tensorflow/pytorch and deployed via MNN toolkit and supply web service through workflow framework finally.

Do not hesitate to let me know if you find bugs here cause I'm a c-with-struct noob 🙃

The three major components are illustrated on the architecture picture bellow.

simple_architecture

A quick overview and examples for both serving and model benchmarking are provided below. Detailed documentation and examples will be provided in the docs folder.

You're welcomed to ask questions and help me to make it better!

All models and detectors can be downloaded from my Hugging Face Page.

Contents of this document

Quick Start

Before proceeding further with this document, make sure you have the following prerequisites

1. Make sure you have CUDA&GPU&Driver rightly installed. You may refer to this to install them

2. Make sure you have MNN installed. For install instruction you may find some help here. MNN-2.7.0 release version was recommended.

3. Make sure you have WORKFLOW installed. For install instruction you may find some help here

4. Make sure you have OPENCV installed. For install instruction you may find some help here

5. Make sure your GCC tookit support cpp-17

6. Segment-Anything needs ONNXRUNTIME and TensorRT library. You may refer to this to install onnxruntime>=1.16.0 and this to install TensorRT-8.6.1.6

After all prerequisites are settled down you may start to build the mortred ai server frame work.

Setup 🔥🔥🔥

Step 1: Prepare 3rd-party Libraries

Copy MNN headers and libs

cp -r $MNN_ROOT_DIR/include/MNN ./3rd_party/include
cp $MNN_ROOT_DIR/build/libMNN.so ./3rd_party/libs
cp $MNN_ROOT_DIR/build/source/backend/cuda/libMNN_Cuda_Main.so ./3rd_party/libs

Copy WORKFLOW headers and libs

cp -r $WORKFLOW_ROOT_DIR/_include/workflow ./3rd_party/include
cp -r $WORKFLOW_ROOT_DIR/_lib/libworkflow.so* ./3rd_party/libs

Copy ONNXRUNTIME headers and libs

cp -r $ONNXRUNTIME_ROOT_DIR/include/* ./3rd_party/include/onnxruntime
cp -r $ONNXRUNTIME_ROOT_DIR/_lib/libonnxruntime*.so* ./3rd_party/libs

Copy TensorRT headers and libs

cp -r $TENSORRT_ROOT_DIR/include/* ./3rd_party/include/TensorRT-8.6.1.6
cp -r $TENSORRT_ROOT_DIR/_lib/libnvinfer.so* ./3rd_party/libs
cp -r $TENSORRT_ROOT_DIR/_lib/libnvinfer_builder_resource.so.8.6.1 ./3rd_party/libs
cp -r $TENSORRT_ROOT_DIR/_lib/libnvinfer_plugin.so* ./3rd_party/libs
cp -r $TENSORRT_ROOT_DIR/_lib/libnvonnxparser.so* ./3rd_party/libs

Step 2: Build Mortred AI Server ☕☕☕

mkdir build && cd build
cmake ..
make -j10

Step 3: Download Pre-Built Models 🍵🍵🍵

Download pre-built image models via BaiduNetDisk and extract code is 1y98. Create a directory named weights in $PROJECT_ROOT_DIR and unzip the downloaded models in it. The weights directory structure should looks like

weights_folder_architecture

Step 4: Test MobileNetv2 Benchmark Tool

The benchmark and server apps will be built in $PROJECT_ROOT_DIR/_bin and libs will be built in $PROJECT_ROOT_DIR/_lib. Benchmark the mobilenetv2 classification model

cd $PROJECT_ROOT_DIR/_bin
./mobilenetv2_benchmark.out ../conf/model/classification/mobilenetv2/mobilenetv2_config.ini

You should see the mobilenetv2 model benchmark profile as follows:

mobilenetv2_demo_benchmark

Step 5: Run MobileNetV2 Server Locally

The detailed description about web server configuration will be found at Web Server Configuration. Now start serving the model

cd $PROJECT_ROOT_DIR/_bin
./mobilenetv2_classification_server.out ../conf/server/classification/mobilenetv2/mobilenetv2_server_config.ini

Model service will be start at http://localhost:8091 with 4 workers waiting to serve. A demo python client was supplied to test the service

cd $PROJECT_ROOT_DIR/scripts
export PYTHONPATH=$PWD:$PYTHONPATH
python server/test_server.py --server mobilenetv2 --mode single

The client will repeatly post demo images 1000 times. Server output should be like mobilenetv2_server_exam_output Client output should be like mobilenetv2_client_exam_output

For more server demo you may find them in Torturials 👇👇👇

Benchmark

The benchmark test environment is as follows:

OS: Ubuntu 20.04.5 LTS / 5.15.0-87-generic

MEMORY: 32G DIMM DDR4 Synchronous 2666 MHz

CPU: Intel(R) Core(TM) i5-10400 CPU @ 2.90GHz

GCC: gcc (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0

GPU: GeForce RTX 3080

CUDA: CUDA Version: 11.5

GPU Driver: Driver Version: 495.29.05

Model Inference Benchmark

All models loop several times to avoid the influence of gpu's warmup and only model's inference time has been counted.

Benchmark Code Snappit benchmakr_code_snappit

Tutorials

How To

Web Server Configuration

TODO

  • Add more model into model zoo

Repo-Status

repo-status

Star History

Star History Chart

Visitor Count

Visitor Count

Acknowledgement

mortred_model_server refers to the following projects:

Releases

No releases published

Packages

No packages published

Languages