This project provides NanoDet image inference, webcam inference and benchmark using Tencent's NCNN framework.
Download and Install Visual Studio from https://visualstudio.microsoft.com/vs/community/
Download and install OpenCV from https://github.com/opencv/opencv/releases
Download and install Vulkan SDK from https://vulkan.lunarg.com/sdk/home
Clone NCNN repository
git clone --recursive https://github.com/Tencent/ncnn.git
Build NCNN following this tutorial: Build for Windows x64 using VS2017
Add ncnn_DIR
= YOUR_NCNN_PATH/build/install/lib/cmake/ncnn
to system environment variables.
Build project: Open x64 Native Tools Command Prompt for VS 2019 or 2017
mkdir -p build
cd build
cmake ..
msbuild nanodet_demo.vcxproj /p:configuration=release /p:platform=x64
Build and install OpenCV from https://github.com/opencv/opencv
Download Vulkan SDK from https://vulkan.lunarg.com/sdk/home
Clone NCNN repository
git clone --recursive https://github.com/Tencent/ncnn.git
Build NCNN following this tutorial: Build for Linux / NVIDIA Jetson / Raspberry Pi
Set environment variables. Run:
export ncnn_DIR=YOUR_NCNN_PATH/build/install/lib/cmake/ncnn
Build project
mkdir build
cd build
cmake ..
make
Download NanoDet ncnn model.
Unzip the file and rename the file to nanodet.param
and nanodet.bin
, then copy them to demo program folder (demo_ncnn/build
).
./nanodet_demo 0 0
./nanodet_demo 1 ${IMAGE_FOLDER}/*.jpg
./nanodet_demo 2 ${VIDEO_PATH}
./nanodet_demo 3 0
Notice:
If benchmark speed is slow, try to limit omp thread num.
Linux:
export OMP_THREAD_LIMIT=4
Model | Resolution | COCO mAP | CPU Latency (i7-8700) | ARM CPU Latency (4*A76) | Vulkan GPU Latency (GTX1060) |
---|---|---|---|---|---|
NanoDet-Plus-m | 320*320 | 27.0 | 10.32ms / 96.9FPS | 11.97ms / 83.5FPS | 3.40ms / 294.1FPS |
NanoDet-Plus-m | 416*416 | 30.4 | 17.98ms / 55.6FPS | 19.77ms / 50.6FPS | 4.27ms / 234.2FPS |
NanoDet-Plus-m-1.5x | 320*320 | 29.9 | 12.87ms / 77.7FPS | 15.90ms / 62.9FPS | 3.78ms / 264.6FPS |
NanoDet-Plus-m-1.5x | 416*416 | 34.1 | 22.53ms / 44.4FPS | 25.49ms / 39.2FPS | 4.79ms / 208.8FPS |
python tools/export_onnx.py --cfg_path ${CONFIG_PATH} --model_path ${PYTORCH_MODEL_PATH}
Run onnx2ncnn in ncnn tools to generate ncnn .param and .bin file.
After that, using ncnnoptimize to optimize ncnn model.
If you have quentions about converting ncnn model, refer to ncnn wiki. https://github.com/Tencent/ncnn/wiki
You can also convert the model with an online tool https://convertmodel.com/ .
If you want to use custom model, please make sure the hyperparameters
in nanodet.h
are the same with your training config file.
int input_size[2] = {416, 416}; // input height and width
int num_class = 80; // number of classes. 80 for COCO
int reg_max = 7; // `reg_max` set in the training config. Default: 7.
std::vector<int> strides = { 8, 16, 32, 64 }; // strides of the multi-level feature.