Skip to content

Latest commit

 

History

History
236 lines (191 loc) · 10.6 KB

README_zh.md

File metadata and controls

236 lines (191 loc) · 10.6 KB
Mobile AI Bench

License pipeline status

English

近年来,智能手机以及IoT设备上的离线深度学习应用变得越来越普遍。在设备上部署深度学习模型给开发者带来挑战,对于手机应用开发者,需要在众多深度学习框架中选择一款合适的框架,对于IoT硬件开发者而言,则还需要从不同的芯片方案中做出选择。同时,除了软硬件的选择之外,开发者还需要从模型量化压缩与模型精度角度进行权衡,最终将模型部署到设备上。对比测试这些不同的芯片,框架以及量化方案,并从中选择最佳组合是一个非常繁琐耗时的工作。

MobileAIBench 是一个端到端的测试工具,用于评测相同的模型在不同硬件和软件框架上运行的性能和精度表现,对开发者的技术选型给出客观参考数据。

每日评测结果

请查看最新的CI Pipeline页面中的benchmark步骤的运行结果,由于设备原因,运行结果不会覆盖所有的设备和框架。

FAQ

参考英文文档。

准备环境

MobileAIBench 现在支持多种框架 (MACE, SNPE, ncnn, TensorFlow Lite 以及 HIAI),需要安装以下的依赖:

依赖 安装命令 验证可用的版本
Python 2.7
ADB apt-get install android-tools-adb Required by Android run, >= 1.0.32
Android NDK NDK installation guide Required by Android build, r15c
Bazel bazel installation guide 0.13.0
CMake apt-get install cmake >= 3.11.3
FileLock pip install -I filelock==3.0.0 Required by Android run
PyYaml pip install -I pyyaml==3.12 3.12.0
sh pip install -I sh==1.12.14 1.12.14
SNPE (可选) 下载并解压 1.18.0

备注1: 鉴于SNPE的许可不允许第三方再分发, 目前Bazel WORKSPACE配置中的链接只能在CI Server中访问。 如果想测评SNPE(通过--executors指定all或者显式指定了SNPE) ,需从官方地址 下载并解压,复制libgnustl_shared.so,然后修改WORKSPACE文件如下。

#new_http_archive(
#    name = "snpe",
#    build_file = "third_party/snpe/snpe.BUILD",
#    sha256 = "8f2b92b236aa7492e4acd217a96259b0ddc1a656cbc3201c7d1c843e1f957e77",
#    strip_prefix = "snpe-1.22.2.233",
#    urls = [
#        "https://cnbj1-fds.api.xiaomi.net/aibench/third_party/snpe-1.22.2_with_libgnustl_shared.so.zip",
#    ],
#)

new_local_repository(
    name = "snpe",
    build_file = "third_party/snpe/snpe.BUILD",
    path = "/path/to/snpe",
)

备注2: 鉴于HIAI的许可不允许第三方再分发, 目前Bazel WORKSPACE配置中的链接只能在CI Server中访问。 如果想测评HIAI(通过--executors指定all或者显式指定了HIAI) ,需从官方地址 登录账号下载HiAI DDK.zip并解压,获得其中的HiAI_DDK_100.200.010.011.zip文件并解压,然后修改WORKSPACE文件如下。

#new_http_archive(
#    name = "hiai",
#    build_file = "third_party/hiai/hiai.BUILD",
#    sha256 = "8da8305617573bc495df8f4509fcb1655ffb073d790d9c0b6ca32ba4a4e41055",
#    strip_prefix = "HiAI_DDK_100.200.010.011",
#    type = "zip",
#    urls = [
#        "http://cnbj1.fds.api.xiaomi.com/aibench/third_party/HiAI_DDK_100.200.010.011_LITE.zip",
#    ],
#)

new_local_repository(
    name = "hiai",
    build_file = "third_party/hiai/hiai.BUILD",
    path = "/path/to/hiai",
)

数据结构

+-----------------+         +------------------+      +---------------+
|   Benchmark     |         |   BaseExecutor   | <--- | MaceExecutor  |
+-----------------+         +------------------+      +---------------+
| - executor      |-------> | - executor       |
| - model_name    |         | - device_type    |      +---------------+
| - quantize      |         |                  | <--- | SnpeExecutor  |
| - input_names   |         +------------------+      +---------------+
| - input_shapes  |         | + Init()         |
| - output_names  |         | + Prepare()      |      +---------------+
| - output_shapes |         | + Run()          | <--- | NcnnExecutor  |
| - run_interval  |         | + Finish()       |      +---------------+
| - num_threads   |         |                  |
+-----------------+         |                  |      +---------------+
| - Run()         |         |                  | <--- | TfLiteExecutor|
+-----------------+         |                  |      +---------------+
        ^     ^             |                  |
        |     |             |                  |      +---------------+
        |     |             |                  | <--- | HiaiExecutor  |
        |     |             +------------------+      +---------------+
        |     |
        |     |             +--------------------+
        |     |             |PerformanceBenchmark|
        |     --------------+--------------------+
        |                   | - Run()            |
        |                   +--------------------+
        |
        |                   +---------------+      +---------------------+
+--------------------+ ---> |PreProcessor   | <--- |ImageNetPreProcessor |
| PrecisionBenchmark |      +---------------+      +---------------------+
+--------------------+
| - pre_processor    |      +---------------+      +---------------------+
| - post_processor   | ---> |PostProcessor  | <--- |ImageNetPostProcessor|
| - metric_evaluator |      +---------------+      +---------------------+
+--------------------+
| - Run()            |      +---------------+
+--------------------+ ---> |MetricEvaluator|
                            +---------------+

如何使用

测试所有模型在所有框架上的性能

bash tools/benchmark.sh --benchmark_option=Performance \
                        --target_abis=armeabi-v7a,arm64-v8a,aarch64,armhf

运行时间可能比较长,如果只想测试指定模型和框架,可以添加如下选项。当前只有MACE支持精度测试。

option type default explanation
--benchmark_option str Performance Benchmark options, Performance/Precision.
--output_dir str output Benchmark output directory.
--executors str all Executors(MACE/SNPE/NCNN/TFLITE/HIAI), comma separated list or all.
--device_types str all DeviceTypes(CPU/GPU/DSP/NPU), comma separated list or all.
--target_abis str armeabi-v7a Target ABIs(armeabi-v7a,arm64-v8a,aarch64,armhf), comma separated list.
--model_names str all Model names(InceptionV3,MobileNetV1...), comma separated list or all.
--run_interval int 10 Run interval between benchmarks, seconds.
--num_threads int 4 The number of threads.
--input_dir str "" Input data directory for precision benchmark.

配置连接方式为ssh的设备

对abi为aarch64、armhf的ARM-Linux设备,支持以ssh方式进行连接,需添加yaml配置信息。 在generic-mobile-devices/devices_for_ai_bench.yml配置ssh设备,下面是一个例子:

devices:
  nanopi:
    target_abis: [aarch64, armhf]
    target_socs: RK3333
    models: Nanopi M4
    address: 10.231.46.118
    username: pi

在已有框架中添加新模型评测

  • aibench/proto/base.proto添加新模型名。

  • aibench/proto/model.meta配置模型信息。

  • aibench/proto/benchmark.meta配置benchmark信息。

  • 运行测试 性能benchmark:

    bash tools/benchmark.sh --benchmark_option=Performance \
                            --executors=MACE --device_types=CPU --model_names=MobileNetV1 \
                            --target_abis=armeabi-v7a,arm64-v8a,aarch64,armhf

    精度benchmark。当前仅支持以ImageNet图像为输入测试MACE精度。

    bash tools/benchmark.sh --benchmark_option=Precision --input_dir=/path/to/inputs \
                            --executors=MACE --device_types=CPU --model_names=MobileNetV1 \
                            --target_abis=armeabi-v7a,arm64-v8a,aarch64,armhf
  • 查看结果

    python report/csv_to_html.py

    在浏览器中打开相应链接查看。

加入新的 AI 框架

  • 定义 executor 并实现其接口:

    class YourExecutor : public BaseExecutor {
     public:
      YourExecutor() :
          BaseExecutor(executor_type, device_type, model_file, weight_file) {}
      
      // Init method should invoke the initializing process for your executor 
      // (e.g.  Mace needs to compile OpenCL kernel once per target). It will be
      // called only once when creating executor engine.
      virtual Status Init(int num_threads);
       // Load model and prepare to run. It will be called only once before 
      // benchmarking the model.
      virtual Status Prepare();
      
      // Run the model. It will be called more than once.
      virtual Status Run(const std::map<std::string, BaseTensor> &inputs,
                         std::map<std::string, BaseTensor> *outputs);
      
      // Unload model and free the memory after benchmarking. It will be called
      // only once.
      virtual void Finish();
    };
  • aibench/benchmark/benchmark_main.cc 中包含头文件:

    #ifdef AIBENCH_ENABLE_YOUR_EXECUTOR
    #include "aibench/executors/your_executor/your_executor.h"
    #endif
  • 添加依赖 third_party/your_executor, aibench/benchmark/BUILDWORKSPACE

  • 测试模型

    在已有框架中添加新模型评测

License

Apache License 2.0