diff --git a/README.md b/README.md index 80276ad6f..f6eb9629a 100644 --- a/README.md +++ b/README.md @@ -1,5 +1,5 @@ [中文版本](README_CH.md) -
+
## Introduction @@ -67,7 +67,7 @@ At present, TNN has been launched in various major businesses, and its following * TNN architecture diagram: -
+
* TNN supports TensorFlow, Pytorch, MxNet, Caffe, and other training frameworks through ONNX, leveraging the continuous improvement of the ONNX open-source society. Currently, TNN supports 100+ ONNX operators, consisting of most of the mainstream CNN, NLP operators needed. @@ -127,7 +127,7 @@ TNN referenced the following projects: * Everyone is welcome to participate to build the best inference framework in the industry. -* Technical Discussion QQ Group: 913940506 Answer: TNN +* Technical Discussion QQ Group: 704900079 Answer: TNN * Scan the QR code to join the TNN discussion group: -
+
diff --git a/README_CH.md b/README_CH.md index 0f71cb05a..214873322 100644 --- a/README_CH.md +++ b/README_CH.md @@ -1,5 +1,5 @@ [English Version](README.md) -
+
## 简介 @@ -68,7 +68,7 @@ demo * TNN架构图: -
+
* 通过 ONNX 支持 TensorFlow, PyTorch, MXNet, Caffe 等多种训练框架,充分利用和融入不断完善的 ONNX 开源生态。当前支持 ONNX 算子100+,覆盖主流CNN, NLP网络。 * 支持主流安卓、iOS、Embedded Linux 操作系统, Windows, Linux,支持 ARM CPU, x86, Mali GPU, Adreno GPU, NV GPU, 达芬奇NPU,RK NPU。 @@ -127,7 +127,7 @@ TNN参考和借鉴了下列项目: * 欢迎大家参与,协同共建,打造业界最好的高性能推理框架。 -* 技术交流 QQ 群: 913940506 答案:TNN +* 技术交流 QQ 群: 704900079 答案:TNN * QQ 群二维码: -
+
diff --git a/doc/cn/development/architecture.md b/doc/cn/development/architecture.md index 2dbd012c5..0a39362f2 100644 --- a/doc/cn/development/architecture.md +++ b/doc/cn/development/architecture.md @@ -10,7 +10,7 @@ 对模型解析相关接口进行了抽象,可支持多种模型格式解析和扩充,相关代码见source/tnn/interpreter模块。 -
+
AbstractModelInterpreter定义了抽象的Interpret接口,不同的模型解析器解析不同类型模型。DefaultModelInterpreter相关的接口将相关结果存入NetStruture和NetResource结构中,部分第三方模型无法完成内部结构解析的有单独适配,如CoreMLModelInterpreter,以完成第三方库适配。 @@ -79,7 +79,7 @@ public: Blob节点构建核心在于内存的分配和优化,主要分为blob内存循环复用,blob内存拼接与监控。 -
+
首先不同layer输出blob间内存会通过内部算法实现循环复用,不同blob间内存复用会优先选择尺寸接近的blob。 @@ -87,7 +87,7 @@ Blob节点构建核心在于内存的分配和优化,主要分为blob内存循 ## 四、多平台加速算子实现 -
+
抽象AbstractDevice接口,用于隐藏不同Device实现细节。提供Device Memory 尺寸计算,Device Memory分配释放,内存CPU Memory与Device meomoy拷贝,Device Layer加速算子构建,以及Instance对应Device Context构建等接口。 diff --git a/doc/cn/development/profiling.md b/doc/cn/development/profiling.md index 064233178..f522ca9a7 100644 --- a/doc/cn/development/profiling.md +++ b/doc/cn/development/profiling.md @@ -18,13 +18,13 @@ 如下图点击benchmark工程,找到工程设置`Signing & Capabilities`,点击Team选项卡选择`Add an Account...` -
+
在如下界面输入Apple ID账号和密码,添加完成后回到`Signing & Capabilities`界面,并在Team选项卡中选中添加的账号。如果没有Apple ID也可以通过`Create Apple ID`选项根据相关提示进行申请。 `PS:申请Apple ID无需付费,可以即时通过,通过后才可在真机上运行APP调试` -
+
4. 真机运行 @@ -33,19 +33,19 @@ 如图在现有`Bundle Identifier`后随机添加后缀(限数字和字母),避免个人账户遇到签名冲突。 -
+
4.2 验证授权 首次运行先利用快捷键`Command + Shift + K`对工程进行清理,再执行快捷键`Command + R`运行。如果是首次登陆Apple ID,Xcode会弹框报如下错误,需要在iOS设备上根据提示进行授权验证。一般来说手机上的授权路径为:设置 -> 通用 -> 描述文件与设备管理 -> Apple Development选项 -> 点击信任 -
+
4.3 运行结果 首次运行先利用快捷键`Command + Shift + K`对工程进行清理,再执行快捷键`Command + R`运行。在界面上点击Run按钮,界面会显示model目录下所有模型的CPU和GPU耗时情况。iPhone7真机运行结果如下图。 -
+
PS: @@ -104,7 +104,7 @@ P.S. 不指定 -t, 默认跑CPU和GPU, 华为npu benchmark需通过-t HUAWEI_NPU ./benchmark_models.sh -c ``` 结果如图: -
+
执行结果会保存在`benchmark_models_result.txt`中。 @@ -116,7 +116,7 @@ P.S. 不指定 -t, 默认跑CPU和GPU, 华为npu benchmark需通过-t HUAWEI_NPU ./benchmark_models.sh -c -f ``` 结果如图: -
+
执行结果会保存在`benchmark_models_result.txt`中。 P.S. 华为npu不支持每层分析。 diff --git a/doc/cn/front_page.md b/doc/cn/front_page.md index 3efeab349..7991fb116 100644 --- a/doc/cn/front_page.md +++ b/doc/cn/front_page.md @@ -1,4 +1,4 @@ -
+
[English Version](../en/front_page_en.md) @@ -77,7 +77,7 @@ TNN作为一个移动端高性能、轻量级的推断框架,同时拥有跨 * TNN架构图: -
+
* 通过ONNX支持TensorFlow, Pytorch, MxNet, Caffe等多种训练框架,充分利用和融入不断完善的ONNX开源生态。当前支持ONNX算子55个,近期会完善到约80个,覆盖主流CNN网络 * 支持主流安卓、iOS、embedded Linux,windows操作系统,支持ARM CPU, GPU硬件平台(近期还会加入达芬奇NPU支持) @@ -118,7 +118,7 @@ TNN作为一个移动端高性能、轻量级的推断框架,同时拥有跨 * 欢迎大家参与,协同共建,打造业界最好的移动端推理框架。 -* 技术交流QQ群: 913940506 答案:TNN +* 技术交流QQ群: 704900079 答案:TNN * QQ群二维码: -
+
diff --git a/doc/cn/get_started.md b/doc/cn/get_started.md index be67e117f..ab2af81d5 100644 --- a/doc/cn/get_started.md +++ b/doc/cn/get_started.md @@ -1,4 +1,4 @@ -
+
# 从0开始跑通一个Demo diff --git a/doc/cn/user/convert.md b/doc/cn/user/convert.md index 51a78a10c..04ac9015f 100755 --- a/doc/cn/user/convert.md +++ b/doc/cn/user/convert.md @@ -2,7 +2,7 @@ [English Version](../../en/user/convert_en.md) -
+
目前 TNN 支持业界主流的模型文件格式,包括ONNX、PyTorch、TensorFlow、TesorFlow-Lite 以及 Caffe 等。如上图所示,TNN 将 ONNX 作为中间层,借助于ONNX 开源社区的力量,来支持多种模型文件格式。如果要将PyTorch、TensorFlow 以及 Caffe 等模型文件格式转换为 TNN,首先需要使用对应的模型转换工具,统一将各种模型格式转换成为 ONNX 模型格式,然后将 ONNX 模型转换成 TNN 模型。 diff --git a/doc/cn/user/demo.md b/doc/cn/user/demo.md index 6aba96e4c..af2c15f08 100644 --- a/doc/cn/user/demo.md +++ b/doc/cn/user/demo.md @@ -48,13 +48,13 @@ 如下图点击TNNExamples工程,找到工程设置`Signing & Capabilities`,点击Team选项卡选择`Add an Account...` -
+
在如下界面输入Apple ID账号和密码,添加完成后回到`Signing & Capabilities`界面,并在Team选项卡中选中添加的账号。如果没有Apple ID也可以通过`Create Apple ID`选项根据相关提示进行申请。 `PS:申请Apple ID无需付费,可以即时通过,通过后才可在真机上运行APP调试` -
+
4. 真机运行 @@ -62,13 +62,13 @@ 如图在现有`Bundle Identifier`后随机添加后缀(限数字和字母),避免个人账户遇到签名冲突。 -
+
4.2 验证授权 首次运行先利用快捷键`Command + Shift + K`对工程进行清理,再执行快捷键`Command + R`运行。如果是首次登陆Apple ID,Xcode会弹框报如下错误,需要在iOS设备上根据提示进行授权验证。一般来说手机上的授权路径为:设置 -> 通用 -> 描述文件与设备管理 -> Apple Development选项 -> 点击信任 -
+
4.3 运行结果 @@ -94,7 +94,7 @@ c) 如果需要执行OCR demo,需要将tnn_sdk_sample.h中的宏HAS_OPENCV设 效果示例:iPhone 7, ARM 单线程 6.3206ms -
+
2. 图像分类 @@ -102,7 +102,7 @@ c) 如果需要执行OCR demo,需要将tnn_sdk_sample.h中的宏HAS_OPENCV设 效果示例:iPhone 7, ARM 单线程 13.83ms -
+
## 二、Android Demo 介绍 @@ -176,11 +176,11 @@ NDK 22和23在链接第三方动态库可能会出错,例如opencv,hiai, 效果示例:华为P40, ARM 单线程 32.2359ms -
+
效果示例: 华为P40, 华为NPU rom 100.320.010.022 9.04ms -
+
2. 人脸检测-视频 @@ -188,7 +188,7 @@ NDK 22和23在链接第三方动态库可能会出错,例如opencv,hiai, 效果示例:华为P40, ARM 单线程 122.296ms -
+
效果示例: 华为P40, 华为NPU rom 100.320.010.022 28ms @@ -200,7 +200,7 @@ NDK 22和23在链接第三方动态库可能会出错,例如opencv,hiai, 效果示例:华为P40, ARM 单线程 81.4047ms -
+
效果示例: 华为P40, NPU rom 100.320.010.022 2.48ms diff --git a/doc/cn/user/roadmap.md b/doc/cn/user/roadmap.md index be872daf3..ef4c20218 100644 --- a/doc/cn/user/roadmap.md +++ b/doc/cn/user/roadmap.md @@ -1,3 +1,3 @@ # Roadmap -
+
diff --git a/doc/cn/user/tech_solution.md b/doc/cn/user/tech_solution.md index 056db4582..2ac508211 100644 --- a/doc/cn/user/tech_solution.md +++ b/doc/cn/user/tech_solution.md @@ -62,7 +62,7 @@ TNN作为一个移动端高性能、轻量级的推理框架,同时拥有跨 * TNN架构图: -
+
* 通过ONNX支持TensorFlow, Pytorch, MxNet, Caffe等多种训练框架,充分利用和融入不断完善的ONNX开源生态。当前支持ONNX算子55个,近期会完善到约80个,覆盖主流CNN网络 * 支持主流安卓、iOS、embedded Linux,windows操作系统,支持ARM CPU, GPU硬件平台(近期还会加入达芬奇NPU支持) diff --git a/doc/en/development/architecture_en.md b/doc/en/development/architecture_en.md index f711aa391..d520b7ffa 100644 --- a/doc/en/development/architecture_en.md +++ b/doc/en/development/architecture_en.md @@ -10,7 +10,7 @@ Considering the maintenance and compatibility of the open-source library, all ex The interface related to the model interpreter is abstracted, which can support multiple model formats' parsing. See the source/tnn/interpreter module for related codes. -
+
AbstractModelInterpreter defines an abstract Interpret interface, and different model parsers parse different types of models. The interface related to DefaultModelInterpreter stores the relevant results in the NetStruture and NetResource structures. Some third-party models which cannot complete the interpretation would need a separate path such as CoreMLModelInterpreter, to complete third-party library adaptation. @@ -65,7 +65,7 @@ Similar to the previous model registration mechanism, different Layers will regi The core of Blob node construction is memory allocation and optimization, which is mainly divided into blob memory recycling, blob memory splicing, and monitoring. -
+
First of all, the memory between the blobs output by different layers will be cyclically reused through an internal algorithm. The memory reuse between different blobs will preferentially select blobs of similar size. @@ -74,7 +74,7 @@ The memory between multiple instances of the same thread/different threads has t ## IV. Multi-platform Acceleration Operator Implementation -
+
Abstract AbstractDevice interface, used to hide the implementation details of different Devices. Provide an interface for Device Memory size calculation, Device Memory allocation/release, CPU Memory and Device memory copy, Device Layer accelerated operator construction, and instance corresponding Device Context construction. diff --git a/doc/en/development/profiling_en.md b/doc/en/development/profiling_en.md index 9934aa917..6e903f39d 100644 --- a/doc/en/development/profiling_en.md +++ b/doc/en/development/profiling_en.md @@ -18,13 +18,13 @@ Analyze the running time of a model. Click the benchmark project as shown below, find the project setting `Signing & Capabilities`, click the Team tab, and select` Add an Account ...` -
+
Enter the Apple ID account and password in the following interface. After the addition is complete, return to the `Signing & Capabilities` interface and select the added account in the Team tab. If you don’t have an Apple ID, you can also use the `Create Apple ID` option to apply. `PS: There is no fee to apply for Apple ID, it can be passed immediately, and the APP can be debugged on the real machine after passing.` -
+
4. Run on real machines @@ -32,20 +32,20 @@ Analyze the running time of a model. As shown in the figure, after the existing `Bundle Identifier`, a suffix (limited to numbers and letters) is randomly added to prevent personal accounts from encountering signature conflicts. -
+
4.2 Verify authorization For the first time, use the shortcut key `Command + Shift + K` to clean up the project, and then execute the shortcut key` Command + R` to run. If it is the first time to log in with Apple ID, Xcode will pop up a box and report the following error. You need to verify the authorization on the iOS device according to the prompt. Generally speaking, the authorization path on the phone is: Settings-> General-> Profile and Device Management-> Apple Development Options-> Click Trust -
+
4.3 Result For the first run, use the shortcut key `Command + Shift + K` to clean up the project, and then execute the shortcut key` Command + R` to run. Click the Run button on the interface, the interface will display the CPU and GPU time consumption of all models in the model directory. The running result of the iPhone7 real machine is shown below. -
+
PS: @@ -115,7 +115,7 @@ Execute the script: The result is shown in the figure and saved to `benchmark_models_result.txt`. -
+
#### 4.2 Layer-by-layer Performance Analysis: @@ -126,7 +126,7 @@ Execute script: ``` P.S. Huawei NPU does not support layer by layer analysis. The result is shown in the figure and saved to `benchmark_models_result.txt`: -
+
### 5.Special Instructions diff --git a/doc/en/front_page_en.md b/doc/en/front_page_en.md index 0ac98e28c..11a4deec0 100644 --- a/doc/en/front_page_en.md +++ b/doc/en/front_page_en.md @@ -1,4 +1,4 @@ -
+
[中文版本](../cn/front_page.md) @@ -77,7 +77,7 @@ At present, TNN has been launched in various major businesses, and its following * TNN architecture diagram: -
TNN架构 +
TNN架构 * TNN supports TensorFlow, Pytorch, MxNet, Caffe, and other training frameworks through ONNX, leveraging the continuous improvement of the ONNX open-source society. Currently TNN supports 55 ONNX operators, and will be developed to cover 80 operators shortly, consisting of most of the mainstream CNN operators needed. @@ -120,8 +120,8 @@ At present, TNN has been launched in various major businesses, and its following * Everyone is welcome to participate to build the best mobile inference framework in the industry. -* Technical Discussion QQ Group: 913940506 Answer: TNN +* Technical Discussion QQ Group: 704900079 Answer: TNN * Scan the QR code to join the TNN discussion group: -
+
diff --git a/doc/en/get_started_en.md b/doc/en/get_started_en.md index 4d8ee4ec5..71bb2adb8 100644 --- a/doc/en/get_started_en.md +++ b/doc/en/get_started_en.md @@ -1,4 +1,4 @@ -
+
# Run a demo from scratch diff --git a/doc/en/user/convert_en.md b/doc/en/user/convert_en.md index 56724d53c..bd5327e0a 100644 --- a/doc/en/user/convert_en.md +++ b/doc/en/user/convert_en.md @@ -4,7 +4,7 @@ ## Overview -
+
TNN currently supports the industry's mainstream model file formats, including ONNX, Pytorch, Tensorflow and Caffe. As shown in the figure above, TNN utilizes ONNX as the intermediate port to support multiple model file formats. To convert model file formats such as Pytorch, Tensorflow, TensorFlow-Lite, and Caffe to TNN, you need to use corresponding tool to convert from the original format to ONNX model first, which then will be transferred into a TNN model. diff --git a/doc/en/user/demo_en.md b/doc/en/user/demo_en.md index 6cfa30a41..9eab95f1f 100644 --- a/doc/en/user/demo_en.md +++ b/doc/en/user/demo_en.md @@ -47,14 +47,14 @@ Click the TNNExamples project as shown below, find the project setting `Signing & Capabilities`, click the Team tab and select `Add an Account...` -
+
Enter the Apple ID account and password in the following interface. Return to the `Signing & Capabilities` interface, and select the added account in the Team tab. If you don’t have an Apple ID, you can also use the “Create Apple ID” option to apply according to the relevant prompts. `PS: There is no fee to apply for Apple ID, it can be passed immediately, and the APP can be run on the real machine after debugging.` -
+
4. Run on real machine @@ -63,14 +63,14 @@ As shown in the figure, after the existing `Bundle Identifier`, a suffix (limited to numbers and letters) is randomly added to avoid personal account conflicts. -
+
4.2 Verify authorization For the first time, use the shortcut key `Command + Shift + K` to clean up the project, and then execute the shortcut key` Command + R` to run. If it is the first time to log in with Apple ID, Xcode will pop up a box and report the following error. You need to verify the authorization on the iOS device according to the prompt. Generally speaking, the authorization path on the phone is: Settings-> General-> Profile and Device Management-> Apple Development Options-> Click Trust -
+
4.3 Result @@ -94,7 +94,7 @@ Effect example: iPhone 7, ARM single thread 6.3206ms -
+
2. Image classification @@ -102,7 +102,7 @@ Example: iPhone 7, ARM single thread 13.83ms -
+
## II. Introduction to Android Demo @@ -175,7 +175,7 @@ NDK 22 and 23 are not suggested, because they may report error when link third p Effect example: Huawei P40, ARM single thread 32.2359ms -
+
Example: Huawei P40, NPU rom 100.320.010.022 9.04ms @@ -188,7 +188,7 @@ NDK 22 and 23 are not suggested, because they may report error when link third p Effect example: Huawei P40, ARM single thread 122.296ms -
+
Example: Huawei P40, NPU rom 100.320.010.022 28ms @@ -201,7 +201,7 @@ NDK 22 and 23 are not suggested, because they may report error when link third p Effect example: Huawei P40, ARM single thread 81.4047ms -
+
Example: Huawei P40, NPU rom 100.320.010.022 2.48ms diff --git a/doc/en/user/tech_solution_en.md b/doc/en/user/tech_solution_en.md index b1be28354..b7d6db0de 100644 --- a/doc/en/user/tech_solution_en.md +++ b/doc/en/user/tech_solution_en.md @@ -61,7 +61,7 @@ At present, TNN has been launched in various major businesses, and its following #### TNN Architecture Diagram: -
+
* TNN supports TensorFlow, Pytorch, MxNet, Caffe, and other training frameworks through ONNX, leveraging the continuous improvement of the ONNX open-source society. Currently, TNN supports 55 ONNX operators and will be developed to cover 80 operators shortly, consisting of most of the mainstream CNN operators needed.