Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
title: Build the MNN Command-line ViT Demo
title: Benchmark the Vision Transformer performance with KleidiAI
weight: 4

### FIXED, DO NOT MODIFY
Expand Down Expand Up @@ -35,12 +35,16 @@ $ git clone https://github.com/alibaba/MNN.git
$ mkdir build_64 && cd build_64
$ ../build_64.sh "-DMNN_LOW_MEMORY=true -DLLM_SUPPORT_VISION=true -DMNN_KLEIDIAI=true -DMNN_CPU_WEIGHT_DEQUANT_GEMM=true -DMNN_BUILD_LLM=true -DMNN_SUPPORT_TRANSFORMER_FUSE=true -DMNN_ARM82=true -DMNN_OPENCL=true -DMNN_USE_LOGCAT=true -DMNN_IMGCODECS=true -DMNN_BUILD_OPENCV=true"
$ adb push *so llm_demo tools/cv/*so /data/local/tmp/
$ adb shell
```

The Build parameter above ```-DMNN_KLEIDIAI ```is to enable the kleidiAI on the MNN, it can be set to false to disable the KleidiAi.

## Test the performance within/without kleidiAi

Here switch to android adb shell environment.

```shell
$ adb shell
$ cd /data/local/tmp/
$ chmod +x llm_demo
$ export LD_LIBRARY_PATH=./
Expand Down Expand Up @@ -73,3 +77,10 @@ prefill speed = 135.29 tok/s
##################################
```

Here is my performance comparation within/without kleidiAI

| | KleidiAI OFF | KleidiAi ON |
|----------|----------|----------|
| Vision Process Time | 5.45s | 5.43 s |
| Prefill Speed | 132.35 tok/s | 148.30 tok/s |
| Decode Speed | 21.61 tok/s | 33.26 tok/s |
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ Open up a Windows PowerShell or Git Bash and checkout the source tree:
cd C:\Users\$env:USERNAME
git clone https://github.com/HenryDen/MNN.git
cd MNN
git checkout 83b650fc8888d7ccd38dbc68330a87d048b9fe7a
git checkout origin/MNN_commit
```

{{% notice Note %}}
Expand All @@ -33,28 +33,15 @@ The app code is currently not merged into the MNN repo. The repo above is a fork

## Build the app using Android Studio

Create a signing.gradle file at android/app with the following template:
```shell
ext{
signingConfigs = [
release: [
storeFile: file('PATH_TO_jks_file'),
storePassword: "****",
keyAlias: "****",
keyPassword: "****"
]
]
}
```
Open Android Studio

If you don't need to compile a release version of the app, you can skip the following step of creating a sign file and write anything in the signing.gradle.
- Navigate to **Open**.
- Browse the folder to the MNN/transformers/llm/engine/android, it will be android icon as picture show,
- Press **OK** to load the android project
- Wait for a while till the Gradle sync finish

- Navigate to **Build -> Generate Signed App Bundle or APK**.
- Select **APK** and click **next**.
- Press **Create new** and fill in the information..
- Fill in the information of the newly generated JKS file in the template above.
![Loading screenshot](open_project.png)

Open the MNN/transformers/llm/engine/android directory with Android Studio and wait for the Gradle project sync to finish.

## Prepare the model
You can download the model from ModelScope : https://www.modelscope.cn/models/qwen/qwen2-vl-2b-instruct
Expand All @@ -79,7 +66,7 @@ $ llmexport --path /path/to/mnn-llm/Qwen2-VL-2B-Instruct/ --export mnn --quant_b
- --sym: the quantization parameter, means symmetrical quantization.

## Build and run the app
Before launching the app, you need to push the model into the device manually:
Before launching the app, you need to push the model into the device manually, connect the android device with the host PC with usb, and make sure the USB debugging is enable in the android device:

```shell
$ adb shell mkdir /data/local/tmp/models/
Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.