Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -18,9 +18,9 @@ sudo apt update
sudo apt install cmake git-lfs -y
```

You can use Android Studio to obtain the NDK.
You can use Android Studio to obtain the NDK.

Click **Tools > SDK Manager** and navigate to the **SDK Tools** tab.
Click **Tools > SDK Manager** and navigate to the **SDK Tools** tab.

Select the **NDK (Side by side)** and **CMake** checkboxes, as shown below:

Expand Down Expand Up @@ -55,7 +55,7 @@ source vision_llm/bin/activate

## Set up Phone Connection

You need to set up an authorized connection with your phone. The Android SDK Platform Tools package, included with Android Studio, provides Android Debug Bridge (ADB) for transferring files.
You need to set up an authorized connection with your phone. The Android SDK Platform Tools package, included with Android Studio, provides Android Debug Bridge (ADB) for transferring files.

Connect your phone to your computer using a USB cable, and enable USB debugging on your phone. To do this, tap the **Build Number** in your **Settings** app 7 times, then enable **USB debugging** in **Developer Options**.

Expand All @@ -79,7 +79,9 @@ The pre-quantized model is available in Hugging Face, you can download with the
```bash
git lfs install
git clone https://huggingface.co/taobao-mnn/Qwen2.5-VL-3B-Instruct-MNN
cd Qwen2.5-VL-3B-Instruct-MNN
git checkout a4622194b3c518139e2cb8099e147e3d71975f7a
cd ..
```

## (Optional) Download and Convert the Model
Expand Down Expand Up @@ -133,11 +135,11 @@ Verify that the model was built correctly by checking that the `Qwen2.5-VL-3B-In

## Push the model to Android device

Push the model onto the device:
Push the repository you cloned earlier onto the device:

```shell
adb shell mkdir /data/local/tmp/models/
adb push Qwen2.5-VL-3B-Instruct-MNN /data/local/tmp/models
adb push Qwen2.5-VL-3B-Instruct-MNN/ /data/local/tmp/models
```

With the model set up, you're ready to build and run an example application.
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
---
title: Benchmark the Vision Transformer performance with KleidiAI
weight: 4

### FIXED, DO NOT MODIFY
layout: learningpathall
---

## Clone Vision Language Models repo

In this section, you will run the Qwen model in action using a demo application using a Android Package Kit (APK).

This repository is set up to enable building the app as an Android Studio project.

Run the following commands to clone the repository and checkout the source tree:

```bash
git clone https://gitlab.arm.com/kleidi/kleidi-examples/vision-language-models
```

## Build the App Using Android Studio

You can use Android Studio to build the app and create an APK.

### Open project and build

Open Android Studio.

Go to **File > Open**.

Navigate to the vision-language-models directories, and click `Open`.

This triggers a build of the project, and you should see output similar to the following on completion:

```output
BUILD SUCCESSFUL in 1m 42s
```

### Generate and Run the APK

Navigate to **Build > Generate App Bundles or APKs**. Select **Generate APKs**.

The build will be executed, and then the app will be copied and installed on the Android device.

After opening the app, you will see the splash screen:

![Loading screenshot](Loading_page.png)

Finally, you can use the UI to chat with the app. Try uploading an image and ask a question on it.

![Loading screenshot](chat2.png)

The final step is to examine how KleidiAI can improve the performance of the model. Continue to the next section to find out.
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: Build the MNN Command-line ViT Demo
weight: 4
weight: 5

### FIXED, DO NOT MODIFY
layout: learningpathall
Expand Down Expand Up @@ -29,7 +29,7 @@ Run the following commands to clone the MNN repository and checkout the source t
cd $HOME
git clone https://github.com/alibaba/MNN.git
cd MNN
git checkout a739ea5870a4a45680f0e36ba9662ca39f2f4eec
git checkout fa3b2161a9b38ac1e7dc46bb20259bd5eb240031
```

Create a build directory and run the build script.
Expand All @@ -40,10 +40,9 @@ The first time that you do this, build the binaries with the `-DMNN_KLEIDIAI` fl
cd $HOME/MNN/project/android
mkdir build_64 && cd build_64

../build_64.sh "-DMNN_LOW_MEMORY=true -DLLM_SUPPORT_VISION=true -DMNN_KLEIDIAI=FALSE \
-DMNN_CPU_WEIGHT_DEQUANT_GEMM=true -DMNN_BUILD_LLM=true \
-DMNN_SUPPORT_TRANSFORMER_FUSE=true -DMNN_ARM82=true -DMNN_OPENCL=true \
-DMNN_USE_LOGCAT=true -DMNN_IMGCODECS=true -DMNN_BUILD_OPENCV=true"
../build_64.sh "-DMNN_BUILD_LLM=true -DMNN_BUILD_LLM_OMNI=ON -DLLM_SUPPORT_VISION=true \
-DMNN_BUILD_OPENCV=true -DMNN_IMGCODECS=true -DMNN_LOW_MEMORY=true \
-DMNN_CPU_WEIGHT_DEQUANT_GEMM=true -DMNN_BUILD_LLM=true -DMNN_SUPPORT_TRANSFORMER_FUSE=true"
```
{{% notice Note %}}
If your NDK toolchain isn't set up correctly, you might run into issues with the above script. Make a note of where the NDK was installed - this will be a directory named after the version you downloaded earlier. Try exporting the following environment variables before re-running `build_64.sh`:
Expand Down Expand Up @@ -102,14 +101,19 @@ prefill speed = 192.28 tok/s

## Enable KleidiAI and Re-run Inference

The next step is to re-generate the binaries with KleidiAI activated. This is done by updating the flag `-DMNN_KLEIDIAI` to `TRUE`.
The next step is to re-generate the binaries with KleidiAI activated. This is done by inserting a hint into the code.

From the `MNN` directory, run:
```bash
sed -i '/void Llm::setRuntimeHint(std::shared_ptr<Express::Executor::RuntimeManager> &rtg) {/a\
rtg->setHint(MNN::Interpreter::CPU_ENABLE_KLEIDIAI, 1);' transformers/llm/engine/src/llm.cpp
```

From the `build_64` directory, run:
```bash
../build_64.sh "-DMNN_LOW_MEMORY=true -DLLM_SUPPORT_VISION=true -DMNN_KLEIDIAI=TRUE \
-DMNN_CPU_WEIGHT_DEQUANT_GEMM=true -DMNN_BUILD_LLM=true \
-DMNN_SUPPORT_TRANSFORMER_FUSE=true -DMNN_ARM82=true -DMNN_OPENCL=true \
-DMNN_USE_LOGCAT=true -DMNN_IMGCODECS=true -DMNN_BUILD_OPENCV=true"
../build_64.sh "-DMNN_BUILD_LLM=true -DMNN_BUILD_LLM_OMNI=ON -DLLM_SUPPORT_VISION=true \
-DMNN_BUILD_OPENCV=true -DMNN_IMGCODECS=true -DMNN_LOW_MEMORY=true \
-DMNN_CPU_WEIGHT_DEQUANT_GEMM=true -DMNN_BUILD_LLM=true -DMNN_SUPPORT_TRANSFORMER_FUSE=true"
```
## Update Files on the Device

Expand Down