ArmDeveloperEcosystem · pareenaverma · Oct 10, 2025 · Oct 10, 2025 · Oct 10, 2025
diff --git a/...ing/vision-llm-inference-on-android-with-kleidiai-and-mnn/1-devenv-and-model.md b/...ing/vision-llm-inference-on-android-with-kleidiai-and-mnn/1-devenv-and-model.md
@@ -18,9 +18,9 @@ sudo apt update
 sudo apt install cmake git-lfs -y
 ```
 
-You can use Android Studio to obtain the NDK. 
+You can use Android Studio to obtain the NDK.
 
-Click **Tools > SDK Manager** and navigate to the **SDK Tools** tab. 
+Click **Tools > SDK Manager** and navigate to the **SDK Tools** tab.
 
 Select the **NDK (Side by side)** and **CMake** checkboxes, as shown below:
 
@@ -55,7 +55,7 @@ source vision_llm/bin/activate
 
 ## Set up Phone Connection
 
-You need to set up an authorized connection with your phone. The Android SDK Platform Tools package, included with Android Studio, provides Android Debug Bridge (ADB) for transferring files. 
+You need to set up an authorized connection with your phone. The Android SDK Platform Tools package, included with Android Studio, provides Android Debug Bridge (ADB) for transferring files.
 
 Connect your phone to your computer using a USB cable, and enable USB debugging on your phone. To do this, tap the **Build Number** in your **Settings** app 7 times, then enable **USB debugging** in **Developer Options**.
 
@@ -79,7 +79,9 @@ The pre-quantized model is available in Hugging Face, you can download with the
 ```bash
 git lfs install
 git clone https://huggingface.co/taobao-mnn/Qwen2.5-VL-3B-Instruct-MNN
+cd Qwen2.5-VL-3B-Instruct-MNN
 git checkout a4622194b3c518139e2cb8099e147e3d71975f7a
+cd ..
 ```
 
 ## (Optional) Download and Convert the Model
@@ -133,11 +135,11 @@ Verify that the model was built correctly by checking that the `Qwen2.5-VL-3B-In
 
 ## Push the model to Android device
 
-Push the model onto the device:
+Push the repository you cloned earlier onto the device:
 
 ```shell
 adb shell mkdir /data/local/tmp/models/
-adb push Qwen2.5-VL-3B-Instruct-MNN /data/local/tmp/models
+adb push Qwen2.5-VL-3B-Instruct-MNN/ /data/local/tmp/models
 ```
 
 With the model set up, you're ready to build and run an example application.
diff --git a/...-gaming/vision-llm-inference-on-android-with-kleidiai-and-mnn/2-generate-apk.md b/...-gaming/vision-llm-inference-on-android-with-kleidiai-and-mnn/2-generate-apk.md
@@ -0,0 +1,53 @@
+---
+title: Benchmark the Vision Transformer performance with KleidiAI
+weight: 4
+
+### FIXED, DO NOT MODIFY
+layout: learningpathall
+---
+
+## Clone Vision Language Models repo
+
+In this section, you will run the Qwen model in action using a demo application using a Android Package Kit (APK).
+
+This repository is set up to enable building the app as an Android Studio project. 
+
+Run the following commands to clone the repository and checkout the source tree:
+
+```bash
+git clone https://gitlab.arm.com/kleidi/kleidi-examples/vision-language-models
+```
+
+## Build the App Using Android Studio
+
+You can use Android Studio to build the app and create an APK.
+
+### Open project and build
+
+Open Android Studio. 
+
+Go to **File > Open**. 
+
+Navigate to the vision-language-models directories, and click `Open`.
+
+This triggers a build of the project, and you should see output similar to the following on completion:
+
+```output
+BUILD SUCCESSFUL in 1m 42s
+```
+
+### Generate and Run the APK
+
+Navigate to **Build > Generate App Bundles or APKs**. Select **Generate APKs**.
+
+The build will be executed, and then the app will be copied and installed on the Android device.
+
+After opening the app, you will see the splash screen:
+
+![Loading screenshot](Loading_page.png)
+
+Finally, you can use the UI to chat with the app. Try uploading an image and ask a question on it.
+
+![Loading screenshot](chat2.png)
+
+The final step is to examine how KleidiAI can improve the performance of the model. Continue to the next section to find out.
diff --git a/...roid-with-kleidiai-and-mnn/2-benchmark.md → ...roid-with-kleidiai-and-mnn/3-benchmark.md b/...roid-with-kleidiai-and-mnn/2-benchmark.md → ...roid-with-kleidiai-and-mnn/3-benchmark.md
@@ -1,6 +1,6 @@
 ---
 title: Build the MNN Command-line ViT Demo
-weight: 4
+weight: 5
 
 ### FIXED, DO NOT MODIFY
 layout: learningpathall
@@ -29,7 +29,7 @@ Run the following commands to clone the MNN repository and checkout the source t
 cd $HOME
 git clone https://github.com/alibaba/MNN.git
 cd MNN
-git checkout a739ea5870a4a45680f0e36ba9662ca39f2f4eec
+git checkout fa3b2161a9b38ac1e7dc46bb20259bd5eb240031
 ```
 
 Create a build directory and run the build script. 
@@ -40,10 +40,9 @@ The first time that you do this, build the binaries with the `-DMNN_KLEIDIAI` fl
 cd $HOME/MNN/project/android
 mkdir build_64 && cd build_64
 
-../build_64.sh "-DMNN_LOW_MEMORY=true -DLLM_SUPPORT_VISION=true -DMNN_KLEIDIAI=FALSE  \
-  -DMNN_CPU_WEIGHT_DEQUANT_GEMM=true -DMNN_BUILD_LLM=true \
-  -DMNN_SUPPORT_TRANSFORMER_FUSE=true -DMNN_ARM82=true -DMNN_OPENCL=true \
-  -DMNN_USE_LOGCAT=true -DMNN_IMGCODECS=true -DMNN_BUILD_OPENCV=true"
+../build_64.sh "-DMNN_BUILD_LLM=true -DMNN_BUILD_LLM_OMNI=ON -DLLM_SUPPORT_VISION=true \
+-DMNN_BUILD_OPENCV=true -DMNN_IMGCODECS=true -DMNN_LOW_MEMORY=true \
+-DMNN_CPU_WEIGHT_DEQUANT_GEMM=true -DMNN_BUILD_LLM=true -DMNN_SUPPORT_TRANSFORMER_FUSE=true"
 ```
 {{% notice Note %}}
 If your NDK toolchain isn't set up correctly, you might run into issues with the above script. Make a note of where the NDK was installed - this will be a directory named after the version you downloaded earlier. Try exporting the following environment variables before re-running `build_64.sh`:
@@ -102,14 +101,19 @@ prefill speed = 192.28 tok/s
 
 ## Enable KleidiAI and Re-run Inference
 
-The next step is to re-generate the binaries with KleidiAI activated. This is done by updating the flag `-DMNN_KLEIDIAI` to `TRUE`. 
+The next step is to re-generate the binaries with KleidiAI activated. This is done by inserting a hint into the code. 
+
+From the `MNN` directory, run:
+```bash
+sed -i '/void Llm::setRuntimeHint(std::shared_ptr<Express::Executor::RuntimeManager> &rtg) {/a\
+    rtg->setHint(MNN::Interpreter::CPU_ENABLE_KLEIDIAI, 1);' transformers/llm/engine/src/llm.cpp
+```
 
 From the `build_64` directory, run:
 ```bash
-../build_64.sh "-DMNN_LOW_MEMORY=true -DLLM_SUPPORT_VISION=true -DMNN_KLEIDIAI=TRUE \
--DMNN_CPU_WEIGHT_DEQUANT_GEMM=true -DMNN_BUILD_LLM=true \
--DMNN_SUPPORT_TRANSFORMER_FUSE=true -DMNN_ARM82=true -DMNN_OPENCL=true \
--DMNN_USE_LOGCAT=true -DMNN_IMGCODECS=true -DMNN_BUILD_OPENCV=true"
+../build_64.sh "-DMNN_BUILD_LLM=true -DMNN_BUILD_LLM_OMNI=ON -DLLM_SUPPORT_VISION=true \
+-DMNN_BUILD_OPENCV=true -DMNN_IMGCODECS=true -DMNN_LOW_MEMORY=true \
+-DMNN_CPU_WEIGHT_DEQUANT_GEMM=true -DMNN_BUILD_LLM=true -DMNN_SUPPORT_TRANSFORMER_FUSE=true"
 ```
 ## Update Files on the Device