Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 6 additions & 3 deletions backends/qualcomm/README.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,14 @@
# Qualcomm AI Engine Direct Backend

Disclaimer: At present, we do not offer any backward compatibility guarantees
for any APIs. We are currently in a pre-alpha development phase, and as such,
for any APIs. We are currently in a development phase, and as such,
we reserve the right to modify interfaces and implementations.

This backend is implemented on the top of
[Qualcomm AI Engine Direct SDK](https://developer.qualcomm.com/software/qualcomm-ai-engine-direct-sdk).
Please follow [tutorial](https://pytorch.org/executorch/stable/build-run-qualcomm-ai-engine-direct-backend.html) to setup environment, build, and run executorch models by this backend (Qualcomm AI Engine Direct is also referred to as QNN in the source and documentation).
Please follow [tutorial](../../docs/source/build-run-qualcomm-ai-engine-direct-backend.md) to setup environment, build, and run executorch models by this backend (Qualcomm AI Engine Direct is also referred to as QNN in the source and documentation).

A website version of the tutorial is [here](https://pytorch.org/executorch/stable/build-run-qualcomm-ai-engine-direct-backend.html).

## Delegate Options

Expand All @@ -29,7 +31,7 @@ Add SoC model into QcomChipset enum in [schema](./serialization/schema.fbs) and
Insert new SoC information into _soc_info_table in [qnn_compile_spec_schema](./serialization/qnn_compile_spec_schema.py).

#### Step 3: Recompile the .pte file
Follow [setup](setup.md) to setup environment and build runtime with new schema header.
Follow [setup](../../docs/source/build-run-qualcomm-ai-engine-direct-backend.md) to setup environment and build runtime with new schema header.

### Supported Inference Type
- Quantized
Expand All @@ -46,6 +48,7 @@ backends/qualcomm
├── partition # QNN Partitioner (AoT Part).
├── passes # Various passes helping lower models to QNN backend (AoT Part).
├── python # Places to put pybind artifacts for accessing QNN APIs, structures, etc (AoT Part).
├── quantizer # QNN Quantizer
├── runtime # Here is QNN runtime responsbile for compiling a model on x64.
| | # Meanwhile, this is also the runtime responsbile for executing compiled
| | # models on a device.
Expand Down
8 changes: 4 additions & 4 deletions backends/qualcomm/scripts/build.sh
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ usage() {
echo "Usage: Build the aarch64 version of executor runner or the python interface of Qnn Manager"
echo "First, you need to set the environment variable for QNN_SDK_ROOT"
echo ", and if you want to build the aarch64 version of executor runner"
echo ", you need to set ANDROID_NDK_ROOT"
echo ", you need to export ANDROID_NDK_ROOT=/path/to/android_ndkXX"
echo "e.g.: executorch$ ./backends/qualcomm/scripts/build.sh --skip_x86_64"
exit 1
}
Expand All @@ -25,9 +25,9 @@ usage() {
[ "$1" = -h ] && usage

BUILD_X86_64="true"
CMAKE_X86_64="build_x86_64"
CMAKE_X86_64="cmake-out"
BUILD_AARCH64="true"
CMAKE_AARCH64="build_android"
CMAKE_AARCH64="cmake-out-android"
CLEAN="true"
BUILD_TYPE="Debug"
BUILD_JOB_NUMBER="16"
Expand Down Expand Up @@ -61,7 +61,7 @@ PRJ_ROOT="$( cd "$(dirname "$0")/../../.." ; pwd -P)"

if [ "$BUILD_AARCH64" = true ]; then
if [[ -z ${ANDROID_NDK_ROOT} ]]; then
echo "Please export ANDROID_NDK_ROOT=/path/to/android_ndk"
echo "Please export ANDROID_NDK_ROOT=/path/to/android_ndkXX"
exit -1
fi
BUILD_ROOT=$PRJ_ROOT/$CMAKE_AARCH64
Expand Down
188 changes: 3 additions & 185 deletions backends/qualcomm/setup.md
Original file line number Diff line number Diff line change
@@ -1,189 +1,7 @@
# Setting up QNN Backend

This is a tutorial for building and running Qualcomm AI Engine Direct backend,
Please refer to [Building and Running ExecuTorch with Qualcomm AI Engine Direct Backend](../../docs/source/build-run-qualcomm-ai-engine-direct-backend.md).

That is a tutorial for building and running Qualcomm AI Engine Direct backend,
including compiling a model on a x64 host and running the inference
on a Android device.


## Prerequisite

Please finish tutorial [Setting up executorch](../../docs/source/getting-started-setup.md).


## Conventions

`$QNN_SDK_ROOT` refers to the root of Qualcomm AI Engine Direct SDK,
i.e., the directory containing `QNN_README.txt`.

`$ANDROID_NDK_ROOT` refers to the root of Android NDK.

`$EXECUTORCH_ROOT` refers to the root of executorch git repository.


## Environment Setup

### Download Qualcomm AI Engine Direct SDK

Navigate to [Qualcomm AI Engine Direct SDK](https://developer.qualcomm.com/software/qualcomm-ai-engine-direct-sdk) and follow the download button.

You might need to apply for a Qualcomm account to download the SDK.

After logging in, search Qualcomm AI Stack at the *Tool* panel.
You can find Qualcomm AI Engine Direct SDK under the AI Stack group.

Please download the Linux version, and follow instructions on the page to
extract the file.

The SDK should be installed to somewhere `/opt/qcom/aistack/qnn` by default.

### Download Android NDK

Please navigate to [Android NDK](https://developer.android.com/ndk) and download
a version of NDK. We recommend LTS version, currently r25c.

### Setup environment variables

We need to make sure Qualcomm AI Engine Direct libraries can be found by
the dynamic linker on x64. Hence we set `LD_LIBRARY_PATH`. In production,
we recommend users to put libraries in default search path or use `rpath`
to indicate the location of libraries.

Further, we set up `$PYTHONPATH` because it's easier to develop and import executorch Python APIs. Users might also build and install executorch package as usual python package.

```bash
export LD_LIBRARY_PATH=$QNN_SDK_ROOT/lib/x86_64-linux-clang/:$LD_LIBRARY_PATH
export PYTHONPATH=$EXECUTORCH_ROOT/..
```

Note: Since we set `PYTHONPATH`, we may have issue with finding `program.fbs`
and `scalar_type.fbs` when we export a model, because they are installed into
`pip-out` directory with the same package name pattern. A workaround is that
we copy `$EXECUTORCH_ROOT/pip-out/lib.linux-x86_64-cpython-310/executorch/exir/_serialize/program.fbs`
and `$EXECUTORCH_ROOT/pip-out/lib.linux-x86_64-cpython-310/executorch/exir/_serialize/scalar_type.fbs`
to `$EXECUTORCH_ROOT/exir/_serialize/`.


## End to End Inference

### Step 1: Build Python APIs for AOT compilation on x64

Python APIs on x64 are required to compile models to Qualcomm AI Engine Direct binary.
Make sure `buck2` is under a directory in `PATH`.

```bash
cd $EXECUTORCH_ROOT
mkdir build_x86_64
cd build_x86_64
cmake .. -DEXECUTORCH_BUILD_QNN=ON -DQNN_SDK_ROOT=${QNN_SDK_ROOT}
cmake --build . -t "PyQnnManagerAdaptor" "PyQnnWrapperAdaptor" -j8

# install Python APIs to correct import path
# The filename might vary depending on your Python and host version.
cp -f backends/qualcomm/PyQnnManagerAdaptor.cpython-310-x86_64-linux-gnu.so $EXECUTORCH_ROOT/backends/qualcomm/python
cp -f backends/qualcomm/PyQnnWrapperAdaptor.cpython-310-x86_64-linux-gnu.so $EXECUTORCH_ROOT/backends/qualcomm/python
```


### Step 2: Build `qnn_executor_runner` for Android

`qnn_executor_runner` is an executable running the compiled model.

You might want to ensure the correct `flatc`. `flatc` can be built along with the above step. For example, we can find `flatc` in `build_x86_64/third-party/flatbuffers/`.

We can prepend `$EXECUTORCH_ROOT/build_x86_64/third-party/flatbuffers` to `PATH`. Then below cross-compiling can find the correct flatbuffer compiler.

Commands to build `qnn_executor_runner` for Android:

```bash
cd $EXECUTORCH_ROOT
mkdir build_android
cd build_android
# build executorch & qnn_executorch_backend
cmake .. \
-DCMAKE_INSTALL_PREFIX=$PWD \
-DEXECUTORCH_BUILD_QNN=ON \
-DEXECUTORCH_BUILD_SDK=ON \
-DEXECUTORCH_ENABLE_EVENT_TRACER=ON \
-DQNN_SDK_ROOT=$QNN_SDK_ROOT \
-DCMAKE_TOOLCHAIN_FILE=$ANDROID_NDK_ROOT/build/cmake/android.toolchain.cmake \
-DANDROID_ABI='arm64-v8a' \
-DANDROID_NATIVE_API_LEVEL=23 \
-B$PWD

cmake --build $PWD -j16 --target install

cmake ../examples/qualcomm \
-DCMAKE_TOOLCHAIN_FILE=$ANDROID_NDK_ROOT/build/cmake/android.toolchain.cmake \
-DANDROID_ABI='arm64-v8a' \
-DANDROID_NATIVE_API_LEVEL=23 \
-DCMAKE_PREFIX_PATH="$PWD/lib/cmake/ExecuTorch;$PWD/third-party/gflags;" \
-DCMAKE_FIND_ROOT_PATH_MODE_PACKAGE=BOTH \
-Bexamples/qualcomm

cmake --build examples/qualcomm -j16
```
**Note:** If you want to build for release, add `-DCMAKE_BUILD_TYPE=Release` to the `cmake` command options.

You can find `qnn_executor_runner` under `build_android/examples/qualcomm/`.


### Step 3: Compile a model

```
python -m examples.qualcomm.scripts.export_example --model_name mv2
```

Then the generated `mv2.pte` can be run on the device by
`build_android/backends/qualcomm/qnn_executor_runner` with Qualcomm AI Engine
Direct backend.

[**Note**] To get proper accuracy, please apply calibrations with representative
dataset, which could be learnt more from examples under `examples/qualcomm/`.


### Step 4: Model Inference

The backend rely on Qualcomm AI Engine Direct SDK libraries.

You might want to follow docs in Qualcomm AI Engine Direct SDK to setup the device environment.
Or see below for a quick setup for testing:

```bash
# make sure you have write-permission on below path.
DEVICE_DIR=/data/local/tmp/executorch_test/
adb shell "mkdir -p ${DEVICE_DIR}"
adb push ${QNN_SDK_ROOT}/lib/aarch64-android/libQnnHtp.so ${DEVICE_DIR}
adb push ${QNN_SDK_ROOT}/lib/aarch64-android/libQnnHtpV69Stub.so ${DEVICE_DIR}
adb push ${QNN_SDK_ROOT}/lib/aarch64-android/libQnnHtpV73Stub.so ${DEVICE_DIR}
adb push ${QNN_SDK_ROOT}/lib/aarch64-android/libQnnHtpV75Stub.so ${DEVICE_DIR}
adb push ${QNN_SDK_ROOT}/lib/aarch64-android/libQnnSystem.so ${DEVICE_DIR}
adb push ${QNN_SDK_ROOT}/lib/hexagon-v69/unsigned/libQnnHtpV69Skel.so ${DEVICE_DIR}
adb push ${QNN_SDK_ROOT}/lib/hexagon-v73/unsigned/libQnnHtpV73Skel.so ${DEVICE_DIR}
adb push ${QNN_SDK_ROOT}/lib/hexagon-v75/unsigned/libQnnHtpV75Skel.so ${DEVICE_DIR}
```

We also need to indicate dynamic linkers on Android and Hexagon where to find these libraries
by setting `ADSP_LIBRARY_PATH` and `LD_LIBRARY_PATH`.

So, we can run `qnn_executor_runner` like
```bash
adb push mv2.pte ${DEVICE_DIR}
adb push ${EXECUTORCH_ROOT}/build_android/examples/qualcomm/qnn_executor_runner ${DEVICE_DIR}
adb shell "cd ${DEVICE_DIR} \
&& export LD_LIBRARY_PATH=${DEVICE_DIR} \
&& export ADSP_LIBRARY_PATH=${DEVICE_DIR} \
&& ./qnn_executor_runner --model_path ./mv2_qnn.pte"
```

You should see the following result.
Note that no output file will be generated in this example.
```
I 00:00:00.133366 executorch:qnn_executor_runner.cpp:156] Method loaded.
I 00:00:00.133590 executorch:util.h:104] input already initialized, refilling.
I 00:00:00.135162 executorch:qnn_executor_runner.cpp:161] Inputs prepared.
I 00:00:00.136768 executorch:qnn_executor_runner.cpp:278] Model executed successfully.
[INFO][Qnn ExecuTorch] Destroy Qnn backend parameters
[INFO][Qnn ExecuTorch] Destroy Qnn context
[INFO][Qnn ExecuTorch] Destroy Qnn device
[INFO][Qnn ExecuTorch] Destroy Qnn backend
```
44 changes: 31 additions & 13 deletions backends/qualcomm/tests/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -231,25 +231,43 @@ def validate_profile():
qnn_sdk = os.environ.get("QNN_SDK_ROOT", None)
assert qnn_sdk, "QNN_SDK_ROOT was not found in environment variable"

build_path = "build_x86_64"
cmds = [
# export LD_LIBRARY_PATH to QNN_SDK_ROOT
f"export LD_LIBRARY_PATH={qnn_sdk}/lib/{target}/:{self.executorch_root}/{build_path}/lib && "
build_folder = self.build_folder
if os.path.isabs(self.build_folder):
# obey user's opinion
pass
else:
# ok, assuming the user give a relative path to cwd
build_folder = os.path.join(os.getcwd(), self.build_folder)

cmd = [
# qnn_executor_runner
f"{self.executorch_root}/{build_path}/examples/qualcomm/qnn_executor_runner",
f"--model_path {pte_fname}",
f"--input_list_path {tmp_dir}/input_list.txt",
f"--output_folder_path {output_dir}",
f"{build_folder}/examples/qualcomm/qnn_executor_runner",
"--model_path",
f"{pte_fname}",
"--input_list_path",
f"{tmp_dir}/input_list.txt",
"--output_folder_path",
f"{output_dir}",
]

subprocess.run(
" ".join(cmds),
shell=True,
executable="/bin/bash",
capture_output=True,
env = dict(os.environ)
env["LD_LIBRARY_PATH"] = f"{qnn_sdk}/lib/{target}/:{build_folder}/lib"
proc = subprocess.run(
cmd,
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT,
env=env,
cwd=tmp_dir,
)

self.assertEqual(
proc.returncode,
0,
f"The process running qnn_executorch_runner return {proc.returncode}, "
"STDOUT=\n"
f"{proc.stdout.decode('utf-8')}",
)

# Verify the outputs
post_process()
self._assert_outputs_equal(outputs, ref_outputs)
Expand Down
Loading