Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions .github/workflows/android-perf.yml
Original file line number Diff line number Diff line change
Expand Up @@ -206,6 +206,10 @@ jobs:
name: build-llm-demo
uses: pytorch/test-infra/.github/workflows/linux_job.yml@main
needs: set-parameters
strategy:
matrix:
delegate: ${{ fromJson(needs.set-parameters.outputs.delegates) }}
fail-fast: false
with:
runner: linux.2xlarge
docker-image: executorch-ubuntu-22.04-clang12-android
Expand All @@ -222,6 +226,11 @@ jobs:
PYTHON_EXECUTABLE=python bash .ci/scripts/setup-linux.sh cmake
export ARTIFACTS_DIR_NAME=artifacts-to-be-uploaded

if [[ ${{ matrix.delegate }} == "qnn" ]]; then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just notice that the job fail https://github.com/pytorch/executorch/actions/runs/10713455099/job/29705837269#step:12:6342, you will need to define the matrix strategy for this job with

strategy:
  matrix:
    delegate: ${{ fromJson(needs.set-parameters.outputs.delegates) }}
fail-fast: false

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just realized that we're building backend-specific apps in the android benchmark workflow and it seems like the changes are introduced here. @kirklandsign @huydhn I think we should not build different apps for each backend. There are several reasons for it:

  1. It will complicate the benchmarking workflow as you will have to add additional logics to match model with the app. Because we are benchmarking the perf of a model instead of the app, the app size is not a concern in this case.
  2. Delegates to hybrid backends will be supported for better perf at some point though today nobody is proactively working on it. Ideally QNN unsupported ops should fallback to XNNPACK instead of Portable, otherwise we may end up with QNN delegated model run slower than a pure CPU model w/ XNNPACK. Our stack should have supported building an app with multiple backends/delegates.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now we build this XNNPACK+QNN when matrix has qnn. https://github.com/pytorch/executorch/blob/main/.github/workflows/android-perf.yml#L147-L150

We can build a single flavor (XNNPACK+QNN+otherbackends) regardless of matrix as well

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The requirement to do always build QNN is we need to run

PYTHON_EXECUTABLE=python bash .ci/scripts/setup-qnn-deps.sh
PYTHON_EXECUTABLE=python bash .ci/scripts/build-qnn-sdk.sh

all the time. Probably takes some time and stability. Ideally we have a docker image.

PYTHON_EXECUTABLE=python bash .ci/scripts/setup-qnn-deps.sh
PYTHON_EXECUTABLE=python bash .ci/scripts/build-qnn-sdk.sh
fi

# TODO: This needs to be replaced with a generic loader .apk
# Build LLM Demo for Android
bash build/build_android_llm_demo.sh ${ARTIFACTS_DIR_NAME}
Expand Down
9 changes: 9 additions & 0 deletions build/build_android_llm_demo.sh
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,13 @@ build_android_native_library() {
ANDROID_ABI="$1"
ANDROID_NDK="${ANDROID_NDK:-/opt/ndk}"
CMAKE_OUT="cmake-out-android-${ANDROID_ABI}"
QNN_SDK_ROOT="${QNN_SDK_ROOT:-}"
if [ -n "$QNN_SDK_ROOT" ]; then
EXECUTORCH_BUILD_QNN=ON
else
EXECUTORCH_BUILD_QNN=OFF
fi


cmake . -DCMAKE_INSTALL_PREFIX="${CMAKE_OUT}" \
-DCMAKE_TOOLCHAIN_FILE="${ANDROID_NDK}/build/cmake/android.toolchain.cmake" \
Expand All @@ -34,6 +41,8 @@ build_android_native_library() {
-DEXECUTORCH_BUILD_KERNELS_OPTIMIZED=ON \
-DEXECUTORCH_BUILD_KERNELS_QUANTIZED=ON \
-DEXECUTORCH_BUILD_KERNELS_CUSTOM=ON \
-DEXECUTORCH_BUILD_QNN="${EXECUTORCH_BUILD_QNN}" \
-DQNN_SDK_ROOT="${QNN_SDK_ROOT}" \
-DCMAKE_BUILD_TYPE=Release \
-B"${CMAKE_OUT}"

Expand Down
Loading