From e5dab18f30d358b7cc6b9d0967f9618cfc4ad556 Mon Sep 17 00:00:00 2001 From: Chester Hu Date: Wed, 25 Sep 2024 11:06:02 -0700 Subject: [PATCH] Demo app android xnnpack quick-fix for the bookmark link (#5642) Summary: Pull Request resolved: https://github.com/pytorch/executorch/pull/5642 quick fix for the in page link Reviewed By: kirklandsign Differential Revision: D63400245 fbshipit-source-id: 2fe6c71117851b22dd80654f9c19a2c3e0036a03 (cherry picked from commit 6e9efa1418d2803bb3fda58d175da2df5e867fb9) --- .../android/LlamaDemo/docs/delegates/xnnpack_README.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/examples/demo-apps/android/LlamaDemo/docs/delegates/xnnpack_README.md b/examples/demo-apps/android/LlamaDemo/docs/delegates/xnnpack_README.md index cb6193942ae..f9bcc3c7758 100644 --- a/examples/demo-apps/android/LlamaDemo/docs/delegates/xnnpack_README.md +++ b/examples/demo-apps/android/LlamaDemo/docs/delegates/xnnpack_README.md @@ -1,5 +1,7 @@ # Building ExecuTorch Android Demo App for Llama running XNNPack +**[UPDATE - 09/25]** We have added support for running [Llama 3.2 models](#for-llama-32-1b-and-3b-models) on the XNNPack backend. We currently support inference on their original data type (BFloat16). We have also added instructions to run [Llama Guard 1B models](#for-llama-guard-1b-models) on-device. + This tutorial covers the end to end workflow for building an android demo app using CPU on device via XNNPack framework. More specifically, it covers: 1. Export and quantization of Llama and Llava models against the XNNPack backend.