Skip to content
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Building Llama iOS Demo for XNNPack Backend

**[UPDATE - 09/25]** We have added support for running [Llama 3.2 models](#for-llama-3.2-1b-and-3b-models) on the XNNPack backend. We currently support inference on their original data type (BFloat16).
**[UPDATE - 09/25]** We have added support for running [Llama 3.2 models](#for-llama-32-1b-and-3b-models) on the XNNPack backend. We currently support inference on their original data type (BFloat16).

This tutorial covers the end to end workflow for building an iOS demo app using XNNPack backend on device.
More specifically, it covers:
Expand Down Expand Up @@ -49,7 +49,7 @@ sh examples/models/llama2/install_requirements.sh

### For Llama 3.2 1B and 3B models
We have supported BFloat16 as a data type on the XNNPack backend for Llama 3.2 1B/3B models.
* You can download original model weights for Llama through Meta official [website](https://llama.meta.com/), or via Huggingface (Link to specific 3.2 1B repo)
* You can download original model weights for Llama through Meta official [website](https://llama.meta.com/).
* For chat use-cases, download the instruct models instead of pretrained.
* Run “examples/models/llama2/install_requirements.sh” to install dependencies.
* The 1B model in BFloat16 format can run on mobile devices with 8GB RAM (iPhone 15 Pro and later). The 3B model will require 12GB+ RAM and hence will not fit on 8GB RAM phones.
Expand Down
Loading