pytorch · cmodi-meta · Sep 25, 2024 · Sep 25, 2024
@@ -1,6 +1,6 @@
 # Building Llama iOS Demo for XNNPack Backend
 
-**[UPDATE - 09/25]** We have added support for running [Llama 3.2 models](#for-llama-3.2-1b-and-3b-models) on the XNNPack backend. We currently support inference on their original data type (BFloat16).
+**[UPDATE - 09/25]** We have added support for running [Llama 3.2 models](#for-llama-32-1b-and-3b-models) on the XNNPack backend. We currently support inference on their original data type (BFloat16).
 
 This tutorial covers the end to end workflow for building an iOS demo app using XNNPack backend on device.
 More specifically, it covers:
@@ -49,7 +49,7 @@ sh examples/models/llama2/install_requirements.sh
 
 ### For Llama 3.2 1B and 3B models
 We have supported BFloat16 as a data type on the XNNPack backend for Llama 3.2 1B/3B models.
-* You can download original model weights for Llama through Meta official [website](https://llama.meta.com/), or via Huggingface (Link to specific 3.2 1B repo)
+* You can download original model weights for Llama through Meta official [website](https://llama.meta.com/).
 * For chat use-cases, download the instruct models instead of pretrained.
 * Run “examples/models/llama2/install_requirements.sh” to install dependencies.
 * The 1B model in BFloat16 format can run on mobile devices with 8GB RAM (iPhone 15 Pro and later). The 3B model will require 12GB+ RAM and hence will not fit on 8GB RAM phones.