From 292ae2f3c99f0209abf3b851dc36492161ea8274 Mon Sep 17 00:00:00 2001 From: Scott Roy Date: Fri, 10 Oct 2025 17:26:11 -0700 Subject: [PATCH 1/3] up --- .../mps/mps-overview.md} | 63 +++++-------------- 1 file changed, 15 insertions(+), 48 deletions(-) rename docs/source/{backends-mps.md => backends/mps/mps-overview.md} (60%) diff --git a/docs/source/backends-mps.md b/docs/source/backends/mps/mps-overview.md similarity index 60% rename from docs/source/backends-mps.md rename to docs/source/backends/mps/mps-overview.md index 184bd88e3a7..a2280defad5 100644 --- a/docs/source/backends-mps.md +++ b/docs/source/backends/mps/mps-overview.md @@ -1,55 +1,27 @@ # MPS Backend -In this tutorial we will walk you through the process of getting setup to build the MPS backend for ExecuTorch and running a simple model on it. +MPS delegate is the ExecuTorch solution to take advantage of Apple's GPU for on-device ML using the [MPS Graph](https://developer.apple.com/documentation/metalperformanceshadersgraph/mpsgraph?language=objc) framework and tuned kernels provided by [MPS](https://developer.apple.com/documentation/metalperformanceshaders?language=objc). -The MPS backend device maps machine learning computational graphs and primitives on the [MPS Graph](https://developer.apple.com/documentation/metalperformanceshadersgraph/mpsgraph?language=objc) framework and tuned kernels provided by [MPS](https://developer.apple.com/documentation/metalperformanceshaders?language=objc). +## Target Requirements -::::{grid} 2 -:::{grid-item-card} What you will learn in this tutorial: -:class-card: card-prerequisites -* In this tutorial you will learn how to export [MobileNet V3](https://pytorch.org/vision/main/models/mobilenetv3.html) model to the MPS delegate. -* You will also learn how to compile and deploy the ExecuTorch runtime with the MPS delegate on macOS and iOS. -::: -:::{grid-item-card} Tutorials we recommend you complete before this: -:class-card: card-prerequisites -* [Introduction to ExecuTorch](intro-how-it-works.md) -* [Getting Started](getting-started.md) -* [Building ExecuTorch with CMake](using-executorch-building-from-source.md) -* [ExecuTorch iOS Demo App](https://github.com/meta-pytorch/executorch-examples/tree/main/mv3/apple/ExecuTorchDemo) -* [ExecuTorch LLM iOS Demo App](https://github.com/meta-pytorch/executorch-examples/tree/main/llm/apple) -::: -:::: +Below are the minimum OS requirements on various hardware for running a MPS-delegated ExecuTorch model: +- [macOS](https://developer.apple.com/macos) >= 12.4 +- [iOS](https://www.apple.com/ios) >= 15.4 +## Development Requirements +To develop you need: -## Prerequisites (Hardware and Software) +- [Xcode](https://developer.apple.com/xcode/) >= 14.1 -In order to be able to successfully build and run a model using the MPS backend for ExecuTorch, you'll need the following hardware and software components: +Before starting, make sure you install the Xcode Command Line Tools: -### Hardware: - - A [mac](https://www.apple.com/mac/) for tracing the model - -### Software: - - - **Ahead of time** tracing: - - [macOS](https://www.apple.com/macos/) 12 - - - **Runtime**: - - [macOS](https://www.apple.com/macos/) >= 12.4 - - [iOS](https://www.apple.com/ios) >= 15.4 - - [Xcode](https://developer.apple.com/xcode/) >= 14.1 - -## Setting up Developer Environment - -***Step 1.*** Complete the steps in [Getting Started](getting-started.md) to set up the ExecuTorch development environment. - -You will also need a local clone of the ExecuTorch repository. See [Building ExecuTorch from Source](using-executorch-building-from-source.html) for instructions. All commands in this document should be run from the executorch repository. - -## Build +```bash +xcode-select --install +``` -### AOT (Ahead-of-time) Components +## Using the MPS Backend -**Compiling model for MPS delegate**: -- In this step, you will generate a simple ExecuTorch program that lowers MobileNetV3 model to the MPS delegate. You'll then pass this Program (the `.pte` file) during the runtime to run it using the MPS backend. +In this step, you will generate a simple ExecuTorch program that lowers MobileNetV3 model to the MPS delegate. You'll then pass this Program (the `.pte` file) during the runtime to run it using the MPS backend. ```bash cd executorch @@ -121,7 +93,7 @@ python3 -m examples.apple.mps.scripts.mps_example --model_name="mv3" --generate_ python3 -m devtools.inspector.inspector_cli --etdump_path etdump.etdp --etrecord_path etrecord.bin ``` -## Deploying and Running on Device +## Runtime integration ***Step 1***. Create the ExecuTorch core and MPS delegate frameworks to link on iOS ```bash @@ -146,8 +118,3 @@ From the same page, include the needed libraries for the MPS delegate: - `Metal.framework` In this tutorial, you have learned how to lower a model to the MPS delegate, build the mps_executor_runner and run a lowered model through the MPS delegate, or directly on device using the MPS delegate static library. - - -## Frequently encountered errors and resolution. - -If you encountered any bugs or issues following this tutorial please file a bug/issue on the [ExecuTorch repository](https://github.com/pytorch/executorch/issues), with hashtag **#mps**. From 240ff8b7268990ee9d7a0b2435ed67eaba972092 Mon Sep 17 00:00:00 2001 From: Scott Roy Date: Wed, 15 Oct 2025 16:52:58 -0700 Subject: [PATCH 2/3] update refs --- CONTRIBUTING.md | 4 +-- README-wheel.md | 2 +- backends/apple/coreml/README.md | 2 +- docs/source/backends-overview.md | 30 +++++++++---------- docs/source/ios-coreml.md | 2 +- docs/source/ios-mps.md | 2 +- docs/source/quantization-overview.md | 2 +- .../using-executorch-building-from-source.md | 2 +- docs/source/using-executorch-export.md | 4 +-- docs/source/using-executorch-ios.md | 2 +- 10 files changed, 26 insertions(+), 26 deletions(-) diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 45e03bd36e1..a94f78cb9f2 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -24,8 +24,8 @@ For Apple, please refer to the [iOS documentation](docs/source/using-executorch- executorch ├── backends - Backend delegate implementations for various hardware targets. Each backend uses partitioner to split the graph into subgraphs that can be executed on specific hardware, quantizer to optimize model precision, and runtime components to execute the graph on target hardware. For details refer to the backend documentation and the Export and Lowering tutorial for more information. │ ├── apple - Apple-specific backends. -│ │ ├── coreml - CoreML backend for Apple devices. See doc. -│ │ └── mps - Metal Performance Shaders backend for Apple devices. See doc. +│ │ ├── coreml - CoreML backend for Apple devices. See doc. +│ │ └── mps - Metal Performance Shaders backend for Apple devices. See doc. │ ├── arm - ARM architecture backends. See doc. │ ├── cadence - Cadence-specific backends. See doc. │ ├── example - Example backend implementations. diff --git a/README-wheel.md b/README-wheel.md index 7ae9b0aa2e0..e20b447f96a 100644 --- a/README-wheel.md +++ b/README-wheel.md @@ -12,7 +12,7 @@ The prebuilt `executorch.runtime` module included in this package provides a way to run ExecuTorch `.pte` files, with some restrictions: * Only [core ATen operators](docs/source/ir-ops-set-definition.md) are linked into the prebuilt module * Only the [XNNPACK backend delegate](docs/source/backends-xnnpack.md) is linked into the prebuilt module. -* \[macOS only] [Core ML](docs/source/backends-coreml.md) and [MPS](docs/source/backends-mps.md) backend +* \[macOS only] [Core ML](docs/source/backends/coreml/coreml-overview.md) and [MPS](docs/source/backends/mps/mps-overview.md) backend are also linked into the prebuilt module. Please visit the [ExecuTorch website](https://pytorch.org/executorch) for diff --git a/backends/apple/coreml/README.md b/backends/apple/coreml/README.md index d063dfc8b71..d72f04da1a1 100644 --- a/backends/apple/coreml/README.md +++ b/backends/apple/coreml/README.md @@ -1,7 +1,7 @@ # ExecuTorch Core ML Delegate This subtree contains the Core ML Delegate implementation for ExecuTorch. -Core ML is an optimized framework for running machine learning models on Apple devices. The delegate is the mechanism for leveraging the Core ML framework to accelerate operators when running on Apple devices. To learn how to use the CoreML delegate, see the [documentation](https://github.com/pytorch/executorch/blob/main/docs/source/backends-coreml.md). +Core ML is an optimized framework for running machine learning models on Apple devices. The delegate is the mechanism for leveraging the Core ML framework to accelerate operators when running on Apple devices. To learn how to use the CoreML delegate, see the [documentation](https://github.com/pytorch/executorch/blob/main/docs/source/backends/coreml/coreml-overview.md). ## Layout - `compiler/` : Lowers a module to Core ML backend. diff --git a/docs/source/backends-overview.md b/docs/source/backends-overview.md index bfa17bc9a9c..dfeb6243d37 100644 --- a/docs/source/backends-overview.md +++ b/docs/source/backends-overview.md @@ -18,20 +18,20 @@ Backends are the bridge between your exported model and the hardware it runs on. ## Choosing a Backend -| Backend | Platform(s) | Hardware Type | Typical Use Case | -|------------------------------------------------|---------------------|---------------|---------------------------------| -| [XNNPACK](backends-xnnpack) | All | CPU | General-purpose, fallback | -| [Core ML](/backends/coreml/coreml-overview.md) | iOS, macOS | NPU/GPU/CPU | Apple devices, high performance | -| [Metal Performance Shaders](backends-mps) | iOS, macOS | GPU | Apple GPU acceleration | -| [Vulkan ](backends-vulkan) | Android | GPU | Android GPU acceleration | -| [Qualcomm](backends-qualcomm) | Android | NPU | Qualcomm SoCs | -| [MediaTek](backends-mediatek) | Android | NPU | MediaTek SoCs | -| [ARM EthosU](backends-arm-ethos-u) | Embedded | NPU | ARM MCUs | -| [ARM VGF](backends-arm-vgf) | Android | NPU | ARM platforms | -| [OpenVINO](build-run-openvino) | Embedded | CPU/GPU/NPU | Intel SoCs | -| [NXP](backends-nxp) | Embedded | NPU | NXP SoCs | -| [Cadence](backends-cadence) | Embedded | DSP | DSP-optimized workloads | -| [Samsung Exynos](backends-samsung-exynos) | Android | NPU | Samsung SoCs | +| Backend | Platform(s) | Hardware Type | Typical Use Case | +|-----------------------------------------------------------------|---------------------|---------------|---------------------------------| +| [XNNPACK](backends-xnnpack) | All | CPU | General-purpose, fallback | +| [Core ML](/backends/coreml/coreml-overview.md) | iOS, macOS | NPU/GPU/CPU | Apple devices, high performance | +| [Metal Performance Shaders](/backends/mps/mps-overview.md) | iOS, macOS | GPU | Apple GPU acceleration | +| [Vulkan ](backends-vulkan) | Android | GPU | Android GPU acceleration | +| [Qualcomm](backends-qualcomm) | Android | NPU | Qualcomm SoCs | +| [MediaTek](backends-mediatek) | Android | NPU | MediaTek SoCs | +| [ARM EthosU](backends-arm-ethos-u) | Embedded | NPU | ARM MCUs | +| [ARM VGF](backends-arm-vgf) | Android | NPU | ARM platforms | +| [OpenVINO](build-run-openvino) | Embedded | CPU/GPU/NPU | Intel SoCs | +| [NXP](backends-nxp) | Embedded | NPU | NXP SoCs | +| [Cadence](backends-cadence) | Embedded | DSP | DSP-optimized workloads | +| [Samsung Exynos](backends-samsung-exynos) | Android | NPU | Samsung SoCs | **Tip:** For best performance, export a `.pte` file for each backend you plan to support. @@ -52,7 +52,7 @@ Backends are the bridge between your exported model and the hardware it runs on. backends-xnnpack backends/coreml/coreml-overview -backends-mps +backends/mps/mps-overview backends-vulkan backends-qualcomm backends-mediatek diff --git a/docs/source/ios-coreml.md b/docs/source/ios-coreml.md index 48271326d87..ff6551aa0c2 100644 --- a/docs/source/ios-coreml.md +++ b/docs/source/ios-coreml.md @@ -1 +1 @@ -```{include} backends-coreml.md +```{include} backends/coreml/coreml-overview.md diff --git a/docs/source/ios-mps.md b/docs/source/ios-mps.md index d6f305d33aa..13717675ba5 100644 --- a/docs/source/ios-mps.md +++ b/docs/source/ios-mps.md @@ -1 +1 @@ -```{include} backends-mps.md +```{include} backends/mps/mps-overview.md diff --git a/docs/source/quantization-overview.md b/docs/source/quantization-overview.md index 4ff8d34a4a8..4ac886b9ed2 100644 --- a/docs/source/quantization-overview.md +++ b/docs/source/quantization-overview.md @@ -29,7 +29,7 @@ These quantizers usually support configs that allow users to specify quantizatio Not all quantization options are supported by all backends. Consult backend-specific guides for supported quantization modes and configuration, and how to initialize the backend-specific PT2E quantizer: * [XNNPACK quantization](backends-xnnpack.md#quantization) -* [CoreML quantization](backends-coreml.md#quantization) +* [CoreML quantization](backends/coreml/coreml-quantization.md) * [QNN quantization](backends-qualcomm.md#step-2-optional-quantize-your-model) diff --git a/docs/source/using-executorch-building-from-source.md b/docs/source/using-executorch-building-from-source.md index d48f9d26db7..39600adad83 100644 --- a/docs/source/using-executorch-building-from-source.md +++ b/docs/source/using-executorch-building-from-source.md @@ -385,7 +385,7 @@ xcode-select --install ``` Run the above command with `--help` flag to learn more on how to build additional backends -(like [Core ML](backends-coreml.md), [MPS](backends-mps.md) or XNNPACK), etc. +(like [Core ML](backends/coreml/coreml-overview.md), [MPS](backends/mps/mps-overview.md) or XNNPACK), etc. Note that some backends may require additional dependencies and certain versions of Xcode and iOS. See backend-specific documentation for more details. diff --git a/docs/source/using-executorch-export.md b/docs/source/using-executorch-export.md index 7abf5cbd30a..f0ad7c18467 100644 --- a/docs/source/using-executorch-export.md +++ b/docs/source/using-executorch-export.md @@ -33,8 +33,8 @@ As part of the .pte file creation process, ExecuTorch identifies portions of the Commonly used hardware backends are listed below. For mobile, consider using XNNPACK for Android and XNNPACK or Core ML for iOS. To create a .pte file for a specific backend, pass the appropriate partitioner class to `to_edge_transform_and_lower`. See the appropriate backend documentation and the [Export and Lowering](#export-and-lowering) section below for more information. - [XNNPACK (CPU)](backends-xnnpack.md) -- [Core ML (iOS)](backends-coreml.md) -- [Metal Performance Shaders (iOS GPU)](backends-mps.md) +- [Core ML (iOS)](backends/coreml/coreml-overview.md) +- [Metal Performance Shaders (iOS GPU)](backends/mps/mps-overview.md) - [Vulkan (Android GPU)](backends-vulkan.md) - [Qualcomm NPU](backends-qualcomm.md) - [MediaTek NPU](backends-mediatek.md) diff --git a/docs/source/using-executorch-ios.md b/docs/source/using-executorch-ios.md index 3e12f174177..b77d5d1b252 100644 --- a/docs/source/using-executorch-ios.md +++ b/docs/source/using-executorch-ios.md @@ -107,7 +107,7 @@ git clone -b viable/strict https://github.com/pytorch/executorch.git --depth 1 - python3 -m venv .venv && source .venv/bin/activate && pip install --upgrade pip ``` -4. Install the required dependencies, including those needed for the backends like [Core ML](backends-coreml.md) or [MPS](backends-mps.md), if you plan to build them later: +4. Install the required dependencies, including those needed for the backends like [Core ML](backends/coreml/coreml-overview.md) or [MPS](backends/mps/mps-overview.md), if you plan to build them later: ```bash ./install_requirements.sh From bd23c93328cd5f3c8dc3d2cd9d7b420b6dbc4406 Mon Sep 17 00:00:00 2001 From: Scott Roy Date: Wed, 15 Oct 2025 17:06:18 -0700 Subject: [PATCH 3/3] up --- CONTRIBUTING.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index a94f78cb9f2..d53e5a89c94 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -25,7 +25,7 @@ executorch ├── backends - Backend delegate implementations for various hardware targets. Each backend uses partitioner to split the graph into subgraphs that can be executed on specific hardware, quantizer to optimize model precision, and runtime components to execute the graph on target hardware. For details refer to the backend documentation and the Export and Lowering tutorial for more information. │ ├── apple - Apple-specific backends. │ │ ├── coreml - CoreML backend for Apple devices. See doc. -│ │ └── mps - Metal Performance Shaders backend for Apple devices. See doc. +│ │ └── mps - Metal Performance Shaders backend for Apple devices. See doc. │ ├── arm - ARM architecture backends. See doc. │ ├── cadence - Cadence-specific backends. See doc. │ ├── example - Example backend implementations.