Skip to content

Conversation

robell
Copy link
Collaborator

@robell robell commented Jul 14, 2025

This is a first version of a VGF runtime with spport for simple VGF files containing inputs and outputs (no weights) and will prepare the appropriate Vulkan structures and dispatch the workload following the normal backend delegate interfaces. It's intended to be extended to take advantage of the existing Vulkan delegate by replacing the basic object creation, and by re-using the VgfRepr in the appropriate way in either a "direct" Arm backend for testing and simple deployment, or integrated with the Vulkan backend to have good memory, sync and performance interop with existing Vulkan delegate operators.

It re-uses the build-setup (headers, volk, etc) and vulkan_executor_runner and has been tested on linux only. This was on the simple S32 add kernel from the aot_arm_compiler, and a quantized and non-quantized mv2.

It depends on a number of components which are not yet released, and the script for these is not included, as our third party dependencies are still evolving.

Details:

  • Minor build fix for vulkan runtime.
  • Bump vulkan and volk headers to get tensor and graph extensions
  • First version of VGFBackend, dispatching on a vulkan layer driver
  • Will process the examples/models mv2 model and constants

Change-Id: I1f278cb98872ae8c0675c72995f0249c038d07d8

Testing

This change currently requires internal dependencies while a few pieces are upstreamed. The following is reproducable for those with full access to the ML SDK for Vulkan (https://github.com/arm/ai-ml-sdk-model-converter)

# test models
python3 -m examples.arm.aot_arm_compiler -t vgf --delegate --model_name="add" -i ./out_add -o out_add.pte
python3 -m examples.arm.aot_arm_compiler -t vgf --delegate --model_name="mv2" -i ./out_mv2 -o out_mv2.pte

#quantized test models
python3 -m examples.arm.aot_arm_compiler -t vgf --delegate --quantize --model_name=add -i ./out_add_quant -o out_add_quant.pte
python3 -m examples.arm.aot_arm_compiler --model_name=mv2 --target=vgf --quantize --delegate -i ./out_mv2_quant -o out_mv2_quant.pte

# commands to execute them using the vulkan executor runner
./cmake-out/backends/vulkan/vulkan_executor_runner -model_path out_add.pte
./cmake-out/backends/vulkan/vulkan_executor_runner -model_path out_mv2.pte
./cmake-out/backends/vulkan/vulkan_executor_runner -model_path out_add_quant.pte
./cmake-out/backends/vulkan/vulkan_executor_runner -model_path out_mv2_quant.pte

cc @digantdesai @freddan80 @per @zingo @oscarandersson8218

@robell robell added the release notes: arm Changes to the ARM backend delegate label Jul 14, 2025
Copy link

pytorch-bot bot commented Jul 14, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/12426

Note: Links to docs will display an error until the docs builds have been completed.

❌ 2 New Failures, 1 Unrelated Failure

As of commit 043be0b with merge base de0554d (image):

NEW FAILURES - The following jobs have failed:

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jul 14, 2025
@robell robell added partner: arm For backend delegation, kernels, demo, etc. from the 3rd-party partner, Arm ciflow/trunk labels Jul 14, 2025
Copy link
Collaborator

@ArmRyan ArmRyan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved internally

@robell robell requested review from SS-JIA and removed request for SS-JIA July 14, 2025 14:22
@robell
Copy link
Collaborator Author

robell commented Jul 14, 2025

only failing case is an http 429 error from HF. Unless we want to kick the tests off again, this just needs a review for the vulkan backend build changes and the introduction of the EXECUTORCH_BUILD_VGF option. The VGF backend code has been reviewed internally, but comments welcome of course. Builds/tests will be added once the final dependencies are upstream, which i hope to be a few weeks now.

@Sebastian-Larsson
Copy link
Collaborator

@digantdesai Do you mind taking a look since this touches some files outside of the Arm backend?

Copy link
Contributor

@digantdesai digantdesai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @robell !

if(EXECUTORCH_BUILD_VULKAN)
add_subdirectory(${CMAKE_CURRENT_SOURCE_DIR}/backends/vulkan)
endif()
if(EXECUTORCH_BUILD_VGF)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a readme under backends/arm?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should to that! I will add one in a separate patch, I will have some updates for this soon

@digantdesai
Copy link
Contributor

Please rebase and land once the CI is green (MMLU could be unrelated).

This is a first version of a VGF runtime with spport for simple VGF
files containing inputs and outputs (no weights) and will prepare the
appropriate Vulkan structures and dispatch the workload following the
normal backend delegate interfaces. It's intended to be extended to
take advantage of the existing Vulkan delegate by replacing the basic
object creation, and by re-using the VgfRepr in the appropriate way in
either a "direct" Arm backend for testing and simple deployment, or
integrated with the Vulkan backend to have good memory, sync and
performance interop with existing Vulkan delegate operators.

It re-uses the build-setup (headers, volk, etc) and
vulkan_executor_runner and has been tested on linux only. This was on
the simple S32 add kernel from the aot_arm_compiler, and a quantized
and non-quantized mv2.

It depends on a number of components which are not yet released, and
the script for these is not included, as our third party dependencies
are still evolving.

Details:
 * Minor build fix for vulkan runtime.
 * Bump vulkan and volk headers to get tensor and graph extensions
 * First version of VGFBackend, dispatching on a vulkan layer driver
 * Will process the examples/models mv2 model and constants

Signed-off-by: Rob Elliott <robert.elliott@arm.com>
Change-Id: I1f278cb98872ae8c0675c72995f0249c038d07d8
Signed-off-by: Rob Elliott <robert.elliott@arm.com>
@robell
Copy link
Collaborator Author

robell commented Jul 28, 2025

I don't believe the failures are related to my changes.

The llama runner failure is present on trunk intermittently: b8fe100

The arm-backend test is for a path i've not modified and is also present intermittently on trunk: https://github.com/pytorch/executorch/actions/runs/16562716934/job/46835516691

the test-phi-3-mini case is present on trunk.

Copy link
Collaborator

@Sebastian-Larsson Sebastian-Larsson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated CI failures. Approved

@Sebastian-Larsson Sebastian-Larsson merged commit 1c72e0e into pytorch:main Jul 28, 2025
196 of 199 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/trunk CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. partner: arm For backend delegation, kernels, demo, etc. from the 3rd-party partner, Arm release notes: arm Changes to the ARM backend delegate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants