Skip to content

Conversation

digantdesai
Copy link
Contributor

@digantdesai digantdesai commented Sep 2, 2025

Summary

  • enable qp8 dqlinear paths at runtime
  • enable xnnpack arm sme2 isa build flag (XNNPACK_ENABLE_ARM_SME is already ON)

Benchmark and SME2 data

  • Not so rigorous benchmarking on an M4 using
    executor_runner --model_path vit_xnnpack_q8.pte --num_executions 100
    with and without SME2 flag shows, and w/ SME2 it is consistently faster.
  • Verified kleidi SME2 kernel is used through Instrumentation on M4
  • LLM packing perf is still needs to be validated with QP8.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Sep 2, 2025
Copy link

pytorch-bot bot commented Sep 2, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/13887

Note: Links to docs will display an error until the docs builds have been completed.

❌ 2 New Failures, 3 Unrelated Failures

As of commit b0942ee with merge base c393d17 (image):

NEW FAILURES - The following jobs have failed:

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@mergennachin mergennachin added the release notes: xnnpack Changes to the XNNPack backend delegate label Sep 3, 2025
Copy link
Contributor

@mergennachin mergennachin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Godspeed!

@digantdesai digantdesai merged commit 932818c into main Sep 3, 2025
242 of 247 checks passed
@digantdesai digantdesai deleted the kleidi_sme2 branch September 3, 2025 19:45
kirklandsign added a commit that referenced this pull request Sep 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ciflow/trunk CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. release notes: xnnpack Changes to the XNNPack backend delegate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants