Skip to content

Conversation

@ArmRyan
Copy link
Collaborator

@ArmRyan ArmRyan commented Oct 22, 2025

These bindings should drastically increase serialization
performance, both speed wise and memory usage wise.

This patch also enables serialization of tosa flatbuffers
that are over 2GB in size.

  • Ensure each tosa operator has an attribute.
  • Replace TosaOp.Op() calls with Op enum
  • Convert tosa clamp data to numpy int8 to allow c++ byte serialization
  • Replace "import tosa_serializer.serializer" with "import tosa_serializer" for pybind
  • Remove serialize to json.
  • Remove Darwin support workaround in mlsdk dependencies
  • Update mlsdk to latest tag, but overwrite the tosa_mlir_translator to a version that supports offset buffers
  • Update tosa-tools to the monorepo
  • Update slice to clamp end and start dim. This avoids the object not having a number field from being too large or small.

Change-Id: Id03582b6b7b9eef296480f9ea3a4744a08b22603

cc @freddan80 @per @zingo @oscarandersson8218 @digantdesai

 These bindings should drastically increase serialization
 performance, both speed wise and memory usage wise.

 This patch also enables serialization of tosa flatbuffers
 that are over 2GB in size.

 - Ensure each tosa operator has an attribute.
 - Replace TosaOp.Op() calls with Op enum
 - Convert tosa clamp data to numpy int8 to allow c++ byte serialization
 - Replace "import tosa_serializer.serializer" with
   "import tosa_serializer" for pybind
 - Remove serialize to json.
 - Remove Darwin support workaround in mlsdk dependencies
 - Update mlsdk to latest tag, but overwrite the
   tosa_mlir_translator to a version that supports offset buffers
 - Update tosa-tools to the monorepo
 - Update slice to clamp end and start dim. This avoids the object
   not having a number field from being too large or small.

Change-Id: Id03582b6b7b9eef296480f9ea3a4744a08b22603
Signed-off-by: Ryan O'Shea <ryan.oshea3@arm.com>
@ArmRyan ArmRyan requested a review from digantdesai as a code owner October 22, 2025 14:02
@ArmRyan ArmRyan added partner: arm For backend delegation, kernels, demo, etc. from the 3rd-party partner, Arm ciflow/trunk labels Oct 22, 2025
@pytorch-bot
Copy link

pytorch-bot bot commented Oct 22, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/15356

Note: Links to docs will display an error until the docs builds have been completed.

❌ 9 New Failures, 2 Cancelled Jobs, 3 Unrelated Failures

As of commit f350ce4 with merge base 788ef2f (image):

NEW FAILURES - The following jobs have failed:

CANCELLED JOBS - The following jobs were cancelled. Please retry:

FLAKY - The following job failed but was likely due to flakiness present on trunk:

BROKEN TRUNK - The following jobs failed but was present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 22, 2025
@ArmRyan ArmRyan added the release notes: arm Changes to the ARM backend delegate label Oct 22, 2025
@ArmRyan
Copy link
Collaborator Author

ArmRyan commented Oct 22, 2025

failures unrelated

Copy link
Collaborator

@robell robell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Internal review completed so we're happy with this.

@freddan80 freddan80 merged commit fdfeaa4 into pytorch:main Oct 23, 2025
290 of 313 checks passed
@digantdesai
Copy link
Contributor

This was a huge change (esp in terms of tosa serializer changes), it is breaking internal CI. Let me try to fwd fix. In the future if we are touching to many files, we should also pull internally and test (at least until we improve OSS testing for buck :().


pushd tosa-tools
git checkout 8468d041c50c6d806f3c1c18c66d7ef641e46580 # serialization lib pybindings
if [[ ! -d "reference_model" ]]; then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess tosa/tosa-reference-model repo on gitlab.arm.com is dead (but its there) and long live tosa/tosa-tools?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah individual repos probably wont change anymore, but should stay online. tosa-tools is a monorepo and is the future of the tooling as far as I'm aware

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re. the tosa-serializer, IIUC we have both c++ (through pybind) and direct python APIs, i.e. tosa_serializer vs. _tosa_serializer why do we have two? Any differences? Are you going to deprecate the direct python ones over pybind ones for perf?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think its mostly just a quick of the reference model atm, when you pip install reference model it installs old python api ( import serializer.tosa_serializer ), but if you pip install serializer you get c++ ( import tosa_serializer ). Im not sure what the tosa teams plans are, but i think the pure python api is likely going to be frozen where it is, it already lags behind the c++ anyway. For us in ET we have no plans to use the old python version anymore.

Theres not really any difference that I can see, the graphs should serialize to the same flatbuffer and have the same graph, I dont think either one has more or less features right now, but that could change

Copy link
Contributor

@digantdesai digantdesai Oct 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm I am having difficulty in buckify the C++ variant, and since this is breaking things internally, let me revert this for now, we can try to land this again later this week.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it will likely impact quite a few follow on fixes which depend on this change if we have to revert this (and so a revert chain) - is there anything we can do to help get this through?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is, because of the travel etc. this is blocking a bunch of changes internally since we can't land this due to failures. Do you know which PRs are depending on this? Besides the import name change, this PR (on its own) seems quite independent.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think its mostly the large revert chain upstream and internally that we were hoping to avoid

fi

pushd tosa-tools
git checkout 8468d041c50c6d806f3c1c18c66d7ef641e46580 # serialization lib pybindings
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any impact on BC esp on Vela 4.4.[12]? I see the CI is OK so should be low risk.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be fine, tosa is supposed to be backwards compatible so unless we use new tosa features ( we dont atm ) then nothing should change for vela. The main changes are:
Pybind11 instead of pure python backend, so better perf when lowering. eg. Llama 3.2 1b in int8 goes from ~1h lowering on my laptop to like 3 minutes, and ram usage goes from over 30gb to ~9.
Support for >2GB flatbuffers, these modify the flatbuffer layout if it is >2GB, but there is a guard around ethos-u backend to try and block those from entering vela until its supported there

@ArmRyan
Copy link
Collaborator Author

ArmRyan commented Oct 24, 2025

This was a huge change (esp in terms of tosa serializer changes), it is breaking internal CI. Let me try to fwd fix. In the future if we are touching to many files, we should also pull internally and test (at least until we improve OSS testing for buck :().

It should just be as simple as a setup.sh run, front-end usage should be identical, its purely the internals of the backend that are api broken, but they shouldnt really be exposed so as long as the library exists there should be no change

GregoryComer added a commit to GregoryComer/executorch that referenced this pull request Oct 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/trunk CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. partner: arm For backend delegation, kernels, demo, etc. from the 3rd-party partner, Arm release notes: arm Changes to the ARM backend delegate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants