Support weight sharing in PTE #13941

neuropilot-captain · 2025-09-04T09:26:14Z

Summary

Support weight sharing with compile spec
Support weight sharing feature to llama export script and runner
Optimize llama performance

…t backend

Support weight sharing in MTK Runtime

pytorch-bot · 2025-09-04T09:26:17Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/13941

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 3 New Failures

As of commit 6b30094 with merge base b02db12 ():

NEW FAILURES - The following jobs have failed:

Build documentation / build (buck2) / Build doc (gh)
At least one of the pre-conditions you specified did not hold
pull / test-vulkan-operators-linux / linux-job (gh)
RuntimeError: Command docker exec -t 0252aa2ec7ffec38095fe8f6b52c2386cea00828903b7d25b6f6b3922d688f87 /exec failed with exit code 127
pull / unittest-nxp-neutron / linux-job (gh)
RuntimeError: Command docker exec -t 5b7a18391058f194c4b45c5b91acb8fade1c56119cfa42689206c25218228f97 /exec failed with exit code 1

This comment was automatically generated by Dr. CI and updates every 15 minutes.

github-actions · 2025-09-04T09:26:52Z

This PR needs a `release notes:` label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

cccclai · 2025-09-04T15:34:29Z

@neuropilot-captain I think the PR needs to rebase

cccclai · 2025-09-05T04:58:02Z

There is still lint error...can you fix it?

swolchok · 2025-09-09T23:14:26Z

examples/mediatek/executor_runner/llama_runner/ModelChunk.cpp

+      mPlannedBuffers.push_back(std::make_unique<uint8_t[]>(buffer_size));
+      mPlannedSpans.push_back({mPlannedBuffers.back().get(), buffer_size});


nit: use emplace_back

swolchok · 2025-09-09T23:15:09Z

examples/mediatek/executor_runner/llama_runner/ModelChunk.cpp

-    ET_LOG(Debug, "Setting up planned buffer %zu, size %zu.", id, buffer_size);
-    planned_buffers.push_back(std::make_unique<uint8_t[]>(buffer_size));
-    planned_spans.push_back({planned_buffers.back().get(), buffer_size});
+  auto modelInstance = new ModelInstance(modelPath);


use unique_ptr; this will leak if anything in here throws

neuropilot-captain added 10 commits May 12, 2025 19:53

Support preprocess_multimethod with extracted_share_data in Neuropilo…

5350547

…t backend

Support weight sharing in MTK Runtime

705f94e

Apply lintrunner

e8e7429

remove dependancy to getPaddedSize

cbcb919

Add shared weights flow to llama export script

a0bfa5d

Refine code

a6da626

Merge pull request #3 from neuropilot-captain/extract_share_runtime

7e6a7d6

Support weight sharing in MTK Runtime

Fix lintrunner errors

56b19fb

Fix backend IO order bug

84c81a3

First working llama shared weights flow

486dd4e

neuropilot-captain requested a review from cccclai as a code owner September 4, 2025 09:26

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Sep 4, 2025

neuropilot-captain added 4 commits September 5, 2025 11:17

Merge remote-tracking branch 'upstream/main' into extract_share

7cc7321

Merge branch 'main' into extract_share

b2303b2

Fix conflict

dbe864d

Update for delegate interface changes

6491161

Fix lint errors and update llama sample run script

6b30094

cccclai approved these changes Sep 8, 2025

View reviewed changes

cccclai merged commit a90e907 into pytorch:main Sep 8, 2025
113 of 116 checks passed

swolchok reviewed Sep 9, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support weight sharing in PTE #13941

Support weight sharing in PTE #13941

Uh oh!

neuropilot-captain commented Sep 4, 2025

Uh oh!

pytorch-bot bot commented Sep 4, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Sep 4, 2025

Uh oh!

cccclai commented Sep 4, 2025

Uh oh!

cccclai commented Sep 5, 2025

Uh oh!

Uh oh!

swolchok Sep 9, 2025

Uh oh!

swolchok Sep 9, 2025

Uh oh!

Uh oh!

		mPlannedBuffers.push_back(std::make_unique<uint8_t[]>(buffer_size));
		mPlannedSpans.push_back({mPlannedBuffers.back().get(), buffer_size});

Support weight sharing in PTE #13941

Support weight sharing in PTE #13941

Uh oh!

Conversation

neuropilot-captain commented Sep 4, 2025

Summary

Uh oh!

pytorch-bot bot commented Sep 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/13941

❌ 3 New Failures

Uh oh!

github-actions bot commented Sep 4, 2025

This PR needs a release notes: label

Uh oh!

cccclai commented Sep 4, 2025

Uh oh!

cccclai commented Sep 5, 2025

Uh oh!

Uh oh!

swolchok Sep 9, 2025

Choose a reason for hiding this comment

Uh oh!

swolchok Sep 9, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

pytorch-bot bot commented Sep 4, 2025 •

edited

Loading

This PR needs a `release notes:` label