Skip to content

Conversation

lucylq
Copy link
Contributor

@lucylq lucylq commented Sep 15, 2025

Summary:
Error happening when we have older PTE files with extended header size 24.

When we call 'from_bytes', we expect header size 32 after adding segment_data_size field.

This is BC on C++ side because we have a minimum length.
Add minimum length to python to make the change BC.

Differential Revision: D82492169

Copy link

pytorch-bot bot commented Sep 15, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/14320

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit a9baded with merge base 4a4f5a0 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Sep 15, 2025
@facebook-github-bot
Copy link
Contributor

@lucylq has exported this pull request. If you are a Meta employee, you can view the originating diff in D82492169.

Copy link

This PR needs a release notes: label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

# Magic bytes
b"eh00"
# uint32_t header size (little endian)
# uint32_t header size (little endian) --> 32
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

? whats this

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh its just the number 32 in little endian in hex

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The header size (x20) is 32 bytes. This is the new one with segment_data_size.

Below is the old one (x18), 24 bytes.


# To find the header, callers should provide at least this many bytes of
# the head of the serialized Program data.
NUM_HEAD_BYTES: ClassVar[int] = 64
Copy link
Contributor

@JacobSzwejbka JacobSzwejbka Sep 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is 64 just a random number with some leeway for future expansion?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is in sync with the C++ kNumHeadBytes, I think the value 64 is arbitrary leeway.

lucylq added a commit to lucylq/executorch-1 that referenced this pull request Sep 15, 2025
Summary:
pytorch#14320

Error happening when we have older PTE files with extended header size 24. 

When we call 'from_bytes', we expect header size 32 after adding segment_data_size field in   D81938296  

This is BC on C++ side because we have a minimum length.
Add minimum length to python to make the change BC.

Reviewed By: JacobSzwejbka

Differential Revision: D82492169
@facebook-github-bot
Copy link
Contributor

@lucylq has exported this pull request. If you are a Meta employee, you can view the originating diff in D82492169.

@mergennachin mergennachin added this to the 1.0.0 milestone Sep 15, 2025
@mergennachin
Copy link
Contributor

Hi @lucylq -- this is concerning.

Is the extended header a BC breaking change for the C++ runtime? We shouldn't make do any BC breaking. In other words, older .PTE files should be runnable with new runtime.

This PR seems like only do mitigation for python side -- but it doesn't make sure older PTE has still be loadable.

@lucylq
Copy link
Contributor Author

lucylq commented Sep 16, 2025

Hi @lucylq -- this is concerning.

Is the extended header a BC breaking change for the C++ runtime? We shouldn't make do any BC breaking. In other words, older .PTE files should be runnable with new runtime.

This PR seems like only do mitigation for python side -- but it doesn't make sure older PTE has still be loadable.

Hi @mnachin the PRs landed last week are not BC breaking for C++ side; we check for segment_data_size if it exists and not otherwise.

There isn't an explicit test in executorch given we generate our PTE files on the fly (so they have segment_data_size), however there are tests from internal et users that cover this. I can check in an older PTE file if we want to explicitly test this in C++ as well?

It was a breaking change for python (which is my bad), which this PR changes.

lucylq added a commit to lucylq/executorch-1 that referenced this pull request Sep 16, 2025
Summary:

pytorch#14320

Error happening when we have older PTE files with extended header size 24. 

When we call 'from_bytes', we expect header size 32 after adding segment_data_size field in   D81938296  

This is BC on C++ side because we have a minimum length.
Add minimum length to python to make the change BC.

Reviewed By: JacobSzwejbka, hyxu2006

Differential Revision: D82492169
Summary:

pytorch#14320

Error happening when we have older PTE files with extended header size 24. 

When we call 'from_bytes', we expect header size 32 after adding segment_data_size field in   D81938296  

This is BC on C++ side because we have a minimum length.
Add minimum length to python to make the change BC.

Reviewed By: JacobSzwejbka, hyxu2006

Differential Revision: D82492169
@facebook-github-bot
Copy link
Contributor

@lucylq has exported this pull request. If you are a Meta employee, you can view the originating diff in D82492169.

@facebook-github-bot
Copy link
Contributor

@lucylq has exported this pull request. If you are a Meta employee, you can view the originating diff in D82492169.

@mergennachin
Copy link
Contributor

I can check in an older PTE file if we want to explicitly test this in C++ as well?

@lucylq @JacobSzwejbka

Yeah, going back to release/0.6, generate an old .PTE file and save it something like "model_v_0_6.pte" and then load this in the new runtime.

We should also look into automating this soon.

@mergennachin
Copy link
Contributor

Actually we should start with release/0.4 -- that's when released beta

Copy link
Contributor

@mergennachin mergennachin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The python side looks good neverthess.

@facebook-github-bot facebook-github-bot merged commit d1e1bf8 into pytorch:main Sep 16, 2025
126 of 127 checks passed
@lucylq
Copy link
Contributor Author

lucylq commented Sep 16, 2025

@pytorchbot cherry-pick --onto release/1.0 -c regression

pytorchbot pushed a commit that referenced this pull request Sep 16, 2025
Differential Revision: D82492169

Pull Request resolved: #14320

(cherry picked from commit d1e1bf8)
@pytorchbot
Copy link
Collaborator

Cherry picking #14320

The cherry pick PR is at #14334 and it is recommended to link a regression cherry pick PR with an issue. The following tracker issues are updated:

Details for Dev Infra team Raised by workflow job

StrycekSimon pushed a commit to nxp-upstream/executorch that referenced this pull request Sep 23, 2025
Differential Revision: D82492169

Pull Request resolved: pytorch#14320
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported meta-exported
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants