runner fix to mitigate the numerical issue (#19286)#19286
Open
billmguo wants to merge 1 commit intopytorch:mainfrom
Open
runner fix to mitigate the numerical issue (#19286)#19286billmguo wants to merge 1 commit intopytorch:mainfrom
billmguo wants to merge 1 commit intopytorch:mainfrom
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19286
Note: Links to docs will display an error until the docs builds have been completed. ❌ 3 New Failures, 1 Pending, 4 Unrelated FailuresAs of commit 2f878b5 with merge base a0d6e9b ( NEW FAILURES - The following jobs have failed:
FLAKY - The following job failed but was likely due to flakiness present on trunk:
BROKEN TRUNK - The following jobs failed but was present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
Contributor
|
@billmguo has exported this pull request. If you are a Meta employee, you can view the originating Diff in D103690468. |
billmguo
added a commit
to billmguo/executorch
that referenced
this pull request
May 5, 2026
Summary: Fix 1 — Dangling shared_ptr (2 files) - runner/static_transformer_runner.h:33 - runner/experimental/static_transformer_runner.h:33 Changed module_(std::shared_ptr<Module>(module.get())) to module_(std::move(module)). The old code extracted the raw pointer without releasing ownership, so the unique_ptr destructor would free the Module while the shared_ptr member still pointed to it. Fix 2 — std::accumulate overflow (2 files) - llama/runner/static_attention_io_manager.h:58 - runner/experimental/static_attention_io_manager.h:59 Changed std::accumulate(..., 0) to std::accumulate(..., size_t(0)). The int initial value caused the entire accumulation to happen in 32-bit signed arithmetic before assigning to size_t. Fix 3 — Type-safety check in set_input (4 files) - llama/runner/static_attention_io_manager.h — added include + size check - runner/experimental/static_attention_io_manager.h — added include + size check - runner/static_transformer_runner.h — added size check (include inherited) - runner/experimental/static_transformer_runner.h — added size check (include inherited) Added ET_CHECK_MSG(sizeof(T) == elementSize(inputMeta->scalar_type()), ...) before constructing the TensorImpl. This catches mismatches between the runner's compiled types (CacheT, MaskT, RopeT) and the model's actual tensor dtypes at load time, rather than silently reinterpreting data. Reviewed By: viveknayakatmeta Differential Revision: D103690468
5000c26 to
6c0b59c
Compare
billmguo
added a commit
to billmguo/executorch
that referenced
this pull request
May 5, 2026
Summary: Pull Request resolved: pytorch#19286 Fix 1 — Dangling shared_ptr (2 files) - runner/static_transformer_runner.h:33 - runner/experimental/static_transformer_runner.h:33 Changed module_(std::shared_ptr<Module>(module.get())) to module_(std::move(module)). The old code extracted the raw pointer without releasing ownership, so the unique_ptr destructor would free the Module while the shared_ptr member still pointed to it. Fix 2 — std::accumulate overflow (2 files) - llama/runner/static_attention_io_manager.h:58 - runner/experimental/static_attention_io_manager.h:59 Changed std::accumulate(..., 0) to std::accumulate(..., size_t(0)). The int initial value caused the entire accumulation to happen in 32-bit signed arithmetic before assigning to size_t. Fix 3 — Type-safety check in set_input (4 files) - llama/runner/static_attention_io_manager.h — added include + size check - runner/experimental/static_attention_io_manager.h — added include + size check - runner/static_transformer_runner.h — added size check (include inherited) - runner/experimental/static_transformer_runner.h — added size check (include inherited) Added ET_CHECK_MSG(sizeof(T) == elementSize(inputMeta->scalar_type()), ...) before constructing the TensorImpl. This catches mismatches between the runner's compiled types (CacheT, MaskT, RopeT) and the model's actual tensor dtypes at load time, rather than silently reinterpreting data. Reviewed By: viveknayakatmeta Differential Revision: D103690468
6c0b59c to
a45dfd3
Compare
YIWENX14
approved these changes
May 5, 2026
Contributor
YIWENX14
left a comment
There was a problem hiding this comment.
LGTM. Please fix the lint error before commit.
Contributor
|
@pytorchbot label "release notes: none" |
billmguo
added a commit
to billmguo/executorch
that referenced
this pull request
May 5, 2026
Summary: Fix 1 — Dangling shared_ptr (2 files) - runner/static_transformer_runner.h:33 - runner/experimental/static_transformer_runner.h:33 Changed module_(std::shared_ptr<Module>(module.get())) to module_(std::move(module)). The old code extracted the raw pointer without releasing ownership, so the unique_ptr destructor would free the Module while the shared_ptr member still pointed to it. Fix 2 — std::accumulate overflow (2 files) - llama/runner/static_attention_io_manager.h:58 - runner/experimental/static_attention_io_manager.h:59 Changed std::accumulate(..., 0) to std::accumulate(..., size_t(0)). The int initial value caused the entire accumulation to happen in 32-bit signed arithmetic before assigning to size_t. Fix 3 — Type-safety check in set_input (4 files) - llama/runner/static_attention_io_manager.h — added include + size check - runner/experimental/static_attention_io_manager.h — added include + size check - runner/static_transformer_runner.h — added size check (include inherited) - runner/experimental/static_transformer_runner.h — added size check (include inherited) Added ET_CHECK_MSG(sizeof(T) == elementSize(inputMeta->scalar_type()), ...) before constructing the TensorImpl. This catches mismatches between the runner's compiled types (CacheT, MaskT, RopeT) and the model's actual tensor dtypes at load time, rather than silently reinterpreting data. Reviewed By: viveknayakatmeta Differential Revision: D103690468
a45dfd3 to
ccf1c11
Compare
Summary: Pull Request resolved: pytorch#19286 Fix 1 — Dangling shared_ptr (2 files) - runner/static_transformer_runner.h:33 - runner/experimental/static_transformer_runner.h:33 Changed module_(std::shared_ptr<Module>(module.get())) to module_(std::move(module)). The old code extracted the raw pointer without releasing ownership, so the unique_ptr destructor would free the Module while the shared_ptr member still pointed to it. Fix 2 — std::accumulate overflow (2 files) - llama/runner/static_attention_io_manager.h:58 - runner/experimental/static_attention_io_manager.h:59 Changed std::accumulate(..., 0) to std::accumulate(..., size_t(0)). The int initial value caused the entire accumulation to happen in 32-bit signed arithmetic before assigning to size_t. Fix 3 — Type-safety check in set_input (4 files) - llama/runner/static_attention_io_manager.h — added include + size check - runner/experimental/static_attention_io_manager.h — added include + size check - runner/static_transformer_runner.h — added size check (include inherited) - runner/experimental/static_transformer_runner.h — added size check (include inherited) Added ET_CHECK_MSG(sizeof(T) == elementSize(inputMeta->scalar_type()), ...) before constructing the TensorImpl. This catches mismatches between the runner's compiled types (CacheT, MaskT, RopeT) and the model's actual tensor dtypes at load time, rather than silently reinterpreting data. Reviewed By: viveknayakatmeta Differential Revision: D103690468
ccf1c11 to
2f878b5
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary:
Fix 1 — Dangling shared_ptr (2 files)
Changed module_(std::shared_ptr(module.get())) to module_(std::move(module)). The old code extracted the raw pointer without releasing ownership, so the unique_ptr destructor would free the Module while the shared_ptr member still pointed to it.
Fix 2 — std::accumulate overflow (2 files)
Changed std::accumulate(..., 0) to std::accumulate(..., size_t(0)). The int initial value caused the entire accumulation to happen in 32-bit signed arithmetic before assigning to size_t.
Fix 3 — Type-safety check in set_input (4 files)
Added ET_CHECK_MSG(sizeof(T) == elementSize(inputMeta->scalar_type()), ...) before constructing the TensorImpl. This catches mismatches between the runner's compiled types (CacheT, MaskT, RopeT) and the model's actual tensor dtypes at load time, rather than silently reinterpreting data.
Reviewed By: viveknayakatmeta
Differential Revision: D103690468