Skip to content

Conversation

@psiddh
Copy link
Contributor

@psiddh psiddh commented Jan 22, 2026

Summary:

   The linear path in ConvertToCortexMPass was not transposing weights unlike
   conv2d, causing inconsistency with the C++ runtime which expects weights in
   [in_features, out_features] format per CMSIS-NN.

  Changes:
   - convert_to_cortex_m_pass.py: Transpose linear weights [out, in] -> [in, out]
   - operators.py: Update meta to use weights.shape[1] for output dimension
   - operators.py: Remove .T from ref impl (weights pre-transposed by pass)

  Fixes MV2 output shape mismatch: [1, 1280] -> [1, 1000]

  MV2 on Corstone-300/E8 with CMSIS-NN kernels
  This fix ensures the AOT-compiled .pte file has correctly shaped output
  tensors for any model using quantized_linear (MV2, ResNet, MV3, etc.).

Test plan

Run MV2 lowered to CMSIS-NN ops on E8 alif board

Copilot AI review requested due to automatic review settings January 22, 2026 16:51
@pytorch-bot
Copy link

pytorch-bot bot commented Jan 22, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/16782

Note: Links to docs will display an error until the docs builds have been completed.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jan 22, 2026
@psiddh psiddh requested review from AdrianLundell and rascani and removed request for Copilot January 22, 2026 16:51
@psiddh psiddh changed the title Title: Fix quantized_linear output shape in meta function Jan 22, 2026
@github-actions
Copy link

This PR needs a release notes: label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Copilot AI review requested due to automatic review settings January 22, 2026 20:32
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes shape inference for the Cortex-M quantized_linear fake/meta function.

Changes:

  • Updates quantized_linear_meta to compute output shape using a different weight dimension.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@psiddh psiddh removed the request for review from AdrianLundell January 22, 2026 21:20
@psiddh psiddh marked this pull request as draft January 22, 2026 21:20
@psiddh
Copy link
Contributor Author

psiddh commented Jan 22, 2026

Copilot raised valid points, converting to draft , need to dig into this issue further

@psiddh psiddh changed the title Fix quantized_linear output shape in meta function Fix quantized_linear output shape in both ref impl & meta function Jan 22, 2026
@psiddh psiddh requested a review from Copilot January 22, 2026 22:20
@psiddh psiddh requested a review from AdrianLundell January 22, 2026 22:28
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@psiddh psiddh marked this pull request as ready for review January 22, 2026 23:53
@psiddh psiddh marked this pull request as draft January 23, 2026 00:02
@psiddh psiddh changed the title Fix quantized_linear output shape in both ref impl & meta function [cortex_m] Fix linear weight layout: transpose in AOT pass, align meta/ref impl Jan 23, 2026
@psiddh psiddh requested a review from Copilot January 23, 2026 05:55
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 113 to 120
kernel_sum_tensor = self._compute_kernel_sum(
weights_tensor, bias_tensor, -input_zp, -weight_zp
)

# Transpose weights from PyTorch format [out_features, in_features]
# to CMSIS-NN format [in_features, out_features]
weights_transposed = weights_tensor.T.contiguous()

Copy link

Copilot AI Jan 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_compute_kernel_sum() already transposes weights internally (weights.T), and this block transposes weights_tensor again to materialize weights_transposed. This duplicates the layout conversion logic in two places and does an extra transpose at AOT time.

Consider refactoring so the transpose is computed once (e.g., compute weights_transposed first and pass it into _compute_kernel_sum, or have _compute_kernel_sum accept already-transposed weights).

Copilot uses AI. Check for mistakes.
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +116 to 124
# Transpose weights once from PyTorch format [out_features, in_features]
# to CMSIS-NN format [in_features, out_features]
weights_transposed = weights_tensor.T.contiguous()
# Pass already-transposed weights to kernel_sum computation
kernel_sum_tensor = self._compute_kernel_sum(
weights_tensor, bias_tensor, -input_zp, -weight_zp
weights_transposed, bias_tensor, -input_zp, -weight_zp
)
Copy link

Copilot AI Jan 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This pass now changes the expected weight layout for cortex_m.quantized_linear to [in_features, out_features]. Please add a regression test that inspects the post-pass graph/parameters (not just numerical output) to ensure weights are actually stored/transmitted in the transposed layout; otherwise the Python reference path can still pass while the CMSIS-NN runtime path regresses.

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, Will do in follow up PR

  [cortex_m] Fix linear weight layout: transpose in AOT pass, align meta/ref impl

  Summary:
   The linear path in ConvertToCortexMPass was not transposing weights unlike
   conv2d, causing inconsistency with the C++ runtime which expects weights in
   [in_features, out_features] format per CMSIS-NN.

  Changes:
   - convert_to_cortex_m_pass.py: Transpose linear weights [out, in] -> [in, out]
   - operators.py: Update meta to use weights.shape[1] for output dimension
   - operators.py: Remove .T from ref impl (weights pre-transposed by pass)
   - operators.py: Transpose once, pass to _compute_kernel_sum

  Fixes MV2 output shape mismatch: [1, 1280] -> [1, 1000]

  MV2 on Corstone-300/E8 with CMSIS-NN kernels
  This fix ensures the AOT-compiled .pte file has correctly shaped output
  tensors for any model using quantized_linear (MV2, ResNet, MV3, etc.).
kernel_sum_tensor,
)

weights_transposed_node = create_constant_placeholder(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Delete old weights here if they have no users left?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, will thoroughly test and provide fix in follow up PR : #16866

@AdrianLundell
Copy link
Collaborator

Looks reasonable and nicely commented which makes it easy to understand! Any ideas why this was not caught in the unittests?

@rascani rascani marked this pull request as ready for review January 23, 2026 15:32
Copilot AI review requested due to automatic review settings January 25, 2026 08:37
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@psiddh
Copy link
Contributor Author

psiddh commented Jan 26, 2026

Looks reasonable and nicely commented which makes it easy to understand! Any ideas why this was not caught in the unittests?

Quick glance and no tests actually seem to cover this particular usecase Will add test cases in the follow

@psiddh psiddh merged commit 06f10b9 into pytorch:main Jan 26, 2026
149 checks passed
@AdrianLundell
Copy link
Collaborator

Hey CI is failing after this update: https://hud.pytorch.org/hud/pytorch/executorch/main/1?per_page=50&name_filter=cortex&useRegexFilter=true, can you have another look on this?

@psiddh
Copy link
Contributor Author

psiddh commented Jan 26, 2026

Hey CI is failing after this update: https://hud.pytorch.org/hud/pytorch/executorch/main/1?per_page=50&name_filter=cortex&useRegexFilter=true, can you have another look on this?

Yes I am looking into it by giving fwd fix

psiddh pushed a commit to psiddh/executorch that referenced this pull request Jan 27, 2026
psiddh added a commit that referenced this pull request Jan 27, 2026
#16910)

…lign meta/ref impl (#16782)"

This reverts commit 06f10b9.

### Summary
[PLEASE REMOVE] See [CONTRIBUTING.md's Pull
Requests](https://github.com/pytorch/executorch/blob/main/CONTRIBUTING.md#pull-requests)
for ExecuTorch PR guidelines.

[PLEASE REMOVE] If this PR closes an issue, please add a `Fixes
#<issue-id>` line.

[PLEASE REMOVE] If this PR introduces a fix or feature that should be
the upcoming release notes, please add a "Release notes: <area>" label.
For a list of available release notes labels, check out
[CONTRIBUTING.md's Pull
Requests](https://github.com/pytorch/executorch/blob/main/CONTRIBUTING.md#pull-requests).

### Test plan
[PLEASE REMOVE] How did you test this PR? Please write down any manual
commands you used and note down tests that you have written if
applicable.

Co-authored-by: Github Executorch <github_executorch@arm.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants