-
Notifications
You must be signed in to change notification settings - Fork 684
NXP backend: Add preprocessing pass to split multilayer GRU
.
#13757
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NXP backend: Add preprocessing pass to split multilayer GRU
.
#13757
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/13757
Note: Links to docs will display an error until the docs builds have been completed. ❗ 1 Active SEVsThere are 1 currently active SEVs. If your PR is affected, please view them below: ❌ 1 New FailureAs of commit b0f6ecb with merge base f8a422c ( NEW FAILURE - The following job has failed:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
@pytorchbot label "module: nxp" "release notes: nxp" |
ffc75b4
to
325ee60
Compare
passes: list[PassType] = passes or [ | ||
FuseBatchNormWithConvPass(), | ||
FuseBatchNormWithLinearPass(), | ||
SplitGRUBasedOnNumLayers(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we do it here in aten is because we want to support quantization?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do it in the aten dialect because it allows the pass to focus only of the transformation, and not worry about the quantization. This way, the implementation is simpler, and if the quantization requirements of GRU (or the other ops) ever change, we only need to update the quantizer (otherwise we would also have to update this pass).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do it here for 2 reasons:
- Neutron NPU supports GRU as primitive operation Neutron IR ==> for Neutron NPU we do not need and want to decompose it to primitive ops by the ExecuTorch.
- Neutron NPU supports GRU, but only single layer. Therefore we transform the multilayer GRU to a sequence of single layer GRU, perform the quantization and obtain a graph what can be represented in Neutron IR, including the proper quantization parameters for inputs and outputs of individual GRU layers.
Note: we will preserve the GRU op in to_edge tranformation - coming change
5d6e466
to
fbb061c
Compare
…ltiple operators. The `aten.gru.input` has a parameter `num_layers`. For values > 1, it represents multiple `aten.gru.input` operators chained together. The introduced pass can split the original GRU into a chain of simpler GRU nodes.
fbb061c
to
b0f6ecb
Compare
Summary
This PR introduces a pre-processing pass on the aten dialect level, which splits
gru
nodes withnum_layers > 1
into an equivalent sequence of single layergru
nodes.Test plan
Unit tests provided in
backends/nxp/tests/test_gru_splitting.py
.cc @robert-kalmar @roman-janik-nxp @StrycekSimon @jirioc