Skip to content

Conversation

@per
Copy link
Collaborator

@per per commented Dec 9, 2024

Summary

Adds a folding pass to fold in q and dq nodes.

Test plan

Added test for the new pass

@pytorch-bot
Copy link

pytorch-bot bot commented Dec 9, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/7240

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure

As of commit 2a03d6f with merge base 3f7eb3b (image):

NEW FAILURE - The following job has failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Dec 9, 2024
@per per requested a review from digantdesai December 9, 2024 12:14
@per per added partner: arm For backend delegation, kernels, demo, etc. from the 3rd-party partner, Arm ciflow/trunk topic: not user facing labels Dec 9, 2024
@per
Copy link
Collaborator Author

per commented Dec 10, 2024

)

output.shape = tosa_shape(output.shape, output.dim_order)
min_output = tosa_graph.addIntermediate(output.shape, ts.DType.INT32)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
min_output = tosa_graph.addIntermediate(output.shape, ts.DType.INT32)
max_output = tosa_graph.addIntermediate(output.shape, ts.DType.INT32)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yepp!

Comment on lines 46 to 57
x_scale = input_qparams[0].scale
x_zp = input_qparams[0].zp

y_scale = input_qparams[1].scale
y_zp = input_qparams[1].zp

assert (
x_zp == y_zp
), "Different zp for inputs, MAX should be quantized with shared quantization!"
assert (
x_scale == y_scale
), "Different scale for input, MAX should be quantized with shared quantization!"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

refactor this as a util to assert shared qconfigs across inputs?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes will fix it up.


class SimpleQuantizeModel(torch.nn.Module):
def forward(self, x):
return x + x
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: may be make it slightly more complicated with >1 input tensors and >1 add-nodes? may be like max((x + x), (y + y))

Also and also chain of nodes i.e. q0->dq0->op1->q2->dq2->op2->q3-dq3 => q0->op1*->op2*->dq3

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack.

dim_order = tensor.dim_order
tensor.shape = [tensor.shape[i] for i in dim_order]

qargs = list(cast(dict[int, QuantArgs], node.meta["input_qparams"]).values())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assert input_qparams in node.meta

"""
assert len(node.meta["output_qparams"]) == 1

qargs_out = cast(dict[int, QuantArgs], node.meta["output_qparams"])[0]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

return rescaled_nodes, min_scale


def insert_rescale_node_back_to_int8(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def insert_rescale_node_back_to_int8(
def insert_rescale_node_to_int8(

per added 7 commits December 13, 2024 12:19
Reuse the logic from the node visiting quantization handling,
but replace the quantization parameter fetching from the node
meta values.

Signed-off-by: Per Åstrand <per.astrand@arm.com>
Change-Id: I9a7bbf6384284e60118756ec5661f6b11847aba7
Fold DQ/Q nodes into the target operators specified to the pass.

Signed-off-by: Per Åstrand <per.astrand@arm.com>
Change-Id: I8a09dc0b887dd5f3915ca157f578ecf51772a1a2
Uses the fold DQ/Q pass to encapsulate the quantization information within the node.

Signed-off-by: Per Åstrand <per.astrand@arm.com>
Change-Id: I3adbab7e2a23a0208a03bbc423b38c15221a4959
Signed-off-by: Per Åstrand <per.astrand@arm.com>
Change-Id: I9230209ed3d6cc0b5ec7a35512248648bb8380ee
Signed-off-by: Per Åstrand <per.astrand@arm.com>
Change-Id: I6154e13a5a6b75549862709d632ee6dd5c8b0e7f
Adds a helper function to retrieve QuantArgs from node.meta and cleanup
the handling a bit by introducing the __eq__ operator for QuantArgs.

Signed-off-by: Per Åstrand <per.astrand@arm.com>
Change-Id: I519a9a286a36a278f40ffb6c679192a54d9f940d
Signed-off-by: Per Åstrand <per.astrand@arm.com>
Change-Id: I2d133f4347d9999c770e5337162c222368c212f2
@per per force-pushed the quantization_folding branch from 4a46eec to 2a03d6f Compare December 13, 2024 12:06
@per
Copy link
Collaborator Author

per commented Dec 13, 2024

pull / unittest / macos / macos-job (pull_request) failing seems to be unrelated (test_flamingo_vision_encoder)

@per per requested a review from digantdesai December 13, 2024 13:23
@per per merged commit 99d5b80 into pytorch:main Dec 16, 2024
105 of 106 checks passed
@per per deleted the quantization_folding branch December 16, 2024 08:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/trunk CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. partner: arm For backend delegation, kernels, demo, etc. from the 3rd-party partner, Arm topic: not user facing

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants