Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix FoldTransposeIntoQuantInit Transformation #78

Merged
merged 12 commits into from
Oct 23, 2023

Conversation

iksnagreb
Copy link
Contributor

Intends to fix problems related to the FoldTransposeIntoQuantInit transformation. The transformation should only be applied if the Transpose node actually follows a so called QuantInit, these are Quant (or BipolarQuant) nodes where all inputs are initializers. Currently, the transform is always applied, even if just some inputs have initializers. This causes problems with the shape inference, as the remaining Quant node does not transpose its runtime inputs. This is fixed by making the test for a node being QuantInit more strict.

I have tested this by running the unit tests for QONNX as well as those under tests/transformation over at FINN and did not observe any issues so far. For more context please see the following issues: #77, Xilinx/finn#878, Xilinx/finn#892.

maltanar and others added 3 commits September 22, 2023 19:54
Previously, the transform was seemingly applied to all Quant-Transpose
patterns, irrespective of whether *all* the inputs are actually
initializers. This should now be fixed by testing more strictly for the
node being a QuantInit.
@iksnagreb
Copy link
Contributor Author

Hm, all tests fail? What is going on? Seems not to be my fault?

@iksnagreb
Copy link
Contributor Author

I have just added a new unit test validating both situations, i.e., keeping and removing the Transpose node, as well as running the InferShapes transformation directly after FoldTransposeIntoQuantInit.

@iksnagreb
Copy link
Contributor Author

Hm, again some unrelated tests fail. This time it is some HTTP Error 500: INTERNAL SERVER ERROR. However, all passed for the two commits in between, seems to be something external?

@maltanar
Copy link
Collaborator

The previous CI failures were due to an onnxruntime bug that got fixed last week, so main is now updated to use the most recent version that takes care of that which I updated this PR with.

The last CI failures seem to be due to some intermittent server failure, re-running was enough to make all tests pass.

Otherwise, the PR looks all good to me - thanks @iksnagreb ! I'll only add one comment here about this bit before I hit merge:

                        # Skip transposing the initializer if the number of
                        # dimensions do not match
                        if perm is not None and len(perm) != tensor.ndim:
                            # Note: Soft skip ok or is this an error?
                            continue

The only cases I've previously seen that would give a perm tensor with a different number of dimensions is when the tensor in question was a scalar (for instance a global zeropoint value of 0). This would be safe to skip.

In theory, one could take advantage of shape broadcasting to create quantization parameters that are neither scalar nor matching the number of dimensions for the target tensor, as this is already mis-specified in the Quant node spec. I'll update the Quant node spec to permit only scalar OR ndim == tensor ndim cases.

@maltanar maltanar merged commit c966b46 into fastmachinelearning:main Oct 23, 2023
5 checks passed
@iksnagreb
Copy link
Contributor Author

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants