New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Frontend][PaddlePaddle] Update the export method of PaddlePaddle Softmax #16653
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given this is improving coverage for operators to be offloaded to CMSIS-NN, does that make sense to add a test case in the CMSIS-NN namespace (https://github.com/apache/tvm/tree/main/tests/python/contrib/test_cmsisnn) that tests for the proposed change so that I doesn't regress in future?
I suggest this is done in this same PR, so that we are not missing on tests.
Hello Leandron(@leandron), I understand your concerns, but TVM currently does not support reading quantized models from Paddle. Therefore, the Softmax operator cannot yet be offloaded to CMSIS-NN. I am working on adding support for TVM to handle Paddle's quantized models. I plan to add this test after the successful merge of TVM PR 16651. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR optimizes the conversion mapping from the Paddle Softmax
operator to the TVM operator, replacing the original combination of multiple small operators with one softmax
operator which looks more reasonable.
This PR looks good to me.
Sure, so let's get #16651 merged, and then update this one with a test case, then we merge this. |
Hi, @leandron. TVM Pull Request 16651 has been successfully merged. What type of example should I add in the testing next? (I see you've already added tests for Softmax.) |
Generally speaking, I suggest adding tests that will exercise the path being proposed here, that is from PaddlePaddle to CMSIS-NN, including at least one softmax operator. Does that make sense? |
Are you suggesting using a PaddlePaddle model with softmax parameters and specifying the runtime as CMSIS-NN, then validating whether the output results of CMSIS-NN match those of the PaddlePaddle model? This approach might not be feasible at the moment. I've already highlighted the potential issues with this method in TVM Pull Request 16651. In simple terms, in the current version of TVM, when a quantized PaddlePaddle model is converted to a TVM model, there are discrepancies in the model's computation results. I'm confident this isn't an issue with my porting efforts because the same problem exists with ONNX models. You can review my detailed test code in TVM Pull Request 16651, where I convert the quantized Paddle model to a TVM model and specify the target to be llvm running on CPU. |
I'm suggesting that tests need to be added when new code is contributed. Now, to what degree of specific tests are practical with specific contributions, that is part of the contribution process. It is a known fact that TVM will have numerical discrepancies with various input frameworks, but that hasn't been a blocker to adding tests to the existing implementation in all different frameworks. I don't see why PaddlePaddle should be an exception to that, and it is not clear in the long term, what is the rush into merging contributions without tests, as it increases technical debt to the project. |
Sorry, English isn't my native language, so I might not have expressed my point clearly. The issue we're discussing isn't about 'whether to add tests' but rather 'what level of error is acceptable in the tests.' There's an inherent error between TVM and various frameworks, and without altering the core TVM code, we can only tolerate a larger margin of error. This error isn't unique to the PaddlePaddle framework; it's also present in the ONNX framework. It's not that PaddlePaddle models are a special case. The pull request for supporting ONNX quantized models was merged earlier than the PR for supporting PaddlePaddle quantized models. Since we allow for errors in ONNX quantized models in TVM, why can't we allow for the same level of error in PaddlePaddle quantized models in TVM? |
For example, with the same input, assuming there is a 5% error in the output of both the TVM and Paddle models (for instance, using cosine similarity as the measure of error). So, when determining whether the output of the TVM model is correct, is it acceptable to allow for this 5% error? |
Sure, I understand your point. I agree with this approach, please add tests for the PaddlePaddle path offloading softmax to CMSIS-NN that contain an error margin for the expected numerical discrepancy. Please do that as part of this PR. |
I apologize for not expressing the meaning clearly. I will add corresponding tests in this PR. 😊 |
If that's what achievable, I'd say we should add the tests, merge this PR, and in parallel try to understand why this discrepancy is quite high, being 5%. Maybe that understanding will involve the PaddlePaddle community as well, so it's a longer process, and shouldn't block this particular PR. |
I fully endorse your point of view. I also believe that testing for the reasons behind errors is crucial, as it helps make TVM's code more 'robust.' After merging the PR, I will create a separate issue to discuss this matter, as it not only concerns the PaddlePaddle community but also the ONNX community. I've conducted some tests regarding the error. I pruned the model to only leave a quantized conv2d operator and tested it with input data of the same shape but different values. I found that this error doesn't exist for every input data; in most cases, it doesn't occur. Subsequently, I analyzed each element of the output. I found that when the error occurs, the majority of elements are the same, with only a small portion being different. Based on my work experience, I speculate that this is likely due to improper handling of overflow during computation. |
Chimingin here a bit, i think the overall we would like to gradually move away from relying on e2e tests. in this case, there are usually two parts:
Moving forward, i think we should decouple T0 and T1 (into likely separate PRs). This would help us to build more robust tests that runs faster and ensure unittest cases remain more UT, runs faster and can locate errors more effectively |
Hello,@leandron . I found in cmsis.py that the scale of softmax must be 1/256 and the zero point must be -128. Why is that? According to the formula |
By the way, in my testing of the Paddle model, the scale is 0.0078649195 (close to 1/127), and the zero point is 0." |
This These are some interesting documentation links:
Note the following restriction in the TFLite softmax:
In summary, the restriction is expected, and comes from the bit-accurate support to TFLite. |
中文简介(Chinese Introduction)
当PaddlePaddle模型想要将算子offload到cmsis-nn上时,需要通过cmsisnn.py的 qnn_softmax_pattern() 来判断是否符合要求。目前将PaddlePaddle转换为cmsis-nn的图时,会将Softmax拆分为几个不同的算子,此时无法被offload到cmsis-nn。
English Introduction(英文简介)
When a PaddlePaddle model wants to offload operators to CMSIS-NN, it needs to determine if it meets the requirements through qnn_softmax_pattern() in cmsisnn.py. Currently, when converting PaddlePaddle to CMSIS-NN graphs, Softmax is split into several different operators, which cannot be offloaded to CMSIS-NN at that point.