Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Concatenation has inconsistent input parameters quantization parameters #221

Closed
LynnL4 opened this issue May 26, 2023 · 6 comments
Closed
Labels
awaiting response bug Something isn't working duplicate This issue or pull request already exists

Comments

@LynnL4
Copy link

LynnL4 commented May 26, 2023

Bug

After quantization, there is a issue of inconsistency in the quantization parameters of the input parameters of concatenation.

Reproduce

https://raw.githubusercontent.com/LynnL4/Public/main/tinynn/yolodetector_q.py

Expectation

Conncation uses consistent quantization parameters for all inputs and outputs

@LynnL4
Copy link
Author

LynnL4 commented May 31, 2023

I tried to add a quantization operator below the nodes with different quantization parameters, and this solved the problem I encountered. But there are still some details that are not taken care of.
LynnL4@6f62020

@peterjc123 peterjc123 added the bug Something isn't working label Jun 1, 2023
@peterjc123
Copy link
Collaborator

Looks like a duplicate of #95 (comment)

@peterjc123
Copy link
Collaborator

peterjc123 commented Sep 6, 2023

image
I think this part of the model should not be quantized, not only because of this, but also there are something that cannot be quantized like pow.

@peterjc123 peterjc123 added awaiting response duplicate This issue or pull request already exists labels Sep 6, 2023
@LynnL4
Copy link
Author

LynnL4 commented Sep 6, 2023

If not quantized, it would be unfriendly for devices like MCU, such as esp32s3. When exporting the YOLOv5 project to int8 TFLite, a quantization operator will also be added in this case. However, it is important to ensure that the values of the input tensor are within the same magnitude to avoid losing a significant amount of precision. Tomorrow, I will try exporting YOLOv8 to TFLite to see what the situation will be. The YOLOv8 project uses onnx2tf.

@peterjc123
Copy link
Collaborator

peterjc123 commented Sep 6, 2023

If not quantized, it would be unfriendly for devices like MCU, such as esp32s3.

Yes, but it seems the Pow op has Quantize and Dequantize ops around it, which is already not optimal. Even if #95 (comment) is resolved, we will add three more Quantize ops to perform re-quantization before the Concatenation op. Actually, I don't know how is the performance of the those ops using Float32 kernels on your hardware.

However, it is important to ensure that the values of the input tensor are within the same magnitude to avoid losing a significant amount of precision.

Yes, I agree with you. Otherwise, we can just use the q params of the output of the Logistics op as the unified one.

The YOLOv8 project uses onnx2tf.

I suppose they will add Quantize nodes before the Concatenation nodes, which may lead to the same result.

@peterjc123
Copy link
Collaborator

Let's continue the discussion in #95.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting response bug Something isn't working duplicate This issue or pull request already exists
Projects
None yet
Development

No branches or pull requests

2 participants