Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistency of CMSIS-NN Quantization Method(Q-format) with ARM Documentation #115

Open
42bPhD opened this issue Mar 4, 2024 · 4 comments
Assignees

Comments

@42bPhD
Copy link

42bPhD commented Mar 4, 2024

Hello.

I am currently in the process of developing using the Q-Format (Qm.n) for quantization. However, upon reviewing the revision history, I noticed that starting from version 4.1.0, the q-format approach is no longer being followed. My current approach aligns with the methods outlined in the following ARM documentation links:

While TensorFlow Lite for Microcontrollers employs Zero Point and Scale Factor for quantization, which necessitates additional memory and floating-point operations, it appears that Q-format based quantization would be more suitable for Cortex-M processors due to these constraints.

Could you kindly provide a clear explanation for the necessity of this change? The absence of discussion regarding its impact on speed and accuracy has left me somewhat perplexed. Any insight into the rationale behind this decision would be greatly appreciated, as it would aid in understanding the best practices for quantization within the context of TensorFlow Lite for Microcontrollers and CMSIS-NN.

Thank you for your time and consideration.

@mansnils
Copy link
Contributor

mansnils commented Mar 5, 2024

Hi @LEE-SEON-WOO,
You are referring to the legacy API. Development on that stopped a long time ago. Long before CMSIS-NN v4.0.1. When CMSIS-NN got its own repository the files were removed. CMSIS-NN is intended to be used with TFLM as CMSIS-NN is a library and not a framework. The legacy API was not bit-exact to TFLM reference kernels. The current API does not use any floating point operations.. Hope that makes it clearer.

@42bPhD
Copy link
Author

42bPhD commented Mar 5, 2024

Hello @mansnils , thank you for your response. Unfortunately, it seems that my limited fluency in English may have caused some misunderstandings, for which I apologize for any confusion caused.

The primary reason I am raising this issue is because I believe the q-format method of quantization could be effective on MCUs as well, and I am curious about the reasons it is not supported. For example, I wonder if it is due to difficulties related to accuracy or efficiency that support is not provided.

I understand that the quantization method provided by ARM is in the form of Q(m,n), and TFLM provides it based on the formula q = S*r + Z (where q: Quant Value, S: Scale Factor, Z: Zero Point, r: real_value). From my experience using ARM's method, it seems to have several advantages. Firstly, it operates at a higher speed because it uses shift operations. Secondly, it allows for smaller additional computations and variable sizes.

As I only deal with models suitable for MCUs, I am not sure how well it works with larger models. Also, legacy APIs using the q-format are easily accessible in other open-source platforms like nnom.

Thank you.

@42bPhD 42bPhD closed this as completed Mar 5, 2024
@mansnils
Copy link
Contributor

mansnils commented Mar 5, 2024

Thanks for the link! Why can't NNOM use the new/existing CMSIS-NN API?
Please note it also use shift operations. The scales are converted to integer multipliers and shifts before inference, and this is done before calling CMSIS-NN (before inference). And also performance should be better than the legacy API since development stopped long time ago.

@42bPhD
Copy link
Author

42bPhD commented Mar 6, 2024

Dear @mansnils,

Thank you for your response. Upon reviewing the CMSIS-NN documentation, I confirmed that the presence of _s indicates compatibility with TensorFlow Lite Micro. Additionally, when examining the algorithm, it is evident that functions such as MIN, MAX, and arm_nn_requantizelink are invoked. Compared to the previous _q7link algorithms, this appears to necessitate more computations. Furthermore, I came across a statement made by the author of NNOM a while back and am writing to verify its authenticity. Here is the link for reference: link.

Best regards.

@42bPhD 42bPhD reopened this Mar 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants