Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Tensorflow] Question: PTQ and QAT #38

Closed
peiwenhuang27 opened this issue Oct 29, 2021 · 2 comments
Closed

[Tensorflow] Question: PTQ and QAT #38

peiwenhuang27 opened this issue Oct 29, 2021 · 2 comments

Comments

@peiwenhuang27
Copy link

Hi, may I ask some questions based on my understanding of the source code please:

1. Conv2D

As far as I know, in post-training quantization, Conv2D supports both Conv2DBiasAddRelu and Conv2DBiasAddLeakyRelu through FuseNodeStartWithConv2d.apply_conv_biasadd_relu_fusion(...). However, the key difference is that with Leaky ReLU, quantized values cannot be directly passed to next quantized Conv2D due to the positive inputs constraint, so QuantizedConv2DWithBiasAndRelu will first dequantize to pass through Leaky ReLU, and then quantize again into next QuantizedConv2DWithBiasAndRelu.

So, if I have a quantization-aware trained model with Conv2DBiasAddLeakyRelu pattern, is it also converted to quantized model in the same manner? That is, regardless of the quantization method, in order to pass through Leaky ReLU, the predecessor node must first dequantize and the successor node must add a quantize input layer, is that correct?

2. LSTM

I noticed the following lines:

# FIXME We only quantize the MatMul op which second input node type is const. This is a
# workaround for RNN model like LTSM.
if weight_node.op != 'Const':
self.output_graph = self.input_graph
return []

Does this mean quantization for LSTM is currently not supported?

Thanks!

deb-intel pushed a commit to deb-intel/lp-opt-tool that referenced this issue Nov 4, 2021
…ntel#38)

* Align the OneDnn rounding mode to tensorflow int32 bias conversion.

Signed-off-by: Zhang, Guoming <guoming.zhang@intel.com>

* Remove the redundant parentness.
@guomingz
Copy link
Contributor

For Conv2D questions, i don't think there's need to insert additional dequantize/quantize before next QuantizedConv2DWithBiasAndRelu as this op supports s8 input.

For LTSM, i remember we already supported LTSM mode since v1.6 release

@ftian1
Copy link
Contributor

ftian1 commented Jan 10, 2022

close it if no further questions

@ftian1 ftian1 closed this as completed Jan 10, 2022
VincyZhang pushed a commit that referenced this issue Feb 12, 2023
* fix example bugs

* fix language modeling issues

Co-authored-by: changwa1 <chang1.wang@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants