Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue on how to get started #85

Closed
danilopau opened this issue May 20, 2022 · 28 comments
Closed

Issue on how to get started #85

danilopau opened this issue May 20, 2022 · 28 comments

Comments

@danilopau
Copy link

Hello on the main pages there are instruction to apply. I put those in a notebook however they don't work.
with pip install tensorflow as you wrote I got
ValueError: Please install Intel® Optimizations for TensorFlow or MKL enabled TensorFlow from source code within version >=1.14.0 and <=2.8.0.
2022-05-20 14:25:56 [ERROR] Specified timeout or max trials is reached! Not found any quantized model which meet accuracy goal. Exit.

Here https://www.intel.com/content/www/us/en/developer/articles/guide/optimization-for-tensorflow-installation-guide.html

it is said to perform
pip install intel-tensorflow==2.8.0
but then another error appear
2022-05-20 14:31:58 [ERROR] Specified timeout or max trials is reached! Not found any quantized model which meet accuracy goal. Exit.
Could you be please precise
default_netcompressor.zip
so that results are reproduceable ?
Thanks

@ftian1
Copy link
Contributor

ftian1 commented May 20, 2022

Hello. pls ensure the framework version you are using is the one INC supports.

from the error log "Please install Intel® Optimizations for TensorFlow or MKL enabled TensorFlow from source code within version >=1.14.0 and <=2.8.0", you didn't install official tensorflow with oneDNN enabled or intel tensorflow.

as for the second issue, could you pls paste the whole log after you installed intel-tensorflow 2.8.0?

@danilopau
Copy link
Author

Guys I using your instructions on the main page https://github.com/intel/neural-compressor and they don't work as you wrote

the 1st issue diasappeared with pip install intel-tensorflow==2.8.0

the second issue the full log is (by the way you have the notebook here so you can reproduce
default_netcompressor.zip

2022-05-20 14:31:52 [INFO] Generating grammar tables from /usr/lib/python3.7/lib2to3/Grammar.txt
2022-05-20 14:31:53 [INFO] Generating grammar tables from /usr/lib/python3.7/lib2to3/PatternGrammar.txt
2022-05-20 14:31:53 [WARNING] Force convert framework model to neural_compressor model.
2022-05-20 14:31:53 [WARNING] Output tensor names should not be empty.
2022-05-20 14:31:53 [WARNING] Input tensor names should not be empty.
2022-05-20 14:31:53 [INFO] Generate a fake evaluation function.
2022-05-20 14:31:53 [WARNING] Found possible input node names: ['input'], output node names: ['MobilenetV1/Predictions/Reshape_1'].
2022-05-20 14:31:54 [INFO] ConvertLayoutOptimizer elapsed time: 10.68 ms
2022-05-20 14:31:54 [INFO] Pass GrapplerOptimizer elapsed time: 304.17 ms
2022-05-20 14:31:54 [INFO] Pass SwitchOptimizer elapsed time: 4.09 ms
2022-05-20 14:31:54 [INFO] Pass RemoveTrainingNodesOptimizer elapsed time: 3.5 ms
2022-05-20 14:31:54 [INFO] Pass SplitSharedInputOptimizer elapsed time: 2.0 ms
2022-05-20 14:31:54 [INFO] Pass GraphFoldConstantOptimizer elapsed time: 1.9 ms
2022-05-20 14:31:54 [INFO] Pass FuseColumnWiseMulOptimizer elapsed time: 3.59 ms
WARNING:tensorflow:From /usr/local/lib/python3.7/dist-packages/neural_compressor/adaptor/tf_utils/util.py:322: extract_sub_graph (from tensorflow.python.framework.graph_util_impl) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.compat.v1.graph_util.extract_sub_graph
2022-05-20 14:31:54 [WARNING] From /usr/local/lib/python3.7/dist-packages/neural_compressor/adaptor/tf_utils/util.py:322: extract_sub_graph (from tensorflow.python.framework.graph_util_impl) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.compat.v1.graph_util.extract_sub_graph
2022-05-20 14:31:54 [INFO] Pass StripUnusedNodesOptimizer elapsed time: 23.08 ms
2022-05-20 14:31:54 [INFO] Pass GraphCseOptimizer elapsed time: 3.5 ms
2022-05-20 14:31:54 [INFO] Pass FoldBatchNormNodesOptimizer elapsed time: 3.52 ms
2022-05-20 14:31:54 [INFO] Pass UpdateEnterOptimizer elapsed time: 1.58 ms
2022-05-20 14:31:54 [INFO] Pass ConvertLeakyReluOptimizer elapsed time: 3.19 ms
2022-05-20 14:31:54 [INFO] Pass ConvertAddToBiasAddOptimizer elapsed time: 3.43 ms
2022-05-20 14:31:54 [INFO] Pass FuseTransposeReshapeOptimizer elapsed time: 3.77 ms
2022-05-20 14:31:54 [INFO] Pass FuseConvWithMathOptimizer elapsed time: 3.53 ms
2022-05-20 14:31:54 [INFO] Pass ExpandDimsOptimizer elapsed time: 13.64 ms
2022-05-20 14:31:54 [INFO] Pass InjectDummyBiasAddOptimizer elapsed time: 4.6 ms
2022-05-20 14:31:54 [INFO] Pass MoveSqueezeAfterReluOptimizer elapsed time: 2.71 ms
2022-05-20 14:31:55 [INFO] Pass Pre Optimization elapsed time: 959.05 ms
2022-05-20 14:31:55 [INFO] Get FP32 model baseline.
2022-05-20 14:31:55 [INFO] Save tuning history to /content/nc_workspace/2022-05-20_14-31-52/./history.snapshot.
2022-05-20 14:31:55 [INFO] FP32 baseline is: [Accuracy: 1.0000, Duration (seconds): 0.0000]
2022-05-20 14:31:55 [CRITICAL] Please set environment variable TF_ENABLE_ONEDNN_OPTS=1 when Tensorflow 2.6.x installed.
2022-05-20 14:31:55 [WARNING] Found possible input node names: ['input'], output node names: ['MobilenetV1/Predictions/Reshape_1'].
2022-05-20 14:31:55 [WARNING] Found possible input node names: ['input'], output node names: ['MobilenetV1/Predictions/Reshape_1'].
2022-05-20 14:31:56 [INFO] Pass Quantization elapsed time: 563.2 ms
2022-05-20 14:31:57 [ERROR] Fail to quantize graph due to fileno.
2022-05-20 14:31:57 [WARNING] Fail to forward with batch size=1, set to 1 now.
2022-05-20 14:31:57 [CRITICAL] Please set environment variable TF_ENABLE_ONEDNN_OPTS=1 when Tensorflow 2.6.x installed.
2022-05-20 14:31:57 [WARNING] Found possible input node names: ['input'], output node names: ['MobilenetV1/Predictions/Reshape_1'].
2022-05-20 14:31:57 [WARNING] Found possible input node names: ['input'], output node names: ['MobilenetV1/Predictions/Reshape_1'].
2022-05-20 14:31:57 [INFO] Pass Quantization elapsed time: 341.39 ms
2022-05-20 14:31:58 [ERROR] Fail to quantize graph due to fileno.
2022-05-20 14:31:58 [ERROR] Unexpected exception UnsupportedOperation('fileno') happened during tuning.
Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/neural_compressor/adaptor/tensorflow.py", line 514, in quantize
data_loader=data_loader).convert()
File "/usr/local/lib/python3.7/dist-packages/neural_compressor/adaptor/tf_utils/graph_converter.py", line 240, in convert
model = self.quantize()
File "/usr/local/lib/python3.7/dist-packages/neural_compressor/adaptor/tf_utils/graph_converter.py", line 544, in quantize
with CaptureOutputToFile(tmp_dump_file):
File "/usr/local/lib/python3.7/dist-packages/neural_compressor/utils/utility.py", line 374, in init
self.orig_stream_fileno = stream.fileno()
io.UnsupportedOperation: fileno

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/neural_compressor/experimental/quantization.py", line 151, in execute
self.strategy.traverse()
File "/usr/local/lib/python3.7/dist-packages/neural_compressor/strategy/strategy.py", line 392, in traverse
tune_cfg, self.model, self.calib_dataloader, self.q_func)
File "/usr/local/lib/python3.7/dist-packages/neural_compressor/utils/utility.py", line 242, in fi
res = func(*args, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/neural_compressor/adaptor/tensorflow.py", line 529, in quantize
data_loader=data_loader).convert()
File "/usr/local/lib/python3.7/dist-packages/neural_compressor/adaptor/tf_utils/graph_converter.py", line 240, in convert
model = self.quantize()
File "/usr/local/lib/python3.7/dist-packages/neural_compressor/adaptor/tf_utils/graph_converter.py", line 544, in quantize
with CaptureOutputToFile(tmp_dump_file):
File "/usr/local/lib/python3.7/dist-packages/neural_compressor/utils/utility.py", line 374, in init
self.orig_stream_fileno = stream.fileno()
io.UnsupportedOperation: fileno
2022-05-20 14:31:58 [ERROR] Specified timeout or max trials is reached! Not found any quantized model which meet accuracy goal. Exit.

@danilopau
Copy link
Author

danilopau commented May 20, 2022

your instructions https://github.com/intel/neural-compressor are not reproduceable as they are
I am not interested on the accuracy
I just need the quantized onnx output model

@danilopau
Copy link
Author

If I add 2 more cells

An ONNX Example

!pip install onnx==1.9.0 onnxruntime==1.10.0 onnxruntime-extensions

Prepare fp32 model

!wget https://github.com/onnx/models/blob/main/vision/classification/resnet/model/resnet50-v1-12.onnx

from neural_compressor.experimental import Quantization, common
tf.compat.v1.disable_eager_execution()
quantizer = Quantization()
quantizer.model = './resnet50-v1-12.onnx'
dataset = quantizer.dataset('dummy', shape=(1, 224, 224, 3))
quantizer.calib_dataloader = common.DataLoader(dataset)
quantizer.fit()

I got 2022-05-20 15:01:37 [WARNING] Force convert framework model to neural_compressor model.

AssertionError Traceback (most recent call last)
in ()
2 tf.compat.v1.disable_eager_execution()
3 quantizer = Quantization()
----> 4 quantizer.model = './resnet50-v1-12.onnx'
5 dataset = quantizer.dataset('dummy', shape=(1, 224, 224, 3))
6 quantizer.calib_dataloader = common.DataLoader(dataset)

2 frames
/usr/local/lib/python3.7/dist-packages/neural_compressor/model/model.py in get_model_fwk_name(model)
203 if fwk_name != 'NA':
204 break
--> 205 assert fwk_name != 'NA', 'Framework is not detected correctly from model format.'
206
207 return fwk_name

AssertionError: Framework is not detected correctly from model format.

@danilopau
Copy link
Author

default_netcompressor.zip
here the colab notebook

@ftian1
Copy link
Contributor

ftian1 commented May 20, 2022

  1. INC tensorflow backend doesn't support Jupyter notebook yet. it's because Jupyter captures the system io but INC need to get the output of standard io for calibration.
  2. pls don't forget to set TF_ENABLE_ONEDNN_OPTS=1 if you are using official tensorflow. Intel TensorFlow doesn't need to set this micro.
  3. pls strictly follow examples readme instructions for result reproduce, if you still have problem, pls let us know

@ftian1
Copy link
Contributor

ftian1 commented May 20, 2022

as for your onnx model error, it fails when loading this onnx model by onnx runtime.

"onnxruntime.capi.onnxruntime_pybind11_state.InvalidProtobuf: [ONNXRuntimeError] : 7 : INVALID_PROTOBUF : Load model from /home/ftian/resnet50-v1-12.onnx failed:Protobuf parsing failed."

It should be some compatibility issue existing in onnx runtime and this onnx model. it's not INC issue.

@danilopau
Copy link
Author

It should be some compatibility issue existing in onnx runtime and this onnx model. it's not INC issue.

so why you indicate in your main page if instructions are not accurately reproduceable ?

@ftian1
Copy link
Contributor

ftian1 commented May 20, 2022

so why you indicate in your main page if instructions are not accurately reproduceable ?

I am totally confused, could you pls let me know which instruction/example is not reproducible?

@danilopau
Copy link
Author

You are confused because you did not see the notebook I share 2 times putting there the instructions from https://github.com/intel/neural-compressor
As I said a simple colab notebook will be reproduceable while the instructions are not

@danilopau
Copy link
Author

what I need is

  1. a simple colab notebook
    or
  2. a docker to install and run an fp32 network example preferably from pytorch
    It seems to me neither the two are available from the main page

@ftian1
Copy link
Contributor

ftian1 commented May 20, 2022

Danilo, I saw your notebook and mentioned the root cause in the thread. This is a known issue and was raised before #35

if you are saying instructions of some examples are wrong or not able to reproduce, pls let me know the exact place.

The main page of INC is just used to tell users how to install INC. That's why I am confused on what you say "main page is not reproducible".

Detailed examples pls refer to examples/ directory. those are all reproducible in linux bash env.

as for jupyter notebook, only tensorflow model has such issue. I would suggest you following the instructions in examples/ directory for model downloading and quantization. you can run same instructions in Juptyer notebook without issue.

@mengniwang95
Copy link
Collaborator

@danilopau Hi, pls try wget https://github.com/onnx/models/raw/main/vision/classification/resnet/model/resnet50-v1-12.onnx to download the onnx model file. Sorry for the link error and we will update it soon.

@danilopau
Copy link
Author

@mengniwang95 thanks
I tried following sequence

!python --version >> Python 3.7.13
pip install neural-compressor
pip install onnx==1.9.0 onnxruntime==1.10.0 onnxruntime-extensions
wget https://github.com/onnx/models/raw/main/vision/classification/resnet/model/resnet50-v1-12.onnx
from neural_compressor.experimental import Quantization, common
quantizer = Quantization()
quantizer.model = './resnet50-v1-12.onnx'
dataset = quantizer.dataset('dummy', shape=(1, 224, 224, 3))
quantizer.calib_dataloader = common.DataLoader(dataset)
quantizer.fit()

and the issue is

2022-05-21 08:08:35 [INFO] NumExpr defaulting to 2 threads.
2022-05-21 08:08:36 [INFO] Generating grammar tables from /usr/lib/python3.7/lib2to3/Grammar.txt
2022-05-21 08:08:36 [INFO] Generating grammar tables from /usr/lib/python3.7/lib2to3/PatternGrammar.txt
2022-05-21 08:08:36 [WARNING] Force convert framework model to neural_compressor model.

AssertionError Traceback (most recent call last)
in ()
4 quantizer = Quantization()
5 quantizer.model = './resnet50-v1-12.onnx'
----> 6 dataset = quantizer.dataset('dummy', shape=(1, 224, 224, 3))
7 quantizer.calib_dataloader = common.DataLoader(dataset)
8 quantizer.fit()

1 frames
/usr/local/lib/python3.7/dist-packages/neural_compressor/experimental/data/datasets/dataset.py in init(self, framework)
128 "mxnet", "onnxrt_qlinearops", "onnxrt_integerops",
129 "pytorch", "pytorch_ipex", "pytorch_fx", "engine"],
--> 130 "framework support tensorflow pytorch mxnet onnxrt engine"
131 self.datasets = framework_datasetsframework.datasets
132

AssertionError: framework support tensorflow pytorch mxnet onnxrt engine

@mengniwang95
Copy link
Collaborator

mengniwang95 commented May 21, 2022

Hi @danilopau , for onnx model, we support 3 frameworks: onnxrt_qlinearops & onnxrt_qintegerops & onnxrt_qdqops, they will use different int8 ops.
Currently model type detection can only confirm your model is an onnx model and set the framework type to 'onnxruntime' and so sorry we don't add datasets for 'onnxruntime'. Thank you for your feedback and I will fix this bug soon.
For a quick fix, you can add 2 lines of code

from neural_compressor import conf
conf.model.framework = 'onnxrt_qlinearops'
quantizer = Quantization(conf)

@danilopau
Copy link
Author

@mengniwang95
appreciated. I hope to have understood correctly your advice. See blow

python --version >> Python 3.7.13
pip install neural-compressor
pip install onnx==1.9.0 onnxruntime==1.10.0 onnxruntime-extensions
wget https://github.com/onnx/models/raw/main/vision/classification/resnet/model/resnet50-v1-12.onnx
from neural_compressor import conf
from neural_compressor.experimental import Quantization, common
conf.model.framework = 'onnxrt_qlinearops'
quantizer = Quantization(conf)
quantizer.model = './resnet50-v1-12.onnx'
dataset = quantizer.dataset('dummy', shape=(1, 224, 224, 3))
quantizer.calib_dataloader = common.DataLoader(dataset)
quantizer.fit()

unfortunately I got another error.
Question is there a way to get the quantized onnx model even if max trials is reached nor target accuracy was meet.
so that I can debug my importer regardeless of the accuracy please ?


env: TF_ENABLE_ONEDNN_OPTS=1
2022-05-20 15:22:56 [INFO] Generating grammar tables from /usr/lib/python3.7/lib2to3/Grammar.txt
2022-05-20 15:22:57 [INFO] Generating grammar tables from /usr/lib/python3.7/lib2to3/PatternGrammar.txt
2022-05-20 15:22:57 [WARNING] Force convert framework model to neural_compressor model.
2022-05-20 15:22:57 [WARNING] Output tensor names should not be empty.
2022-05-20 15:22:57 [WARNING] Input tensor names should not be empty.
2022-05-20 15:22:57 [INFO] Generate a fake evaluation function.
2022-05-20 15:22:57 [WARNING] Found possible input node names: ['input'], output node names: ['MobilenetV1/Predictions/Reshape_1'].
2022-05-20 15:22:57 [INFO] ConvertLayoutOptimizer elapsed time: 0.43 ms
2022-05-20 15:22:57 [INFO] Pass GrapplerOptimizer elapsed time: 145.32 ms
2022-05-20 15:22:57 [INFO] Pass SwitchOptimizer elapsed time: 3.39 ms
2022-05-20 15:22:57 [INFO] Pass RemoveTrainingNodesOptimizer elapsed time: 1.88 ms
2022-05-20 15:22:57 [INFO] Pass SplitSharedInputOptimizer elapsed time: 2.03 ms
2022-05-20 15:22:57 [INFO] Pass GraphFoldConstantOptimizer elapsed time: 2.38 ms
2022-05-20 15:22:57 [INFO] Pass FuseColumnWiseMulOptimizer elapsed time: 3.59 ms
WARNING:tensorflow:From /usr/local/lib/python3.7/dist-packages/neural_compressor/adaptor/tf_utils/util.py:322: extract_sub_graph (from tensorflow.python.framework.graph_util_impl) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.compat.v1.graph_util.extract_sub_graph
2022-05-20 15:22:57 [WARNING] From /usr/local/lib/python3.7/dist-packages/neural_compressor/adaptor/tf_utils/util.py:322: extract_sub_graph (from tensorflow.python.framework.graph_util_impl) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.compat.v1.graph_util.extract_sub_graph
2022-05-20 15:22:57 [INFO] Pass StripUnusedNodesOptimizer elapsed time: 13.03 ms
2022-05-20 15:22:57 [INFO] Pass GraphCseOptimizer elapsed time: 4.26 ms
2022-05-20 15:22:57 [INFO] Pass FoldBatchNormNodesOptimizer elapsed time: 4.09 ms
2022-05-20 15:22:57 [INFO] Pass UpdateEnterOptimizer elapsed time: 2.17 ms
2022-05-20 15:22:57 [INFO] Pass ConvertLeakyReluOptimizer elapsed time: 4.2 ms
2022-05-20 15:22:57 [INFO] Pass ConvertAddToBiasAddOptimizer elapsed time: 2.63 ms
2022-05-20 15:22:57 [INFO] Pass FuseTransposeReshapeOptimizer elapsed time: 4.31 ms
2022-05-20 15:22:58 [INFO] Pass FuseConvWithMathOptimizer elapsed time: 4.37 ms
2022-05-20 15:22:58 [INFO] Pass ExpandDimsOptimizer elapsed time: 3.97 ms
2022-05-20 15:22:58 [INFO] Pass InjectDummyBiasAddOptimizer elapsed time: 4.17 ms
2022-05-20 15:22:58 [INFO] Pass MoveSqueezeAfterReluOptimizer elapsed time: 2.09 ms
2022-05-20 15:22:58 [INFO] Pass Pre Optimization elapsed time: 431.36 ms
2022-05-20 15:22:58 [INFO] Get FP32 model baseline.
2022-05-20 15:22:58 [INFO] Save tuning history to /content/nc_workspace/2022-05-20_15-22-56/./history.snapshot.
2022-05-20 15:22:58 [INFO] FP32 baseline is: [Accuracy: 1.0000, Duration (seconds): 0.0000]
2022-05-20 15:22:58 [WARNING] Found possible input node names: ['input'], output node names: ['MobilenetV1/Predictions/Reshape_1'].
2022-05-20 15:22:58 [WARNING] Found possible input node names: ['input'], output node names: ['MobilenetV1/Predictions/Reshape_1'].
2022-05-20 15:22:59 [INFO] Pass Quantization elapsed time: 401.98 ms
2022-05-20 15:22:59 [ERROR] Fail to quantize graph due to fileno.
2022-05-20 15:22:59 [WARNING] Fail to forward with batch size=1, set to 1 now.
2022-05-20 15:22:59 [WARNING] Found possible input node names: ['input'], output node names: ['MobilenetV1/Predictions/Reshape_1'].
2022-05-20 15:22:59 [WARNING] Found possible input node names: ['input'], output node names: ['MobilenetV1/Predictions/Reshape_1'].
2022-05-20 15:23:00 [INFO] Pass Quantization elapsed time: 363.88 ms
2022-05-20 15:23:00 [ERROR] Fail to quantize graph due to fileno.
2022-05-20 15:23:00 [ERROR] Unexpected exception UnsupportedOperation('fileno') happened during tuning.
Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/neural_compressor/adaptor/tensorflow.py", line 514, in quantize
data_loader=data_loader).convert()
File "/usr/local/lib/python3.7/dist-packages/neural_compressor/adaptor/tf_utils/graph_converter.py", line 240, in convert
model = self.quantize()
File "/usr/local/lib/python3.7/dist-packages/neural_compressor/adaptor/tf_utils/graph_converter.py", line 544, in quantize
with CaptureOutputToFile(tmp_dump_file):
File "/usr/local/lib/python3.7/dist-packages/neural_compressor/utils/utility.py", line 374, in init
self.orig_stream_fileno = stream.fileno()
io.UnsupportedOperation: fileno

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/neural_compressor/experimental/quantization.py", line 151, in execute
self.strategy.traverse()
File "/usr/local/lib/python3.7/dist-packages/neural_compressor/strategy/strategy.py", line 392, in traverse
tune_cfg, self.model, self.calib_dataloader, self.q_func)
File "/usr/local/lib/python3.7/dist-packages/neural_compressor/utils/utility.py", line 242, in fi
res = func(*args, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/neural_compressor/adaptor/tensorflow.py", line 529, in quantize
data_loader=data_loader).convert()
File "/usr/local/lib/python3.7/dist-packages/neural_compressor/adaptor/tf_utils/graph_converter.py", line 240, in convert
model = self.quantize()
File "/usr/local/lib/python3.7/dist-packages/neural_compressor/adaptor/tf_utils/graph_converter.py", line 544, in quantize
with CaptureOutputToFile(tmp_dump_file):
File "/usr/local/lib/python3.7/dist-packages/neural_compressor/utils/utility.py", line 374, in init
self.orig_stream_fileno = stream.fileno()
io.UnsupportedOperation: fileno
2022-05-20 15:23:00 [ERROR] Specified timeout or max trials is reached! Not found any quantized model which meet accuracy goal. Exit.

@mengniwang95
Copy link
Collaborator

I am a little confused, your log shows you are quantizing a tf model but not an onnx model.
you can add

conf.tuning.exit_policy.performance_only = True

@danilopau
Copy link
Author

@mengniwang95
I just rerun above lines without your latest advice and I confirm the issue
2022-05-21 10:20:21 [INFO] NumExpr defaulting to 2 threads.
2022-05-21 10:20:22 [INFO] Generating grammar tables from /usr/lib/python3.7/lib2to3/Grammar.txt
2022-05-21 10:20:22 [INFO] Generating grammar tables from /usr/lib/python3.7/lib2to3/PatternGrammar.txt
2022-05-21 10:20:23 [WARNING] Force convert framework model to neural_compressor model.
2022-05-21 10:20:26 [INFO] Generate a fake evaluation function.
2022-05-21 10:20:28 [INFO] Get FP32 model baseline.
2022-05-21 10:20:28 [INFO] Save tuning history to /content/nc_workspace/2022-05-21_10-20-22/./history.snapshot.
2022-05-21 10:20:28 [INFO] FP32 baseline is: [Accuracy: 1.0000, Duration (seconds): 0.0000]
2022-05-21 10:20:31 [WARNING] Fail to forward with batch size=1, set to 1 now.
2022-05-21 10:20:33 [ERROR] Unexpected exception InvalidArgument('[ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Got invalid dimensions for input: data for the following indices\n index: 1 Got: 224 Expected: 3\n index: 3 Got: 3 Expected: 224\n Please fix either the inputs or the model.') happened during tuning.
Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/neural_compressor/adaptor/onnxrt.py", line 124, in quantize
quantize_config, tmp_iterations)
File "/usr/local/lib/python3.7/dist-packages/neural_compressor/adaptor/onnxrt.py", line 334, in _get_quantize_params
quantize_params = augment.dump_calibration()
File "/usr/local/lib/python3.7/dist-packages/neural_compressor/adaptor/ox_utils/onnxrt_mid.py", line 358, in dump_calibration
node_output_names, output_dicts_list = self.get_intermediate_outputs()
File "/usr/local/lib/python3.7/dist-packages/neural_compressor/adaptor/ox_utils/onnxrt_mid.py", line 234, in get_intermediate_outputs
intermediate_outputs.append(session.run(None, ort_inputs))
File "/usr/local/lib/python3.7/dist-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 192, in run
return self._sess.run(output_names, input_feed, run_options)
onnxruntime.capi.onnxruntime_pybind11_state.InvalidArgument: [ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Got invalid dimensions for input: data for the following indices
index: 1 Got: 224 Expected: 3
index: 3 Got: 3 Expected: 224
Please fix either the inputs or the model.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/neural_compressor/experimental/quantization.py", line 151, in execute
self.strategy.traverse()
File "/usr/local/lib/python3.7/dist-packages/neural_compressor/strategy/strategy.py", line 392, in traverse
tune_cfg, self.model, self.calib_dataloader, self.q_func)
File "/usr/local/lib/python3.7/dist-packages/neural_compressor/utils/utility.py", line 242, in fi
res = func(*args, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/neural_compressor/adaptor/onnxrt.py", line 131, in quantize
quantize_config, calib_sampling_size)
File "/usr/local/lib/python3.7/dist-packages/neural_compressor/adaptor/onnxrt.py", line 334, in _get_quantize_params
quantize_params = augment.dump_calibration()
File "/usr/local/lib/python3.7/dist-packages/neural_compressor/adaptor/ox_utils/onnxrt_mid.py", line 358, in dump_calibration
node_output_names, output_dicts_list = self.get_intermediate_outputs()
File "/usr/local/lib/python3.7/dist-packages/neural_compressor/adaptor/ox_utils/onnxrt_mid.py", line 234, in get_intermediate_outputs
intermediate_outputs.append(session.run(None, ort_inputs))
File "/usr/local/lib/python3.7/dist-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 192, in run
return self._sess.run(output_names, input_feed, run_options)
onnxruntime.capi.onnxruntime_pybind11_state.InvalidArgument: [ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Got invalid dimensions for input: data for the following indices
index: 1 Got: 224 Expected: 3
index: 3 Got: 3 Expected: 224
Please fix either the inputs or the model.
2022-05-21 10:20:33 [ERROR] Specified timeout or max trials is reached! Not found any quantized model which meet accuracy goal. Exit.

@danilopau
Copy link
Author

and following your advice I run

from neural_compressor import conf
from neural_compressor.experimental import Quantization, common
conf.tuning.exit_policy.performance_only = True
conf.model.framework = 'onnxrt_qlinearops'
quantizer = Quantization(conf)
quantizer.model = './resnet50-v1-12.onnx'
dataset = quantizer.dataset('dummy', shape=(1, 224, 224, 3))
quantizer.calib_dataloader = common.DataLoader(dataset)
quantizer.fit()


2022-05-21 10:24:21 [INFO] NumExpr defaulting to 2 threads.
2022-05-21 10:24:22 [INFO] Generating grammar tables from /usr/lib/python3.7/lib2to3/Grammar.txt
2022-05-21 10:24:22 [INFO] Generating grammar tables from /usr/lib/python3.7/lib2to3/PatternGrammar.txt
2022-05-21 10:24:22 [WARNING] Force convert framework model to neural_compressor model.
2022-05-21 10:24:25 [INFO] Generate a fake evaluation function.
2022-05-21 10:24:27 [INFO] Get FP32 model baseline.
2022-05-21 10:24:27 [INFO] Save tuning history to /content/nc_workspace/2022-05-21_10-24-22/./history.snapshot.
2022-05-21 10:24:27 [INFO] FP32 baseline is: [Accuracy: 1.0000, Duration (seconds): 0.0000]
2022-05-21 10:24:30 [WARNING] Fail to forward with batch size=1, set to 1 now.
2022-05-21 10:24:32 [ERROR] Unexpected exception InvalidArgument('[ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Got invalid dimensions for input: data for the following indices\n index: 1 Got: 224 Expected: 3\n index: 3 Got: 3 Expected: 224\n Please fix either the inputs or the model.') happened during tuning.
Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/neural_compressor/adaptor/onnxrt.py", line 124, in quantize
quantize_config, tmp_iterations)
File "/usr/local/lib/python3.7/dist-packages/neural_compressor/adaptor/onnxrt.py", line 334, in _get_quantize_params
quantize_params = augment.dump_calibration()
File "/usr/local/lib/python3.7/dist-packages/neural_compressor/adaptor/ox_utils/onnxrt_mid.py", line 358, in dump_calibration
node_output_names, output_dicts_list = self.get_intermediate_outputs()
File "/usr/local/lib/python3.7/dist-packages/neural_compressor/adaptor/ox_utils/onnxrt_mid.py", line 234, in get_intermediate_outputs
intermediate_outputs.append(session.run(None, ort_inputs))
File "/usr/local/lib/python3.7/dist-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 192, in run
return self._sess.run(output_names, input_feed, run_options)
onnxruntime.capi.onnxruntime_pybind11_state.InvalidArgument: [ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Got invalid dimensions for input: data for the following indices
index: 1 Got: 224 Expected: 3
index: 3 Got: 3 Expected: 224
Please fix either the inputs or the model.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/neural_compressor/experimental/quantization.py", line 151, in execute
self.strategy.traverse()
File "/usr/local/lib/python3.7/dist-packages/neural_compressor/strategy/strategy.py", line 392, in traverse
tune_cfg, self.model, self.calib_dataloader, self.q_func)
File "/usr/local/lib/python3.7/dist-packages/neural_compressor/utils/utility.py", line 242, in fi
res = func(*args, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/neural_compressor/adaptor/onnxrt.py", line 131, in quantize
quantize_config, calib_sampling_size)
File "/usr/local/lib/python3.7/dist-packages/neural_compressor/adaptor/onnxrt.py", line 334, in _get_quantize_params
quantize_params = augment.dump_calibration()
File "/usr/local/lib/python3.7/dist-packages/neural_compressor/adaptor/ox_utils/onnxrt_mid.py", line 358, in dump_calibration
node_output_names, output_dicts_list = self.get_intermediate_outputs()
File "/usr/local/lib/python3.7/dist-packages/neural_compressor/adaptor/ox_utils/onnxrt_mid.py", line 234, in get_intermediate_outputs
intermediate_outputs.append(session.run(None, ort_inputs))
File "/usr/local/lib/python3.7/dist-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 192, in run
return self._sess.run(output_names, input_feed, run_options)
onnxruntime.capi.onnxruntime_pybind11_state.InvalidArgument: [ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Got invalid dimensions for input: data for the following indices
index: 1 Got: 224 Expected: 3
index: 3 Got: 3 Expected: 224
Please fix either the inputs or the model.
2022-05-21 10:24:32 [ERROR] Specified timeout or max trials is reached! Not found any quantized model which meet accuracy goal. Exit.

@danilopau
Copy link
Author

@mengniwang95

with that sequence under Python 3.7.13 I was able to generate optimized, renamed and augmented onnx model
unfortunately these are not quantized. How to achieve an onnx with quantized standard onnx layers please ?

pip install neural-compressor

An ONNX Example

!pip install onnx==1.9.0 onnxruntime==1.10.0 onnxruntime-extensions

Prepare fp32 model

!wget https://github.com/onnx/models/raw/main/vision/classification/resnet/model/resnet50-v1-12.onnx
from neural_compressor import conf
from neural_compressor.experimental import Quantization, common

conf.model.framework = 'onnxrt_qlinearops'
conf.tuning.exit_policy.performance_only = True
quantizer = Quantization(conf)
quantizer.model = './resnet50-v1-12.onnx'
dataset = quantizer.dataset('dummy', shape=(1, 224, 224, 3))
quantizer.calib_dataloader = common.DataLoader(dataset)
quantizer.fit()

@mengniwang95
Copy link
Collaborator

@danilopau hi, this error is caused by your input shape, according to the model it should be (1, 3, 224, 224)

@danilopau
Copy link
Author

@mengniwang95
I did it still the onnx model (renamed, optimized, augmented are float) where are the onnx quantized model saved please ?
Python 3.7.13

An ONNX Example

!pip install onnx==1.9.0 onnxruntime==1.10.0 onnxruntime-extensions

Prepare fp32 model

!wget https://github.com/onnx/models/raw/main/vision/classification/resnet/model/resnet50-v1-12.onnx
from neural_compressor import conf
from neural_compressor.experimental import Quantization, common

conf.model.framework = 'onnxrt_qlinearops'
#conf.tuning.exit_policy.performance_only = True
quantizer = Quantization("./conf.yaml")
quantizer.model = './resnet50-v1-12.onnx'
dataset = quantizer.dataset('dummy', shape=(1, 3, 224, 224))
quantizer.calib_dataloader = common.DataLoader(dataset)
quantizer.fit()

@danilopau
Copy link
Author

the conf.yaml is
version: 1.0

model:
name: resnet50-v1-12
framework: onnxrt_qlinearops

evaluation:
accuracy:
metric:
topk: 1

tuning:
accuracy_criterion:
relative: 0.89

@mengniwang95
Copy link
Collaborator

Did it raise any error info?
Further you should do save operation

model = quantizer.fit()
model.save(output_path)

@danilopau
Copy link
Author

@mengniwang95 thanks again for your great support.
I did

from neural_compressor import conf
from neural_compressor.experimental import Quantization, common

conf.model.framework = 'onnxrt_qlinearops'
#conf.tuning.exit_policy.performance_only = True
quantizer = Quantization("./conf.yaml")
quantizer.model = './resnet50-v1-12.onnx'
dataset = quantizer.dataset('dummy', shape=(1, 3, 224, 224))
quantizer.calib_dataloader = common.DataLoader(dataset)
model = quantizer.fit()
model.save("./resnet_quantized.onnx") >> weights are float unfortunately, there are no QConv there

---- log is ---

2022-05-22 14:10:40 [INFO] NumExpr defaulting to 2 threads.
2022-05-22 14:10:43 [INFO] Generating grammar tables from /usr/lib/python3.7/lib2to3/Grammar.txt
2022-05-22 14:10:43 [INFO] Generating grammar tables from /usr/lib/python3.7/lib2to3/PatternGrammar.txt
2022-05-22 14:10:44 [WARNING] Force convert framework model to neural_compressor model.
2022-05-22 14:10:50 [INFO] Generate a fake evaluation function.
2022-05-22 14:10:53 [INFO] Get FP32 model baseline.
2022-05-22 14:10:53 [INFO] Save tuning history to /content/drive/MyDrive/Colab/neuralcompressor/onnx/nc_workspace/2022-05-22_14-10-43/./history.snapshot.
2022-05-22 14:10:53 [INFO] FP32 baseline is: [Accuracy: 1.0000, Duration (seconds): 0.0000]
2022-05-22 14:11:07 [INFO] |*Mixed Precision Statistics|
2022-05-22 14:11:07 [INFO] +----------------------+--------+-------+
2022-05-22 14:11:07 [INFO] | Op Type | Total | INT8 |
2022-05-22 14:11:07 [INFO] +----------------------+--------+-------+
2022-05-22 14:11:07 [INFO] | Conv | 53 | 53 |
2022-05-22 14:11:07 [INFO] | MatMul | 1 | 1 |
2022-05-22 14:11:07 [INFO] | MaxPool | 1 | 1 |
2022-05-22 14:11:07 [INFO] | GlobalAveragePool | 1 | 1 |
2022-05-22 14:11:07 [INFO] | Add | 17 | 17 |
2022-05-22 14:11:07 [INFO] | QuantizeLinear | 2 | 2 |
2022-05-22 14:11:07 [INFO] | DequantizeLinear | 2 | 2 |
2022-05-22 14:11:07 [INFO] +----------------------+--------+-------+
2022-05-22 14:11:08 [INFO] Pass quantize model elapsed time: 14732.17 ms
2022-05-22 14:11:08 [INFO] Tune 1 result is: [Accuracy (int8|fp32): 1.0000|1.0000, Duration (seconds) (int8|fp32): 0.0000|0.0000], Best tune result is: [Accuracy: 1.0000, Duration (seconds): 0.0000]
2022-05-22 14:11:08 [INFO] |Tune Result Statistics|
2022-05-22 14:11:08 [INFO] +--------------------+----------+---------------+------------------+
2022-05-22 14:11:08 [INFO] | Info Type | Baseline | Tune 1 result | Best tune result |
2022-05-22 14:11:08 [INFO] +--------------------+----------+---------------+------------------+
2022-05-22 14:11:08 [INFO] | Accuracy | 1.0000 | 1.0000 | 1.0000 |
2022-05-22 14:11:08 [INFO] | Duration (seconds) | 0.0000 | 0.0000 | 0.0000 |
2022-05-22 14:11:08 [INFO] +--------------------+----------+---------------+------------------+
2022-05-22 14:11:08 [INFO] Save tuning history to /content/drive/MyDrive/Colab/neuralcompressor/onnx/nc_workspace/2022-05-22_14-10-43/./history.snapshot.
2022-05-22 14:11:08 [INFO] Specified timeout or max trials is reached! Found a quantized model which meet accuracy goal. Exit.
2022-05-22 14:11:08 [INFO] Save deploy yaml to /content/drive/MyDrive/Colab/neuralcompressor/onnx/nc_workspace/2022-05-22_14-10-43/deploy.yaml

@mengniwang95
Copy link
Collaborator

@danilopau Hi, sorry for my late reply, I tried in my local env with your code + yaml and the saved model has int8 nodes like below:
image

@danilopau
Copy link
Author

@mengniwang95
thanks again. Please point me to instructions on how to set your local env in lunix kindly

@mengniwang95
Copy link
Collaborator

Actually the process is same as you set in colab. Just install needed py pkgs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants