ONNX Model Dynamic Quantisation Stuck . 

Hi Team , 

I am trying to convert a BERT Seq2Seq Model into Intel Neural Compressor ( Its an ONNX Model).  


It is stuck at the stage after which no quantisation is happening . I tried the relative accuracy upto 80% . Please guide.

Stuck at this : 

2022-11-09 06:56:20 [INFO] Get FP32 model baseline.
Accuracy is : 50.2
mean: 16.99 ms,  std: 2.8 ms,  min: 12.7 ms,  max: 33.83 ms,  median: 16.31 ms,  95p: 21.73 ms,  99p: 26.9 ms
2022-11-09 06:56:58 [INFO] Save tuning history to /home/jupyter/spell_projects/nc_workspace/2022-11-09_06-56-07/./history.snapshot.
2022-11-09 06:56:58 [INFO] FP32 baseline is: [Accuracy: 50.2000, Duration (seconds): 38.4878]


No update after this . 

YAML Property File

version: 1.0

model:                                               # mandatory. used to specify model specific information.
  name: bert 
  framework: onnxrt_integerops                       # mandatory. possible values are tensorflow, mxnet, pytorch, pytorch_ipex, onnxrt_integerops and onnxrt_qlinearops.

quantization:
  approach: post_training_dynamic_quant              # optional. default value is post_training_static_quant.                                   

tuning:
  accuracy_criterion:
    relative:  0.80                               # optional. default value is relative, other value is absolute. this example allows relative accuracy loss: 1%.
  exit_policy:
    timeout: 0                                       # optional. tuning timeout (seconds). default value is 0 which means early stop. combine with max_trials field to decide when to exit.
    max_trials: 10
  random_seed: 5271



def inc_quantize(MODEL_CONFIG,):

    ONNX_MODEL_FILE = os.path.join(MODEL_CONFIG['ONNX_MODEL_PATH'],MODEL_CONFIG['ONNX_MODEL_FILE_NAME'])
    model = onnx.load(ONNX_MODEL_FILE)

    quantizer = Quantization(MODEL_CONFIG['QUANT_CONFIG'])
    quantizer.cfg.tuning.accuracy_criterion.relative **= 0.8**
    quantizer.model = common.Model(model)
    quantizer.tokenizer = transformers.AutoTokenizer.from_pretrained(
        MODEL_CONFIG['BERT_MODEL_TYPE'])
    quantizer.eval_func = run_evaluation_benchmark
    q_model = quantizer()
    QUANTIZED_SAVE_PATH = os.path.join(MODEL_CONFIG['ONNX_MODEL_PATH'],MODEL_CONFIG['ONNX_QUANTIZED_FILE_NAME'])
    q_model.save(QUANTIZED_SAVE_PATH)
    print("QUANTIZED MODEL Saved at: ",QUANTIZED_SAVE_PATH)



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ONNX Model Dynamic Quantisation Stuck . #128

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

ONNX Model Dynamic Quantisation Stuck . #128

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions