Skip to content

ONNX Model Dynamic Quantisation Stuck .  #128

@NitinAggarwal1

Description

@NitinAggarwal1

Hi Team ,

I am trying to convert a BERT Seq2Seq Model into Intel Neural Compressor ( Its an ONNX Model).

It is stuck at the stage after which no quantisation is happening . I tried the relative accuracy upto 80% . Please guide.

Stuck at this :

2022-11-09 06:56:20 [INFO] Get FP32 model baseline.
Accuracy is : 50.2
mean: 16.99 ms, std: 2.8 ms, min: 12.7 ms, max: 33.83 ms, median: 16.31 ms, 95p: 21.73 ms, 99p: 26.9 ms
2022-11-09 06:56:58 [INFO] Save tuning history to /home/jupyter/spell_projects/nc_workspace/2022-11-09_06-56-07/./history.snapshot.
2022-11-09 06:56:58 [INFO] FP32 baseline is: [Accuracy: 50.2000, Duration (seconds): 38.4878]

No update after this .

YAML Property File

version: 1.0

model: # mandatory. used to specify model specific information.
name: bert
framework: onnxrt_integerops # mandatory. possible values are tensorflow, mxnet, pytorch, pytorch_ipex, onnxrt_integerops and onnxrt_qlinearops.

quantization:
approach: post_training_dynamic_quant # optional. default value is post_training_static_quant.

tuning:
accuracy_criterion:
relative: 0.80 # optional. default value is relative, other value is absolute. this example allows relative accuracy loss: 1%.
exit_policy:
timeout: 0 # optional. tuning timeout (seconds). default value is 0 which means early stop. combine with max_trials field to decide when to exit.
max_trials: 10
random_seed: 5271

def inc_quantize(MODEL_CONFIG,):

ONNX_MODEL_FILE = os.path.join(MODEL_CONFIG['ONNX_MODEL_PATH'],MODEL_CONFIG['ONNX_MODEL_FILE_NAME'])
model = onnx.load(ONNX_MODEL_FILE)

quantizer = Quantization(MODEL_CONFIG['QUANT_CONFIG'])
quantizer.cfg.tuning.accuracy_criterion.relative **= 0.8**
quantizer.model = common.Model(model)
quantizer.tokenizer = transformers.AutoTokenizer.from_pretrained(
    MODEL_CONFIG['BERT_MODEL_TYPE'])
quantizer.eval_func = run_evaluation_benchmark
q_model = quantizer()
QUANTIZED_SAVE_PATH = os.path.join(MODEL_CONFIG['ONNX_MODEL_PATH'],MODEL_CONFIG['ONNX_QUANTIZED_FILE_NAME'])
q_model.save(QUANTIZED_SAVE_PATH)
print("QUANTIZED MODEL Saved at: ",QUANTIZED_SAVE_PATH)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions