How to make PTQ calibration for a Hybrid Quantization model (int8 & fp16) #3978

renshujiajia · 2024-07-03T07:33:05Z

Description

what is the right way to calibrate a hybrid quantization model ？
i built my tensorrt engine from ONNX model by the sub code, i selected the class Calibrator(trt.IInt8EntropyCalibrator2) to set the config.int8_calibrator

My hybrid-quantized super-resolution model's inference results are biased towards magenta. I have performed clipping operations; what could be the possible reason for this? Is there an issue with my calibration code? Or could it be due to a poor distribution of the calibration dataset? i am sure that my infer program is absolute right.

def build_engine_onnx(model_file, engine_file_path, min_shape, opt_shape, max_shape, calibration_stream):
    logger = trt.Logger(trt.Logger.INFO)
    builder = trt.Builder(logger)
    network = builder.create_network(1 << int(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH))
    parser = trt.OnnxParser(network, logger)
    
    config = builder.create_builder_config()
    config.set_memory_pool_limit(trt.MemoryPoolType.WORKSPACE, 2 << 30)                             # 1GB，即1024MB
    config.set_flag(trt.BuilderFlag.FP16)
    config.set_flag(trt.BuilderFlag.INT8)
    
    
    # 启用强类型匹配
    # config.set_flag(trt.BuilderFlag.GPU_FALLBACK)
    # print(dir(trt.BuilderFlag))
    
    # Add calibrator
    calibrator = Calibrator(calibration_stream, 'calibration.cache')
    config.int8_calibrator = calibrator

    with open(model_file, 'rb') as model:
        if not parser.parse(model.read()):
            for error in range(parser.num_errors):
                print(parser.get_error(error))
            return None

    profile = builder.create_optimization_profile()
    input_name = network.get_input(0).name
    
    # 设置多种输入张量维度
    # profile.set_shape(input_name, min_shape, opt_shape, max_shape)
    
    # 固定输入张量维度
    network.get_input(0).shape = fixed_shape            # 直接采用固定shape输入进行
    config.add_optimization_profile(profile)

    print(f"Building TensorRT engine from file {model_file}...")
    # engine = builder.build_engine(network, config)
    plan = builder.build_serialized_network(network, config)
    # if plan is None:
    #     raise RuntimeError("Failed to build the TensorRT engine!")

    # engine = runtime.deserialize_cuda_engine(plan)
    # print("Completed creating Engine")
    with open(engine_file_path, "wb") as f:
        f.write(bytearray(plan))
    return plan

Environment

TensorRT Version: 10.0.1

NVIDIA GPU: RTX4090

NVIDIA Driver Version: 12.0

CUDA Version: 12.0

CUDNN Version: 8.2.0

Operating System: Operating System: Linux interactive11554 5.11.0-27-generic #29 SMP Wed Aug 11 15:58:17 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

Python Version (if applicable): 3,8,19

The text was updated successfully, but these errors were encountered:

lix19937 · 2024-07-04T01:28:56Z

Try to add

profile.set_shape(input_name, opt_shape, opt_shape, opt_shape) # for fixed shape

before config.add_optimization_profile(profile)

And check your preprocess code, or try minmax calibrator.

renshujiajia · 2024-07-04T02:09:49Z

Try to add
profile.set_shape(input_name, opt_shape, opt_shape, opt_shape) # for fixed shape  
before config.add_optimization_profile(profile)

And check your preprocess code, or try minmax calibrator.

thanks alot， i will try the minmax calibrator, but isn't network.get_input(0).shape = opt_shape and profile.set_shape(input_name, opt_shape, opt_shape, opt_shape) # for fixed shape serve the same purpose? the exported model information is as follows：

 input id:  0    istis input:  True      binding name:  input    shape:  (1, 3, 4320, 7680)      type:  DataType.FLOAT
 input id:  1    istis input:  False     binding name:  output   shape:  (1, 3, 8640, 15360)     type:  DataType.FLOAT

lix19937 · 2024-07-04T11:25:48Z

If not profile.set_shape , your profile is empty. In fact, for fixed shape model, need not care optimization_profile.

network.get_input(0).shape = opt_shape
and
profile.set_shape(input_name, opt_shape, opt_shape, opt_shape)
are diff roles.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to make PTQ calibration for a Hybrid Quantization model (int8 & fp16) #3978

How to make PTQ calibration for a Hybrid Quantization model (int8 & fp16) #3978

renshujiajia commented Jul 3, 2024

lix19937 commented Jul 4, 2024 •

edited

Loading

renshujiajia commented Jul 4, 2024

lix19937 commented Jul 4, 2024

How to make PTQ calibration for a Hybrid Quantization model (int8 & fp16) #3978

How to make PTQ calibration for a Hybrid Quantization model (int8 & fp16) #3978

Comments

renshujiajia commented Jul 3, 2024

Description

Environment

lix19937 commented Jul 4, 2024 • edited Loading

renshujiajia commented Jul 4, 2024

lix19937 commented Jul 4, 2024

lix19937 commented Jul 4, 2024 •

edited

Loading