Skip to content

Quant tensorrt engine don't achieve advantage in inference speed over fp16 on A100 #3

@bigmover

Description

@bigmover

I replace the unet.trt10.0.1.6.post12.dev1.engine built from Build the TRT engine for the INT8 Quantized ONNX UNet to engine/unet.trt10.0.1.plan . AND rerun the textdemo(python demo_txt2img.py "enchanted winter forest, soft diffuse light on a snow-filled day, serene nature scene, the forest is illuminated by the snow" --negative-prompt "normal quality, low quality, worst quality, low res, blurry, nsfw, nude" --scheduler Euler --denoising-steps 30 --seed 2946901).

Is there something I missed or what mistake I made? Any apply will be appreciated.

fp16
image
image

int8
image
image

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions