ONNX converted Whisper model takes more than twice the VRAM of the torch version #1869
Closed
2 of 4 tasks
Labels
bug
Something isn't working
System Info
Who can help?
@michaelbenayoun
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction (minimal, reproducible, runnable)
I wrote an easily reproducible example using either ORT or torch:
with torch:
I get model_size: 6.23 GB, max_size: 6.91 GB
with onnx:
I get model_size: 11,26GB, max_size: 11,99GB
Expected behavior
The VRAM requirements using either ORT or torch should be comparable.
The text was updated successfully, but these errors were encountered: