The dynamic compilation does not handle composite shapes with multiple dimensions very effectively #127162
Labels
module: dynamic shapes
oncall: pt2
triaged
This issue has been looked at a team member, and triaged and prioritized into an appropriate module
馃悰 Describe the bug
Hello, I have encountered a performance issue and would greatly appreciate any assistance. Thank you.
After compiling the unet model with
unet = torch.compile(unet, dynamic=True)
, it is capable of generating videos with multiple shapes. However, this process significantly slows down compared to testing with only one shape.There are a total of 9 shapes that I use for testing purposes(The input has NCDHW 5 dims but only changing D, H, W dim) .During the warmup phase, the model generates the largest shape. Once all 9 shapes have been generated, the model's performance slows down when returning to the largest shape, as compared to its performance during the eager phase.
Is the performance related to the cache size? Torch dynamo settings lies below
Error logs
No response
Minified repro
No response
Versions
PyTorch version: 2.3.0+cu121
Is debug build: False
CUDA used to build PyTorch: 12.1
ROCM used to build PyTorch: N/A
OS: Debian GNU/Linux 11 (bullseye) (x86_64)
GCC version: (Debian 10.2.1-6) 10.2.1 20210110
Clang version: Could not collect
CMake version: version 3.18.4
Libc version: glibc-2.31
Python version: 3.9.2 (default, Feb 28 2021, 17:03:44) [GCC 10.2.1 20210110] (64-bit runtime)
Python platform: Linux-5.15.120.bsk.2-amd64-x86_64-with-glibc2.31
Is CUDA available: True
CUDA runtime version: 12.2.140
cc @ezyang @msaroufim @bdhirsh @anijain2305 @chauhang
The text was updated successfully, but these errors were encountered: