Skip to content
This repository was archived by the owner on Aug 1, 2025. It is now read-only.
This repository was archived by the owner on Aug 1, 2025. It is now read-only.

[inductor] Triton Auto Tuning - Long Compilation Time #1807

@anijain2305

Description

@anijain2305

Repro -
rm -rf /tmp/torchinductor_$USER; python benchmarks/dynamo/timm_models.py --training --performance --device cuda --inductor --float32 --only=coat_lite_mini

Skipping autotuning leads to 25 seconds of compilation time vs 360 seconds.

Questions

cc @jansel @ngimel @zdevito

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions