Skip to content

TIMM inception_v3 model fail accuracy test in MaxAutotune mode for training #132922

@shunting314

Description

@shunting314

🐛 Describe the bug

I tried this command on my dev server but can not repro:

TORCHINDUCTOR_MAX_AUTOTUNE=1 time python benchmarks/dynamo/timm_models.py --backend inductor --amp --accuracy --only inception_v3 --training

But from error message:

2024-08-04T18:30:23.2222123Z SingleProcess AUTOTUNE benchmarking takes 2.2663 seconds and 0.0012 seconds precompiling
2024-08-04T18:32:23.0572573Z W0804 18:32:23.056000 27683 torch/_logging/_internal.py:1048] [6/0] Profiler function <class 'torch.autograd.profiler.record_function'> will be ignored
2024-08-04T18:33:30.7004917Z E0804 18:33:30.699000 27683 torch/_dynamo/utils.py:1554] RMSE (res-fp64): 0.53290, (ref-fp64): 0.12166 and shape=torch.Size([]). res.dtype: torch.float32, multiplier: 3.000000, tol: 0.040000
2024-08-04T18:33:30.7081883Z fail_accuracy

from the dashboard, the erro happens for a scalar tensor! Using larger multiplier should resolve it.

Note: the model pass the accuracy test in default/CG configuration.

Error logs

.

Minified repro

No response

Versions

.

cc @ezyang @chauhang @penguinwu

Metadata

Metadata

Assignees

Labels

oncall: pt2pt2-pass-rate-regressionTrack regression of PT2 dashboard pass ratetriagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate module

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions