The int8wo benchmarks are consuming a lot of time in CI run, and gets stuck on auto-tune. Will need to investigate it more to understand the issue. Reproduced in: https://github.com/pytorch/ao/pull/3347