revert d5ca53c (pytorch#46097). The changes only affect ROCm. Reverts a work-around for a compiler performance issue that is no longer needed.
`python -m pt.cat_test --tag_filter all --device cuda`
```
OLD Forward Execution Time (us) : 48.833
NEW Forward Execution Time (us) : 8.318
OLD Forward Execution Time (us) : 54.508
NEW Forward Execution Time (us) : 23.824
OLD Forward Execution Time (us) : 52.117
NEW Forward Execution Time (us) : 14.942
OLD Forward Execution Time (us) : 98.790
NEW Forward Execution Time (us) : 74.334
OLD Forward Execution Time (us) : 102.063
NEW Forward Execution Time (us) : 76.008
OLD Forward Execution Time (us) : 167.786
NEW Forward Execution Time (us) : 123.679
OLD Forward Execution Time (us) : 98.320
NEW Forward Execution Time (us) : 67.436
OLD Forward Execution Time (us) : 91.484
NEW Forward Execution Time (us) : 59.230
OLD Forward Execution Time (us) : 109.569
NEW Forward Execution Time (us) : 76.557
OLD Forward Execution Time (us) : 106.603
NEW Forward Execution Time (us) : 87.635
OLD Forward Execution Time (us) : 106.693
NEW Forward Execution Time (us) : 88.902
OLD Forward Execution Time (us) : 110.881
NEW Forward Execution Time (us) : 94.361
OLD Forward Execution Time (us) : 122.925
NEW Forward Execution Time (us) : 123.046
OLD Forward Execution Time (us) : 272.442
NEW Forward Execution Time (us) : 271.932
OLD Forward Execution Time (us) : 457.329
NEW Forward Execution Time (us) : 456.767
OLD Forward Execution Time (us) : 117.688
NEW Forward Execution Time (us) : 87.133
OLD Forward Execution Time (us) : 873.764
NEW Forward Execution Time (us) : 865.075
OLD Forward Execution Time (us) : 1746.831
NEW Forward Execution Time (us) : 1730.252
OLD Forward Execution Time (us) : 2619.303
NEW Forward Execution Time (us) : 2598.717
OLD Forward Execution Time (us) : 52.063
NEW Forward Execution Time (us) : 7.904
OLD Forward Execution Time (us) : 52.275
NEW Forward Execution Time (us) : 8.118
OLD Forward Execution Time (us) : 51.896
NEW Forward Execution Time (us) : 7.938
OLD Forward Execution Time (us) : 51.745
NEW Forward Execution Time (us) : 7.922
OLD Forward Execution Time (us) : 52.575
NEW Forward Execution Time (us) : 13.299
OLD Forward Execution Time (us) : 52.090
NEW Forward Execution Time (us) : 8.015
```
Pull Request resolved: pytorch#74129
Approved by: https://github.com/ngimel