🚨 Get rid of most Apex references#45723
Conversation
|
run-slow: kosmos2_5, longt5, mt5, pix2struct, pop2piano, switch_transformers, t5, udop, umt5 |
|
This comment contains models: ["models/kosmos2_5", "models/longt5", "models/mt5", "models/pix2struct", "models/pop2piano", "models/switch_transformers", "models/t5", "models/udop", "models/umt5"] |
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
CI ResultsCommit Info
Model CI Report❌ 1 new failed tests from this PR 😭
|
|
[For maintainers] Suggested jobs to run (before merge) run-slow: longt5, pix2struct, pop2piano, t5 |
|
run-slow: longt5, pix2struct, pop2piano, t5 |
|
This comment contains models: ["models/longt5", "models/pix2struct", "models/pop2piano", "models/t5"] |
|
cc @vasqu should be ready for review/merge! |
vasqu
left a comment
There was a problem hiding this comment.
Can you add a 🚨 to the title, while I doubt it would affect many it is still breaking
|
Done! |
We still have some references to Apex in the library. Apex was the only way to get mixed precision + fused ops with PyTorch for a while, but all of that has been folded into the main library now, so it's probably time to deprecate all of that.
The main place Apex appears is
RMSNorm- T5 and T5-derived models check if Apex is available, and replace theirRMSNormlayers withapex.FusedRMSNormif so. We drop this in this PR, and just use the base classes in all cases.We also drop the fused apex
AdamWfromtrainer_optimizer.py, since the built-in torch AdamW also has afused=Truekwarg now.Fixes #45704