Skip to content

🚨 Get rid of most Apex references#45723

Merged
Rocketknight1 merged 2 commits intomainfrom
no_more_apex
May 1, 2026
Merged

🚨 Get rid of most Apex references#45723
Rocketknight1 merged 2 commits intomainfrom
no_more_apex

Conversation

@Rocketknight1
Copy link
Copy Markdown
Member

@Rocketknight1 Rocketknight1 commented Apr 30, 2026

We still have some references to Apex in the library. Apex was the only way to get mixed precision + fused ops with PyTorch for a while, but all of that has been folded into the main library now, so it's probably time to deprecate all of that.

The main place Apex appears is RMSNorm - T5 and T5-derived models check if Apex is available, and replace their RMSNorm layers with apex.FusedRMSNorm if so. We drop this in this PR, and just use the base classes in all cases.

We also drop the fused apex AdamW from trainer_optimizer.py, since the built-in torch AdamW also has a fused=True kwarg now.

Fixes #45704

@Rocketknight1
Copy link
Copy Markdown
Member Author

run-slow: kosmos2_5, longt5, mt5, pix2struct, pop2piano, switch_transformers, t5, udop, umt5

@github-actions
Copy link
Copy Markdown
Contributor

Workflow Run ⚙️

This comment contains run-slow, running the specified jobs:

models: ["models/kosmos2_5", "models/longt5", "models/mt5", "models/pix2struct", "models/pop2piano", "models/switch_transformers", "models/t5", "models/udop", "models/umt5"]
quantizations: []

@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@github-actions
Copy link
Copy Markdown
Contributor

CI Results

Workflow Run ⚙️

Commit Info

Context Commit Description
RUN 36c84aa1 workflow commit (merge commit)
PR ee54fd59 branch commit (from PR)
main d2b4101d base commit (on main)

Model CI Report

1 new failed tests from this PR 😭

  • switch_transformers:
    tests/models/switch_transformers/test_modeling_switch_transformers.py::SwitchTransformerModelIntegrationTests::test_small_logits (✅ ⟹ ❌)

@Rocketknight1 Rocketknight1 marked this pull request as ready for review April 30, 2026 16:04
@github-actions
Copy link
Copy Markdown
Contributor

[For maintainers] Suggested jobs to run (before merge)

run-slow: longt5, pix2struct, pop2piano, t5

@Rocketknight1
Copy link
Copy Markdown
Member Author

run-slow: longt5, pix2struct, pop2piano, t5

@github-actions
Copy link
Copy Markdown
Contributor

Workflow Run ⚙️

This comment contains run-slow, running the specified jobs:

models: ["models/longt5", "models/pix2struct", "models/pop2piano", "models/t5"]
quantizations: []

@github-actions
Copy link
Copy Markdown
Contributor

CI Results

Workflow Run ⚙️

Commit Info

Context Commit Description
RUN 068c1351 workflow commit (merge commit)
PR 1a6e46ba branch commit (from PR)
main a752ba7a base commit (on main)

✅ No failing test specific to this PR 🎉 👏 !

@Rocketknight1
Copy link
Copy Markdown
Member Author

cc @vasqu should be ready for review/merge!

Copy link
Copy Markdown
Contributor

@vasqu vasqu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a 🚨 to the title, while I doubt it would affect many it is still breaking

@Rocketknight1 Rocketknight1 changed the title Get rid of most Apex references 🚨 Get rid of most Apex references May 1, 2026
@Rocketknight1
Copy link
Copy Markdown
Member Author

Done!

@Rocketknight1 Rocketknight1 added this pull request to the merge queue May 1, 2026
Merged via the queue into main with commit 807d9d7 May 1, 2026
30 checks passed
@Rocketknight1 Rocketknight1 deleted the no_more_apex branch May 1, 2026 12:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

T5 silently uses apex.FusedRMSNorm which has a memory leak (NVIDIA/apex#1999)

3 participants