[Inductor] Remove bf16 fallback for atomic_add #167380

karthickai · 2025-11-07T23:13:40Z

Stack from ghstack (oldest at bottom):

-> [Inductor] Remove bf16 fallback for atomic_add #167380

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @chenyang78 @kadeng @muchulee8 @amjames @chauhang @aakhundov @coconutruben @mlazos

[ghstack-poisoned]

pytorch-bot · 2025-11-07T23:13:44Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/167380

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit b3e3ede with merge base 1727a71 ():

FLAKY - The following job failed but was likely due to flakiness present on trunk:

inductor / unit-test / inductor-pallas-test / test (inductor-pallas, 1, 1, linux.g5.4xlarge.nvidia.gpu) (gh) (similar failure)
Process completed with exit code 1.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

ghstack-source-id: 06b2eb8 Pull Request resolved: #167380

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy chenyang78 kadeng muchulee8 amjames chauhang aakhundov coconutruben mlazos [ghstack-poisoned]

ghstack-source-id: 253359b Pull Request resolved: #167380

shunting314 · 2025-11-12T00:10:09Z

test/inductor/test_torchinductor.py

+                "tl.atomic_add" in code[0],
+                "bf16 should generate tl.atomic_add",
+            )
+            torch.testing.assert_close(


This will always pass since result is an alias to output?

thanks for the catch, I create a new expected tensor instead of reusing output.

shunting314 · 2025-11-12T00:10:23Z

torch/_inductor/utils.py

-        and dtype == torch.bfloat16
-        and torch.cuda.is_available()
-        and torch.cuda.get_device_capability() >= (9, 0)
-        and config.bfloat16_atomic_adds_enabled


is this config still needed?

yeah, the config is not needed anymore I removed it as well as this test case (

pytorch/test/inductor/test_cuda_repro.py

Lines 2260 to 2283 in e8d411e

@skipCUDAIf(

not SM90OrLater, "uses bfloat16 atomic add instrs which requires SM >= 90"

)

@unittest.skipIf(

config.is_fbcode(),

"bfloat16 atomic add is supported in fbcode, so we won't fallback",

)

def test_index_add_fallback(self):

def f(x, y):

return torch.index_select(x, 0, y)

x = torch.randn(

2000, 384, dtype=torch.bfloat16, device="cuda", requires_grad=True

)

y = torch.ones(713268, dtype=torch.int64, device="cuda")

x_ref = x.clone().detach().requires_grad_(True)

y_ref = y.clone().detach()

out, (_, bw_code) = run_fw_bw_and_get_code(lambda: torch.compile(f)(x, y))

fc = FileCheck()

fc.check("aten.index_add")

fc.run(bw_code)

self.assertEqual(f(x_ref, y_ref), out)

)

Fixes: #97016 cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy chenyang78 kadeng muchulee8 amjames chauhang aakhundov coconutruben mlazos [ghstack-poisoned]

ghstack-source-id: d068fdd Pull Request resolved: #167380

shunting314 · 2025-11-12T18:34:13Z

torch/_inductor/utils.py

-        return False
-    else:
-        return dtype in OrderedSet([torch.int64, torch.bool, torch.bfloat16])
+    return dtype in OrderedSet([torch.int64, torch.bool])


Do we still need fallback atomic_add for bfloat16 if compute_capacility < 90?

Yes, we still need fallback. I've added check sm < (9, 0) to fallback.

Fixes: #97016 cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy chenyang78 kadeng muchulee8 amjames chauhang aakhundov coconutruben mlazos [ghstack-poisoned]

ghstack-source-id: e214b26 Pull Request resolved: #167380

mlazos · 2025-11-12T22:45:05Z

Hi why are we removing this?

This is used internally

karthickai · 2025-11-13T00:13:54Z

Hi why are we removing this?

This is used internally

We are addressing the issue (#97016). Previously, atomic_add did not support bf16, so a fallback was implemented. triton now supports bf16 atomic add, so we are removing this fallback

mlazos · 2025-11-13T04:56:36Z

Hi why are we removing this?
This is used internally

We are addressing the issue (#97016). Previously, atomic_add did not support bf16, so a fallback was implemented. triton now supports bf16 atomic add, so we are removing this fallback

Okay sounds good, you may want to ensure the config is not used internally before removing it may cause errors when landing this.

karthickai · 2025-11-13T05:11:01Z

Hi why are we removing this?
This is used internally

We are addressing the issue (#97016). Previously, atomic_add did not support bf16, so a fallback was implemented. triton now supports bf16 atomic add, so we are removing this fallback

Okay sounds good, you may want to ensure the config is not used internally before removing it may cause errors when landing this.

I did an internal code search for the config bfloat16_atomic_adds_enabled, and only the files we touched are affected. I think it’s safe to merge.

karthickai · 2025-11-13T18:31:42Z

@pytorchbot merge

pytorchmergebot · 2025-11-13T18:34:11Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

[Inductor] Remove bf16 fallback for atomic_add

41cf5fa

[ghstack-poisoned]

pytorch-bot bot added ciflow/inductor module: inductor labels Nov 7, 2025

karthickai added a commit that referenced this pull request Nov 7, 2025

[Inductor] Remove bf16 fallback for atomic_add

713961f

ghstack-source-id: 06b2eb8 Pull Request resolved: #167380

karthickai added the release notes: inductor label Nov 7, 2025

Update on "[Inductor] Remove bf16 fallback for atomic_add"

832dbbe

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy chenyang78 kadeng muchulee8 amjames chauhang aakhundov coconutruben mlazos [ghstack-poisoned]

karthickai added a commit that referenced this pull request Nov 7, 2025

[Inductor] Remove bf16 fallback for atomic_add

099b93c

ghstack-source-id: 253359b Pull Request resolved: #167380

karthickai mentioned this pull request Nov 7, 2025

[Inductor] atomic_add does not support bf16 #97016

Closed

karthickai requested review from eellison and masnesral November 7, 2025 23:31

eellison requested review from mlazos and shunting314 and removed request for eellison November 11, 2025 19:31

shunting314 reviewed Nov 12, 2025

View reviewed changes

Update on "[Inductor] Remove bf16 fallback for atomic_add"

4c7e914

Fixes: #97016 cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy chenyang78 kadeng muchulee8 amjames chauhang aakhundov coconutruben mlazos [ghstack-poisoned]

karthickai added a commit that referenced this pull request Nov 12, 2025

[Inductor] Remove bf16 fallback for atomic_add

2382139

ghstack-source-id: d068fdd Pull Request resolved: #167380

shunting314 reviewed Nov 12, 2025

View reviewed changes

Update on "[Inductor] Remove bf16 fallback for atomic_add"

b3e3ede

Fixes: #97016 cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy chenyang78 kadeng muchulee8 amjames chauhang aakhundov coconutruben mlazos [ghstack-poisoned]

karthickai added a commit that referenced this pull request Nov 12, 2025

[Inductor] Remove bf16 fallback for atomic_add

dc06b9f

ghstack-source-id: e214b26 Pull Request resolved: #167380

mlazos approved these changes Nov 13, 2025

View reviewed changes

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Nov 13, 2025

pytorchmergebot added the merging label Nov 13, 2025

shunting314 self-requested a review November 13, 2025 19:07

pytorchmergebot added the Merged label Nov 13, 2025

pytorchmergebot closed this in d8384e2 Nov 13, 2025

pytorchmergebot removed the merging label Nov 13, 2025

	@skipCUDAIf(
	not SM90OrLater, "uses bfloat16 atomic add instrs which requires SM >= 90"
	)
	@unittest.skipIf(
	config.is_fbcode(),
	"bfloat16 atomic add is supported in fbcode, so we won't fallback",
	)
	def test_index_add_fallback(self):
	def f(x, y):
	return torch.index_select(x, 0, y)

	x = torch.randn(
	2000, 384, dtype=torch.bfloat16, device="cuda", requires_grad=True
	)
	y = torch.ones(713268, dtype=torch.int64, device="cuda")
	x_ref = x.clone().detach().requires_grad_(True)
	y_ref = y.clone().detach()

	out, (_, bw_code) = run_fw_bw_and_get_code(lambda: torch.compile(f)(x, y))
	fc = FileCheck()
	fc.check("aten.index_add")
	fc.run(bw_code)

	self.assertEqual(f(x_ref, y_ref), out)

[Inductor] Remove bf16 fallback for atomic_add #167380

[Inductor] Remove bf16 fallback for atomic_add #167380

Conversation

karthickai commented Nov 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Nov 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/167380

✅ You can merge normally! (1 Unrelated Failure)

Uh oh!

shunting314 Nov 12, 2025

Choose a reason for hiding this comment

Uh oh!

karthickai Nov 12, 2025

Choose a reason for hiding this comment

Uh oh!

shunting314 Nov 12, 2025

Choose a reason for hiding this comment

Uh oh!

karthickai Nov 12, 2025

Choose a reason for hiding this comment

Uh oh!

shunting314 Nov 12, 2025

Choose a reason for hiding this comment

Uh oh!

karthickai Nov 12, 2025

Choose a reason for hiding this comment

Uh oh!

mlazos commented Nov 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

karthickai commented Nov 13, 2025

Uh oh!

mlazos commented Nov 13, 2025

Uh oh!

karthickai commented Nov 13, 2025

Uh oh!

karthickai commented Nov 13, 2025

Uh oh!

pytorchmergebot commented Nov 13, 2025

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

karthickai commented Nov 7, 2025 •

edited

Loading

pytorch-bot bot commented Nov 7, 2025 •

edited

Loading

mlazos commented Nov 12, 2025 •

edited

Loading