[nativert] Downcast triton double arguments to floats #166620

minjang · 2025-10-30T07:35:47Z

This diff tries to fix a limitation in Sigmoid + Triton interaction, where float arguments are not correctly passed. NativeRT passes float arguments as double, while triton kernels were reading as a float, resulting in wrong values.

Limitations in (de)seriazliation

In triton, float arguments to a kernel are encoded as "fp32" (code):

        elif isinstance(arg, float):
            return ("fp32", None)

But it seems like that torch export serde uses double (code) because Thrift only has the double type:

union Argument {
  10: bool as_none;
  20: TensorArgument as_tensor;
  30: list<TensorArgument> as_tensors;
  50: i64 as_int;
  70: list<i64> as_ints;
  80: double as_float;   ===> actually double
...

TritonKernel constructor loads attributes from a node, where Constant represents the variant type. And it only has double (code):

using Constant = std::variant<
    None,
    int64_t,
    std::vector<int64_t>,
    double,    ===> triton float is loaded as double

So, NativeRT passes float arguments (originally in Triton) as double to triton kernels. But, all of the triton backends (nvidia, amd and cpu) are reading them as float because the signature still says fp32.

D84423898 was the current workaround: wrapping float arguments with tensors.

The Fix

Fixing the thrift definition isn't viable because Thrift only supports double type. It's also possible to fix on the triton side: it can downcast from double to float. But I needed to fix all backends.

Instead, I think this diff would be the most effective way: when building TritonKernel, have downcasted float values, right after loading double arguments.

Test Plan:

buck test fbcode//mode/opt-amd-gpu fbcode//caffe2/test:test_export --

Differential Revision: D85747160

pytorch-bot · 2025-10-30T07:35:51Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/166620

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 578ce43 with merge base 3206677 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

meta-codesync · 2025-10-30T07:35:57Z

@minjang has exported this pull request. If you are a Meta employee, you can view the originating Diff in D85747160.

minjang · 2025-10-30T17:27:15Z

@pytorchbot label "topic: not user facing"

XueningXu

LGTM!

Summary: This diff tries to fix a limitation in Sigmoid + Triton interaction, where float arguments are not correctly passed. NativeRT passes float arguments as double, while triton kernels were reading as a float, resulting in wrong values. --- ## Limitations in (de)seriazliation In triton, float arguments to a kernel are encoded as "fp32" ([code](https://github.com/triton-lang/triton-cpu/blob/main-merged/python/triton/runtime/jit.py#L310-L326)): ``` elif isinstance(arg, float): return ("fp32", None) ``` But it seems like that torch export serde uses double ([code](https://github.com/pytorch/pytorch/blob/d2eff5d454ab2cb0a5ccdfb5eb6e7d6dcc75e097/torch/_export/serde/export_schema.thrift#L149)) because Thrift only has the double type: ``` union Argument { 10: bool as_none; 20: TensorArgument as_tensor; 30: list<TensorArgument> as_tensors; 50: i64 as_int; 70: list<i64> as_ints; 80: double as_float; ===> actually double ... ``` `TritonKernel` constructor loads attributes from a node, where `Constant` represents the variant type. And it only has `double` ([code](https://github.com/pytorch/pytorch/blob/d2eff5d454ab2cb0a5ccdfb5eb6e7d6dcc75e097/torch/nativert/graph/Graph.h#L86)): ``` using Constant = std::variant< None, int64_t, std::vector<int64_t>, double, ===> triton float is loaded as double ``` So, NativeRT passes float arguments (originally in Triton) as double to triton kernels. But, all of the triton backends (nvidia, amd and cpu) are reading them as float because the signature still says `fp32`. D84423898 was the current workaround: wrapping float arguments with tensors. ## The Fix Fixing the thrift definition isn't viable because Thrift only supports double type. It's also possible to fix on the triton side: it can downcast from double to float. But I needed to fix all backends. Instead, I think this diff would be the most effective way: when building `TritonKernel`, have downcasted float values, right after loading double arguments. Test Plan: ``` buck test fbcode//mode/opt-amd-gpu fbcode//sigmoid/inference/test:test_passes buck test fbcode//mode/opt-amd-gpu fbcode//caffe2/test:test_export -- ``` Reviewed By: XueningXu Differential Revision: D85747160

facebook-github-bot · 2025-10-31T03:38:04Z

@pytorchbot merge

(Initiating merge automatically since Phabricator Diff has merged)

pytorchmergebot · 2025-10-31T03:39:54Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorchmergebot · 2025-10-31T03:40:03Z

Merge failed

Reason: This PR has internal changes and must be landed via Phabricator! Please try reimporting/rexporting the PR!

Details for Dev Infra team

Raised by workflow job

XueningXu · 2025-10-31T03:44:52Z

@pytorchbot merge

pytorchmergebot · 2025-10-31T03:46:37Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

This diff tries to fix a limitation in Sigmoid + Triton interaction, where float arguments are not correctly passed. NativeRT passes float arguments as double, while triton kernels were reading as a float, resulting in wrong values. --- ## Limitations in (de)seriazliation In triton, float arguments to a kernel are encoded as "fp32" ([code](https://github.com/triton-lang/triton-cpu/blob/main-merged/python/triton/runtime/jit.py#L310-L326)): ``` elif isinstance(arg, float): return ("fp32", None) ``` But it seems like that torch export serde uses double ([code](https://github.com/pytorch/pytorch/blob/d2eff5d454ab2cb0a5ccdfb5eb6e7d6dcc75e097/torch/_export/serde/export_schema.thrift#L149)) because Thrift only has the double type: ``` union Argument { 10: bool as_none; 20: TensorArgument as_tensor; 30: list<TensorArgument> as_tensors; 50: i64 as_int; 70: list<i64> as_ints; 80: double as_float; ===> actually double ... ``` `TritonKernel` constructor loads attributes from a node, where `Constant` represents the variant type. And it only has `double` ([code](https://github.com/pytorch/pytorch/blob/d2eff5d454ab2cb0a5ccdfb5eb6e7d6dcc75e097/torch/nativert/graph/Graph.h#L86)): ``` using Constant = std::variant< None, int64_t, std::vector<int64_t>, double, ===> triton float is loaded as double ``` So, NativeRT passes float arguments (originally in Triton) as double to triton kernels. But, all of the triton backends (nvidia, amd and cpu) are reading them as float because the signature still says `fp32`. D84423898 was the current workaround: wrapping float arguments with tensors. ## The Fix Fixing the thrift definition isn't viable because Thrift only supports double type. It's also possible to fix on the triton side: it can downcast from double to float. But I needed to fix all backends. Instead, I think this diff would be the most effective way: when building `TritonKernel`, have downcasted float values, right after loading double arguments. Test Plan: ``` buck test fbcode//mode/opt-amd-gpu fbcode//caffe2/test:test_export -- ``` Differential Revision: D85747160 Pull Request resolved: #166620 Approved by: https://github.com/XueningXu

meta-codesync bot added fb-exported meta-exported labels Oct 30, 2025

pytorch-bot bot added the topic: not user facing topic category label Oct 30, 2025

minjang requested a review from XueningXu October 30, 2025 17:33

XueningXu approved these changes Oct 30, 2025

View reviewed changes

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Oct 30, 2025

minjang force-pushed the export-D85747160 branch from 3c5e332 to 91a84b5 Compare October 30, 2025 17:40

minjang force-pushed the export-D85747160 branch from 91a84b5 to 216f9b0 Compare October 30, 2025 23:08

minjang force-pushed the export-D85747160 branch from 216f9b0 to 8befb05 Compare October 30, 2025 23:57

minjang force-pushed the export-D85747160 branch from 8befb05 to 578ce43 Compare October 31, 2025 01:32

pytorchmergebot added the merging label Oct 31, 2025

pytorchmergebot removed the merging label Oct 31, 2025

pytorchmergebot added the merging label Oct 31, 2025

pytorchmergebot added the Merged label Oct 31, 2025

pytorchmergebot closed this in 85b035c Oct 31, 2025

pytorchmergebot removed the merging label Oct 31, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[nativert] Downcast triton double arguments to floats #166620

[nativert] Downcast triton double arguments to floats #166620

minjang commented Oct 30, 2025 •

edited

Loading

Uh oh!

pytorch-bot bot commented Oct 30, 2025 •

edited

Loading

Uh oh!

meta-codesync bot commented Oct 30, 2025

Uh oh!

minjang commented Oct 30, 2025

Uh oh!

XueningXu left a comment

Uh oh!

facebook-github-bot commented Oct 31, 2025

Uh oh!

pytorchmergebot commented Oct 31, 2025

Uh oh!

pytorchmergebot commented Oct 31, 2025

Uh oh!

XueningXu commented Oct 31, 2025

Uh oh!

pytorchmergebot commented Oct 31, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[nativert] Downcast triton double arguments to floats #166620

[nativert] Downcast triton double arguments to floats #166620

Conversation

minjang commented Oct 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Limitations in (de)seriazliation

The Fix

Uh oh!

pytorch-bot bot commented Oct 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/166620

✅ No Failures

Uh oh!

meta-codesync bot commented Oct 30, 2025

Uh oh!

minjang commented Oct 30, 2025

Uh oh!

XueningXu left a comment

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot commented Oct 31, 2025

Uh oh!

pytorchmergebot commented Oct 31, 2025

Merge started

Uh oh!

pytorchmergebot commented Oct 31, 2025

Merge failed

Uh oh!

XueningXu commented Oct 31, 2025

Uh oh!

pytorchmergebot commented Oct 31, 2025

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

minjang commented Oct 30, 2025 •

edited

Loading

pytorch-bot bot commented Oct 30, 2025 •

edited

Loading