Fix NestedTensor max/min operations for integer dtypes. #162273

adabeyta · 2025-09-05T16:35:14Z

Fixes: #162049

Summary

max_dim and min_dim functions incorrectly used torch.finfo()
for all dtypes, causing TypeError for integer tensors.

Changes

Use torch.iinfo() for integer dtypes instead of torch.finfo().
Add CPU test: test_jagged_max_min_dtypes covering int8, int16, int32, int64, uint8, float16, bfloat16, float32 and float64

Testing

Before Fix:

python -m pytest test/test_nestedtensor.py -k "test_jagged_max_min_dtypes" -v

Output:

FAILED [0.0006s] test/test_nestedtensor.py::TestNestedTensorDeviceTypeCPU::test_jagged_max_min_dtypes_cpu_bfloat16 - TypeError: torch.finfo() requires a floating point input type. Use torch.iinfo to handle 'torch.finfo'
FAILED [0.0006s] test/test_nestedtensor.py::TestNestedTensorDeviceTypeCPU::test_jagged_max_min_dtypes_cpu_float16 - TypeError: torch.finfo() requires a floating point input type. Use torch.iinfo to handle 'torch.finfo'
FAILED [0.0006s] test/test_nestedtensor.py::TestNestedTensorDeviceTypeCPU::test_jagged_max_min_dtypes_cpu_float32 - TypeError: torch.finfo() requires a floating point input type. Use torch.iinfo to handle 'torch.finfo'
FAILED [0.0006s] test/test_nestedtensor.py::TestNestedTensorDeviceTypeCPU::test_jagged_max_min_dtypes_cpu_float64 - TypeError: torch.finfo() requires a floating point input type. Use torch.iinfo to handle 'torch.finfo'
FAILED [0.0006s] test/test_nestedtensor.py::TestNestedTensorDeviceTypeCPU::test_jagged_max_min_dtypes_cpu_int16 - TypeError: torch.finfo() requires a floating point input type. Use torch.iinfo to handle 'torch.finfo'
FAILED [0.0005s] test/test_nestedtensor.py::TestNestedTensorDeviceTypeCPU::test_jagged_max_min_dtypes_cpu_int32 - TypeError: torch.finfo() requires a floating point input type. Use torch.iinfo to handle 'torch.finfo'
FAILED [0.0005s] test/test_nestedtensor.py::TestNestedTensorDeviceTypeCPU::test_jagged_max_min_dtypes_cpu_int64 - TypeError: torch.finfo() requires a floating point input type. Use torch.iinfo to handle 'torch.finfo'
FAILED [0.0004s] test/test_nestedtensor.py::TestNestedTensorDeviceTypeCPU::test_jagged_max_min_dtypes_cpu_int8 - TypeError: torch.finfo() requires a floating point input type. Use torch.iinfo to handle 'torch.finfo'
FAILED [0.0004s] test/test_nestedtensor.py::TestNestedTensorDeviceTypeCPU::test_jagged_max_min_dtypes_cpu_uint8 - TypeError: torch.finfo() requires a floating point input type. Use torch.iinfo to handle 'torch.finfo'

After Fix:

python -m pytest test/test_nestedtensor.py -k "test_jagged_max_min_dtypes" -v

Output:

Running 9 items in this shard

test/test_nestedtensor.py::TestNestedTensorDeviceTypeCPU::test_jagged_max_min_dtypes_cpu_bfloat16 PASSED [0.0086s]                                                                                                                   [ 11%]
test/test_nestedtensor.py::TestNestedTensorDeviceTypeCPU::test_jagged_max_min_dtypes_cpu_float16 PASSED [0.0011s]                                                                                                                    [ 22%]
test/test_nestedtensor.py::TestNestedTensorDeviceTypeCPU::test_jagged_max_min_dtypes_cpu_float32 PASSED [0.0011s]                                                                                                                    [ 33%]
test/test_nestedtensor.py::TestNestedTensorDeviceTypeCPU::test_jagged_max_min_dtypes_cpu_float64 PASSED [0.0011s]                                                                                                                    [ 44%]
test/test_nestedtensor.py::TestNestedTensorDeviceTypeCPU::test_jagged_max_min_dtypes_cpu_int16 PASSED [0.0009s]                                                                                                                      [ 55%]
test/test_nestedtensor.py::TestNestedTensorDeviceTypeCPU::test_jagged_max_min_dtypes_cpu_int32 PASSED [0.0010s]                                                                                                                      [ 66%]
test/test_nestedtensor.py::TestNestedTensorDeviceTypeCPU::test_jagged_max_min_dtypes_cpu_int64 PASSED [0.0010s]                                                                                                                      [ 77%]
test/test_nestedtensor.py::TestNestedTensorDeviceTypeCPU::test_jagged_max_min_dtypes_cpu_int8 PASSED [0.0010s]                                                                                                                       [ 88%]
test/test_nestedtensor.py::TestNestedTensorDeviceTypeCPU::test_jagged_max_min_dtypes_cpu_uint8 PASSED [0.0011s]                                                                                                                       [100%]

pytorch-bot · 2025-09-05T16:35:22Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/162273

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit 5bc1829 with merge base e1bd5b6 ():

FLAKY - The following job failed but was likely due to flakiness present on trunk:

trunk / macos-py3-arm64 / test (default, 1, 3, macos-m1-stable) (gh) (similar failure)
RuntimeError: inductor/test_aten_comm_compute_reordering 1/1 failed!

This comment was automatically generated by Dr. CI and updates every 15 minutes.

adabeyta · 2025-09-05T16:36:53Z

@pytorchbot label "topic: not user facing"

Callidior · 2025-09-09T08:23:43Z

Thanks for taking the time to work on this!
I think there are some more affected functions: amin_default, amax_default, argmin_default. argmax_default.
Does it make sense to fix these as well in the scope of this PR?

…e test coverage

adabeyta · 2025-09-15T17:58:59Z

Thanks for taking the time to work on this! I think there are some more affected functions: amin_default, amax_default, argmin_default. argmax_default. Does it make sense to fix these as well in the scope of this PR?

Thanks @Callidior for the suggestion, since this directly involves additional affected functions (amin/amax/argmin/argmax) I went ahead and added a fix for them in this PR as well and added some UT's for verification.

@Skylion007 Thank you for taking a look and please let me know if you see anything that needs attention.

isuruf · 2025-09-19T13:30:14Z

@adabeyta the tests for amin fails now

adabeyta · 2025-09-29T21:35:58Z

@adabeyta the tests for amin fails now

@isuruf Thanks for catching that! After some investigation I found it was an overflow issue for int64 cases only.

What was happening

The int64 tests were failing because of how the padding value gets handled internally. Basically, when we use the maximum int64 value as padding, it has to pass through a float64 conversion in the C++ code. But float64 can't precisely represent such a huge number, so it gets rounded and then overflows, turning into the minimum int64 value instead of the maximum. This broke all the min operations.

Fix

I've updated the code to use a padding value for int64 that won't cause this overflow (mathematical boundary of what IEEE 754 double-precision floating point can represent).

Applied this fix to all 6 affected functions (min, max, amin, amax, argmin, argmax) and all tests are passing now, including the int64 ones that were failing before.

Let me know if you'd like any other changes or have questions about the approach! @jbschlosser

isuruf · 2025-09-30T16:35:46Z

Since it needs a double, can we use float('inf') and float('-inf') as max and min?

torch/nested/_internal/ops.py

jbschlosser · 2025-09-30T16:45:10Z

torch/nested/_internal/ops.py

+_INT64_SAFE_MAX_FLOAT64 = (1 << 53) - 1
+_INT64_SAFE_MIN_FLOAT64 = -_INT64_SAFE_MAX_FLOAT64


curious if float('inf') / float('-inf') work instead as suggested by @isuruf?

Thanks for the feedback @jbschlosser and @isuruf

I've updated the PR to address the suggestions. I factored out the dtype logic into a _get_padding_value() helper function.

Regarding float('inf')/float('-inf'):
I tested this approach but it causes overflow errors when the padding values need to be converted back to int64 type. The test failures showed:

RuntimeError: value cannot be converted to type int64_t without overflow

I've kept the safe int64 values (1 << 53) - 1 and -(1 << 53) for now. Let me know if you have any other suggestions to try out in its place.

The test failures showed:

RuntimeError: value cannot be converted to type int64_t without overflow

We could check for infinite values before

pytorch/aten/src/ATen/native/nested/NestedTensorTransformerFunctions.cpp

Line 273 in bac0f28

Tensor padded = values.new_full(padded_shape, padding_value);

and change the value to the maximum or minimum value of the dtype.

Hm I think it's a bit inefficient to check all values, so I'm good with the current approach, even if it's non-ideal. thanks for looking into it!

Not all values, just the padding_value which is a double.

My concern is that if you do

import torch x = torch.nested.nested_tensor( [torch.arange(0, n) - 2**60 for n in (10, 20, 30)], layout=torch.jagged, ) print(x.max(dim=1).values)

now you get wrong results because of this workaround.

okay I agree this is a problem - there's a large range of big int64 values for which the results will be wrong, not just a single edge case value. I think we need a bit more exploration @adabeyta

@adabeyta would you be able to explore that option?

I guess in a new PR, as the merge went through :p

jbschlosser · 2025-10-02T15:35:10Z

@pytorchbot merge

pytorchmergebot · 2025-10-02T15:37:39Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

jbschlosser · 2025-10-02T18:27:35Z

@pytorchbot merge cancel

pytorch-bot · 2025-10-02T18:27:37Z

❌ 🤖 pytorchbot command failed:

@pytorchbot: error: argument command: invalid choice: 'cancel' (choose from 'merge', 'revert', 'rebase', 'label', 'drci', 'cherry-pick')

usage: @pytorchbot [-h] {merge,revert,rebase,label,drci,cherry-pick} ...

Try @pytorchbot --help for more info.

Fix nestedtensor max/min ops for int dtypes.

edc6a24

pytorch-bot bot added the topic: not user facing topic category label Sep 5, 2025

Skylion007 approved these changes Sep 5, 2025

View reviewed changes

pytorchbot added the open source label Sep 5, 2025

Fix NestedTensor amin/amax/argmin/argmax for integer dtypes, add dtyp…

34d3938

…e test coverage

jbschlosser self-requested a review September 22, 2025 17:55

jbschlosser added topic: improvements topic category release notes: nested tensor Changes that have a direct impact on nested tensors and removed topic: not user facing topic category labels Sep 22, 2025

adabeyta added 2 commits September 29, 2025 14:16

Merge branch 'pytorch:main' into aabeyta/nestedtensor_min_max_dint

0d744de

Fix int64 overflow in NestedTensor min/max operations

bf4f8d0

jbschlosser reviewed Sep 30, 2025

View reviewed changes

Refactor dtype padding logic into _get_padding_value helper function

5bc1829

adabeyta requested a review from jbschlosser October 1, 2025 23:31

jbschlosser approved these changes Oct 2, 2025

View reviewed changes

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Oct 2, 2025

pytorchmergebot added the merging label Oct 2, 2025

pytorchmergebot added the Merged label Oct 2, 2025

pytorchmergebot closed this in 6a31f42 Oct 2, 2025

pytorchmergebot removed the merging label Oct 2, 2025

		_INT64_SAFE_MAX_FLOAT64 = (1 << 53) - 1
		_INT64_SAFE_MIN_FLOAT64 = -_INT64_SAFE_MAX_FLOAT64

Fix NestedTensor max/min operations for integer dtypes. #162273

Fix NestedTensor max/min operations for integer dtypes. #162273

Uh oh!

Conversation

adabeyta commented Sep 5, 2025

Summary

Changes

Testing

Uh oh!

pytorch-bot bot commented Sep 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/162273

✅ You can merge normally! (1 Unrelated Failure)

Uh oh!

adabeyta commented Sep 5, 2025

Uh oh!

Callidior commented Sep 9, 2025

Uh oh!

adabeyta commented Sep 15, 2025

Uh oh!

isuruf commented Sep 19, 2025

Uh oh!

adabeyta commented Sep 29, 2025

What was happening

Fix

Uh oh!

isuruf commented Sep 30, 2025

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jbschlosser commented Oct 2, 2025

Uh oh!

pytorchmergebot commented Oct 2, 2025

Merge started

Uh oh!

jbschlosser commented Oct 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Oct 2, 2025

Uh oh!

Uh oh!

pytorch-bot bot commented Sep 5, 2025 •

edited

Loading

jbschlosser commented Oct 2, 2025 •

edited

Loading