Skip to content

Conversation

mikaylagawarecki
Copy link
Contributor

@mikaylagawarecki mikaylagawarecki commented May 22, 2025

Added torch.hash_tensor reduction function with a mode argument that defaults to reduction with xor.

  • The hash is always uint64.
  • Integers will be casted to uint64 before performing the xor_sum reduction
  • Floats will be upcasted to double and then bitcasted to uint64 before performing the xor_sum reduction

We don't provide an ordering aware hash function yet because we don't have an easy way to provide a fast cuda kernel for it

Stack from ghstack (oldest at bottom):

cc @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10 @jerryzh168 @voznesenskym @penguinwu @EikanWang @Guobing-Chen @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @chenyang78 @kadeng @muchulee8 @amjames @chauhang @aakhundov @coconutruben @Lucaskabela

[ghstack-poisoned]
Copy link

pytorch-bot bot commented May 22, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/154149

Note: Links to docs will display an error until the docs builds have been completed.

⏳ 1 Pending, 1 Unrelated Failure

As of commit 5c8132c with merge base f168cf4 (image):

UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@pytorch-bot pytorch-bot bot added the module: cpu CPU specific problem (e.g., perf, algorithm) label May 22, 2025
mikaylagawarecki added a commit that referenced this pull request May 22, 2025
ghstack-source-id: 00cf75d
Pull Request resolved: #154149
@mikaylagawarecki mikaylagawarecki added module: python frontend For issues relating to PyTorch's Python frontend ciflow/trunk Trigger trunk jobs on your pull request and removed module: cpu CPU specific problem (e.g., perf, algorithm) labels May 22, 2025
Copy link
Contributor

Attention! native_functions.yaml was changed

If you are adding a new function or defaulted argument to native_functions.yaml, you cannot use it from pre-existing Python frontend code until our FC window passes (two weeks). Split your PR into two PRs, one which adds the new C++ functionality, and one that makes use of it from Python, and land them two weeks apart. See https://github.com/pytorch/pytorch/wiki/PyTorch's-Python-Frontend-Backward-and-Forward-Compatibility-Policy#forwards-compatibility-fc for more info.


Caused by:

cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 jerryzh168 albanD

[ghstack-poisoned]
@pytorch-bot pytorch-bot bot added ciflow/inductor module: cpu CPU specific problem (e.g., perf, algorithm) module: dynamo module: inductor labels May 27, 2025
cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 jerryzh168 albanD voznesenskym penguinwu EikanWang Guobing-Chen zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy chenyang78 kadeng muchulee8 amjames chauhang aakhundov

[ghstack-poisoned]
mikaylagawarecki added a commit that referenced this pull request May 27, 2025
ghstack-source-id: 0da1f03
Pull Request resolved: #154149
@mikaylagawarecki mikaylagawarecki changed the title Add basic xor_sum op Add basic torch.hash_tensor op Jun 26, 2025
mikaylagawarecki added a commit that referenced this pull request Jun 26, 2025
ghstack-source-id: 12a1271
Pull Request resolved: #154149
@mikaylagawarecki mikaylagawarecki added the release notes: python_frontend python frontend release notes category label Jun 26, 2025
@mikaylagawarecki mikaylagawarecki requested a review from albanD July 15, 2025 15:04
Added `torch.hash_tensor` reduction function with a `mode` argument that defaults to multiply/shift via range then reduction with xor.

The tensor is always viewed as int64 (applying padding as necessary) before the reduction so the result is always int64.

I chose int64 rather than uint64 because e.g. a * x + b is not implemented for uint64 in torch on cuda.




[ghstack-poisoned]
mikaylagawarecki added a commit that referenced this pull request Jul 21, 2025
ghstack-source-id: 1ed6070
Pull Request resolved: #154149
Added `torch.hash_tensor` reduction function with a `mode` argument that defaults to multiply/shift via range then reduction with xor.

The tensor is always viewed as int64 (applying padding as necessary) before the reduction so the result is always int64.

I chose int64 rather than uint64 because e.g. a * x + b is not implemented for uint64 in torch on cuda.




[ghstack-poisoned]
mikaylagawarecki added a commit that referenced this pull request Jul 22, 2025
ghstack-source-id: 25eda40
Pull Request resolved: #154149
Added `torch.hash_tensor` reduction function with a `mode` argument that defaults to multiply/shift via range then reduction with xor.

The tensor is always viewed as int64 (applying padding as necessary) before the reduction so the result is always int64.

I chose int64 rather than uint64 because e.g. a * x + b is not implemented for uint64 in torch on cuda.




[ghstack-poisoned]
mikaylagawarecki added a commit that referenced this pull request Jul 22, 2025
ghstack-source-id: 4385e53
Pull Request resolved: #154149
Added `torch.hash_tensor` reduction function with a `mode` argument that defaults to multiply/shift via range then reduction with xor.

The tensor is always viewed as int64 (applying padding as necessary) before the reduction so the result is always int64.

I chose int64 rather than uint64 because e.g. a * x + b is not implemented for uint64 in torch on cuda.




[ghstack-poisoned]
mikaylagawarecki added a commit that referenced this pull request Jul 22, 2025
ghstack-source-id: 8da7498
Pull Request resolved: #154149
Added `torch.hash_tensor` reduction function with a `mode` argument that defaults to reduction with xor.

- The hash is always uint64.
- Integers will be casted to uint64 before performing the xor_sum reduction
- Floats will be upcasted to double and then bitcasted to uint64 before performing the xor_sum reduction





[ghstack-poisoned]
mikaylagawarecki added a commit that referenced this pull request Jul 22, 2025
ghstack-source-id: 6669fda
Pull Request resolved: #154149
Added `torch.hash_tensor` reduction function with a `mode` argument that defaults to reduction with xor.

- The hash is always uint64.
- Integers will be casted to uint64 before performing the xor_sum reduction
- Floats will be upcasted to double and then bitcasted to uint64 before performing the xor_sum reduction





[ghstack-poisoned]
mikaylagawarecki added a commit that referenced this pull request Jul 22, 2025
ghstack-source-id: 7726e76
Pull Request resolved: #154149
Comment on lines +184 to +185
// return a double, otherwise uint64_t will be cast to double
// when accumulating and the result will be wrong
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ho why?

Copy link
Contributor Author

@mikaylagawarecki mikaylagawarecki Jul 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's say we have inputs

a: 3.14159, a_bits: 4614256650576692846
b: 1.61803, b_bits: 4609965778477721196
a ^ b: 9219082337818812418

If we return the result as int64_t, the next time it is used as an input, it will be cast to 9219082337818812418.0, which becomes 4890905006165143848 in bits, which would mess up the xor reduction here

Added `torch.hash_tensor` reduction function with a `mode` argument that defaults to reduction with xor.

- The hash is always uint64.
- Integers will be casted to uint64 before performing the xor_sum reduction
- Floats will be upcasted to double and then bitcasted to uint64 before performing the xor_sum reduction





[ghstack-poisoned]
mikaylagawarecki added a commit that referenced this pull request Jul 23, 2025
ghstack-source-id: 906195d
Pull Request resolved: #154149
Added `torch.hash_tensor` reduction function with a `mode` argument that defaults to reduction with xor.

- The hash is always uint64.
- Integers will be casted to uint64 before performing the xor_sum reduction
- Floats will be upcasted to double and then bitcasted to uint64 before performing the xor_sum reduction





[ghstack-poisoned]
mikaylagawarecki added a commit that referenced this pull request Jul 23, 2025
ghstack-source-id: fadf8f5
Pull Request resolved: #154149
Added `torch.hash_tensor` reduction function with a `mode` argument that defaults to reduction with xor.

- The hash is always uint64.
- Integers will be casted to uint64 before performing the xor_sum reduction
- Floats will be upcasted to double and then bitcasted to uint64 before performing the xor_sum reduction





[ghstack-poisoned]
mikaylagawarecki added a commit that referenced this pull request Jul 23, 2025
ghstack-source-id: 9ac76da
Pull Request resolved: #154149
Copy link
Collaborator

@albanD albanD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me!

@mikaylagawarecki
Copy link
Contributor Author

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

yangw-dev pushed a commit that referenced this pull request Aug 1, 2025
Added `torch.hash_tensor` reduction function with a `mode` argument that defaults to reduction with xor.

- The hash is always uint64.
- Integers will be casted to uint64 before performing the xor_sum reduction
- Floats will be upcasted to double and then bitcasted to uint64 before performing the xor_sum reduction

Pull Request resolved: #154149
Approved by: https://github.com/albanD
@github-actions github-actions bot deleted the gh/mikaylagawarecki/313/head branch August 24, 2025 02:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ciflow/inductor ciflow/trunk Trigger trunk jobs on your pull request Merged module: cpu CPU specific problem (e.g., perf, algorithm) module: dynamo module: inductor release notes: python_frontend python frontend release notes category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants