Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix hash(::BigInt) on 32 bit systems #50076

Merged
merged 5 commits into from Jun 6, 2023

Conversation

oscardssmith
Copy link
Member

Theoretically fixes #50075 (CI will let us know for sure).

@oscardssmith oscardssmith added kind:bugfix This change fixes an existing bug domain:bignums BigInt and BigFloat domain:hashing backport 1.6 Change should be backported to release-1.6 backport 1.9 Change should be backported to release-1.9 labels Jun 5, 2023
@oscardssmith oscardssmith changed the title fix hash(BigInt) on 32 bit systems fix hash(::BigInt) on 32 bit systems Jun 5, 2023
@LilithHafner
Copy link
Member

Wow you're fast!

@oscardssmith
Copy link
Member Author

you're assuming that the fix is actually correct :) I am mostly just guessing.

@LilithHafner
Copy link
Member

Hmm, yeah. On 32-bit systems the pointer in a BigInt is a UInt32 pointer, so creating a UInt64 pointer and unsafe_loading from it may be broken when x.alloc == 1.

@oscardssmith
Copy link
Member Author

Yeah. I don't think this is workable. Let's eval it out then.

@oscardssmith
Copy link
Member Author

grr. The collision test fails. Any thoughts?

@LilithHafner
Copy link
Member

The (length(types) - 1) ^ 2 upper bound in the collision test is cute, but afaict, the comment explaining why it works is wrong (or maybe its too clever for me cc @vtjnash #38031). My unmerged approach when adding BigFloat tests is to replace it with a hardcoded bound, which is also not good.

The best approach would be to adjust BigInt and BigFloat hashing to reduce collisions, but we shouldn't hold these bugfixes for that.

@rfourquet
Copy link
Member

Was this broken only since 1.9? Do we know what is broken in the current code for 32 bits systems?

@LilithHafner
Copy link
Member

The following is merely speculation:

Was this broken only since 1.9?

Unlikely

Do we know what is broken in the current code for 32 bits systems?

I believe Int64 gets hashed in one go while BigInt gets hashed one UInt32 chunk at a time, so for numbers in the range typemax(Int32) < x < typemax(Int64), BigInts and Int64s are hashed differently.

@vtjnash
Copy link
Sponsor Member

vtjnash commented Jun 6, 2023

The hash range used to fold back on itself for certain types, so it is not a comment about the required guarantee but simply a statement about why it was not zero anymore and why the resulting collisions are still rare enough (and not e.g. clustered)

@oscardssmith
Copy link
Member Author

This has been broken since at least #38031 (it goes further back, but git blame is hard to track then).

@jmkuhn
Copy link
Contributor

jmkuhn commented Jun 6, 2023

This has been broken since at least #38031 (it goes further back, but git blame is hard to track then).

Testing a few old 32 bit arm builds that I have, it worked in v1.4.2 and was broken by v1.5.3.

@KristofferC KristofferC mentioned this pull request Jun 6, 2023
36 tasks
@oscardssmith oscardssmith merged commit c3ea5dc into JuliaLang:master Jun 6, 2023
6 checks passed
@oscardssmith oscardssmith deleted the 32-bit-BigInt-hash branch June 6, 2023 17:26
KristofferC pushed a commit that referenced this pull request Jun 27, 2023
* don't define hash(::BigInt) on 32 bit systems

(cherry picked from commit c3ea5dc)
KristofferC pushed a commit that referenced this pull request Jun 27, 2023
* don't define hash(::BigInt) on 32 bit systems

(cherry picked from commit c3ea5dc)
KristofferC pushed a commit that referenced this pull request Jun 27, 2023
* don't define hash(::BigInt) on 32 bit systems

(cherry picked from commit c3ea5dc)
@KristofferC KristofferC removed the backport 1.9 Change should be backported to release-1.9 label Jul 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 1.6 Change should be backported to release-1.6 domain:bignums BigInt and BigFloat domain:hashing kind:bugfix This change fixes an existing bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

hash(::BigInt) broken in 1.9 on 32-bit systems
6 participants