Skip to content

Conversation

vtjnash
Copy link
Member

@vtjnash vtjnash commented Sep 29, 2025

Now that hashing has 3 interfaces (pointer (unsafe), array (indexable), iterable) in decreasing levels of typical optimization and performance, use those instead of making custom implementations for specific types. This automatically opts all AbstractString into fast hashing if they've correctly defined the codeunit string interface.

@oscardssmith
Copy link
Member

Won't this be a massive slowdown as it goes from hashing word at a time to byte at a time?

…rface

Now that hashing has 3 interfaces (pointer (unsafe), array (indexable),
iterable) in decreasing levels of typical optimization and performance,
use those instead of making custom implementations for specific types.
This automatically opts all AbstractString into fast hashing if they've
correctly defined the `codeunit` string interface.
@vtjnash vtjnash force-pushed the jn/rapidhash-all-the-things branch from d254cdc to 0e4af2d Compare September 29, 2025 20:19
@vtjnash vtjnash merged commit afb44f5 into master Sep 30, 2025
7 checks passed
@vtjnash
Copy link
Member Author

vtjnash commented Sep 30, 2025

No, the hash function is unchanged. The original PR adding these functions contains the benchmarks as well

@vtjnash vtjnash deleted the jn/rapidhash-all-the-things branch September 30, 2025 13:50
vtjnash added a commit that referenced this pull request Sep 30, 2025
This became unsound to use even though it was preserved "to avoid
breakage" in v1.13, since continuing to use it would give incorrect hash
results, which could result in corrupt dictionaries.

Since #59691, these broken `hash` methods can now simply be deleted as
they no longer provide any value.

It is hard to say whether this is technically breaking or not as a
change. It causes packages to go from giving subtly wrong answers (the
worst kind of wrong) to crashing in v1.13, until the offending incorrect
methods are deleted.
@adienes
Copy link
Member

adienes commented Sep 30, 2025

it is about 10x slower for big BitIntegers

julia> using BenchmarkTools, BitIntegers

# PR
julia> @btime hash(x) setup=x=rand(UInt1024)
  401.455 ns (0 allocations: 0 bytes)

# yesterday
julia> @btime hash(x) setup=x=rand(UInt1024)
  47.439 ns (0 allocations: 0 bytes)

# 1.12
julia> @btime hash(x) setup=x=rand(UInt1024)
  102.354 ns (0 allocations: 0 bytes)

xal-0 pushed a commit to xal-0/julia that referenced this pull request Sep 30, 2025
…rface (JuliaLang#59691)

Now that hashing has 3 interfaces (pointer (unsafe), array (indexable),
iterable) in decreasing levels of typical optimization and performance,
use those instead of making custom implementations for specific types.
This automatically opts all AbstractString into fast hashing if they've
correctly defined the `codeunit` string interface.
@adienes adienes added the hashing label Oct 2, 2025
vtjnash added a commit that referenced this pull request Oct 3, 2025
This became unsound to use even though it was preserved "to avoid
breakage" in v1.13, since continuing to use it would give incorrect hash
results, which could result in corrupt dictionaries.

Since #59691, these broken `hash` methods can now simply be deleted as
they no longer provide any value.

It is hard to say whether this is technically breaking or not as a
change. It causes packages to go from giving subtly wrong answers (the
worst kind of wrong) to crashing in v1.13, until the offending incorrect
methods are deleted.
vtjnash added a commit that referenced this pull request Oct 4, 2025
This became unsound to use even though it was preserved "to avoid
breakage" in v1.13, since continuing to use it would give incorrect hash
results, which could result in corrupt dictionaries and unsound
programs.

Since #59691, these broken `hash` methods in the ecosystem can now
simply be deleted as they no longer provide any value.

It is hard to say whether this is technically breaking or not as a
change. It causes packages to go from giving subtly wrong answers (the
worst kind of wrong) to crashing in v1.13, until the offending incorrect
methods are deleted.

n.b. this is expected to break several packages (notably among them,
JuliaInterpreter, which has several tests specifically just for the
existence of this global), which will require some ecosystem updates.
The update should just be to delete the offending method (they is now
redundant, undesirable, and have been giving unsafely buggy answers),
but we need to do some work to identify those places and release new
versions.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants