Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support hashtables with more than 2G elements #26

Closed
Bodigrim opened this issue Mar 13, 2024 · 4 comments
Closed

Support hashtables with more than 2G elements #26

Bodigrim opened this issue Mar 13, 2024 · 4 comments
Assignees

Comments

@Bodigrim
Copy link
Contributor

Some users run Haskell on machines with over 1TB memory, it's not impossible to have over 2G elements of hashtable.

primes = UI.fromList [
3, 7, 11, 17, 23, 29, 37, 47, 59, 71, 89, 107, 131, 163, 197, 239, 293, 353, 431, 521, 631,
761, 919, 1103, 1327, 1597, 1931, 2333, 2801, 3371, 4049, 4861, 5839, 7013, 8419, 10103, 12143,
14591, 17519, 21023, 25229, 30293, 36353, 43627, 52361, 62851, 75431, 90523, 108631, 130363,
156437, 187751, 225307, 270371, 324449, 389357, 467237, 560689, 672827, 807403, 968897,
1162687, 1395263, 1674319, 2009191, 2411033, 2893249, 3471899, 4166287, 4999559, 5999471,
7199369, 8639249, 10367101, 12440537, 14928671, 17914409, 21497293, 25796759, 30956117,
37147349, 44576837, 53492207, 64190669, 77028803, 92434613, 110921543, 133105859, 159727031,
191672443, 230006941, 276008387, 331210079, 397452101, 476942527, 572331049, 686797261,
824156741, 988988137, 1186785773, 1424142949, 1708971541, 2050765853 ]

What's the principle behind the choice of prime numbers here? How can it be extended further, to cover entire Int64 range?

@klapaucius
Copy link
Owner

the principle is:

[3, 7 ... primeNumber_n, primeNumber_n+k > 1.2*primeNumber_n ...

so it can easily be extended given a sequence of prime numbers

@klapaucius
Copy link
Owner

Something like this

Prelude Math.NumberTheory.Primes Data.List> takeWhile (\x -> x < fromIntegral(maxBound :: Int)) . map unPrime $ iterate (\p -> nextPrime (ceiling $ 1.2 * fromIntegral (unPrime p))) (nextPrime 2050765853)
[2050765853,2460919037,2953102909,3543723499,4252468211,5102961869,6123554263,7348265123,8817918149,10581501797,12697802207,15237362659,18284835221,21941802293,26330162767,31596195347,37915434433,45498521329,54598225597,65517870827,78621445013,94345734019,113214880859,135857857043,163029428461,195635314181,234762377023,281714852449,338057822959,405669387577,486803265103,584163918127,700996701773,841196042177,1009435250623,1211322300767,1453586761009,1744304113229,2093164935877,2511797923093,3014157507727,3616989009287,4340386811153,5208464173387,6250157008087,7500188409719,9000226091663,10800271310027,12960325572053,15552390686477,18662868823783,22395442588543,26874531106253,32249437327559,38699324793071,46439189751707,55727027702123,66872433242599,80246919891143,96296303869429,115555564643329,138666677571997,166400013086401,199680015703691,239616018844471,287539222613371,345047067136049,414056480563273,496867776675943,596241332011217,715489598413487,858587518096247,1030305021715513,1236366026058641,1483639231270373,1780367077524493,2136440493029393,2563728591635279,3076474309962373,3691769171954869,4430123006345843,5316147607615027,6379377129138049,7655252554965709,9186303065958853,11023563679150639,13228276414980799,15873931697977019,19048718037572429,22858461645086941,27430153974104333,32916184768925261,39499421722710373,47399306067252449,56879167280702959,68255000736843607,81906000884212361,98287201061054833,117944641273265887,141533569527919211,169840283433503041,203808340120203719,244570008144244481,293484009773093401,352180811727712129,422616974073254567,507140368887905497,608568442665486697,730282131198584087,876338557438300843,1051606268925960971,1261927522711153157,1514313027253383839,1817175632704060721,2180610759244872743,2616732911093847187,3140079493312616483,3768095391975139873,4521714470370167831,5426057364444201011,6511268837333041219,7813522604799648773]

but probably fewer than that because each element is a bunch of words

@klapaucius klapaucius self-assigned this Mar 13, 2024
@Bodigrim
Copy link
Contributor Author

the principle is:

[3, 7 ... primeNumber_n, primeNumber_n+k > 1.2*primeNumber_n ...

so it can easily be extended given a sequence of prime numbers

Thanks for the explanation! So fundamentally there is enough freedom to select a prime around +20%.

I have an idea to choose primes such that division by them (which is ubiquitous throughout all routines) can be replaced by wide multiplication and right shift, but have not worked out all details yet.

@Bodigrim
Copy link
Contributor Author

Fixed by #27.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants