Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about the hashtable.Map growThreshold #79

Closed
woodliu opened this issue Apr 11, 2024 · 2 comments
Closed

Question about the hashtable.Map growThreshold #79

woodliu opened this issue Apr 11, 2024 · 2 comments
Labels
enhancement New feature or request

Comments

@woodliu
Copy link

woodliu commented Apr 11, 2024

Here define a growThreshold, if the total nodes in the map table above this value, it needs to resize the map table.
growThreshold defined as growThreshold := float64(tableLen) * bucketSize * loadFactor , here tableLen (len(t.buckets))is the buckets chain number(t.buckets[i] is a chain), not the total nodes number, so we can't compare the chain number with nodes number like:
if t.sumSize() > int64(growThreshold)

Am i right?

@woodliu woodliu added the enhancement New feature or request label Apr 11, 2024
@maypok86
Copy link
Owner

maypok86 commented Apr 12, 2024

Yes, we cannot compare them with the same idea, as for example in slices. It's not really the growth of the hash table, but the growth in the number of buckets. Why is this necessary, because items can continue to be added to the chain indefinitely? Because the search time complexity will be O(n) in this case, but we want to support search and insertion for amortized O(1).

This check is necessary to determine that there are too many items in the hash table for the current number of buckets. If there are too many items in the hash table, then you need to increase the number of buckets and recalculate the hashes of the items. This allows you to maintain O(1) time complexity. Also, when rehashing, a new seed is selected and the hashes of the items are recalculated, which means that the attacker needs to do the almost impossible things to carry out a hash collision attack.

@woodliu
Copy link
Author

woodliu commented Apr 15, 2024

Yes, we cannot compare them with the same idea, as for example in slices. It's not really the growth of the hash table, but the growth in the number of buckets. Why is this necessary, because items can continue to be added to the chain indefinitely? Because the search time complexity will be O(n) in this case, but we want to support search and insertion for amortized O(1).

This check is necessary to determine that there are too many items in the hash table for the current number of buckets. If there are too many items in the hash table, then you need to increase the number of buckets and recalculate the hashes of the items. This allows you to maintain O(1) time complexity. Also, when rehashing, a new seed is selected and the hashes of the items are recalculated, which means that the attacker needs to do the almost impossible things to carry out a hash collision attack.

I see, so this is a strategy to lower the time complexity by in crease the number of buckets and descrease the length of chain when there are many nodes.
This is smart!

@woodliu woodliu closed this as completed Apr 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants