Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bigcache short description #122

Closed
mangalaman93 opened this issue Feb 27, 2019 · 8 comments
Closed

Bigcache short description #122

mangalaman93 opened this issue Feb 27, 2019 · 8 comments

Comments

@mangalaman93
Copy link

We are working on an article about the current state of concurrent cache implementations in Go. We are planning to include bigcache with following short description. Please let me know if my understanding is correct, or you would like to modify it.

BigCache divides the data into shards based on the hash of the key. Each shard
contains a map and a ring buffer. Whenever a new element is set, it appends the
element in the ring buffer of the corresponding shard and the offset into the
buffer is stored in the map. If the same element is set more than once, the
previous entries in the buffer are marked invalid. If the buffer is too small,
it is expanded until the maximum capacity is reached.

The map stores data from uint32 to uint32. Each key is hashed into a uint32
that becomes the key of the map. The value of the map points to the offset
into the buffer where value is stored along with metadata information. If
there are hash collisions, BigCache ignores the previous key and stores the
current one into the map.
@cristaloleg
Copy link
Collaborator

LGTM, but pinging original authors to clarify this @druminski @janisz @mxplusb

@janisz
Copy link
Collaborator

janisz commented Feb 27, 2019

LGTM 👍

@siennathesane
Copy link
Collaborator

LGTM 👍

@larytet
Copy link

larytet commented Aug 5, 2019

I would add more information regarding the collisions. I had to dig in the code to understand how the collisions are handled. I think that I understand the trade-offs of handling collisions. I am not sure every application can ignore hash function collisions. Probably most applications can.

I think that it is not hard to allow collisions (different key, same hash of the key) of the hash function. For example, if a call to Set() detects a collision I can increment the hash of the key and try to insert the hash again. I can try it a few times before I return a failure from the Set() or resize the map. In Get() I run a similar trial process: compare the requested and found keys until there is a match.

@cristaloleg
Copy link
Collaborator

@janisz ? :)

@janisz
Copy link
Collaborator

janisz commented Aug 6, 2019

@larytet what it the question? I don't get it?

We do not handle collisions at all see:

func TestHashCollision(t *testing.T) {

Your aproach is valid and popular in hashmaps implementations (see wiki). Let's imagine we get the same hash for every entry (the worst case scenario) then our complexity is O(n) when n is the number of existing entries, we need to iterate over whole map to find matching element. We can improve it by using tree instead of map but then we need to have a way to compare elements.

Does it answer your question?

@larytet
Copy link

larytet commented Aug 6, 2019

@larytet what it the question? I don't get it?

We do not handle collisions at all see:

Does it answer your question?

My comment was about "Bigcache short description" (README?). I suggested to mention that the implementation ignores hash collisions.

I find lack of any support surprising. It is not hard to provide some mechanism which reduces significantly the probability of collisions. The complexity will remain O(1) if the linear probing is limited by a small number of slots.

In my naivety I expected the Set() to fail if there is a different key with the same hash already stored in the cache. This is not the case. For some applications it will be a show stopper.

janisz added a commit to janisz/bigcache that referenced this issue Aug 7, 2019
@janisz
Copy link
Collaborator

janisz commented Aug 7, 2019

@larytet Oh, thanks for calrifiction. Let's create new task to fix it. I updated readme for now #151

flisky pushed a commit to flisky/bigcache that referenced this issue May 7, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants