Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potential race condition with insert-wait-get #57

Open
blind-oracle opened this issue Oct 16, 2023 · 2 comments
Open

Potential race condition with insert-wait-get #57

blind-oracle opened this issue Oct 16, 2023 · 2 comments

Comments

@blind-oracle
Copy link

blind-oracle commented Oct 16, 2023

Hi folks, not sure it's a bug but I can't figure out otherwise.

We got a cache based on stretto (sync API) in our service with simple semantics:

  • try_insert_with_ttl() then wait()
  • get() to fetch the value

I have a test that does simple insert/wait/get sequence to check that given entry exists in cache and in our CI/CD (bazel) this test sometimes fails - get() reports that the key is missing. Problem is that I cannot reproduce this locally - it has 100% success rate even if I run it thousands of times.

I am creating a cache with a large enough max_cost and using TTL of 3600s to make sure it won't be evicted.

Would be grateful for any hint on how to debug this, maybe I'm doing something wrong. But it seems consistent with code in https://github.com/al8n/stretto/blob/main/examples/sync_example.rs

@al8n
Copy link
Owner

al8n commented Oct 17, 2023

Hi, I failed to reproduce it on my machine, but this may be because the current implementation will first push new entry to a write buffer and then add it to the map (if the write buffer is full, then some inserts will be directly dropped). Can you give me your test code to help me reproduce the problem?

@blind-oracle
Copy link
Author

@al8n Thanks for the effort. Probably my code won't help as I can't reproduce it too when running locally and not in Bazel. It's hard for me to tell how these environments are different, should be the same (and we have thousands of other tests which run fine). But there should be some subtle difference that causes this...

Effectively I'm doing the simple thing that I wrote initially - insert the value with some key, then immediately (well, after wait) check if it's there. And this sometimes gives me a cache miss. When the value is inserted the cache is empty, just created, so it shouldn't be dropped.

Maybe there's some initialization phase after the Cache object is created using CacheBuilder.finalize() (threads are spawned etc)? Though it does not explain why it does not fail locally.

I've switched the cache now to use async API of stretto, will check if that will cause same issues...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants