Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent timing issues causing inconsistencies between concurrent hash table and policy data structures #348

Merged
merged 19 commits into from
Dec 27, 2023

Conversation

tatsuya6502
Copy link
Member

@tatsuya6502 tatsuya6502 commented Nov 19, 2023

Descriptions

Fixes #345.

This pull request (PR) prevents timing issues in writes that cause inconsistencies between the cache's internal data structures such as the concurrent hash table and access-order queue.

Example

The following new test case reproduces this issue in v0.12.0 and v0.12.1:

  1. Create a Cache with max_capacity = 2 and time_to_idle = 1 second.
  2. Insert key a.
  3. Insert key b.
  4. Insert key c.
    • → When pending tasks are processed (at step 6), c will be evicted as the cache is full.
  5. Pause for 2 seconds.
    • → When pending tasks are processed (at step 6), a and b should expire.
  6. Insert key c again.
    • → The pending tasks up to step 4 should be processed implicitly because more than 0.3 seconds have passed after creating the Cache.
  7. Remove key c.
  8. Pause for 2 seconds.
  9. Call run_pending_tasks.
    • → The pending tasks from step 6 and 7 should be processed.
  10. Get the entry_count.

Expected

  • At step 7, the value inserted at step 6 should be returned.
  • At step 10, 0 should be returned.

Actual

  • At step 7, no value was returned.
  • At step 10, 1 was returned.

Cause

When processing pending tasks, Cache did not check if more than one pending tasks exist for the same key.

In the above example, key c had two pending tasks at step 6; first one from step 4 (when cache is full) and second one from step 6 (when cache has enough room).

At step 6, pending tasks up to step 4 should be processed:

  1. Write operation log for inserting key a is processed:
    • Key a is admitted and added to the access-order queue (the LRU deque).
  2. Write operation log for inserting key b is processed:
    • Key b is admitted and added to the LRU.
  3. Write operation log for inserting key c (at step 4) is processed:
    • It is evicted because the Cache is full, and removed from the concurrent hash table (CHT).
    • This removes the key-value of c (second insertion at step 6) from the CHT (wrong).
  4. Key a is considered expired:
    • It is removed from both the CHT and LRU.
  5. Key b is considered expired:
    • It is removed from both the CHT and LRU.

Then, at step 9, pending tasks from step 6 and 7 should be processed:

  1. Write operation log for inserting key c (at step 6) is processed:
    • Key "c" is admitted and added to the LRU.
    • (But there is no key-value for c in the CHT)
  2. There is no write operation log for removing key c (at step 7).
    • This is because key-value for c did not exist at step 7.
  3. Key c is considered expired:
    • Key c is supposed to be removed from both the CHT and LRU, but it will not be, because it does not exist in the CHT.

This issue was already present in v0.11.x and older versions, but were less likely to occur because the window for the timings was much smaller than v0.12.0 and v0.12.1 because these older versions used background threads to periodically process pending tasks.

Fixes

When processing write operation logs in pending tasks, check if more than one logs exist for the same key:

  • To achieve it, add two u16 counters to each key in the CHT to track the following generations (versions):
    • entry_gen: Will be incremented on every insertion, update and removal of the key.
      • The write operation log carries the entry_gen after the increment.
    • policy_gen: Will be updated on every time the write operation log is processed for the key.
  • When processing a write operation log, check if the entry_gen in the log equals to the current entry_gen of the key.
    • If yes, and the key is not admitted, remove the key from the CHT.
    • If no, and the key is not admitted, do nothing for the key. (This solves the problem in the above example)

After processing write operation logs, Cache will process other pending tasks such as evicting expired entries. This PR also incudes various small improvements in these steps by checking if each key still has pending write operation logs. This is done by calling is_dirty method, which returns true when entry_gen != policy_gen.

Checklist

Add check if entry_gen in the write operation log equals to the current entry_gen of the key:

  • future::base_cache::Internal struct
    • handle_upsert method
  • sync::base_cache::Internal struct
    • handle_upsert method

Add is_dirty checks to the following internal methods:

  • future::base_cache::Internal struct
    • evict_expired_entries_using_timers method
    • remove_expired_ao method
    • remove_expired_wo method
    • invalidate_entries method
    • evict_lru_entries method
  • sync::base_cache::Internal struct
    • evict_expired_entries_using_timers method
    • remove_expired_ao method
    • remove_expired_wo method
    • invalidate_entries method
    • evict_lru_entries method

data structures caused by timing issues

- Replace the boolean `is_dirty` with two atomic u16 counters `entry_gen` and
  `policy_gen` to track the number of times an cached entry has been updated and the
  number of times its write log has been applied to the policy data structures.
- Update an internal `handle_upsert` method to check the `entry_gen` counter when
  removing an entry from the hash table.
data structures caused by timing issues

Update the test case for checking the byte size of `EntryInfo`.
@tatsuya6502 tatsuya6502 self-assigned this Nov 19, 2023
@tatsuya6502 tatsuya6502 added the bug Something isn't working label Nov 19, 2023
@tatsuya6502 tatsuya6502 added this to the v0.12.2 milestone Nov 19, 2023
data structures caused by timing issues

- Address some corner cases.
- Add and update test cases.
src/future/cache.rs Outdated Show resolved Hide resolved
data structures caused by timing issues

Fix a test for Linux 32-bit platforms.
data structures caused by timing issues

Fix a test for Linux 32-bit platforms.
@tatsuya6502 tatsuya6502 marked this pull request as ready for review November 19, 2023 15:39
data structures caused by timing issues

Update some source code comments.
@tatsuya6502
Copy link
Member Author

To clarify the remaining tasks, added the tasks section with check boxes to the description of this PR.

@tatsuya6502
Copy link
Member Author

@peter-scholtens

FYI, I will be unable to work on Moka and other products for a couple of weeks from now. I will be in a hospital for medical treatments. I will have some chances to check GitHub Issues and Discussions from there, but will not actively respond to them.

Sorry for the inconvenience. I will resume work on this PR when I return from the hospital and everything is settled.

@peter-scholtens
Copy link
Contributor

Take your time, health is the most important thing in life.

data structures caused by timing issues

Update internal eviction methods to utilize `is_dirty()` method.
data structures caused by timing issues

Fix compile errors when only the `sync` feature is enabled by making some internal
methods available for the feature.
data structures caused by timing issues

Increment the entry generation when invalidating an entry.
data structures caused by timing issues

Update an internal `admit` method to skip if a victim entry `is_dirty`.
@tatsuya6502
Copy link
Member Author

Updated the description of this issue.

data structures caused by timing issues

Brush up the change log and source code comments.
data structures caused by timing issues

- Found a corner-case issue in the `handle_upsert` method and fix it:
  - Fix the `handle_upsert` method by checking if the `EntryInfo`s are the same from
    the `WriteOp` and the concurrent hash table.
  - Add a test case.
Copy link
Member Author

@tatsuya6502 tatsuya6502 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Merging.

@tatsuya6502 tatsuya6502 merged commit 35f4f11 into main Dec 27, 2023
38 checks passed
@tatsuya6502 tatsuya6502 deleted the fix-race-in-handle-upseart branch December 27, 2023 02:41
@Swatinem
Copy link
Contributor

FYI, I believe this fixed a long standing issue we were observing where the cache hit rate would slowly degrade over time after ~1 week of operation.

Updating to 0.12.2, the hit rate has been stable without any degradation for 2+ weeks:

image

@tatsuya6502
Copy link
Member Author

Thank you for reporting. That is a great news!

I didn't realize it until you mentioned it, but they should be related. The entries affected by this bug are counted in the cache usage even though they have no value (thus cache misses). Since they will never be evicted, the hit rate gradually decreases as the cache is occupied by these broken entries.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Entry count never reaches zero even when calling run_pending_tasks() in a loop [moka future 0.12]
3 participants