Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Drop evicted entries immediately #169

Merged
merged 14 commits into from
Jul 24, 2022
Merged

Conversation

tatsuya6502
Copy link
Member

@tatsuya6502 tatsuya6502 commented Jul 21, 2022

Changes

This PR makes sync and future caches to drop the value part of evicted entries immediately. It calls the flush method of crossbeam_epoch::Guard when necessary.

There is no change in the public API.

Background

moka::cht uses crossbeam-epoch's defer_unchecked method, which takes a closure as the argument, to drop evicted entries. However, to improve the throughput, crossbeam-epoch will stash these deferred functions away in thread local storage until it gets an enough number of them (62 for each thread). They do not get executed until they are pushed to the global queue.

So, if client has N threads calling Moka's write methods (insert, get_with, invalidate, etc.), crossbeam-epoch can delay dropping evicted/invalidated entries up to N * (62 - 1) entries.

This PR ensures that these deferred functions will be executed as soon as possible by calling the flush method of crossbeam_epoch::Guard in a timely manner.

- Call the `flush` method of `crossbeam_epoch::Guard` when necessary.
- Temporary add a crate feature `flush` to enable and disable calling the `flush`.
@tatsuya6502 tatsuya6502 self-assigned this Jul 21, 2022
@tatsuya6502 tatsuya6502 added the enhancement New feature or request label Jul 21, 2022
@tatsuya6502 tatsuya6502 added this to the v0.9.3 milestone Jul 21, 2022
- Compile the test program only when `sync` feature is enabled.
- Use `AtomicU32` instead of `AtomicU64` in the test program because some target
  platforms do not support `AtomicU64`.
Avoid compile errors in the test program when `sync` feature is disabled.
Call the `flush` method of `crossbeam_epoch::Guard` when necessary.
clippy 0.1.63 (efd358333ac 2022-07-16)
- Add unit tests.
- Update the entry_lifecycle example.
Remove the temporary crate feature `flush` to make the calls on the `flush`
method of `crossbeam_epoch::Guard` always enabled.
Remove the temporary crate feature `flush`. (Forgot to check Cargo.toml in)
Remove a temporary example program.
Attempt to stabilize a test `drop_value_immediately_after_eviction` for
`sync::SegmentedCache`.
Attempt to stabilize a test `drop_value_immediately_after_eviction` for
`sync::SegmentedCache`.
Attempt to stabilize a test `drop_value_immediately_after_eviction` for
`future::Cache` on QEMU user mode emulators.
@tatsuya6502
Copy link
Member Author

I ran some performance tests using mokabench with ARC-S3 workload.

As expected, this change added some performance overheads:

  1. For "Sync Segmented" tests where all processor cores were saturated (100% CPU utilization), duration of the benchmarks got ~25% or ~10% longer than the one before applying this change:
    • a. Duration was ~25% longer when the cache hit ratio was ~10%.
    • b. Duration was ~10% longer when the cache hit ratio was ~65%.
  2. For "Sync Cache" and "Async Cache" tests where CPU utilization was ~60%, duration did not get longer.

I think the overheads will be acceptable for real-world workloads. Those workloads will be much lighter than both 1. and 2, so the overhead will be much smaller.

Here is a summary of when flush will be called:

When flush will be called Per entry overhead
A get_with or try_get_with on a non existing key High
B insert over an existing key High
C invalidate an existing key High
D invalidate_all Low (Called only once after processing many entries)
E Batch eviction of many entries Low (Same as above)
F Hash table expanding or shrinking Low (Same as above)

The above benchmark runs all flush cases except B and D. The benchmark 1-a runs A more often than 1-b due to low hit ratio. This is why 1-a had larger overhead than 1-b.

@tatsuya6502 tatsuya6502 marked this pull request as ready for review July 24, 2022 07:17
Copy link
Member Author

@tatsuya6502 tatsuya6502 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By some reason, this PR did not pick one last commit: b2c244d

Merging anyway.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant