Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix HyperClockCache Rollback bug in #10801 #10843

Closed
wants to merge 3 commits into from

Conversation

pdillinger
Copy link
Contributor

@pdillinger pdillinger commented Oct 21, 2022

Summary: In #10801 in ClockHandleTable::Evict, we saved a reference to the hash value (const UniqueId64x2& hashed_key) instead of saving the hash value itself before marking the handle as empty and thus free for use by other threads. This could lead to Rollback seeing the wrong hash value for updating the displacements after an entry is removed.

The fix is (like other places) to copy the hash value before it's released. (We could Rollback while we own the entry, but that creates more dependences between atomic updates, because in that case, based on the code, the Rollback writes would have to happen before or after the entry is released by marking empty. By doing the relaxed Rollback after marking empty, there's more opportunity for re-ordering / ILP.)

Intended follow-up: refactoring for better code sharing in clock_cache.cc

Test Plan: watch for clean crash test, TSAN

Summary: There were some cases where we saved a reference to the hash
value instead of saving the hash value itself. This fixes by generally
doing Rollback as soon as an entry is no longer visible to Lookups,
so we don't need to copy the value for later passing to Rollback.

Test Plan: watch for clean crash test, TSAN
@facebook-github-bot
Copy link
Contributor

@pdillinger has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@siying
Copy link
Contributor

siying commented Oct 21, 2022

I don't think I'm knowledge enough to review it, without reading all the code. @anand1976 are you the person who is most familiar with the code except @pdillinger ?

@siying
Copy link
Contributor

siying commented Oct 21, 2022

Summary: In Evict, we saved a reference to the hash value instead of saving the hash value itself. This fixes by generally doing Rollback as soon as an entry is no longer visible to Lookups, so we don't need to copy the value for later passing to Rollback.

Can you explain "we saved a reference to the hash value instead of saving the hash value itself"? Do you mean std::array<uint64_t, 2> is actually a reference?

@siying
Copy link
Contributor

siying commented Oct 21, 2022

Oh "const UniqueId64x2&" is used in one of the places. Still didn't get why it would cause a data race. Can you explain more in the summary?

@facebook-github-bot
Copy link
Contributor

@pdillinger has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@pdillinger has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Copy link
Contributor

@anand1976 anand1976 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

pdillinger added a commit to pdillinger/rocksdb that referenced this pull request Oct 26, 2022
Summary: For clean-up and in preparation for some other anticipated
changes, including
* A new dynamically-scaling variant of HyperClockCache
* SecondaryCache support for HyperClockCache

This change does some refactoring for current and future code sharing
and reusability. (Including follow-up on facebook#10843)

* TBD whether new variant will be a HyperClockCache or use some other
name, so namespace is just clock_cache for the family of structures.
* A number of helper functions introduced and used.
* Pre-emptively split ClockHandle (shared among lock-free clock cache
variants) and HandleImpl (specific to a kind of Table), and introduce
template to plug new Table implementation into ClockCacheShard.

* Mostly using helper functions. Some things like `Rollback()` and
`FreeDataMarkEmpty()` were not combined because `Rollback()` is
Table-specific while `FreeDataMarkEmpty()` can be used with different
table implementations.
* Performance testing indicated that despite more opportunities for
parallelism, making a local copy of handle data for processing after
marking an entry empty was slower than doing that processing before
marking the entry empty (but after marking it "under construction"),
thus avoiding a few words of copying data. At least for now, this
answers the "TODO? Delay freeing?" questions (no).

Test Plan: minor test updates in refactoring; no functionality change

Same setup as facebook#10801:

Before: `readrandom [AVG 81 runs] : 627992 (± 5124) ops/sec`
After: `readrandom [AVG 81 runs] : 637512 (± 4866) ops/sec`

I've been getting some inconsistent results on restarts like the system
is not being fair to the two processes, so I'm not sure there's such a
real difference.
facebook-github-bot pushed a commit that referenced this pull request Nov 3, 2022
Summary:
For clean-up and in preparation for some other anticipated changes, including
* A new dynamically-scaling variant of HyperClockCache
* SecondaryCache support for HyperClockCache

This change does some refactoring for current and future code sharing and reusability. (Including follow-up on #10843)

## clock_cache.h
* TBD whether new variant will be a HyperClockCache or use some other name, so namespace is just clock_cache for the family of structures.
* A number of helper functions introduced and used.
* Pre-emptively split ClockHandle (shared among lock-free clock cache variants) and HandleImpl (specific to a kind of Table), and introduce template to plug new Table implementation into ClockCacheShard.

## clock_cache.cc
* Mostly using helper functions. Some things like `Rollback()` and `FreeDataMarkEmpty()` were not combined because `Rollback()` is Table-specific while `FreeDataMarkEmpty()` can be used with different table implementations.
* Performance testing indicated that despite more opportunities for parallelism, making a local copy of handle data for processing after marking an entry empty was slower than doing that processing before marking the entry empty (but after marking it "under construction"), thus avoiding a few words of copying data. At least for now, this answers the "TODO? Delay freeing?" questions (no).

Pull Request resolved: #10887

Test Plan:
fixed a unit testing gap; other minor test updates for refactoring

No functionality change

## Performance
Same setup as #10801:

Before: `readrandom [AVG 81 runs] : 627992 (± 5124) ops/sec`
After: `readrandom [AVG 81 runs] : 637512 (± 4866) ops/sec`

I've been getting some inconsistent results on restarts like the system is not being fair to the two processes, so I'm not sure there's such a real difference.

Reviewed By: anand1976

Differential Revision: D40959240

Pulled By: pdillinger

fbshipit-source-id: 0a8f3646b3bdb5bc7aaad60b26790b0779189949
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants