Skip to content

Continue migration of HCC impl to BitFields#14027

Closed
pdillinger wants to merge 7 commits intofacebook:mainfrom
pdillinger:hcc_bitfields2
Closed

Continue migration of HCC impl to BitFields#14027
pdillinger wants to merge 7 commits intofacebook:mainfrom
pdillinger:hcc_bitfields2

Conversation

@pdillinger
Copy link
Contributor

Summary: Continuing work from #13965. Here I'm migrating the "next with shift" kind of bit field and for that I've added an API for atomic additive transformations that can be combined into a single atomic update for multiple fields. (I implemented more features than needed, just in case they are needed someday and to demonstrate what is possible.)

Test Plan: BitFields unit test updated/added, existing HCC tests

Summary: Continuing work from facebook#13965. Here I'm migrating the "next with
shift" kind of bit field and for that I've added an API for atomic
additive transformations that can be combined into a single atomic
update for multiple fields. (I implemented more features than needed,
just in case they are needed and to demonstrate what is possible.)

Test Plan: BitFields unit test updated/added, existing HCC tests
@pdillinger pdillinger requested a review from xingbowang October 4, 2025 03:35
@meta-cla meta-cla bot added the CLA Signed label Oct 4, 2025
@meta-codesync
Copy link

meta-codesync bot commented Oct 4, 2025

@pdillinger has imported this pull request. If you are a Meta employee, you can view this in D83895094.

Copy link
Contributor

@xingbowang xingbowang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, this looks great. I think the refactor made the code much more readable. Thank you for improving this.

Comment on lines +821 to +826
bool IsLockedNotEnd() const {
// TODO: can the compiler optimize the naive version?
constexpr U kEndFlag = U{1} << EndFlag::kBitOffset;
constexpr U kLockedFlag = U{1} << LockedFlag::kBitOffset;
return (underlying & (kEndFlag | kLockedFlag)) == kLockedFlag;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this same to
return IsLocked() && !IsEnd();
If so, it might be more readable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang is able to optimize the more readable version but GCC is not

Comment on lines +168 to +170
// Assert no under/overflow (unless the field is at the top bits of the
// representation in U, which is allowed because it doesn't lead to
// leakage into other fields)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a little concerned about the special case handling for the top bits allowed overflow. Is there a way to separate them? E.g. one explicitly disallow all over/under flow. The other allow over/under flow with assertion of bit fields at top or bottom.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No special handling is needed in my code here to allow under/overflow on a field including the top-most bit. Because this library is explicitly for use in cases attempting extreme performance optimization, I prefer to keep as much optimization potential open to the user, at least when it keeps my implementation simpler. Users should be accustomed to the semantic of integer fields quietly dropping overflow (C/C++), so this is arguably more standard than fields for which over/underflow is forbidden.

Comment on lines +412 to +423
void Apply(AddTransform<BitFieldsT> transform, BitFieldsT* before = nullptr,
BitFieldsT* after = nullptr) {
U before_val =
Base::v_.fetch_add(transform.to_add, std::memory_order_acq_rel);
transform.AssertPreconditions(before_val);
if (before) {
before->underlying = before_val;
}
if (after) {
after->underlying = before_val + transform.to_add;
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the only difference between this and ApplyRelaxed is the memory order, maybe dedup it with mem order as argument?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do

Comment on lines +264 to +266
// that adding the value will not overflow the field (though that is
// technically allowed if this field extends to the highest bits of the
// representation). Can be combined with other additive transforms for other
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it does overflow, what will happen?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will corrupt another field (or fields), assuming there are additional fields in higher bits.

Comment on lines +587 to +588
auto transform3 = Field1::PlusTransformPromiseNoOverflow(10000U) +
Field4::MinusTransformPromiseNoUnderflow(3U);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we add a test to assert death when overflow/underflow do get triggered?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will try

pdillinger added a commit to pdillinger/rocksdb that referenced this pull request Nov 27, 2025
Summary: This change builds on facebook#14027 and facebook#13965 to complete migration
of the HyperClockCache implementation to using the hygienic BitFields API.
No semantic change in the implementation details is intended, just
greatly improving readability and safety of the code while maintaining
the same performance.

In more detail,
* Refactor the main metadata atomic for each slot in an HCC table into
SlotMeta using BitFields.
* Extended BitFields APIs with some additional features, and renamed
  BlahTransform classes to BlahTransformer to resolve potential naming
  conflicts with member functions to create them.

Test Plan: for correctness, mostly existing tests. but also added tests
for new BitFields features. I especially ran local TSAN whitebox crash
test extensively which caught a couple of refactoring errors.

For performance, I verified with release builds of cache_bench, using
default options, that there was no noticeable/consistent difference
after all these HCC migrations vs. backing them out. That test was with
GCC 11 and -O2, which is a reasonable baseline for expected compiler
optimizations.
@meta-codesync
Copy link

meta-codesync bot commented Dec 1, 2025

@pdillinger merged this pull request in 4951494.

pdillinger added a commit to pdillinger/rocksdb that referenced this pull request Dec 2, 2025
Summary: This change builds on facebook#14027 and facebook#13965 to complete migration
of the HyperClockCache implementation to using the hygienic BitFields API.
No semantic change in the implementation details is intended, just
greatly improving readability and safety of the code while maintaining
the same performance.

In more detail,
* Refactor the main metadata atomic for each slot in an HCC table into
SlotMeta using BitFields.
* Extended BitFields APIs with some additional features, and renamed
  BlahTransform classes to BlahTransformer to resolve potential naming
  conflicts with member functions to create them.

Test Plan: for correctness, mostly existing tests. but also added tests
for new BitFields features. I especially ran local TSAN whitebox crash
test extensively which caught a couple of refactoring errors.

For performance, I verified with release builds of cache_bench, using
default options, that there was no noticeable/consistent difference
after all these HCC migrations vs. backing them out. That test was with
GCC 11 and -O2, which is a reasonable baseline for expected compiler
optimizations.
meta-codesync bot pushed a commit that referenced this pull request Dec 30, 2025
Summary:
This change builds on #14027 and #13965 to complete migration
of the HyperClockCache implementation to using the hygienic BitFields API.
No semantic change in the implementation details is intended, just
greatly improving readability and safety of the code while maintaining
the same performance.

In more detail,
* Refactor the main metadata atomic for each slot in an HCC table into
SlotMeta using BitFields.
* Extended BitFields APIs with some additional features, and renamed
  BlahTransform classes to BlahTransformer to resolve potential naming
  conflicts with member functions to create them.

Pull Request resolved: #14154

Test Plan:
for correctness, mostly existing tests. but also added tests
for new BitFields features. I especially ran local TSAN whitebox crash
test extensively which caught a couple of refactoring errors.

For performance, I verified with release builds of cache_bench, using
default options, that there was no noticeable/consistent difference
after all these HCC migrations vs. backing them out. That test was with
GCC 11 and -O2, which is a reasonable baseline for expected compiler
optimizations.

Reviewed By: xingbowang

Differential Revision: D87960540

Pulled By: pdillinger

fbshipit-source-id: e0257b7fea8a5c7709daef18911959201ce4e0f3
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants