Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce custom hash table data structures. #3940

Open
wants to merge 4 commits into
base: trunk
Choose a base branch
from

Conversation

chandlerc
Copy link
Contributor

The hash table design is heavily based on Abseil's "Swiss Tables" design. It uses an array of bytes storing metadata about each entry and an array of entries where each is a pair of key and value. The metadata byte consists of 7-bits of hash of the key (distinct from the bits used to index the table), and one bit indicating the presence of a special entry -- either empty or deleted.

TODO: document the design and PR more fully

@chandlerc chandlerc requested a review from josh11b May 7, 2024 10:21
Copy link
Contributor

@josh11b josh11b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to give feedback on map.h early, before you update set.h

common/map.h Outdated Show resolved Hide resolved
common/map.h Outdated Show resolved Hide resolved
common/hashing.h Show resolved Hide resolved
common/map.h Outdated Show resolved Hide resolved
common/map.h Outdated Show resolved Hide resolved
common/map.h Outdated Show resolved Hide resolved
common/map.h Outdated Show resolved Hide resolved
common/map.h Outdated Show resolved Hide resolved
common/map.h Outdated Show resolved Hide resolved
common/map.h Outdated Show resolved Hide resolved
common/map.h Outdated Show resolved Hide resolved
common/map.h Outdated Show resolved Hide resolved
Copy link
Contributor Author

@chandlerc chandlerc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, definitely useful to get map.h tidied up before I port it to set.h. I think I've responded to all the comments there.

common/map.h Outdated Show resolved Hide resolved
common/map.h Outdated Show resolved Hide resolved
common/map.h Outdated Show resolved Hide resolved
common/map.h Outdated Show resolved Hide resolved
common/map.h Outdated Show resolved Hide resolved
common/map.h Outdated Show resolved Hide resolved
common/map.h Outdated Show resolved Hide resolved
common/map.h Outdated Show resolved Hide resolved
common/map.h Outdated Show resolved Hide resolved
common/hashing.h Show resolved Hide resolved
common/map.h Outdated Show resolved Hide resolved
common/map.h Outdated Show resolved Hide resolved
Copy link
Contributor

@josh11b josh11b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sending what comments I have so far.

common/raw_hashtable.h Outdated Show resolved Hide resolved
common/raw_hashtable.h Outdated Show resolved Hide resolved
common/raw_hashtable.h Outdated Show resolved Hide resolved
common/map.h Outdated Show resolved Hide resolved
common/raw_hashtable.h Outdated Show resolved Hide resolved
common/raw_hashtable.h Show resolved Hide resolved
common/raw_hashtable.h Show resolved Hide resolved
common/raw_hashtable.h Outdated Show resolved Hide resolved
common/raw_hashtable.h Outdated Show resolved Hide resolved
common/raw_hashtable.h Outdated Show resolved Hide resolved
common/raw_hashtable.h Outdated Show resolved Hide resolved
common/raw_hashtable.h Outdated Show resolved Hide resolved
Copy link
Contributor Author

@chandlerc chandlerc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all the feedback so far, PTAL.

common/raw_hashtable.h Show resolved Hide resolved
common/raw_hashtable.h Outdated Show resolved Hide resolved
common/raw_hashtable.h Outdated Show resolved Hide resolved
common/raw_hashtable.h Outdated Show resolved Hide resolved
common/raw_hashtable.h Outdated Show resolved Hide resolved
common/map.h Outdated Show resolved Hide resolved
common/raw_hashtable.h Outdated Show resolved Hide resolved
common/raw_hashtable.h Outdated Show resolved Hide resolved
common/raw_hashtable.h Outdated Show resolved Hide resolved
common/raw_hashtable.h Outdated Show resolved Hide resolved
Copy link
Contributor

@josh11b josh11b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finished a pass through raw_hashtable.h

common/raw_hashtable.h Outdated Show resolved Hide resolved
common/raw_hashtable.h Outdated Show resolved Hide resolved
common/raw_hashtable.h Outdated Show resolved Hide resolved
common/raw_hashtable.h Show resolved Hide resolved
common/raw_hashtable.h Outdated Show resolved Hide resolved
common/raw_hashtable.h Show resolved Hide resolved
common/raw_hashtable.h Outdated Show resolved Hide resolved
common/raw_hashtable.h Outdated Show resolved Hide resolved
common/raw_hashtable.h Outdated Show resolved Hide resolved
common/raw_hashtable.h Outdated Show resolved Hide resolved
common/raw_hashtable.h Outdated Show resolved Hide resolved
common/raw_hashtable.h Outdated Show resolved Hide resolved
common/raw_hashtable.h Outdated Show resolved Hide resolved
common/raw_hashtable.h Outdated Show resolved Hide resolved
common/raw_hashtable.h Outdated Show resolved Hide resolved
common/raw_hashtable.h Outdated Show resolved Hide resolved
common/raw_hashtable.h Outdated Show resolved Hide resolved
Copy link
Contributor Author

@chandlerc chandlerc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PTAL!

Beyond inline replies, also was able to simplify the specialization strategy for the table type and improved some other comments spotted as I was working.

Also, think this is far enough along in review I'm moving it out of draft.

common/raw_hashtable.h Outdated Show resolved Hide resolved
common/raw_hashtable.h Show resolved Hide resolved
common/raw_hashtable.h Outdated Show resolved Hide resolved
common/raw_hashtable.h Outdated Show resolved Hide resolved
common/raw_hashtable.h Outdated Show resolved Hide resolved
common/raw_hashtable.h Show resolved Hide resolved
common/raw_hashtable.h Outdated Show resolved Hide resolved
common/raw_hashtable.h Outdated Show resolved Hide resolved
common/raw_hashtable.h Outdated Show resolved Hide resolved
common/raw_hashtable.h Outdated Show resolved Hide resolved
Copy link
Contributor Author

@chandlerc chandlerc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PTAL!

Beyond inline replies, also was able to simplify the specialization strategy for the table type and improved some other comments spotted as I was working.

Also, think this is far enough along in review I'm moving it out of draft.

@chandlerc chandlerc marked this pull request as ready for review May 24, 2024 01:42
@chandlerc chandlerc requested a review from josh11b May 24, 2024 01:42
common/raw_hashtable.h Outdated Show resolved Hide resolved
common/raw_hashtable.h Show resolved Hide resolved
common/raw_hashtable.h Outdated Show resolved Hide resolved
common/raw_hashtable.h Outdated Show resolved Hide resolved
common/raw_hashtable.h Outdated Show resolved Hide resolved
common/raw_hashtable.h Outdated Show resolved Hide resolved
common/raw_hashtable.h Outdated Show resolved Hide resolved
common/raw_hashtable.h Outdated Show resolved Hide resolved
common/raw_hashtable.h Outdated Show resolved Hide resolved
common/raw_hashtable.h Outdated Show resolved Hide resolved
common/raw_hashtable.h Outdated Show resolved Hide resolved
common/raw_hashtable.h Outdated Show resolved Hide resolved
The hash table design is heavily based on Abseil's ["Swiss
Tables"][swiss-tables] design. It uses an array of bytes storing
metadata about each entry and an array of entries where each is a pair
of key and value. The metadata byte consists of 7-bits of hash of the
key (distinct from the bits used to index the table), and one bit
indicating the presence of a special entry -- either empty or deleted.

[swiss-tables]: https://abseil.io/about/design/swisstables

See the comments in `raw_hashtable.h` for a detailed overview of the
design.

Co-authored-by: josh11b <15258583+josh11b@users.noreply.github.com>
@chandlerc
Copy link
Contributor Author

So, I'm most of the way through adding really nice support for handling stateful keys like indices into vectors and such. It turned out to be necessary to even move the toolchain over to these data structures as it also lets us use something other than operator== for equality, and we need that for APFloat keys. There are a few interesting things that emerged:

  • I've ended up removing unnecessary functionality that also makes the stateful case unimplementable.

  • I've found and cleaned up a number of innocuous inconsistencies in the code.

  • I've been able to implement the test you asked for where we control the hashing and force every key to collide. I can implement even more of these if desired.

Wanted to both let you know @josh11b about my progress there, and also ask whether you'd like me to merge this into this code review or keep it as a follow-up. I can easily manage either way.

@josh11b
Copy link
Contributor

josh11b commented May 28, 2024

So, I'm most of the way through adding really nice support for handling stateful keys like indices into vectors and such. It turned out to be necessary to even move the toolchain over to these data structures as it also lets us use something other than operator== for equality, and we need that for APFloat keys. There are a few interesting things that emerged:

  • I've ended up removing unnecessary functionality that also makes the stateful case unimplementable.
  • I've found and cleaned up a number of innocuous inconsistencies in the code.
  • I've been able to implement the test you asked for where we control the hashing and force every key to collide. I can implement even more of these if desired.

Wanted to both let you know @josh11b about my progress there, and also ask whether you'd like me to merge this into this code review or keep it as a follow-up. I can easily manage either way.

I'm fine either way. If it simplifies the files haven't reviewed yet, it would be a benefit.

This allows customizing both the hash and equality comparison. It also
does so in a way that enables stateful key contexts such as indices into
a specific vector. This works by requiring the context to be packaged on
every API accepting a key. When the context is stateless and can be
default constructed, this works via a default argument. When it is
stateful, the argument must be provided explicitly.

This allows us to add a stress test that forces colliding hashes in
egregious ways, as well as allowing the staeful hashing. There are also
a number of basic cleanups to code that were uncovered while working on
this.
@chandlerc
Copy link
Contributor Author

So, I'm most of the way through adding really nice support for handling stateful keys like indices into vectors and such. It turned out to be necessary to even move the toolchain over to these data structures as it also lets us use something other than operator== for equality, and we need that for APFloat keys. There are a few interesting things that emerged:

  • I've ended up removing unnecessary functionality that also makes the stateful case unimplementable.
  • I've found and cleaned up a number of innocuous inconsistencies in the code.
  • I've been able to implement the test you asked for where we control the hashing and force every key to collide. I can implement even more of these if desired.

Wanted to both let you know @josh11b about my progress there, and also ask whether you'd like me to merge this into this code review or keep it as a follow-up. I can easily manage either way.

I'm fine either way. If it simplifies the files haven't reviewed yet, it would be a benefit.

After discussion, it seems a bit borderline with pros and cons, but on the whole probably simplifies the review. Added and pushed. I did make it a clean follow-up commit so that it is easy to see the diff: 807de9e (#3940)

common/raw_hashtable_metadata_group.h Outdated Show resolved Hide resolved
common/raw_hashtable_metadata_group.h Outdated Show resolved Hide resolved
common/raw_hashtable_metadata_group.h Outdated Show resolved Hide resolved
// trailing zero bits, AArch64 only supports counting the leading zero
// bits and requires a bit-reverse to count the trailing zero bits. Doing
// the byte-reverse approach essentially combines moving the high bit into
// the low bit and the reverse necessary for counting the zero bits.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you say how valuable this optimization is?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some context on why it is important. Let me know if you want me to try to repeat the measurements to compute the exact % difference I can see on the M1.

common/raw_hashtable_metadata_group.h Show resolved Hide resolved
Comment on lines 212 to 215
template <int N>
auto Test() const -> bool {
return mask_ & (static_cast<MaskT>(1) << N);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this function for? Seems like it should do something different when ByteEncoding is true.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was stale code from a past experiment, sorry. Removed.

common/raw_hashtable_metadata_group.h Outdated Show resolved Hide resolved
common/raw_hashtable.h Outdated Show resolved Hide resolved
EntryT* local_entries = this->entries();
const uint8_t* local_arg_metadata = arg.metadata();
const EntryT* local_arg_entries = arg.entries();
memcpy(local_metadata, local_arg_metadata, local_size);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Say that we are preserving which slot each entry is in, along with all the tombstones, instead of rehashing just the present values.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment on lines 419 to 428
KeyContextT key_context = KeyContextT())
-> std::pair<EntryT*, bool>;

// Looks up the entry in the hashtable, and if found destroys the entry and
// returns `true`. If not found, returns `false`.
//
// Does not release any memory, just leaves a tombstone behind so this entry
// cannot be found and the slot can in theory be re-used.
template <typename LookupKeyT>
auto EraseImpl(LookupKeyT lookup_key, KeyContextT key_context = KeyContextT())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason the raw versions of these operations should have default values? Seems like that would just invited accidents where the key context isn't passed through.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doh, I forgot to remove them. I had them initially to test in phases as I introduced things, and meant to go back. Thanks for catching this.

common/raw_hashtable_metadata_group.h Outdated Show resolved Hide resolved
Comment on lines 342 to 343
auto MatchEmpty() const -> MatchIndex;
auto MatchDeleted() const -> MatchIndex;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't these return MatchRange too, like Match() and MatchPresent()? Or do they only return the first match? If the latter case, please add comments to document.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed a bit, no, this is an intentionally weaker API. It opens some doors to different implementation strategies. I've tried to add comments here (and up in BitIndex and BitIndexRange) to help clarify what's going on here.

common/raw_hashtable_metadata_group.h Show resolved Hide resolved
Comment on lines 395 to 396
// They directly verify their results to help establish baseline
// functionality.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't that only be in debug mode?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, commented.

common/raw_hashtable_metadata_group.h Outdated Show resolved Hide resolved
static auto CompareEqual(MetadataGroup lhs, MetadataGroup rhs) -> bool;

// Directly verify the provided masks that will be used to build a
// `MatchIndex` or `MatchGroup`. These either `CHECK`-fail or return true.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you explain what these functions are verifying and/or what the parameters are?

MatchMask index_mask,
llvm::function_ref<auto(uint8_t byte)->bool> byte_match) const -> bool {
for (ssize_t byte_index : llvm::seq<ssize_t>(0, Size)) {
if constexpr (Size > 8) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is Size > 8 significant?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's where we switch from byte-encoding of the index to bit-encoding.

return UseSIMD ? simd_result : portable_result;
}

inline auto MetadataGroup::VerifyIndexMask(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will need to review the Verify* functions again once I know what they are supposed to do.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You suggested comments above are correct.


inline auto MetadataGroup::PortableClearDeleted() -> void {
for (uint64_t& byte_int : byte_ints) {
byte_int &= (~LSBs | byte_int >> 7);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Magic enough to merit an explanation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. It took a long explanation.


auto ClearDeleted() -> void;

auto Match(uint8_t present_byte) const -> MatchRange;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the high bit be set in present_byte?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, added a detailed comment to help there. (We check this in the implementation.)

Copy link
Contributor

@josh11b josh11b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finished a pass on raw_hashtable_metadata_group.h

// present matches which only require testing 7 bits and have a particular
// layout.

// Set the high bit of every byte to `1`. The match byte always has this bit
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what you mean by "match" byte. Do you mean each byte of "pattern"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rewrote the comment a bunch to clarify.

// MSBs of each byte. We structure this as a multiply and an add because we
// know that the add cannot carry, and this way it can be lowered using
// combined multiply-add instructions if available.
uint64_t pattern = LSBs * present_byte + MSBs;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps this should be called the "query"? or "broadcast_query"? or "query_repeated"? I'm not sure why this is called "pattern".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤷 it's the pattern we use for the parallel search.

I can rename though... query seems a bit wrong. broadcast ok?

// Set the high bit of every byte to `1`. The match byte always has this bit
// set as well, which ensures the xor below, in addition to zeroing the byte
// that matches, also clears the high bit of every byte.
uint64_t mask = byte_ints[0] | MSBs;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In what sense is this a "mask"? I would associate that term with something used to & with something else. It looks like this ultimately becomes "matches", so perhaps that would be a better name?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, I have a slightly broader definition of "mask" as more "a collection of bits with significant positions, rather than a numeric quantity or other 'integer'".

The reason for this being called "mask" is because it eventually becomes the mask input to the MatchRange. I'm not sure it ever becomes "matches"?

I can call this match_mask I guess, but it didn't seem like adding that prefix was very helpful in the context of this large function.

common/raw_hashtable_metadata_group.h Outdated Show resolved Hide resolved
common/raw_hashtable_metadata_group.h Outdated Show resolved Hide resolved
Comment on lines 836 to 838
auto match_byte_vec = vdup_n_u8(present_byte | PresentMask);
auto match_byte_cmp_vec = vceq_u8(byte_vec, match_byte_vec);
uint64_t mask = vreinterpret_u64_u8(match_byte_cmp_vec)[0];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment what these do for people who don't know all the SIMD functions?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather avoid duplicating intrinsic documentation here. I can try to add some links to the documentation, although I worry about them becoming stale.

auto match_byte_vec = vdup_n_u8(present_byte | PresentMask);
auto match_byte_cmp_vec = vceq_u8(byte_vec, match_byte_vec);
uint64_t mask = vreinterpret_u64_u8(match_byte_cmp_vec)[0];
// Mask in scalar to fold with testing for zero.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't understand what you meant by this, other than there is some advantage to switching to scalar operations over SIMD for this step.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wrote more details.

#elif CARBON_X86_SIMD_SUPPORT
// We arrange the byte vector for present bytes so that we can directly
// extract it as a mask.
result = MatchRange(_mm_movemask_epi8(byte_vec));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what this operation does, and so why you don't need to & MSBs like the other code paths.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This produces a bit-encoded mask instead of a byte-encoded mask.

This is also probably the source of me calling the backing integer storage for BitIndex and BitIndexRange a mask of some kind -- that's how x86's SIMD instructions refer to them.

// Mask in scalar to fold with testing for zero.
result = MatchRange(mask & MSBs);
#elif CARBON_X86_SIMD_SUPPORT
result = X86SIMDMatch(present_byte | PresentMask);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason this is a separate function rather than inlined here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because it has subtly different semantics from what Match needs. I wanted to keep SIMDMatch and PortableMatch exactly the same contract as Match. But for PortableMatch, we specifically don't want to allow Empty or Deleted. But on X86, we have a common implementation we want to use for all three cases.

return result;
}

inline auto MetadataGroup::SIMDMatchEmpty() const -> MatchIndex {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add and update comments on these last functions too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've propagated some, but not sure if I got everything you were looking for here.

// An index encoded as low zero bits ending in (at least) one set high bit. The
// index can be extracted by counting the low zero bits. It's presence can be
// tested directly however by checking for any zero bits. The underlying type to
// be used is provided as `MaskT` which must be an unsigned integer type.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think moving away from "Mask" terminology for this type and these values would be clearer. I think these are much more like "match (bit) sets."

Copy link
Contributor

@josh11b josh11b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect many of the comments about set.h also apply to map.h

auto Lookup(LookupKeyT lookup_key,
KeyContextT key_context = KeyContextT()) const -> LookupResult;

// Run the provided callback for every key in the set.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the signature of this callback? Less critical here than for MapView.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a constraint that checks the signature -- is that enough? It's not the most readable, but seems a bit more semantically unambiguous.

auto operator[](LookupKeyT lookup_key) const
-> ValueT* requires(std::default_initializable<KeyContextT>);

// Run the provided callback for every key and value in the map.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please write the signature for the callback.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a constraint that checks the signature -- is that enough? It's not the most readable, but seems a bit more semantically unambiguous.

common/set.h Outdated
template <typename CallbackT>
void ForEach(CallbackT callback);

// Count the probed keys. This routine is purely informational and for use in
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Count the probed keys" doesn't provide a lot of value beyond the name of the function "CountProbedKeys". It would be more valuable to say this function is slow or to give additional clarity about what is returned.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tried to improve the comment.

// by using `MapView<const T, const V>`, and we enable conversions to more-const
// views. This mirrors the semantics of views like `std::span`.
template <typename InputKeyT, typename InputValueT,
typename InputKeyContextT = DefaultKeyContext>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should document this in this file and in set.h

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doh, done!

common/set.h Outdated Show resolved Hide resolved
common/set.h Outdated
// This data structure optimizes heavily for small key types that are cheap to
// move and even copy. Using types with large keys or expensive to copy keys may
// create surprising performance bottlenecks. A `std::string` key should be fine
// with largely small strings, but if some or many strings are large heap
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"largely" seems strange to describe something small

Suggested change
// with largely small strings, but if some or many strings are large heap
// with generally small strings, but if some or many strings are large heap

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Contributor

@josh11b josh11b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Read through set_test.cpp and started on map_test.cpp

common/set_test.cpp Outdated Show resolved Hide resolved
Comment on lines +44 to +52
auto MakeElements(RangeT&& range, RangeTs&&... ranges) {
std::vector<typename RangeT::value_type> elements;
auto add_range = [&elements](RangeT&& r) {
for (const auto&& e : r) {
elements.push_back(e);
}
};
add_range(std::forward<RangeT>(range));
(add_range(std::forward<RangeT>(ranges)), ...);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like your C++ variadics code has been influenced by Carbon's variadics.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

=]

ExpectSetElementsAre(other_s2, MakeElements(llvm::seq(1, 32)));
}

TYPED_TEST(SetTest, Move) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this test done? I don't see anything testing move.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doh, forgot to finish it. Done now.


static constexpr uint8_t PresentMask = 0b1000'0000;

static constexpr bool FastByteClear = Size == 8;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please write a comment to explain the situation with what makes this case fast.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call, this is a weird case. Wrote it up. Hopefully it makes some sense, but may also need some fine tuning.

static auto Load(const uint8_t* metadata, ssize_t index) -> MetadataGroup;
auto Store(uint8_t* metadata, ssize_t index) const -> void;

auto ClearByte(ssize_t byte_index) -> void;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please explain the requirement that FastByteClear is true.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

common/raw_hashtable_metadata_group.h Outdated Show resolved Hide resolved
Copy link
Contributor

@josh11b josh11b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've now looked at everything except for files with names containing "benchmark"

common/map_test.cpp Show resolved Hide resolved
Comment on lines 217 to 220
EXPECT_TRUE(mv.Contains(1));
EXPECT_TRUE(cmv.Contains(2));
EXPECT_TRUE(cmv2.Contains(3));
EXPECT_TRUE(cmv3.Contains(4));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also validate that these keys have the expected values?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

EXPECT_FALSE(m.Update(2, 201).is_inserted());
EXPECT_FALSE(m.Update(4, 401).is_inserted());

// Now fill up the first control group.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this comment mean? Is "control group" defined anywhere?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stale comment -- the original term for the metadata group was "control group", dating back to the original SwissTable design. Updated the comment.

Comment on lines +472 to +475
// Fill the map through a couple of growth steps, verifying at each step. Note
// that because this is a collision test, we synthesize actively harmful
// hashes in terms of collisions and so this test is essentially quadratic. We
// need to keep it relatively small.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be interested to see this test do some erasing and reinserting.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, done. I've only done this for Map because its a lot of code and doesn't seem to add value duplicated. We can also test everything a bit more nicely in map because we can track updates to keys with new values.

Comment on lines 482 to 485
for (int j : llvm::seq(1, i)) {
SCOPED_TRACE(llvm::formatv("Assert key: {0}", j).str());
ASSERT_EQ(j * 100, *m[j]);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this instead of ExpectMapElementsAre() like you do below?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eh, this is code copied around a bit, and I think originally predated the helper. Sure, switched to just use the helper here.

Map<ssize_t, int, 0, IndexKeyContext<TestData>> m;

EXPECT_FALSE(m.Contains(42, key_context));
EXPECT_TRUE(m.Insert(1, 100, key_context).is_inserted());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using * 100 for both the key and value here -- perhaps choose a different value or different value for keys?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, let's go with 100000 for keys. Mostly want them to be easy to seee.

Co-authored-by: josh11b <15258583+josh11b@users.noreply.github.com>
common/raw_hashtable_metadata_group.h Outdated Show resolved Hide resolved
common/raw_hashtable_metadata_group.h Outdated Show resolved Hide resolved
common/raw_hashtable_metadata_group.h Outdated Show resolved Hide resolved
common/raw_hashtable_metadata_group.h Outdated Show resolved Hide resolved
common/raw_hashtable_metadata_group.h Outdated Show resolved Hide resolved
common/raw_hashtable_metadata_group.h Outdated Show resolved Hide resolved
common/raw_hashtable_metadata_group.h Outdated Show resolved Hide resolved
common/raw_hashtable_metadata_group.h Outdated Show resolved Hide resolved
common/raw_hashtable_metadata_group.h Outdated Show resolved Hide resolved
common/raw_hashtable_metadata_group.h Outdated Show resolved Hide resolved
Copy link
Contributor Author

@chandlerc chandlerc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I've responded to all of the comments. Tried to make all the improvements I could based on them, and huge thanks for all the suggested comments and help crafting better documentation throughout!

auto operator*() -> ssize_t& {
CARBON_DCHECK(mask_ != 0) << "Cannot get an index from a zero mask!";
__builtin_assume(mask_ != 0);
index_ = BitIndexT(mask_).index();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(as discussed) we return it on the next line, and we have to return a reference as this is an iterator, so we need to store the index in this iterator to provide the target of that reference. I'll add a comment.

ExpectSetElementsAre(other_s2, MakeElements(llvm::seq(1, 32)));
}

TYPED_TEST(SetTest, Move) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doh, forgot to finish it. Done now.

Comment on lines +44 to +52
auto MakeElements(RangeT&& range, RangeTs&&... ranges) {
std::vector<typename RangeT::value_type> elements;
auto add_range = [&elements](RangeT&& r) {
for (const auto&& e : r) {
elements.push_back(e);
}
};
add_range(std::forward<RangeT>(range));
(add_range(std::forward<RangeT>(ranges)), ...);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

=]

auto Lookup(LookupKeyT lookup_key,
KeyContextT key_context = KeyContextT()) const -> LookupResult;

// Run the provided callback for every key in the set.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a constraint that checks the signature -- is that enough? It's not the most readable, but seems a bit more semantically unambiguous.

// by using `MapView<const T, const V>`, and we enable conversions to more-const
// views. This mirrors the semantics of views like `std::span`.
template <typename InputKeyT, typename InputValueT,
typename InputKeyContextT = DefaultKeyContext>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doh, done!

// Mask in scalar to fold with testing for zero.
result = MatchRange(mask & MSBs);
#elif CARBON_X86_SIMD_SUPPORT
result = X86SIMDMatch(present_byte | PresentMask);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because it has subtly different semantics from what Match needs. I wanted to keep SIMDMatch and PortableMatch exactly the same contract as Match. But for PortableMatch, we specifically don't want to allow Empty or Deleted. But on X86, we have a common implementation we want to use for all three cases.

#elif CARBON_X86_SIMD_SUPPORT
// We arrange the byte vector for present bytes so that we can directly
// extract it as a mask.
result = MatchRange(_mm_movemask_epi8(byte_vec));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This produces a bit-encoded mask instead of a byte-encoded mask.

This is also probably the source of me calling the backing integer storage for BitIndex and BitIndexRange a mask of some kind -- that's how x86's SIMD instructions refer to them.

return result;
}

inline auto MetadataGroup::SIMDMatchEmpty() const -> MatchIndex {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've propagated some, but not sure if I got everything you were looking for here.

common/set.h Outdated
template <typename CallbackT>
void ForEach(CallbackT callback);

// Count the probed keys. This routine is purely informational and for use in
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tried to improve the comment.

common/set.h Outdated
// This data structure optimizes heavily for small key types that are cheap to
// move and even copy. Using types with large keys or expensive to copy keys may
// create surprising performance bottlenecks. A `std::string` key should be fine
// with largely small strings, but if some or many strings are large heap
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants