-
Notifications
You must be signed in to change notification settings - Fork 553
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
container: Introduce chunked_hash_map #17182
Conversation
Naming: highly recommend |
46fd9b4
to
7b1794d
Compare
|
Failure seems to be a dupe of: #17205 |
this is neat! |
Given the intent is to use as widely as possible maybe the name of “fmap” is better. For commonly used classes, small tokens is better. |
7b1794d
to
f36fba5
Compare
While I generally agree with this idea (e.g.: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, a question and FYI the cover letter needs updating from the new name
f36fba5
to
637d715
Compare
637d715
to
de57e0e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, I think we need to fix the oss build
cmake/dependencies.cmake
Outdated
fetch_dep(unordered_dense | ||
REPO https://github.com/redpanda-data/unordered_dense | ||
TAG 9338f301522a965309ecec58ce61f54a52fb5c22 | ||
SOURCE_DIR redpanda_build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this directory doesn't exist in the repo copy paste error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also you need to add this to MakeAvailable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah brilliant, yeah I was still experimenting with this (== guessing).
de57e0e
to
660c5ad
Compare
660c5ad
to
7ac5c05
Compare
7ac5c05
to
88a5f5c
Compare
@@ -1,7 +1,7 @@ | |||
find_package(Roaring REQUIRED) | |||
v_cc_library( | |||
NAME container | |||
DEPS Roaring::roaring |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FAILED: lib/libv_v_io.so
: && /vectorized/llvm/bin/clang++ -fPIC -fPIC -march=westmere -fuse-ld=lld -Wno-unused-lambda-capture -Wno-unused-command-line-argument -stdlib=libc++ -fsanitize=undefined -fsanitize=address -fstack-protector-strong -fno-plt -fstack-clash-protection -O0 -g -gdwarf-4 -gsplit-dwarf -Wl,--build-id=sha1 -L/vectorized/lib -L/vectorized/lib64 -stdlib=libc++ -fsanitize=undefined -fsanitize=address -Wl,-z,relro,-z,now -shared -Wl,-soname,libv_v_io.so -o lib/libv_v_io.so src/v/io/CMakeFiles/v_io.dir/persistence.cc.o src/v/io/CMakeFiles/v_io.dir/page.cc.o src/v/io/CMakeFiles/v_io.dir/page_set.cc.o src/v/io/CMakeFiles/v_io.dir/clang-tidy-helper.cc.o src/v/io/CMakeFiles/v_io.dir/logger.cc.o src/v/io/CMakeFiles/v_io.dir/io_queue.cc.o src/v/io/CMakeFiles/v_io.dir/scheduler.cc.o -Wl,-rpath,/vectorized/lib64 /vectorized/lib64/libseastar.so /vectorized/lib64/libboost_program_options.so.1.82.0 /vectorized/lib64/libboost_thread.so.1.82.0 /vectorized/lib64/libboost_atomic.so.1.82.0 /vectorized/lib64/libboost_chrono.so.1.82.0 /vectorized/lib64/libboost_date_time.so.1.82.0 /vectorized/lib64/libboost_container.so.1.82.0 /vectorized/lib64/libboost_exception.a /vectorized/lib64/libcares.so /vectorized/lib64/libcryptopp.so /vectorized/lib64/libfmtd.so.8.1.1 -Wl,--as-needed /vectorized/lib64/liblz4.so -fsanitize=address -fsanitize=undefined -fno-sanitize=vptr /vectorized/lib64/libabsl_cord.so.2308.0.0 /vectorized/lib64/libabsl_cordz_info.so.2308.0.0 /vectorized/lib64/libabsl_cord_internal.so.2308.0.0 /vectorized/lib64/libabsl_cordz_functions.so.2308.0.0 /vectorized/lib64/libabsl_cordz_handle.so.2308.0.0 /vectorized/lib64/libabsl_crc_cord_state.so.2308.0.0 /vectorized/lib64/libabsl_crc32c.so.2308.0.0 /vectorized/lib64/libabsl_crc_internal.so.2308.0.0 /vectorized/lib64/libabsl_crc_cpu_detect.so.2308.0.0 /vectorized/lib64/libabsl_str_format_internal.so.2308.0.0 /vectorized/lib64/libabsl_raw_hash_set.so.2308.0.0 /vectorized/lib64/libabsl_hash.so.2308.0.0 /vectorized/lib64/libabsl_bad_optional_access.so.2308.0.0 /vectorized/lib64/libabsl_city.so.2308.0.0 /vectorized/lib64/libabsl_bad_variant_access.so.2308.0.0 /vectorized/lib64/libabsl_low_level_hash.so.2308.0.0 /vectorized/lib64/libabsl_hashtablez_sampler.so.2308.0.0 /vectorized/lib64/libabsl_exponential_biased.so.2308.0.0 /vectorized/lib64/libabsl_synchronization.so.2308.0.0 /vectorized/lib64/libabsl_stacktrace.so.2308.0.0 /vectorized/lib64/libabsl_graphcycles_internal.so.2308.0.0 /vectorized/lib64/libabsl_kernel_timeout_internal.so.2308.0.0 /vectorized/lib64/libabsl_time.so.2308.0.0 /vectorized/lib64/libabsl_civil_time.so.2308.0.0 /vectorized/lib64/libabsl_time_zone.so.2308.0.0 /vectorized/lib64/libabsl_symbolize.so.2308.0.0 /vectorized/lib64/libabsl_strings.so.2308.0.0 /vectorized/lib64/libabsl_string_view.so.2308.0.0 /vectorized/lib64/libabsl_throw_delegate.so.2308.0.0 /vectorized/lib64/libabsl_strings_internal.so.2308.0.0 /vectorized/lib64/libabsl_int128.so.2308.0.0 /vectorized/lib64/libabsl_debugging_internal.so.2308.0.0 /vectorized/lib64/libabsl_malloc_internal.so.2308.0.0 /vectorized/lib64/libabsl_demangle_internal.so.2308.0.0 /vectorized/lib64/libabsl_base.so.2308.0.0 /vectorized/lib64/libabsl_raw_logging_internal.so.2308.0.0 /vectorized/lib64/libabsl_log_severity.so.2308.0.0 /vectorized/lib64/libabsl_spinlock_wait.so.2308.0.0 /vectorized/lib64/libroaring.so -lunordered_dense && :
ld.lld: error: unable to find library -lunordered_dense
internal build is failing because cluster
doesn't depend on v::container
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does it try to link against unordered_dense though, it's a header-only library.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh because you put it in deps, it will try and link against it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe trying adding unordered_dense::unordered_dense
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh that will only work for the OSS build. There is no cmake target for this in the superbuild so cmake tries to link against it, I think thats why the :: alias pattern emerged. I am sort of out of my depth here, I am not a CMake wizard
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thoughts: either need a conditional add library for oss or make a dummy target in vtools
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for those pointers, so I also have no idea what I am doing here but with: https://github.com/redpanda-data/vtools/compare/stephan/find-unordered-dense?expand=1
and
diff --git a/src/v/container/CMakeLists.txt b/src/v/container/CMakeLists.txt
index 7b260ee289..c524beffdf 100644
--- a/src/v/container/CMakeLists.txt
+++ b/src/v/container/CMakeLists.txt
@@ -1,7 +1,11 @@
find_package(Roaring REQUIRED)
+find_package(unordered_dense REQUIRED)
+
v_cc_library(
NAME container
- DEPS Roaring::roaring unordered_dense
+ DEPS
+ Roaring::roaring
+ unordered_dense::unordered_dense
)
add_subdirectory(tests)
it seems to work.
@dotnwat is this the right way to go?
We really need to get away from the dual build.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Everything looks ok.
When it's not a header-only library we can usually get away with a little alias hack in the open-source build like this https://github.com/redpanda-data/redpanda/blob/dev/cmake/dependencies.cmake#L135-L137.
But given it's header-only I think FindUnorderedDense.cmake is the most reasonable thing. And that looks fine.
We really need to get away from the dual build.
100%
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Noah, here it is: https://github.com/redpanda-data/vtools/pull/2594
Introduces `chunked_hash_map` which is a type alias for the segmented version of https://github.com/martinus/unordered_dense/ using our `chunked_vector` as a backing array. This allows us to use a hashmap in scenarios where the map may grow large (i.e.: scale with partition/topic counts) similar to `chunked_vector` and avoids us having to fall back to the suboptimal `absl::btree_map`. Below are some numbers from our leader balanacer benchmark comparing the different map types: | test | iterations | median | mad | min | max | allocs | tasks | inst | | - | - | - | - | - | - | - | - | - | | lb.random_generator_movement_absl_node | 149 | 3.120ms | 97.624ns | 3.120ms | 3.122ms | 0.000 | 0.000 | 44663355.3 | | lb.random_generator_movement_absl_flat | 152 | 3.090ms | 342.546ns | 3.089ms | 3.090ms | 0.000 | 0.000 | 44846754.1 | | lb.random_generator_movement_absl_btree | 84 | 8.103ms | 884.595ns | 8.102ms | 8.105ms | 0.000 | 0.000 | 57163339.9 | | lb.random_generator_movement_std_map | 117 | 5.028ms | 676.256ns | 5.027ms | 5.030ms | 0.000 | 0.000 | 53678485.9 | | lb.random_generator_movement_unordered_dense | 156 | 2.850ms | 576.237ns | 2.849ms | 2.851ms | 0.000 | 0.000 | 43115077.3 | | lb.random_generator_movement_unordered_dense_segmented | 149 | 3.111ms | 360.074ns | 3.111ms | 3.115ms | 0.000 | 0.000 | 46250454.0 | | lb.random_generator_movement_unordered_dense_segmented_frag | 151 | 3.063ms | 388.795ns | 3.063ms | 3.065ms | 0.000 | 0.000 | 45697581.1 | | lb.random_generator_movement_unordered_dense_segmented_frag_absl_hash | 139 | 3.643ms | 444.705ns | 3.642ms | 3.643ms | 0.000 | 0.000 | 46476708.7 | This patch further adds some missing typedefs to `fragmented_vector` that are needed for it to work as the backing array for `unordered_dense`.
Use a conditional hash adapter that supports `AbslHashValue` if the class provides it.
At large partition counts (50k+) the `absl::node_hash_map` backing index array would cause oversized allocs otherwise. Switch to `chunked_hash_map`.
88a5f5c
to
d6ba468
Compare
ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/46672#018e6d93-1d0c-40bc-b583-87cf7d32e2fa ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/46672#018e6d93-1d15-4eea-85f0-0401b81138a4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
public build passes: https://github.com/redpanda-data/redpanda/actions/runs/8419599090
@StephanDollberg let's update the advice on the wiki with this: https://redpandadata.atlassian.net/wiki/spaces/CORE/pages/318275653/Memory+Management+in+Redpanda |
/backport v23.3.x |
Introduces
chunked_hash_map
which is a type alias for the segmentedversion of https://github.com/martinus/unordered_dense/ using our
chunked_vector
as a backing array.This allows us to use a hashmap in scenarios where the map may grow
large (i.e.: scale with partition/topic counts) similar to
chunked_vector
and avoids us having to fall back to the suboptimal
absl::btree_map
.Below are some numbers from our leader balanacer benchmark comparing the
different map types:
Backports Required
Release Notes