-
Notifications
You must be signed in to change notification settings - Fork 12k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[libc++] Insert new nodes at the beginning of equal range in std::unordered_multimap #104702
base: main
Are you sure you want to change the base?
[libc++] Insert new nodes at the beginning of equal range in std::unordered_multimap #104702
Conversation
|
Thank you for submitting a Pull Request (PR) to the LLVM Project! This PR will be automatically labeled and the relevant teams will be If you wish to, you can add reviewers by using the "Reviewers" section on this page. If this is not working for you, it is probably because you do not have write If you have received no comments on your PR for a week, you can request a review If you have further questions, they may be answered by the LLVM GitHub User Guide. You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums. |
…ltimap - Changed the insertion and emplacement logic in `std::unordered_multimap` to insert new nodes at the beginning of an equal range instead of the back. - This change addresses a performance issue where inserting at the back degraded performance to that of a linked list for large numbers of elements with equal keys. - The new approach improves performance significantly, fixing a bug reported in 2013 (Bug 16747 in LLVM Bugzilla) that was previously marked as WONTFIX. - Note: This change may break code that relies on the order of elements within each equal range, although such order is not guaranteed by the standard. - This new behavior mimics the insertion logic of the `libstdc++` implementation. - Added a benchmark to measure the performance improvements of the new insertion logic.
eb58296
to
7187d6c
Compare
Benchmark ResultsEnvironment:
Summary: Results before: Results after: |
|
@llvm/pr-subscribers-libcxx Author: Arvid Jonasson (arvidjonasson) ChangesSummary of Changes
Rationale for ChangeThe original bug report from 2013 was marked as WONTFIX, but I believe this change now warrants reconsideration. This fix is crucial because it adheres to the principle of least astonishment: users reasonably expect This issue was highlighted by a performance degradation encountered in an open-source library with tens of thousands of stars on GitHub. Suggesting workarounds, such as using Related Issues
Full diff: https://github.com/llvm/llvm-project/pull/104702.diff 6 Files Affected:
diff --git a/libcxx/include/__hash_table b/libcxx/include/__hash_table
index d5fbc92a3dfc4e..d4c18599c92509 100644
--- a/libcxx/include/__hash_table
+++ b/libcxx/include/__hash_table
@@ -1398,7 +1398,8 @@ __hash_table<_Tp, _Hash, _Equal, _Alloc>::__node_insert_unique(__node_pointer __
// Prepare the container for an insertion of the value __cp_val with the hash
// __cp_hash. This does a lookup into the container to see if __cp_value is
// already present, and performs a rehash if necessary. Returns a pointer to the
-// last occurrence of __cp_val in the map.
+// node before the insertion point, or nullptr if the insertion point is the
+// first node in the bucket.
//
// Note that this function does forward exceptions if key_eq() throws, and never
// mutates __value or actually inserts into the map.
@@ -1414,28 +1415,20 @@ __hash_table<_Tp, _Hash, _Equal, _Alloc>::__node_insert_multi_prepare(size_t __c
size_t __chash = std::__constrain_hash(__cp_hash, __bc);
__next_pointer __pn = __bucket_list_[__chash];
if (__pn != nullptr) {
- for (bool __found = false;
- __pn->__next_ != nullptr && std::__constrain_hash(__pn->__next_->__hash(), __bc) == __chash;
+ for (; __pn->__next_ != nullptr && std::__constrain_hash(__pn->__next_->__hash(), __bc) == __chash;
__pn = __pn->__next_) {
- // __found key_eq() action
- // false false loop
- // true true loop
- // false true set __found to true
- // true false break
- if (__found !=
- (__pn->__next_->__hash() == __cp_hash && key_eq()(__pn->__next_->__upcast()->__get_value(), __cp_val))) {
- if (!__found)
- __found = true;
- else
- break;
- }
+ // Iterate until either:
+ // __pn->__next_ is nullptr, or
+ // __pn->__next_ is in a different bucket, or
+ // __pn->__next_ has the same key as __cp_val
+ if (__pn->__next_->__hash() == __cp_hash && key_eq()(__pn->__next_->__upcast()->__get_value(), __cp_val))
+ break;
}
}
return __pn;
}
-// Insert the node __cp into the container after __pn (which is the last node in
-// the bucket that compares equal to __cp). Rehashing, and checking for
+// Insert the node __cp into the container after __pn. Rehashing, and checking for
// uniqueness has already been performed (in __node_insert_multi_prepare), so
// all we need to do is update the bucket and size(). Assumes that __cp->__hash
// is up-to-date.
diff --git a/libcxx/test/benchmarks/CMakeLists.txt b/libcxx/test/benchmarks/CMakeLists.txt
index 616cf0ff8d2374..3d51a3599a1f8a 100644
--- a/libcxx/test/benchmarks/CMakeLists.txt
+++ b/libcxx/test/benchmarks/CMakeLists.txt
@@ -168,6 +168,7 @@ set(BENCHMARK_TESTS
stringstream.bench.cpp
system_error.bench.cpp
to_chars.bench.cpp
+ unordered_multimap_operations.bench.cpp
unordered_set_operations.bench.cpp
util_smartptr.bench.cpp
variant_visit_1.bench.cpp
diff --git a/libcxx/test/benchmarks/ContainerBenchmarks.h b/libcxx/test/benchmarks/ContainerBenchmarks.h
index 744505b439985b..bcfe8c9f9e0f16 100644
--- a/libcxx/test/benchmarks/ContainerBenchmarks.h
+++ b/libcxx/test/benchmarks/ContainerBenchmarks.h
@@ -99,7 +99,21 @@ void BM_InsertValue(benchmark::State& st, Container c, GenInputs gen) {
while (st.KeepRunning()) {
c.clear();
for (auto it = in.begin(); it != end; ++it) {
- benchmark::DoNotOptimize(&(*c.insert(*it).first));
+ benchmark::DoNotOptimize(c.insert(*it));
+ }
+ benchmark::ClobberMemory();
+ }
+}
+
+template <class Container, class... GenInputs>
+void BM_EmplaceValue(benchmark::State& st, Container c, GenInputs... gen) {
+ auto num = st.range(0);
+ auto args = std::make_tuple(gen(num)...);
+ while (st.KeepRunning()) {
+ c.clear();
+ for (decltype(num) i = 0; i < num; ++i) {
+ benchmark::DoNotOptimize(
+ std::apply([&c, i](auto&&... args) -> decltype(auto) { return c.emplace(args[i]...); }, args));
}
benchmark::ClobberMemory();
}
diff --git a/libcxx/test/benchmarks/unordered_multimap_operations.bench.cpp b/libcxx/test/benchmarks/unordered_multimap_operations.bench.cpp
new file mode 100644
index 00000000000000..0d921896901c0f
--- /dev/null
+++ b/libcxx/test/benchmarks/unordered_multimap_operations.bench.cpp
@@ -0,0 +1,260 @@
+//===----------------------------------------------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include <cstdint>
+#include <cstdlib>
+#include <functional>
+#include <unordered_map>
+#include <vector>
+#include <utility>
+
+#include "benchmark/benchmark.h"
+
+#include "ContainerBenchmarks.h"
+#include "GenerateInput.h"
+#include "test_macros.h"
+
+using namespace ContainerBenchmarks;
+
+constexpr size_t TestNumInputs = 1024;
+
+template <class GenInputs>
+inline auto getRandomWithMaxCardinality(GenInputs genInputs, size_t maxCardinality) {
+ return [genInputs = std::forward<GenInputs>(genInputs), maxCardinality](size_t N) {
+ auto possibleInputs = genInputs(maxCardinality);
+ decltype(possibleInputs) inputs;
+ size_t minIdx = 0, maxIdx = possibleInputs.size() - 1;
+ for (size_t i = 0; i < N; ++i) {
+ inputs.push_back(possibleInputs[getRandomInteger(minIdx, maxIdx)]);
+ }
+ return inputs;
+ };
+}
+
+template <class GenInputsFirst, class GenInputsSecond>
+inline auto genRandomPairInputs(GenInputsFirst genFirst, GenInputsSecond genSecond) {
+ return [genFirst = std::forward< GenInputsFirst >(genFirst),
+ genSecond = std::forward< GenInputsSecond >(genSecond)](size_t N) {
+ auto first = genFirst(N);
+ auto second = genSecond(N);
+ std::vector<std::pair<typename decltype(first)::value_type, typename decltype(second)::value_type>> inputs;
+ inputs.reserve(N);
+ for (size_t i = 0; i < N; ++i) {
+ inputs.emplace_back(std::move(first[i]), std::move(second[i]));
+ }
+ return inputs;
+ };
+}
+
+//----------------------------------------------------------------------------//
+// BM_InsertValue
+// ---------------------------------------------------------------------------//
+
+BENCHMARK_CAPTURE(BM_InsertValue,
+ unordered_multimap_uint32,
+ std::unordered_multimap<uint32_t, uint32_t>{},
+ genRandomPairInputs(getRandomIntegerInputs<uint32_t>, getRandomIntegerInputs<uint32_t>))
+ ->Arg(TestNumInputs);
+
+BENCHMARK_CAPTURE(BM_InsertValue,
+ unordered_multimap_uint32_sorted,
+ std::unordered_multimap<uint32_t, uint32_t>{},
+ genRandomPairInputs(getSortedIntegerInputs<uint32_t>, getRandomIntegerInputs<uint32_t>))
+ ->Arg(TestNumInputs);
+
+// String //
+BENCHMARK_CAPTURE(BM_InsertValue,
+ unordered_multimap_string,
+ std::unordered_multimap<std::string, uint32_t>{},
+ genRandomPairInputs(getRandomStringInputs, getRandomIntegerInputs<uint32_t>))
+ ->Arg(TestNumInputs);
+
+// Prefixed String //
+BENCHMARK_CAPTURE(BM_InsertValue,
+ unordered_multimap_prefixed_string,
+ std::unordered_multimap<std::string, uint32_t>{},
+ genRandomPairInputs(getPrefixedRandomStringInputs, getRandomIntegerInputs<uint32_t>))
+ ->Arg(TestNumInputs);
+
+// Random with max cardinalities [512, 128, 32, 8, 1] //
+BENCHMARK_CAPTURE(BM_InsertValue,
+ unordered_multimap_uint32_max_cardinality_512,
+ std::unordered_multimap<uint32_t, uint32_t>{},
+ genRandomPairInputs(getRandomWithMaxCardinality(getRandomIntegerInputs<uint32_t>, 512),
+ getRandomIntegerInputs<uint32_t>))
+ ->Arg(TestNumInputs);
+
+BENCHMARK_CAPTURE(BM_InsertValue,
+ unordered_multimap_uint32_max_cardinality_128,
+ std::unordered_multimap<uint32_t, uint32_t>{},
+ genRandomPairInputs(getRandomWithMaxCardinality(getRandomIntegerInputs<uint32_t>, 128),
+ getRandomIntegerInputs<uint32_t>))
+ ->Arg(TestNumInputs);
+
+BENCHMARK_CAPTURE(BM_InsertValue,
+ unordered_multimap_uint32_max_cardinality_32,
+ std::unordered_multimap<uint32_t, uint32_t>{},
+ genRandomPairInputs(getRandomWithMaxCardinality(getRandomIntegerInputs<uint32_t>, 32),
+ getRandomIntegerInputs<uint32_t>))
+ ->Arg(TestNumInputs);
+
+BENCHMARK_CAPTURE(BM_InsertValue,
+ unordered_multimap_uint32_max_cardinality_8,
+ std::unordered_multimap<uint32_t, uint32_t>{},
+ genRandomPairInputs(getRandomWithMaxCardinality(getRandomIntegerInputs<uint32_t>, 8),
+ getRandomIntegerInputs<uint32_t>))
+ ->Arg(TestNumInputs);
+
+BENCHMARK_CAPTURE(BM_InsertValue,
+ unordered_multimap_uint32_max_cardinality_1,
+ std::unordered_multimap<uint32_t, uint32_t>{},
+ genRandomPairInputs(getRandomWithMaxCardinality(getRandomIntegerInputs<uint32_t>, 1),
+ getRandomIntegerInputs<uint32_t>))
+ ->Arg(TestNumInputs);
+
+BENCHMARK_CAPTURE(
+ BM_InsertValue,
+ unordered_multimap_string_max_cardinality_512,
+ std::unordered_multimap<std::string, uint32_t>{},
+ genRandomPairInputs(getRandomWithMaxCardinality(getRandomStringInputs, 512), getRandomIntegerInputs<uint32_t>))
+ ->Arg(TestNumInputs);
+
+BENCHMARK_CAPTURE(
+ BM_InsertValue,
+ unordered_multimap_string_max_cardinality_128,
+ std::unordered_multimap<std::string, uint32_t>{},
+ genRandomPairInputs(getRandomWithMaxCardinality(getRandomStringInputs, 128), getRandomIntegerInputs<uint32_t>))
+ ->Arg(TestNumInputs);
+
+BENCHMARK_CAPTURE(
+ BM_InsertValue,
+ unordered_multimap_string_max_cardinality_32,
+ std::unordered_multimap<std::string, uint32_t>{},
+ genRandomPairInputs(getRandomWithMaxCardinality(getRandomStringInputs, 32), getRandomIntegerInputs<uint32_t>))
+ ->Arg(TestNumInputs);
+
+BENCHMARK_CAPTURE(
+ BM_InsertValue,
+ unordered_multimap_string_max_cardinality_8,
+ std::unordered_multimap<std::string, uint32_t>{},
+ genRandomPairInputs(getRandomWithMaxCardinality(getRandomStringInputs, 8), getRandomIntegerInputs<uint32_t>))
+ ->Arg(TestNumInputs);
+
+BENCHMARK_CAPTURE(
+ BM_InsertValue,
+ unordered_multimap_string_max_cardinality_1,
+ std::unordered_multimap<std::string, uint32_t>{},
+ genRandomPairInputs(getRandomWithMaxCardinality(getRandomStringInputs, 1), getRandomIntegerInputs<uint32_t>))
+ ->Arg(TestNumInputs);
+
+//----------------------------------------------------------------------------//
+// BM_EmplaceValue
+// ---------------------------------------------------------------------------//
+
+BENCHMARK_CAPTURE(BM_EmplaceValue,
+ unordered_multimap_uint32,
+ std::unordered_multimap<uint32_t, uint32_t>{},
+ getRandomIntegerInputs<uint32_t>,
+ getRandomIntegerInputs<uint32_t>)
+ ->Arg(TestNumInputs);
+
+BENCHMARK_CAPTURE(BM_EmplaceValue,
+ unordered_multimap_uint32_sorted,
+ std::unordered_multimap<uint32_t, uint32_t>{},
+ getSortedIntegerInputs<uint32_t>,
+ getRandomIntegerInputs<uint32_t>)
+ ->Arg(TestNumInputs);
+
+// String //
+BENCHMARK_CAPTURE(BM_EmplaceValue,
+ unordered_multimap_string,
+ std::unordered_multimap<std::string, uint32_t>{},
+ getRandomStringInputs,
+ getRandomIntegerInputs<uint32_t>)
+ ->Arg(TestNumInputs);
+
+// Prefixed String //
+BENCHMARK_CAPTURE(BM_EmplaceValue,
+ unordered_multimap_prefixed_string,
+ std::unordered_multimap<std::string, uint32_t>{},
+ getPrefixedRandomStringInputs,
+ getRandomIntegerInputs<uint32_t>)
+ ->Arg(TestNumInputs);
+
+// Random with max cardinalities [512, 128, 32, 8] //
+BENCHMARK_CAPTURE(BM_EmplaceValue,
+ unordered_multimap_uint32_max_cardinality_512,
+ std::unordered_multimap<uint32_t, uint32_t>{},
+ getRandomWithMaxCardinality(getRandomIntegerInputs<uint32_t>, 512),
+ getRandomIntegerInputs<uint32_t>)
+ ->Arg(TestNumInputs);
+
+BENCHMARK_CAPTURE(BM_EmplaceValue,
+ unordered_multimap_uint32_max_cardinality_128,
+ std::unordered_multimap<uint32_t, uint32_t>{},
+ getRandomWithMaxCardinality(getRandomIntegerInputs<uint32_t>, 128),
+ getRandomIntegerInputs<uint32_t>)
+ ->Arg(TestNumInputs);
+
+BENCHMARK_CAPTURE(BM_EmplaceValue,
+ unordered_multimap_uint32_max_cardinality_32,
+ std::unordered_multimap<uint32_t, uint32_t>{},
+ getRandomWithMaxCardinality(getRandomIntegerInputs<uint32_t>, 32),
+ getRandomIntegerInputs<uint32_t>)
+ ->Arg(TestNumInputs);
+
+BENCHMARK_CAPTURE(BM_EmplaceValue,
+ unordered_multimap_uint32_max_cardinality_8,
+ std::unordered_multimap<uint32_t, uint32_t>{},
+ getRandomWithMaxCardinality(getRandomIntegerInputs<uint32_t>, 8),
+ getRandomIntegerInputs<uint32_t>)
+ ->Arg(TestNumInputs);
+
+BENCHMARK_CAPTURE(BM_EmplaceValue,
+ unordered_multimap_uint32_max_cardinality_1,
+ std::unordered_multimap<uint32_t, uint32_t>{},
+ getRandomWithMaxCardinality(getRandomIntegerInputs<uint32_t>, 1),
+ getRandomIntegerInputs<uint32_t>)
+ ->Arg(TestNumInputs);
+
+BENCHMARK_CAPTURE(BM_EmplaceValue,
+ unordered_multimap_string_max_cardinality_512,
+ std::unordered_multimap<std::string, uint32_t>{},
+ getRandomWithMaxCardinality(getRandomStringInputs, 512),
+ getRandomIntegerInputs<uint32_t>)
+ ->Arg(TestNumInputs);
+
+BENCHMARK_CAPTURE(BM_EmplaceValue,
+ unordered_multimap_string_max_cardinality_128,
+ std::unordered_multimap<std::string, uint32_t>{},
+ getRandomWithMaxCardinality(getRandomStringInputs, 128),
+ getRandomIntegerInputs<uint32_t>)
+ ->Arg(TestNumInputs);
+
+BENCHMARK_CAPTURE(BM_EmplaceValue,
+ unordered_multimap_string_max_cardinality_32,
+ std::unordered_multimap<std::string, uint32_t>{},
+ getRandomWithMaxCardinality(getRandomStringInputs, 32),
+ getRandomIntegerInputs<uint32_t>)
+ ->Arg(TestNumInputs);
+
+BENCHMARK_CAPTURE(BM_EmplaceValue,
+ unordered_multimap_string_max_cardinality_8,
+ std::unordered_multimap<std::string, uint32_t>{},
+ getRandomWithMaxCardinality(getRandomStringInputs, 8),
+ getRandomIntegerInputs<uint32_t>)
+ ->Arg(TestNumInputs);
+
+BENCHMARK_CAPTURE(BM_EmplaceValue,
+ unordered_multimap_string_max_cardinality_1,
+ std::unordered_multimap<std::string, uint32_t>{},
+ getRandomWithMaxCardinality(getRandomStringInputs, 1),
+ getRandomIntegerInputs<uint32_t>)
+ ->Arg(TestNumInputs);
+
+BENCHMARK_MAIN();
diff --git a/libcxx/test/std/containers/unord/unord.multimap/unord.multimap.modifiers/emplace_hint.pass.cpp b/libcxx/test/std/containers/unord/unord.multimap/unord.multimap.modifiers/emplace_hint.pass.cpp
index 2a6e3f60ca6b57..ec31c2be1fa747 100644
--- a/libcxx/test/std/containers/unord/unord.multimap/unord.multimap.modifiers/emplace_hint.pass.cpp
+++ b/libcxx/test/std/containers/unord/unord.multimap/unord.multimap.modifiers/emplace_hint.pass.cpp
@@ -43,20 +43,12 @@ int main(int, char**)
assert(c.size() == 2);
assert(r->first == 3);
assert(r->second == Emplaceable(5, 6));
- LIBCPP_ASSERT(r == std::next(c.begin()));
r = c.emplace_hint(r, std::piecewise_construct, std::forward_as_tuple(3),
std::forward_as_tuple(6, 7));
assert(c.size() == 3);
assert(r->first == 3);
assert(r->second == Emplaceable(6, 7));
- LIBCPP_ASSERT(r == std::next(c.begin()));
- r = c.begin();
- assert(r->first == 3);
- LIBCPP_ASSERT(r->second == Emplaceable());
- r = std::next(r, 2);
- assert(r->first == 3);
- LIBCPP_ASSERT(r->second == Emplaceable(5, 6));
}
{
typedef std::unordered_multimap<int, Emplaceable, std::hash<int>, std::equal_to<int>,
@@ -74,20 +66,12 @@ int main(int, char**)
assert(c.size() == 2);
assert(r->first == 3);
assert(r->second == Emplaceable(5, 6));
- LIBCPP_ASSERT(r == std::next(c.begin()));
r = c.emplace_hint(r, std::piecewise_construct, std::forward_as_tuple(3),
std::forward_as_tuple(6, 7));
assert(c.size() == 3);
assert(r->first == 3);
assert(r->second == Emplaceable(6, 7));
- LIBCPP_ASSERT(r == std::next(c.begin()));
- r = c.begin();
- assert(r->first == 3);
- LIBCPP_ASSERT(r->second == Emplaceable());
- r = std::next(r, 2);
- assert(r->first == 3);
- LIBCPP_ASSERT(r->second == Emplaceable(5, 6));
}
return 0;
diff --git a/libcxx/test/std/utilities/format/format.range/format.range.fmtmap/format.functions.tests.h b/libcxx/test/std/utilities/format/format.range/format.range.fmtmap/format.functions.tests.h
index 11088796e53869..326c03d2e5db24 100644
--- a/libcxx/test/std/utilities/format/format.range/format.range.fmtmap/format.functions.tests.h
+++ b/libcxx/test/std/utilities/format/format.range/format.range.fmtmap/format.functions.tests.h
@@ -754,7 +754,7 @@ void test_string(TestFunction check, ExceptionTest check_exception) {
template <class CharT, class TestFunction, class ExceptionTest>
void test_status(TestFunction check, ExceptionTest check_exception) {
- std::unordered_multimap<status, status> input{{status::foobar, status::foo}, {status::foobar, status::bar}};
+ std::multimap<status, status> input{{status::foobar, status::foo}, {status::foobar, status::bar}};
check(SV("{0xaa55: 0xaaaa, 0xaa55: 0x5555}"), SV("{}"), input);
check(SV("{0xaa55: 0xaaaa, 0xaa55: 0x5555}^42"), SV("{}^42"), input);
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the change is reasonable, but this seems like a change that has major potential for breaking a bunch of people. For that reason it might be a good idea to address #102303 before. If we find that it results in major breakage, we should probably add a flag (at least temporarily) to opt out of the optimization.
| @@ -1398,7 +1398,8 @@ __hash_table<_Tp, _Hash, _Equal, _Alloc>::__node_insert_unique(__node_pointer __ | |||
| // Prepare the container for an insertion of the value __cp_val with the hash | |||
| // __cp_hash. This does a lookup into the container to see if __cp_value is | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not attached: We should add a release note for this.
Summary of Changes
Insertion and Emplacement Logic Update:
std::unordered_multimapto insert new nodes at the beginning of an equal range rather than at the back.Performance Improvement:
Compatibility Note:
Behavior Consistency:
libstdc++implementation.Benchmark and Testing:
std::unordered_multimap.Rationale for Change
The original bug report from 2013 was marked as WONTFIX, but I believe this change now warrants reconsideration. This fix is crucial because it adheres to the principle of least astonishment: users reasonably expect
std::unordered_multimapto efficiently handle duplicate keys. Without this improvement, operations that should take a split second could instead degrade into taking minutes or even hours under certain conditions.This issue was highlighted by a performance degradation encountered in an open-source library with tens of thousands of stars on GitHub. Suggesting workarounds, such as using
ummap.emplace_hint(ummap.find(key), key, value), is not ideal and could erode trust in the long-term reliability of the library.Related Issues