Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[libc++] Insert new nodes at the beginning of equal range in std::unordered_multimap #104702

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

arvidjonasson
Copy link

@arvidjonasson arvidjonasson commented Aug 18, 2024

Summary of Changes

  • Insertion and Emplacement Logic Update:

    • Modified the insertion and emplacement logic in std::unordered_multimap to insert new nodes at the beginning of an equal range rather than at the back.
    • This change addresses a significant performance issue where inserting at the back degraded performance to that of a linked list for large numbers of elements with equal keys.
  • Performance Improvement:

    • The new approach significantly improves performance by preventing the performance degradation observed in cases with many equal keys. This fix addresses Bug 16747, which was originally reported in 2013 and marked as WONTFIX.
  • Compatibility Note:

    • This change may affect code that relies on the order of elements within each equal range, although such order is not guaranteed by the standard.
  • Behavior Consistency:

    • The new insertion logic aligns with the behavior of the libstdc++ implementation.
  • Benchmark and Testing:

    • Added a benchmark to measure the performance improvements from the new insertion logic.
    • Fixed two tests that relied on the previous order of elements in std::unordered_multimap.

Rationale for Change

The original bug report from 2013 was marked as WONTFIX, but I believe this change now warrants reconsideration. This fix is crucial because it adheres to the principle of least astonishment: users reasonably expect std::unordered_multimap to efficiently handle duplicate keys. Without this improvement, operations that should take a split second could instead degrade into taking minutes or even hours under certain conditions.

This issue was highlighted by a performance degradation encountered in an open-source library with tens of thousands of stars on GitHub. Suggesting workarounds, such as using ummap.emplace_hint(ummap.find(key), key, value), is not ideal and could erode trust in the long-term reliability of the library.

Related Issues

Copy link

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be
notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write
permissions for the repository. In which case you can instead tag reviewers by
name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review
by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate
is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

…ltimap

- Changed the insertion and emplacement logic in `std::unordered_multimap` to insert new nodes at the beginning of an equal range instead of the back.
- This change addresses a performance issue where inserting at the back degraded performance to that of a linked list for large numbers of elements with equal keys.
- The new approach improves performance significantly, fixing a bug reported in 2013 (Bug 16747 in LLVM Bugzilla) that was previously marked as WONTFIX.
- Note: This change may break code that relies on the order of elements within each equal range, although such order is not guaranteed by the standard.
- This new behavior mimics the insertion logic of the `libstdc++` implementation.
- Added a benchmark to measure the performance improvements of the new insertion logic.
@arvidjonasson arvidjonasson force-pushed the insert-new-nodes-at-beginning-of-equal-range branch from eb58296 to 7187d6c Compare August 18, 2024 14:54
Quuxplusone added a commit to Quuxplusone/blog that referenced this pull request Aug 18, 2024
@arvidjonasson arvidjonasson changed the title Insert new nodes at the beginning of equal range in std::unordered_multimap [libc++]: Insert new nodes at the beginning of equal range in std::unordered_multimap Aug 18, 2024
@arvidjonasson arvidjonasson changed the title [libc++]: Insert new nodes at the beginning of equal range in std::unordered_multimap [libc++] Insert new nodes at the beginning of equal range in std::unordered_multimap Aug 18, 2024
@arvidjonasson
Copy link
Author

arvidjonasson commented Aug 18, 2024

Benchmark Results

Environment:

  • System: MacBook Air M2, 8GB RAM
  • Benchmark Conditions: Charger plugged in, battery saver off
  • Cache: 128 KB L1, 16 MB shared L2

Summary:
The previous insertion logic exhibited significant performance degradation when handling low cardinality keys, especially with string types. The new insertion logic shows marked improvement, particularly when dealing with a large number of duplicate keys.

Results before:

Running ./build/libcxx/test/benchmarks/unordered_multimap_operations.bench.out
-------------------------------------------------------------------------------------------------------------
Benchmark                                                                   Time             CPU   Iterations
-------------------------------------------------------------------------------------------------------------
BM_InsertValue/unordered_multimap_uint32/1024                           62331 ns        62270 ns        11240
BM_InsertValue/unordered_multimap_uint32_sorted/1024                    47805 ns        47654 ns        14820
BM_InsertValue/unordered_multimap_string/1024                          270417 ns       270342 ns         2556
BM_InsertValue/unordered_multimap_prefixed_string/1024                 272385 ns       272335 ns         2623
BM_InsertValue/unordered_multimap_uint32_max_cardinality_512/1024       71090 ns        70982 ns         9978
BM_InsertValue/unordered_multimap_uint32_max_cardinality_128/1024       76065 ns        75991 ns         9182
BM_InsertValue/unordered_multimap_uint32_max_cardinality_32/1024        89788 ns        89516 ns         7699
BM_InsertValue/unordered_multimap_uint32_max_cardinality_8/1024        138602 ns       138576 ns         4701
BM_InsertValue/unordered_multimap_uint32_max_cardinality_1/1024        590632 ns       589968 ns         1202
BM_InsertValue/unordered_multimap_string_max_cardinality_512/1024      312396 ns       312361 ns         2244
BM_InsertValue/unordered_multimap_string_max_cardinality_128/1024      404650 ns       404602 ns         1734
BM_InsertValue/unordered_multimap_string_max_cardinality_32/1024       762362 ns       762264 ns          914
BM_InsertValue/unordered_multimap_string_max_cardinality_8/1024       2218712 ns      2218479 ns          313
BM_InsertValue/unordered_multimap_string_max_cardinality_1/1024      15270609 ns     15268222 ns           45
BM_EmplaceValue/unordered_multimap_uint32/1024                          58471 ns        58467 ns        11884
BM_EmplaceValue/unordered_multimap_uint32_sorted/1024                   45723 ns        45719 ns        15412
BM_EmplaceValue/unordered_multimap_string/1024                         261983 ns       261916 ns         2677
BM_EmplaceValue/unordered_multimap_prefixed_string/1024                262311 ns       262241 ns         2664
BM_EmplaceValue/unordered_multimap_uint32_max_cardinality_512/1024      68535 ns        68527 ns        10128
BM_EmplaceValue/unordered_multimap_uint32_max_cardinality_128/1024      73849 ns        73842 ns         9490
BM_EmplaceValue/unordered_multimap_uint32_max_cardinality_32/1024       87869 ns        87859 ns         7903
BM_EmplaceValue/unordered_multimap_uint32_max_cardinality_8/1024       137971 ns       137957 ns         5066
BM_EmplaceValue/unordered_multimap_uint32_max_cardinality_1/1024       581757 ns       581625 ns         1196
BM_EmplaceValue/unordered_multimap_string_max_cardinality_512/1024     310528 ns       310482 ns         2276
BM_EmplaceValue/unordered_multimap_string_max_cardinality_128/1024     405231 ns       405164 ns         1746
BM_EmplaceValue/unordered_multimap_string_max_cardinality_32/1024      765931 ns       765796 ns          913
BM_EmplaceValue/unordered_multimap_string_max_cardinality_8/1024      2229371 ns      2229125 ns          313
BM_EmplaceValue/unordered_multimap_string_max_cardinality_1/1024     15392159 ns     15390261 ns           46

Results after:

Running ./build/libcxx/test/benchmarks/unordered_multimap_operations.bench.out
-------------------------------------------------------------------------------------------------------------
Benchmark                                                                   Time             CPU   Iterations
-------------------------------------------------------------------------------------------------------------
BM_InsertValue/unordered_multimap_uint32/1024                           58530 ns        58522 ns        12253
BM_InsertValue/unordered_multimap_uint32_sorted/1024                    46019 ns        46015 ns        15126
BM_InsertValue/unordered_multimap_string/1024                          262279 ns       262230 ns         2695
BM_InsertValue/unordered_multimap_prefixed_string/1024                 260191 ns       260157 ns         2688
BM_InsertValue/unordered_multimap_uint32_max_cardinality_512/1024       63376 ns        63363 ns        11320
BM_InsertValue/unordered_multimap_uint32_max_cardinality_128/1024       59583 ns        59577 ns        11942
BM_InsertValue/unordered_multimap_uint32_max_cardinality_32/1024        57061 ns        57052 ns        12315
BM_InsertValue/unordered_multimap_uint32_max_cardinality_8/1024         56225 ns        56217 ns        12407
BM_InsertValue/unordered_multimap_uint32_max_cardinality_1/1024         45790 ns        45779 ns        15304
BM_InsertValue/unordered_multimap_string_max_cardinality_512/1024      294407 ns       294365 ns         2402
BM_InsertValue/unordered_multimap_string_max_cardinality_128/1024      303496 ns       303460 ns         2346
BM_InsertValue/unordered_multimap_string_max_cardinality_32/1024       292438 ns       292404 ns         2395
BM_InsertValue/unordered_multimap_string_max_cardinality_8/1024        287342 ns       287327 ns         2423
BM_InsertValue/unordered_multimap_string_max_cardinality_1/1024        262533 ns       262502 ns         2648
BM_EmplaceValue/unordered_multimap_uint32/1024                          58538 ns        58532 ns        11972
BM_EmplaceValue/unordered_multimap_uint32_sorted/1024                   45809 ns        45804 ns        15193
BM_EmplaceValue/unordered_multimap_string/1024                         263236 ns       263203 ns         2691
BM_EmplaceValue/unordered_multimap_prefixed_string/1024                262738 ns       262697 ns         2674
BM_EmplaceValue/unordered_multimap_uint32_max_cardinality_512/1024      62269 ns        62261 ns        11204
BM_EmplaceValue/unordered_multimap_uint32_max_cardinality_128/1024      59043 ns        59036 ns        11972
BM_EmplaceValue/unordered_multimap_uint32_max_cardinality_32/1024       56576 ns        56563 ns        12206
BM_EmplaceValue/unordered_multimap_uint32_max_cardinality_8/1024        56446 ns        56443 ns        12290
BM_EmplaceValue/unordered_multimap_uint32_max_cardinality_1/1024        45799 ns        45795 ns        15219
BM_EmplaceValue/unordered_multimap_string_max_cardinality_512/1024     293243 ns       293192 ns         2394
BM_EmplaceValue/unordered_multimap_string_max_cardinality_128/1024     298157 ns       298116 ns         2354
BM_EmplaceValue/unordered_multimap_string_max_cardinality_32/1024      292008 ns       291967 ns         2360
BM_EmplaceValue/unordered_multimap_string_max_cardinality_8/1024       287732 ns       287706 ns         2442
BM_EmplaceValue/unordered_multimap_string_max_cardinality_1/1024       262941 ns       262902 ns         2655

@arvidjonasson arvidjonasson marked this pull request as ready for review August 18, 2024 23:49
@arvidjonasson arvidjonasson requested a review from a team as a code owner August 18, 2024 23:49
@llvmbot llvmbot added the libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi. label Aug 18, 2024
@llvmbot
Copy link

llvmbot commented Aug 18, 2024

@llvm/pr-subscribers-libcxx

Author: Arvid Jonasson (arvidjonasson)

Changes

Summary of Changes

  • Insertion and Emplacement Logic Update:

    • Modified the insertion and emplacement logic in std::unordered_multimap to insert new nodes at the beginning of an equal range rather than at the back.
    • This change addresses a significant performance issue where inserting at the back degraded performance to that of a linked list for large numbers of elements with equal keys.
  • Performance Improvement:

    • The new approach significantly improves performance by preventing the performance degradation observed in cases with many equal keys. This fix addresses Bug 16747, which was originally reported in 2013 and marked as WONTFIX.
  • Compatibility Note:

    • This change may affect code that relies on the order of elements within each equal range, although such order is not guaranteed by the standard.
  • Behavior Consistency:

    • The new insertion logic aligns with the behavior of the libstdc++ implementation.
  • Benchmark and Testing:

    • Added a benchmark to measure the performance improvements from the new insertion logic.
    • Fixed two tests that relied on the previous order of elements in std::unordered_multimap.

Rationale for Change

The original bug report from 2013 was marked as WONTFIX, but I believe this change now warrants reconsideration. This fix is crucial because it adheres to the principle of least astonishment: users reasonably expect std::unordered_multimap to efficiently handle duplicate keys. Without this improvement, operations that should take a split second could instead degrade into taking minutes or even hours under certain conditions.

This issue was highlighted by a performance degradation encountered in an open-source library with tens of thousands of stars on GitHub. Suggesting workarounds, such as using ummap.emplace_hint(ummap.find(key), key, value), is not ideal and could erode trust in the long-term reliability of the library.

Related Issues

  • #17121
  • #21649

Full diff: https://github.com/llvm/llvm-project/pull/104702.diff

6 Files Affected:

  • (modified) libcxx/include/__hash_table (+10-17)
  • (modified) libcxx/test/benchmarks/CMakeLists.txt (+1)
  • (modified) libcxx/test/benchmarks/ContainerBenchmarks.h (+15-1)
  • (added) libcxx/test/benchmarks/unordered_multimap_operations.bench.cpp (+260)
  • (modified) libcxx/test/std/containers/unord/unord.multimap/unord.multimap.modifiers/emplace_hint.pass.cpp (-16)
  • (modified) libcxx/test/std/utilities/format/format.range/format.range.fmtmap/format.functions.tests.h (+1-1)
diff --git a/libcxx/include/__hash_table b/libcxx/include/__hash_table
index d5fbc92a3dfc4e..d4c18599c92509 100644
--- a/libcxx/include/__hash_table
+++ b/libcxx/include/__hash_table
@@ -1398,7 +1398,8 @@ __hash_table<_Tp, _Hash, _Equal, _Alloc>::__node_insert_unique(__node_pointer __
 // Prepare the container for an insertion of the value __cp_val with the hash
 // __cp_hash. This does a lookup into the container to see if __cp_value is
 // already present, and performs a rehash if necessary. Returns a pointer to the
-// last occurrence of __cp_val in the map.
+// node before the insertion point, or nullptr if the insertion point is the
+// first node in the bucket.
 //
 // Note that this function does forward exceptions if key_eq() throws, and never
 // mutates __value or actually inserts into the map.
@@ -1414,28 +1415,20 @@ __hash_table<_Tp, _Hash, _Equal, _Alloc>::__node_insert_multi_prepare(size_t __c
   size_t __chash      = std::__constrain_hash(__cp_hash, __bc);
   __next_pointer __pn = __bucket_list_[__chash];
   if (__pn != nullptr) {
-    for (bool __found = false;
-         __pn->__next_ != nullptr && std::__constrain_hash(__pn->__next_->__hash(), __bc) == __chash;
+    for (; __pn->__next_ != nullptr && std::__constrain_hash(__pn->__next_->__hash(), __bc) == __chash;
          __pn = __pn->__next_) {
-      //      __found    key_eq()     action
-      //      false       false       loop
-      //      true        true        loop
-      //      false       true        set __found to true
-      //      true        false       break
-      if (__found !=
-          (__pn->__next_->__hash() == __cp_hash && key_eq()(__pn->__next_->__upcast()->__get_value(), __cp_val))) {
-        if (!__found)
-          __found = true;
-        else
-          break;
-      }
+      // Iterate until either:
+      //   __pn->__next_ is nullptr, or
+      //   __pn->__next_ is in a different bucket, or
+      //   __pn->__next_ has the same key as __cp_val
+      if (__pn->__next_->__hash() == __cp_hash && key_eq()(__pn->__next_->__upcast()->__get_value(), __cp_val))
+        break;
     }
   }
   return __pn;
 }
 
-// Insert the node __cp into the container after __pn (which is the last node in
-// the bucket that compares equal to __cp). Rehashing, and checking for
+// Insert the node __cp into the container after __pn. Rehashing, and checking for
 // uniqueness has already been performed (in __node_insert_multi_prepare), so
 // all we need to do is update the bucket and size(). Assumes that __cp->__hash
 // is up-to-date.
diff --git a/libcxx/test/benchmarks/CMakeLists.txt b/libcxx/test/benchmarks/CMakeLists.txt
index 616cf0ff8d2374..3d51a3599a1f8a 100644
--- a/libcxx/test/benchmarks/CMakeLists.txt
+++ b/libcxx/test/benchmarks/CMakeLists.txt
@@ -168,6 +168,7 @@ set(BENCHMARK_TESTS
     stringstream.bench.cpp
     system_error.bench.cpp
     to_chars.bench.cpp
+    unordered_multimap_operations.bench.cpp
     unordered_set_operations.bench.cpp
     util_smartptr.bench.cpp
     variant_visit_1.bench.cpp
diff --git a/libcxx/test/benchmarks/ContainerBenchmarks.h b/libcxx/test/benchmarks/ContainerBenchmarks.h
index 744505b439985b..bcfe8c9f9e0f16 100644
--- a/libcxx/test/benchmarks/ContainerBenchmarks.h
+++ b/libcxx/test/benchmarks/ContainerBenchmarks.h
@@ -99,7 +99,21 @@ void BM_InsertValue(benchmark::State& st, Container c, GenInputs gen) {
   while (st.KeepRunning()) {
     c.clear();
     for (auto it = in.begin(); it != end; ++it) {
-      benchmark::DoNotOptimize(&(*c.insert(*it).first));
+      benchmark::DoNotOptimize(c.insert(*it));
+    }
+    benchmark::ClobberMemory();
+  }
+}
+
+template <class Container, class... GenInputs>
+void BM_EmplaceValue(benchmark::State& st, Container c, GenInputs... gen) {
+  auto num  = st.range(0);
+  auto args = std::make_tuple(gen(num)...);
+  while (st.KeepRunning()) {
+    c.clear();
+    for (decltype(num) i = 0; i < num; ++i) {
+      benchmark::DoNotOptimize(
+          std::apply([&c, i](auto&&... args) -> decltype(auto) { return c.emplace(args[i]...); }, args));
     }
     benchmark::ClobberMemory();
   }
diff --git a/libcxx/test/benchmarks/unordered_multimap_operations.bench.cpp b/libcxx/test/benchmarks/unordered_multimap_operations.bench.cpp
new file mode 100644
index 00000000000000..0d921896901c0f
--- /dev/null
+++ b/libcxx/test/benchmarks/unordered_multimap_operations.bench.cpp
@@ -0,0 +1,260 @@
+//===----------------------------------------------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include <cstdint>
+#include <cstdlib>
+#include <functional>
+#include <unordered_map>
+#include <vector>
+#include <utility>
+
+#include "benchmark/benchmark.h"
+
+#include "ContainerBenchmarks.h"
+#include "GenerateInput.h"
+#include "test_macros.h"
+
+using namespace ContainerBenchmarks;
+
+constexpr size_t TestNumInputs = 1024;
+
+template <class GenInputs>
+inline auto getRandomWithMaxCardinality(GenInputs genInputs, size_t maxCardinality) {
+  return [genInputs = std::forward<GenInputs>(genInputs), maxCardinality](size_t N) {
+    auto possibleInputs = genInputs(maxCardinality);
+    decltype(possibleInputs) inputs;
+    size_t minIdx = 0, maxIdx = possibleInputs.size() - 1;
+    for (size_t i = 0; i < N; ++i) {
+      inputs.push_back(possibleInputs[getRandomInteger(minIdx, maxIdx)]);
+    }
+    return inputs;
+  };
+}
+
+template <class GenInputsFirst, class GenInputsSecond>
+inline auto genRandomPairInputs(GenInputsFirst genFirst, GenInputsSecond genSecond) {
+  return [genFirst  = std::forward< GenInputsFirst >(genFirst),
+          genSecond = std::forward< GenInputsSecond >(genSecond)](size_t N) {
+    auto first  = genFirst(N);
+    auto second = genSecond(N);
+    std::vector<std::pair<typename decltype(first)::value_type, typename decltype(second)::value_type>> inputs;
+    inputs.reserve(N);
+    for (size_t i = 0; i < N; ++i) {
+      inputs.emplace_back(std::move(first[i]), std::move(second[i]));
+    }
+    return inputs;
+  };
+}
+
+//----------------------------------------------------------------------------//
+//                       BM_InsertValue
+// ---------------------------------------------------------------------------//
+
+BENCHMARK_CAPTURE(BM_InsertValue,
+                  unordered_multimap_uint32,
+                  std::unordered_multimap<uint32_t, uint32_t>{},
+                  genRandomPairInputs(getRandomIntegerInputs<uint32_t>, getRandomIntegerInputs<uint32_t>))
+    ->Arg(TestNumInputs);
+
+BENCHMARK_CAPTURE(BM_InsertValue,
+                  unordered_multimap_uint32_sorted,
+                  std::unordered_multimap<uint32_t, uint32_t>{},
+                  genRandomPairInputs(getSortedIntegerInputs<uint32_t>, getRandomIntegerInputs<uint32_t>))
+    ->Arg(TestNumInputs);
+
+// String //
+BENCHMARK_CAPTURE(BM_InsertValue,
+                  unordered_multimap_string,
+                  std::unordered_multimap<std::string, uint32_t>{},
+                  genRandomPairInputs(getRandomStringInputs, getRandomIntegerInputs<uint32_t>))
+    ->Arg(TestNumInputs);
+
+// Prefixed String //
+BENCHMARK_CAPTURE(BM_InsertValue,
+                  unordered_multimap_prefixed_string,
+                  std::unordered_multimap<std::string, uint32_t>{},
+                  genRandomPairInputs(getPrefixedRandomStringInputs, getRandomIntegerInputs<uint32_t>))
+    ->Arg(TestNumInputs);
+
+// Random with max cardinalities [512, 128, 32, 8, 1] //
+BENCHMARK_CAPTURE(BM_InsertValue,
+                  unordered_multimap_uint32_max_cardinality_512,
+                  std::unordered_multimap<uint32_t, uint32_t>{},
+                  genRandomPairInputs(getRandomWithMaxCardinality(getRandomIntegerInputs<uint32_t>, 512),
+                                      getRandomIntegerInputs<uint32_t>))
+    ->Arg(TestNumInputs);
+
+BENCHMARK_CAPTURE(BM_InsertValue,
+                  unordered_multimap_uint32_max_cardinality_128,
+                  std::unordered_multimap<uint32_t, uint32_t>{},
+                  genRandomPairInputs(getRandomWithMaxCardinality(getRandomIntegerInputs<uint32_t>, 128),
+                                      getRandomIntegerInputs<uint32_t>))
+    ->Arg(TestNumInputs);
+
+BENCHMARK_CAPTURE(BM_InsertValue,
+                  unordered_multimap_uint32_max_cardinality_32,
+                  std::unordered_multimap<uint32_t, uint32_t>{},
+                  genRandomPairInputs(getRandomWithMaxCardinality(getRandomIntegerInputs<uint32_t>, 32),
+                                      getRandomIntegerInputs<uint32_t>))
+    ->Arg(TestNumInputs);
+
+BENCHMARK_CAPTURE(BM_InsertValue,
+                  unordered_multimap_uint32_max_cardinality_8,
+                  std::unordered_multimap<uint32_t, uint32_t>{},
+                  genRandomPairInputs(getRandomWithMaxCardinality(getRandomIntegerInputs<uint32_t>, 8),
+                                      getRandomIntegerInputs<uint32_t>))
+    ->Arg(TestNumInputs);
+
+BENCHMARK_CAPTURE(BM_InsertValue,
+                  unordered_multimap_uint32_max_cardinality_1,
+                  std::unordered_multimap<uint32_t, uint32_t>{},
+                  genRandomPairInputs(getRandomWithMaxCardinality(getRandomIntegerInputs<uint32_t>, 1),
+                                      getRandomIntegerInputs<uint32_t>))
+    ->Arg(TestNumInputs);
+
+BENCHMARK_CAPTURE(
+    BM_InsertValue,
+    unordered_multimap_string_max_cardinality_512,
+    std::unordered_multimap<std::string, uint32_t>{},
+    genRandomPairInputs(getRandomWithMaxCardinality(getRandomStringInputs, 512), getRandomIntegerInputs<uint32_t>))
+    ->Arg(TestNumInputs);
+
+BENCHMARK_CAPTURE(
+    BM_InsertValue,
+    unordered_multimap_string_max_cardinality_128,
+    std::unordered_multimap<std::string, uint32_t>{},
+    genRandomPairInputs(getRandomWithMaxCardinality(getRandomStringInputs, 128), getRandomIntegerInputs<uint32_t>))
+    ->Arg(TestNumInputs);
+
+BENCHMARK_CAPTURE(
+    BM_InsertValue,
+    unordered_multimap_string_max_cardinality_32,
+    std::unordered_multimap<std::string, uint32_t>{},
+    genRandomPairInputs(getRandomWithMaxCardinality(getRandomStringInputs, 32), getRandomIntegerInputs<uint32_t>))
+    ->Arg(TestNumInputs);
+
+BENCHMARK_CAPTURE(
+    BM_InsertValue,
+    unordered_multimap_string_max_cardinality_8,
+    std::unordered_multimap<std::string, uint32_t>{},
+    genRandomPairInputs(getRandomWithMaxCardinality(getRandomStringInputs, 8), getRandomIntegerInputs<uint32_t>))
+    ->Arg(TestNumInputs);
+
+BENCHMARK_CAPTURE(
+    BM_InsertValue,
+    unordered_multimap_string_max_cardinality_1,
+    std::unordered_multimap<std::string, uint32_t>{},
+    genRandomPairInputs(getRandomWithMaxCardinality(getRandomStringInputs, 1), getRandomIntegerInputs<uint32_t>))
+    ->Arg(TestNumInputs);
+
+//----------------------------------------------------------------------------//
+//                       BM_EmplaceValue
+// ---------------------------------------------------------------------------//
+
+BENCHMARK_CAPTURE(BM_EmplaceValue,
+                  unordered_multimap_uint32,
+                  std::unordered_multimap<uint32_t, uint32_t>{},
+                  getRandomIntegerInputs<uint32_t>,
+                  getRandomIntegerInputs<uint32_t>)
+    ->Arg(TestNumInputs);
+
+BENCHMARK_CAPTURE(BM_EmplaceValue,
+                  unordered_multimap_uint32_sorted,
+                  std::unordered_multimap<uint32_t, uint32_t>{},
+                  getSortedIntegerInputs<uint32_t>,
+                  getRandomIntegerInputs<uint32_t>)
+    ->Arg(TestNumInputs);
+
+// String //
+BENCHMARK_CAPTURE(BM_EmplaceValue,
+                  unordered_multimap_string,
+                  std::unordered_multimap<std::string, uint32_t>{},
+                  getRandomStringInputs,
+                  getRandomIntegerInputs<uint32_t>)
+    ->Arg(TestNumInputs);
+
+// Prefixed String //
+BENCHMARK_CAPTURE(BM_EmplaceValue,
+                  unordered_multimap_prefixed_string,
+                  std::unordered_multimap<std::string, uint32_t>{},
+                  getPrefixedRandomStringInputs,
+                  getRandomIntegerInputs<uint32_t>)
+    ->Arg(TestNumInputs);
+
+// Random with max cardinalities [512, 128, 32, 8] //
+BENCHMARK_CAPTURE(BM_EmplaceValue,
+                  unordered_multimap_uint32_max_cardinality_512,
+                  std::unordered_multimap<uint32_t, uint32_t>{},
+                  getRandomWithMaxCardinality(getRandomIntegerInputs<uint32_t>, 512),
+                  getRandomIntegerInputs<uint32_t>)
+    ->Arg(TestNumInputs);
+
+BENCHMARK_CAPTURE(BM_EmplaceValue,
+                  unordered_multimap_uint32_max_cardinality_128,
+                  std::unordered_multimap<uint32_t, uint32_t>{},
+                  getRandomWithMaxCardinality(getRandomIntegerInputs<uint32_t>, 128),
+                  getRandomIntegerInputs<uint32_t>)
+    ->Arg(TestNumInputs);
+
+BENCHMARK_CAPTURE(BM_EmplaceValue,
+                  unordered_multimap_uint32_max_cardinality_32,
+                  std::unordered_multimap<uint32_t, uint32_t>{},
+                  getRandomWithMaxCardinality(getRandomIntegerInputs<uint32_t>, 32),
+                  getRandomIntegerInputs<uint32_t>)
+    ->Arg(TestNumInputs);
+
+BENCHMARK_CAPTURE(BM_EmplaceValue,
+                  unordered_multimap_uint32_max_cardinality_8,
+                  std::unordered_multimap<uint32_t, uint32_t>{},
+                  getRandomWithMaxCardinality(getRandomIntegerInputs<uint32_t>, 8),
+                  getRandomIntegerInputs<uint32_t>)
+    ->Arg(TestNumInputs);
+
+BENCHMARK_CAPTURE(BM_EmplaceValue,
+                  unordered_multimap_uint32_max_cardinality_1,
+                  std::unordered_multimap<uint32_t, uint32_t>{},
+                  getRandomWithMaxCardinality(getRandomIntegerInputs<uint32_t>, 1),
+                  getRandomIntegerInputs<uint32_t>)
+    ->Arg(TestNumInputs);
+
+BENCHMARK_CAPTURE(BM_EmplaceValue,
+                  unordered_multimap_string_max_cardinality_512,
+                  std::unordered_multimap<std::string, uint32_t>{},
+                  getRandomWithMaxCardinality(getRandomStringInputs, 512),
+                  getRandomIntegerInputs<uint32_t>)
+    ->Arg(TestNumInputs);
+
+BENCHMARK_CAPTURE(BM_EmplaceValue,
+                  unordered_multimap_string_max_cardinality_128,
+                  std::unordered_multimap<std::string, uint32_t>{},
+                  getRandomWithMaxCardinality(getRandomStringInputs, 128),
+                  getRandomIntegerInputs<uint32_t>)
+    ->Arg(TestNumInputs);
+
+BENCHMARK_CAPTURE(BM_EmplaceValue,
+                  unordered_multimap_string_max_cardinality_32,
+                  std::unordered_multimap<std::string, uint32_t>{},
+                  getRandomWithMaxCardinality(getRandomStringInputs, 32),
+                  getRandomIntegerInputs<uint32_t>)
+    ->Arg(TestNumInputs);
+
+BENCHMARK_CAPTURE(BM_EmplaceValue,
+                  unordered_multimap_string_max_cardinality_8,
+                  std::unordered_multimap<std::string, uint32_t>{},
+                  getRandomWithMaxCardinality(getRandomStringInputs, 8),
+                  getRandomIntegerInputs<uint32_t>)
+    ->Arg(TestNumInputs);
+
+BENCHMARK_CAPTURE(BM_EmplaceValue,
+                  unordered_multimap_string_max_cardinality_1,
+                  std::unordered_multimap<std::string, uint32_t>{},
+                  getRandomWithMaxCardinality(getRandomStringInputs, 1),
+                  getRandomIntegerInputs<uint32_t>)
+    ->Arg(TestNumInputs);
+
+BENCHMARK_MAIN();
diff --git a/libcxx/test/std/containers/unord/unord.multimap/unord.multimap.modifiers/emplace_hint.pass.cpp b/libcxx/test/std/containers/unord/unord.multimap/unord.multimap.modifiers/emplace_hint.pass.cpp
index 2a6e3f60ca6b57..ec31c2be1fa747 100644
--- a/libcxx/test/std/containers/unord/unord.multimap/unord.multimap.modifiers/emplace_hint.pass.cpp
+++ b/libcxx/test/std/containers/unord/unord.multimap/unord.multimap.modifiers/emplace_hint.pass.cpp
@@ -43,20 +43,12 @@ int main(int, char**)
         assert(c.size() == 2);
         assert(r->first == 3);
         assert(r->second == Emplaceable(5, 6));
-        LIBCPP_ASSERT(r == std::next(c.begin()));
 
         r = c.emplace_hint(r, std::piecewise_construct, std::forward_as_tuple(3),
                                                         std::forward_as_tuple(6, 7));
         assert(c.size() == 3);
         assert(r->first == 3);
         assert(r->second == Emplaceable(6, 7));
-        LIBCPP_ASSERT(r == std::next(c.begin()));
-        r = c.begin();
-        assert(r->first == 3);
-        LIBCPP_ASSERT(r->second == Emplaceable());
-        r = std::next(r, 2);
-        assert(r->first == 3);
-        LIBCPP_ASSERT(r->second == Emplaceable(5, 6));
     }
     {
         typedef std::unordered_multimap<int, Emplaceable, std::hash<int>, std::equal_to<int>,
@@ -74,20 +66,12 @@ int main(int, char**)
         assert(c.size() == 2);
         assert(r->first == 3);
         assert(r->second == Emplaceable(5, 6));
-        LIBCPP_ASSERT(r == std::next(c.begin()));
 
         r = c.emplace_hint(r, std::piecewise_construct, std::forward_as_tuple(3),
                                                         std::forward_as_tuple(6, 7));
         assert(c.size() == 3);
         assert(r->first == 3);
         assert(r->second == Emplaceable(6, 7));
-        LIBCPP_ASSERT(r == std::next(c.begin()));
-        r = c.begin();
-        assert(r->first == 3);
-        LIBCPP_ASSERT(r->second == Emplaceable());
-        r = std::next(r, 2);
-        assert(r->first == 3);
-        LIBCPP_ASSERT(r->second == Emplaceable(5, 6));
     }
 
   return 0;
diff --git a/libcxx/test/std/utilities/format/format.range/format.range.fmtmap/format.functions.tests.h b/libcxx/test/std/utilities/format/format.range/format.range.fmtmap/format.functions.tests.h
index 11088796e53869..326c03d2e5db24 100644
--- a/libcxx/test/std/utilities/format/format.range/format.range.fmtmap/format.functions.tests.h
+++ b/libcxx/test/std/utilities/format/format.range/format.range.fmtmap/format.functions.tests.h
@@ -754,7 +754,7 @@ void test_string(TestFunction check, ExceptionTest check_exception) {
 
 template <class CharT, class TestFunction, class ExceptionTest>
 void test_status(TestFunction check, ExceptionTest check_exception) {
-  std::unordered_multimap<status, status> input{{status::foobar, status::foo}, {status::foobar, status::bar}};
+  std::multimap<status, status> input{{status::foobar, status::foo}, {status::foobar, status::bar}};
 
   check(SV("{0xaa55: 0xaaaa, 0xaa55: 0x5555}"), SV("{}"), input);
   check(SV("{0xaa55: 0xaaaa, 0xaa55: 0x5555}^42"), SV("{}^42"), input);

Copy link
Contributor

@philnik777 philnik777 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the change is reasonable, but this seems like a change that has major potential for breaking a bunch of people. For that reason it might be a good idea to address #102303 before. If we find that it results in major breakage, we should probably add a flag (at least temporarily) to opt out of the optimization.

@@ -1398,7 +1398,8 @@ __hash_table<_Tp, _Hash, _Equal, _Alloc>::__node_insert_unique(__node_pointer __
// Prepare the container for an insertion of the value __cp_val with the hash
// __cp_hash. This does a lookup into the container to see if __cp_value is
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not attached: We should add a release note for this.

@arvidjonasson
Copy link
Author

I'll look into #102303. However, I think it might be wise to initially make the fix for this issue opt-in, without waiting for #102303. Once #102303 is resolved, we can then consider making the fix opt-out. What are your thoughts on this approach?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi. performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants