Skip to content

Conversation

huixie90
Copy link
Member

@huixie90 huixie90 commented Oct 19, 2023

  • [libc++][test] add more benchmarks for stop_token (this commit has an in progress PR)
  • [libc++] use mutex in the stop_token

Benchmark results:
Old is origin/main
New is the code in this patch (uses mutex in the implementation of stop_token

My platform is MacOS with an old Intel CPU (Intel(R) Core(TM) i7-4850HQ CPU @ 2.30GHz)

Note that the most realistic test case is BM_stop_token_async_reg_unreg_callback (which was given by Lewis Baker)
However, using mutex is 30% worse on my platform

Benchmark                                                                     Time             CPU      Time Old      Time New       CPU Old       CPU New
----------------------------------------------------------------------------------------------------------------------------------------------------------
BM_stop_token_single_thread_polling_stop_requested/1024                    -0.0012         -0.0023         23441         23413         23251         23198
BM_stop_token_single_thread_polling_stop_requested/2048                    +0.0765         +0.0990         50684         54561         48083         52842
BM_stop_token_single_thread_polling_stop_requested/4096                    -0.0343         -0.0081         96385         93076         92942         92186
BM_stop_token_single_thread_polling_stop_requested/8192                    -0.0094         +0.0012        187784        186021        184064        184276
BM_stop_token_single_thread_polling_stop_requested/16384                   +0.1298         +0.1141        373690        422209        367815        409787
BM_stop_token_single_thread_polling_stop_requested/32768                   +0.0254         +0.0156        751648        770709        733097        744565
BM_stop_token_single_thread_polling_stop_requested/65536                   -0.1033         +0.0261       1728804       1550261       1464467       1502668
BM_stop_token_single_thread_polling_stop_requested/131072                  +0.1729         +0.1277       3040908       3566805       2959195       3337039
BM_stop_token_single_thread_polling_stop_requested/262144                  -0.0038         +0.0042       6132277       6108979       5962607       5987738
BM_stop_token_single_thread_polling_stop_requested/524288                  +0.1145         +0.1212      12227363      13627524      11867559      13306034
BM_stop_token_single_thread_polling_stop_requested/1048576                 -0.0410         -0.0121      24950396      23927586      23859300      23570419
BM_stop_token_single_thread_polling_stop_requested/2097152                 +0.0938         +0.1147      48411963      52950600      47142867      52551400
BM_stop_token_single_thread_polling_stop_requested/4194304                 -0.0569         +0.0046     100648615      94923150      93629000      94062429
BM_stop_token_single_thread_polling_stop_requested/8388608                 -0.0606         -0.0015     203746515     191404329     188938500     188654500
BM_stop_token_single_thread_polling_stop_requested/16777216                -0.0465         -0.0125     398377111     379865855     381103000     376323500
BM_stop_token_multi_thread_polling_stop_requested/1024                     -0.0000         -0.0152      10022095      10021933          9053          8915
BM_stop_token_multi_thread_polling_stop_requested/2048                     +0.0000         +0.0045      10022271      10022285          9206          9247
BM_stop_token_multi_thread_polling_stop_requested/4096                     -0.0074         -0.0326      10098053      10023575          9720          9403
BM_stop_token_multi_thread_polling_stop_requested/8192                     +0.0000         +0.0229      10022785      10023275          9552          9771
BM_stop_token_multi_thread_polling_stop_requested/16384                    -0.0001         -0.1178      10022176      10021112          9260          8169
BM_stop_token_multi_thread_polling_stop_requested/32768                    -0.0069         -0.1561      10091077      10021860         10804          9117
BM_stop_token_multi_thread_polling_stop_requested/65536                    -0.0009         -0.0634      10030966      10022186          9947          9316
BM_stop_token_multi_thread_polling_stop_requested/131072                   -0.0009         +0.0602      10031877      10022752          9043          9587
BM_stop_token_multi_thread_polling_stop_requested/262144                   +0.0127         +0.0310      10023011      10150196          9761         10064
BM_stop_token_multi_thread_polling_stop_requested/524288                   -0.0059         -0.0247      10243258      10182981         10051          9803
BM_stop_token_multi_thread_polling_stop_requested/1048576                  +0.0119         -0.0645      20023195      20260835         16056         15021
BM_stop_token_multi_thread_polling_stop_requested/2097152                  +0.0078         +0.0004      38143460      38442644         27510         27520
BM_stop_token_multi_thread_polling_stop_requested/4194304                  +0.1859         +0.1750      68374502      81086410         48510         57000
BM_stop_token_multi_thread_polling_stop_requested/8388608                  +0.2084         +0.1145     128822315     155665247         84690         94390
BM_stop_token_multi_thread_polling_stop_requested/16777216                 +0.0272         +0.1679     256288538     263270466        150100        175300
BM_stop_token_single_thread_reg_unreg_callback/1024                        -0.3340         -0.3734        391317        260604        387914        243075
BM_stop_token_single_thread_reg_unreg_callback/2048                        -0.3844         -0.3899        751618        462709        742515        453025
BM_stop_token_single_thread_reg_unreg_callback/4096                        -0.3854         -0.3912       1482209        910927       1448161        881601
BM_stop_token_single_thread_reg_unreg_callback/8192                        -0.3990         -0.3975       2967106       1783284       2883508       1737339
BM_stop_token_single_thread_reg_unreg_callback/16384                       -0.4122         -0.4149       5901638       3469093       5784545       3384376
BM_stop_token_single_thread_reg_unreg_callback/32768                       -0.4112         -0.4119      11557329       6805108      11433328       6723990
BM_stop_token_single_thread_reg_unreg_callback/65536                       -0.4233         -0.4185      23626270      13624736      23064032      13412019
BM_stop_token_single_thread_reg_unreg_callback/131072                      -0.4142         -0.4150      47619537      27897418      46265067      27066615
BM_stop_token_single_thread_reg_unreg_callback/262144                      -0.4096         -0.4138      92522626      54627747      91492750      53636077
BM_stop_token_single_thread_reg_unreg_callback/524288                      -0.4151         -0.4191     188534994     110266830     185798250     107923333
BM_stop_token_single_thread_reg_unreg_callback/1048576                     -0.4091         -0.4140     368334133     217655504     364605000     213648333
BM_stop_token_single_thread_reg_unreg_callback/2097152                     -0.4136         -0.4151     754131916     442198239     736178000     430615000
BM_stop_token_single_thread_reg_unreg_callback/4194304                     -0.4216         -0.4182    1526152018     882728129    1479928000     860986000
BM_stop_token_single_thread_reg_unreg_callback/8388608                     -0.4065         -0.4172    3039331041    1803756065    2967543000    1729398000
BM_stop_token_single_thread_reg_unreg_callback/16777216                    -0.4290         -0.4206    6073153399    3468023265    5922712000    3431697000
BM_stop_token_async_reg_unreg_callback/1024                                +0.3326         -0.2332       7501646       9996707          9437          7236
BM_stop_token_async_reg_unreg_callback/2048                                +0.1611         -0.1852       8624101      10013095          8971          7310
BM_stop_token_async_reg_unreg_callback/4096                                +0.1167         -0.2250       9058483      10115628          9356          7251
BM_stop_token_async_reg_unreg_callback/8192                                +0.0183         -0.4677       9830867      10011011         13147          6998
BM_stop_token_async_reg_unreg_callback/16384                               +0.2157         -0.5530      16460254      20010154         28254         12630
BM_stop_token_async_reg_unreg_callback/32768                               +0.4153         -0.1983      28446237      40259523         33190         26610
BM_stop_token_async_reg_unreg_callback/65536                               +0.3619         -0.3632      51359055      69947472         71510         45540
BM_stop_token_async_reg_unreg_callback/131072                              +0.3368         -0.4223      98274913     131374363        141940         82000
BM_stop_token_async_reg_unreg_callback/262144                              +0.3220         -0.4234     192115122     253977793        267760        154400
BM_stop_token_async_reg_unreg_callback/524288                              +0.3718         -0.3172     373407673     512234425        445100        303900
BM_stop_token_async_reg_unreg_callback/1048576                             +0.4333         -0.3859     742966141    1064906436       1024000        628800
BM_stop_token_async_reg_unreg_callback/2097152                             +0.3477         -0.4415    1510280878    2035443237       2087000       1165600
BM_stop_token_async_reg_unreg_callback/4194304                             +0.3801         -0.4078    3017343359    4164301772       4134000       2448000
BM_stop_token_async_reg_unreg_callback/8388608                             +0.3525         -0.4697    6003461062    8119399125       8665000       4595000
BM_stop_token_async_reg_unreg_callback/16777216                            +0.3998         -0.3143   11974818159   16762767199      15884000      10892000
OVERALL_GEOMEAN                                                            -0.0529         -0.2108             0             0             0             0

However, running on a MacOS with arm64 chip, the results seems quite opposite:

Comparing ../../../build/atomic.json to ../../../build/mutex.json
Benchmark                                                                     Time             CPU      Time Old      Time New       CPU Old       CPU New
----------------------------------------------------------------------------------------------------------------------------------------------------------
BM_stop_token_single_thread_polling_stop_requested/1024                    -0.0041         -0.0041         11395         11348         11393         11346
BM_stop_token_single_thread_polling_stop_requested/2048                    +0.0420         +0.0410         22460         23405         22457         23377
BM_stop_token_single_thread_polling_stop_requested/4096                    +0.0015         +0.0015         44708         44773         44702         44768
BM_stop_token_single_thread_polling_stop_requested/8192                    +0.0053         +0.0052         90053         90532         90044         90515
BM_stop_token_single_thread_polling_stop_requested/16384                   -0.0003         -0.0003        181192        181143        181166        181119
BM_stop_token_single_thread_polling_stop_requested/32768                   +0.0154         +0.0161        360188        365732        359870        365681
BM_stop_token_single_thread_polling_stop_requested/65536                   +0.0036         +0.0035        722259        724854        722148        724664
BM_stop_token_single_thread_polling_stop_requested/131072                  +0.0002         +0.0002       1443620       1443862       1443435       1443679
BM_stop_token_single_thread_polling_stop_requested/262144                  +0.0103         +0.0102       2917277       2947247       2916906       2946669
BM_stop_token_single_thread_polling_stop_requested/524288                  +0.0061         +0.0060       5855807       5891619       5854835       5889992
BM_stop_token_single_thread_polling_stop_requested/1048576                 -0.0025         -0.0025      11563476      11534010      11561350      11531950
BM_stop_token_single_thread_polling_stop_requested/2097152                 -0.0103         -0.0107      23339057      23097783      23333967      23083900
BM_stop_token_single_thread_polling_stop_requested/4194304                 -0.0188         -0.0188      46495239      45621008      46490067      45615533
BM_stop_token_single_thread_polling_stop_requested/8388608                 -0.0126         -0.0126      93130738      91959323      93117857      91944750
BM_stop_token_single_thread_polling_stop_requested/16777216                -0.0023         -0.0023     186845010     186413594     186816250     186384250
BM_stop_token_multi_thread_polling_stop_requested/1024                     -0.0177         -0.0488       9357993       9192674          8611          8191
BM_stop_token_multi_thread_polling_stop_requested/2048                     -0.2016         +0.4240       8403250       6708806          9291         13230
BM_stop_token_multi_thread_polling_stop_requested/4096                     +0.0566         -0.2998       9490963      10027911         11316          7924
BM_stop_token_multi_thread_polling_stop_requested/8192                     -0.0758         +0.7459       9909932       9158930          8588         14994
BM_stop_token_multi_thread_polling_stop_requested/16384                    +0.0868         -0.7077       9220371      10020579         27158          7939
BM_stop_token_multi_thread_polling_stop_requested/32768                    +0.0005         -0.0020      10029109      10034418          7830          7814
BM_stop_token_multi_thread_polling_stop_requested/65536                    -0.0041         +0.9394      10026671       9985758          7900         15321
BM_stop_token_multi_thread_polling_stop_requested/131072                   +0.0068         -0.6114       9966698      10034179         20278          7881
BM_stop_token_multi_thread_polling_stop_requested/262144                   -0.0008         +0.1381      10036839      10028716          8390          9549
BM_stop_token_multi_thread_polling_stop_requested/524288                   +0.0033         -0.0085      12172003      12211579          9485          9404
BM_stop_token_multi_thread_polling_stop_requested/1048576                  +0.0039         +0.3853      20347035      20426881         13606         18848
BM_stop_token_multi_thread_polling_stop_requested/2097152                  -0.0248         +0.8056      36933272      36018430         20830         37610
BM_stop_token_multi_thread_polling_stop_requested/4194304                  +0.0281         -0.2110      65121058      66953304         88150         69550
BM_stop_token_multi_thread_polling_stop_requested/8388608                  -0.0200         -0.0926     129479845     126884548        183410        166420
BM_stop_token_multi_thread_polling_stop_requested/16777216                 +0.0102         -0.4571     250539048     253101935        281830        153000
BM_stop_token_single_thread_reg_unreg_callback/1024                        -0.4949         -0.4950        226302        114305        226275        114269
BM_stop_token_single_thread_reg_unreg_callback/2048                        -0.4907         -0.4907        450184        229296        450132        229255
BM_stop_token_single_thread_reg_unreg_callback/4096                        -0.4928         -0.4929        902801        457891        902739        457796
BM_stop_token_single_thread_reg_unreg_callback/8192                        -0.4940         -0.4940       1808106        914974       1807909        914832
BM_stop_token_single_thread_reg_unreg_callback/16384                       -0.4912         -0.4912       3601722       1832608       3601052       1832372
BM_stop_token_single_thread_reg_unreg_callback/32768                       -0.4930         -0.4929       7260665       3681317       7259458       3680917
BM_stop_token_single_thread_reg_unreg_callback/65536                       -0.4925         -0.4925      14447661       7331859      14445979       7331137
BM_stop_token_single_thread_reg_unreg_callback/131072                      -0.4898         -0.4898      28842642      14716291      28838542      14713979
BM_stop_token_single_thread_reg_unreg_callback/262144                      -0.4915         -0.4916      57749441      29363903      57742917      29358833
BM_stop_token_single_thread_reg_unreg_callback/524288                      -0.4928         -0.4928     115710729      58691028     115697167      58683167
BM_stop_token_single_thread_reg_unreg_callback/1048576                     -0.4912         -0.4912     230488028     117281687     230457000     117266667
BM_stop_token_single_thread_reg_unreg_callback/2097152                     -0.4949         -0.4949     464038104     234392180     463979500     234336667
BM_stop_token_single_thread_reg_unreg_callback/4194304                     -0.4907         -0.4906     925273292     471281145     925071000     471233000
BM_stop_token_single_thread_reg_unreg_callback/8388608                     -0.4950         -0.4950    1865684416     942252250    1865576000     942148000
BM_stop_token_single_thread_reg_unreg_callback/16777216                    -0.4893         -0.4893    3691127792    1885143750    3690851000    1884935000
BM_stop_token_async_reg_unreg_callback/1024                                -0.9329         +0.5698       8354276        560804        115726        181670
BM_stop_token_async_reg_unreg_callback/2048                                -0.8821         +2.1070       9335119       1100545        110450        343163
BM_stop_token_async_reg_unreg_callback/4096                                -0.7802         +6.2397      10008783       2199476        101531        735051
BM_stop_token_async_reg_unreg_callback/8192                                -0.5992        +14.0063      10806708       4330797         89393       1341461
BM_stop_token_async_reg_unreg_callback/16384                               -0.3230        +25.6680      18673120      12642008        201863       5383290
BM_stop_token_async_reg_unreg_callback/32768                               -0.0126        +28.0929      29725453      29350143        341910       9947143
BM_stop_token_async_reg_unreg_callback/65536                               +0.2886        +24.8093      49578377      63888916        706060      18222933
BM_stop_token_async_reg_unreg_callback/131072                              +0.0088        +21.5273      94271864      95104150       1241720      27972591
BM_stop_token_async_reg_unreg_callback/262144                              -0.1699        +19.4190     183031090     151936208       2418360      49380412
BM_stop_token_async_reg_unreg_callback/524288                              -0.4062        +20.9616     368969808     219097508       3596100      78976000
BM_stop_token_async_reg_unreg_callback/1048576                             -0.2458        +15.0834     735211542     554506694      10945000     176032667
BM_stop_token_async_reg_unreg_callback/2097152                             -0.5581        +11.5708    1422213758     628487667      17395300     218672000
BM_stop_token_async_reg_unreg_callback/4194304                             -0.4473         +9.9163    2940221000    1625006979      39179000     427690000
BM_stop_token_async_reg_unreg_callback/8388608                             -0.4736        +12.5049    5805061209    3055751291      66142000     893240000
BM_stop_token_async_reg_unreg_callback/16777216                            -0.4556        +15.6139   11251438833    6125025541     110539000    1836481000
OVERALL_GEOMEAN                                                            -0.2997         +0.5803             0             0             0             0

@huixie90 huixie90 requested a review from a team as a code owner October 19, 2023 12:41
@llvmbot llvmbot added the libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi. label Oct 19, 2023
@llvmbot
Copy link
Member

llvmbot commented Oct 19, 2023

@llvm/pr-subscribers-libcxx

Author: Hui (huixie90)

Changes
  • [libc++][test] add more benchmarks for stop_token
  • [libc++] use mutex in the stop_token
  • remove unused

Full diff: https://github.com/llvm/llvm-project/pull/69600.diff

2 Files Affected:

  • (modified) libcxx/benchmarks/stop_token.bench.cpp (+78-3)
  • (modified) libcxx/include/__stop_token/stop_state.h (+60-46)
diff --git a/libcxx/benchmarks/stop_token.bench.cpp b/libcxx/benchmarks/stop_token.bench.cpp
index 293d55ed82a08cf..e059a1166af16bd 100644
--- a/libcxx/benchmarks/stop_token.bench.cpp
+++ b/libcxx/benchmarks/stop_token.bench.cpp
@@ -14,6 +14,81 @@
 
 using namespace std::chrono_literals;
 
+// We have a single thread created by std::jthread consuming the stop_token:
+// polling for stop_requested.
+void BM_stop_token_single_thread_polling_stop_requested(benchmark::State& state) {
+  auto thread_func = [&](std::stop_token st, std::atomic<std::uint64_t>* loop_count) {
+    while (!st.stop_requested()) {
+      // doing some work
+      loop_count->fetch_add(1, std::memory_order_relaxed);
+    }
+  };
+
+  std::atomic<std::uint64_t> loop_count(0);
+  std::uint64_t total_loop_test_param = state.range(0);
+
+  auto thread = support::make_test_jthread(thread_func, &loop_count);
+
+  for (auto _ : state) {
+    auto start_total = loop_count.load(std::memory_order_relaxed);
+
+    while (loop_count.load(std::memory_order_relaxed) - start_total < total_loop_test_param) {
+      std::this_thread::yield();
+    }
+  }
+}
+
+BENCHMARK(BM_stop_token_single_thread_polling_stop_requested)->RangeMultiplier(2)->Range(1 << 10, 1 << 24);
+
+// We have multiple threads polling for stop_requested of the same stop_token.
+void BM_stop_token_multi_thread_polling_stop_requested(benchmark::State& state) {
+  std::atomic<bool> start{false};
+
+  auto thread_func = [&start](std::atomic<std::uint64_t>* loop_count, std::stop_token st) {
+    start.wait(false);
+    while (!st.stop_requested()) {
+      // doing some work
+      loop_count->fetch_add(1, std::memory_order_relaxed);
+    }
+  };
+
+  constexpr size_t thread_count = 20;
+
+  std::uint64_t total_loop_test_param = state.range(0);
+
+  std::vector<std::atomic<std::uint64_t>> loop_counts(thread_count);
+  std::stop_source ss;
+  std::vector<std::jthread> threads;
+  threads.reserve(thread_count);
+
+  for (size_t i = 0; i < thread_count; ++i) {
+    threads.emplace_back(support::make_test_jthread(thread_func, &loop_counts[i], ss.get_token()));
+  }
+
+  auto get_total_loop = [&loop_counts] {
+    std::uint64_t total = 0;
+    for (const auto& loop_count : loop_counts) {
+      total += loop_count.load(std::memory_order_relaxed);
+    }
+    return total;
+  };
+
+  start = true;
+  start.notify_all();
+
+  for (auto _ : state) {
+    auto start_total = get_total_loop();
+
+    while (get_total_loop() - start_total < total_loop_test_param) {
+      std::this_thread::yield();
+    }
+  }
+
+  ss.request_stop();
+}
+
+BENCHMARK(BM_stop_token_multi_thread_polling_stop_requested)->RangeMultiplier(2)->Range(1 << 10, 1 << 24);
+
 // We have a single thread created by std::jthread consuming the stop_token:
 // registering/deregistering callbacks, one at a time.
 void BM_stop_token_single_thread_reg_unreg_callback(benchmark::State& state) {
@@ -59,11 +134,11 @@ void BM_stop_token_async_reg_unreg_callback(benchmark::State& state) {
   std::atomic<bool> start{false};
 
   std::uint64_t total_reg_test_param = state.range(0);
+  std::vector<std::atomic<std::uint64_t>> reg_counts(thread_count);
 
   std::stop_source ss;
   std::vector<std::jthread> threads;
   threads.reserve(thread_count);
-  std::vector<std::atomic<std::uint64_t>> reg_counts(thread_count);
 
   auto thread_func = [&start](std::atomic<std::uint64_t>* count, std::stop_token st) {
     std::vector<std::optional<std::stop_callback<dummy_stop_callback>>> cbs(concurrent_request_count);
@@ -84,8 +159,8 @@ void BM_stop_token_async_reg_unreg_callback(benchmark::State& state) {
 
   auto get_total_reg = [&] {
     std::uint64_t total = 0;
-    for (const auto& reg_counts : reg_counts) {
-      total += reg_counts.load(std::memory_order_relaxed);
+    for (const auto& reg_count : reg_counts) {
+      total += reg_count.load(std::memory_order_relaxed);
     }
     return total;
   };
diff --git a/libcxx/include/__stop_token/stop_state.h b/libcxx/include/__stop_token/stop_state.h
index 462aa73952b84f9..f3fca6554b378a7 100644
--- a/libcxx/include/__stop_token/stop_state.h
+++ b/libcxx/include/__stop_token/stop_state.h
@@ -12,7 +12,7 @@
 
 #include <__availability>
 #include <__config>
-#include <__stop_token/atomic_unique_lock.h>
+#include <__mutex/mutex.h>
 #include <__stop_token/intrusive_list_view.h>
 #include <__thread/id.h>
 #include <atomic>
@@ -37,10 +37,51 @@ struct __stop_callback_base : __intrusive_node_base<__stop_callback_base> {
   bool* __destroyed_        = nullptr;
 };
 
+// stop_token needs to lock with noexcept. mutex::lock can throw.
+// wrap it with a while loop and catch all exceptions
+class __nothrow_mutex_lock {
+  std::mutex& __mutex_;
+  bool __is_locked_;
+
+public:
+  _LIBCPP_HIDE_FROM_ABI explicit __nothrow_mutex_lock(std::mutex& __mutex) noexcept
+      : __mutex_(__mutex), __is_locked_(true) {
+    __lock();
+  }
+
+  __nothrow_mutex_lock(const __nothrow_mutex_lock&)            = delete;
+  __nothrow_mutex_lock(__nothrow_mutex_lock&&)                 = delete;
+  __nothrow_mutex_lock& operator=(const __nothrow_mutex_lock&) = delete;
+  __nothrow_mutex_lock& operator=(__nothrow_mutex_lock&&)      = delete;
+
+  _LIBCPP_HIDE_FROM_ABI ~__nothrow_mutex_lock() {
+    if (__is_locked_) {
+      __unlock();
+    }
+  }
+
+  _LIBCPP_HIDE_FROM_ABI bool __owns_lock() const noexcept { return __is_locked_; }
+
+  _LIBCPP_HIDE_FROM_ABI void __lock() noexcept {
+    while (true) {
+      try {
+        __mutex_.lock();
+        break;
+      } catch (...) {
+      }
+    }
+    __is_locked_ = true;
+  }
+
+  _LIBCPP_HIDE_FROM_ABI void __unlock() noexcept {
+    __mutex_.unlock(); // throws nothing
+    __is_locked_ = false;
+  }
+};
+
 class __stop_state {
   static constexpr uint32_t __stop_requested_bit        = 1;
-  static constexpr uint32_t __callback_list_locked_bit  = 1 << 1;
-  static constexpr uint32_t __stop_source_counter_shift = 2;
+  static constexpr uint32_t __stop_source_counter_shift = 1;
 
   // The "stop_source counter" is not used for lifetime reference counting.
   // When the number of stop_source reaches 0, the remaining stop_tokens's
@@ -49,9 +90,10 @@ class __stop_state {
   // The "callback list locked" bit implements the atomic_unique_lock to
   // guard the operations on the callback list
   //
-  //       31 - 2          |  1                   |    0           |
-  //  stop_source counter  | callback list locked | stop_requested |
+  //       31 - 1          |    0           |
+  //  stop_source counter  | stop_requested |
   atomic<uint32_t> __state_ = 0;
+  std::mutex __mutex_;
 
   // Reference count for stop_token + stop_callback + stop_source
   // When the counter reaches zero, the state is destroyed
@@ -59,7 +101,7 @@ class __stop_state {
   atomic<uint32_t> __ref_count_ = 0;
 
   using __state_t            = uint32_t;
-  using __callback_list_lock = __atomic_unique_lock<__state_t, __callback_list_locked_bit>;
+  using __callback_list_lock = __nothrow_mutex_lock;
   using __callback_list      = __intrusive_list_view<__stop_callback_base>;
 
   __callback_list __callback_list_;
@@ -101,8 +143,9 @@ class __stop_state {
   }
 
   _LIBCPP_AVAILABILITY_SYNC _LIBCPP_HIDE_FROM_ABI bool __request_stop() noexcept {
-    auto __cb_list_lock = __try_lock_for_request_stop();
-    if (!__cb_list_lock.__owns_lock()) {
+    __callback_list_lock __cb_list_lock(__mutex_);
+    auto __old = __state_.fetch_or(__stop_requested_bit, std::memory_order_release);
+    if ((__old & __stop_requested_bit) == __stop_requested_bit) {
       return false;
     }
     __requesting_thread_ = this_thread::get_id();
@@ -138,20 +181,15 @@ class __stop_state {
   }
 
   _LIBCPP_AVAILABILITY_SYNC _LIBCPP_HIDE_FROM_ABI bool __add_callback(__stop_callback_base* __cb) noexcept {
-    // If it is already stop_requested. Do not try to request it again.
-    const auto __give_up_trying_to_lock_condition = [__cb](__state_t __state) {
-      if ((__state & __stop_requested_bit) != 0) {
-        // already stop requested, synchronously run the callback and no need to lock the list again
-        __cb->__invoke();
-        return true;
-      }
-      // no stop source. no need to lock the list to add the callback as it can never be invoked
-      return (__state >> __stop_source_counter_shift) == 0;
-    };
-
-    __callback_list_lock __cb_list_lock(__state_, __give_up_trying_to_lock_condition);
+    __callback_list_lock __cb_list_lock(__mutex_);
+    auto __state = __state_.load(std::memory_order_acquire);
+    if ((__state & __stop_requested_bit) != 0) {
+      // already stop requested, synchronously run the callback and no need to lock the list again
+      __cb->__invoke();
+      return false;
+    }
 
-    if (!__cb_list_lock.__owns_lock()) {
+    if ((__state >> __stop_source_counter_shift) == 0) {
       return false;
     }
 
@@ -165,7 +203,7 @@ class __stop_state {
 
   // called by the destructor of stop_callback
   _LIBCPP_AVAILABILITY_SYNC _LIBCPP_HIDE_FROM_ABI void __remove_callback(__stop_callback_base* __cb) noexcept {
-    __callback_list_lock __cb_list_lock(__state_);
+    __callback_list_lock __cb_list_lock(__mutex_);
 
     // under below condition, the request_stop call just popped __cb from the list and could execute it now
     bool __potentially_executing_now = __cb->__prev_ == nullptr && !__callback_list_.__is_head(__cb);
@@ -191,30 +229,6 @@ class __stop_state {
     }
   }
 
-private:
-  _LIBCPP_AVAILABILITY_SYNC _LIBCPP_HIDE_FROM_ABI __callback_list_lock __try_lock_for_request_stop() noexcept {
-    // If it is already stop_requested, do not try to request stop or lock the list again.
-    const auto __lock_fail_condition = [](__state_t __state) { return (__state & __stop_requested_bit) != 0; };
-
-    // set locked and requested bit at the same time
-    const auto __after_lock_state = [](__state_t __state) {
-      return __state | __callback_list_locked_bit | __stop_requested_bit;
-    };
-
-    // acq because [thread.stoptoken.intro] Registration of a callback synchronizes with the invocation of that
-    //     callback. We are going to invoke the callback after getting the lock, acquire so that we can see the
-    //     registration of a callback (and other writes that happens-before the add_callback)
-    //     Note: the rel (unlock) in the add_callback syncs with this acq
-    // rel because [thread.stoptoken.intro] A call to request_stop that returns true synchronizes with a call
-    //     to stop_requested on an associated stop_token or stop_source object that returns true.
-    //     We need to make sure that all writes (including user code) before request_stop will be made visible
-    //     to the threads that waiting for `stop_requested == true`
-    //     Note: this rel syncs with the acq in `stop_requested`
-    const auto __locked_ordering = std::memory_order_acq_rel;
-
-    return __callback_list_lock(__state_, __lock_fail_condition, __after_lock_state, __locked_ordering);
-  }
-
   template <class _Tp>
   friend struct __intrusive_shared_ptr_traits;
 };

@huixie90 huixie90 force-pushed the stop_token_mutex_lock branch from 08fe05a to 5860b7a Compare December 2, 2023 16:20
@ldionne
Copy link
Member

ldionne commented Jul 31, 2024

Closing since we decided not to do this.

@ldionne ldionne closed this Jul 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi. performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants