Add blocking #32

polytypic · 2023-02-28T11:42:45Z

This PR adds blocking support to kcas and basically turns kcas into a proper software transactional memory (STM) implementation.

For blocking this PR introduces an experimental domain local await mechanism. It is a first-order interface inspired by proposed "rendezvous" mechanism in a lockfree PR with support for cancellation. Ultimately some sort of unified interface for suspend functionality should be provided by some other library, possibly the Stdlib.

For the blocking support this PR also introduces a dedicated Retry.Later exception for retrying transactions and higher-order single location updates. Previously the Stdlib.Exit exception was used for that purpose, but I feel that having a dedicated exception makes the intent clearer and frees Stdlib.Exit for other purposes such as aborting a transaction.

This PR adds a Promise module based on the Promise module of Eio to the kcas_data package. I also drafted some other Eio style primitives, but decided to move the development of those to another PR.

Perhaps a good place to start review is to read the couple of sections added to the README:

The blocking support works by adding a list of awaiters to locations. When a location is modified, the awaiters are resumed. This adds a bit of overhead to all operations as locations take an extra word of memory and those words also need to be accessed on every write operation. Based on the benchmarks the overhead seems to result in the previous implementation being roughly about 1.05x faster (in some cases the overhead is less and in some cases more). After a blocking operation is resumed, the implementation eagerly removes any awaiters it attached to locations to avoid space leaks.

One internal change made in this PR is that uses of Atomic.get and Atomic.set operations that do not need use fences for correctness are made to use functions fenceless_get and fenceless_set, which are just aliases for Atomic.get and Atomic.set. I have another PR #46 that changes those to actually use fenceless operations and introduces some additional optimizations.

This PR also changes the description of the library to reflect the new capabilities by calling kcas "Software transactional memory based on lock-free multi-word compare-and-set". I believe multi-word compare-and-set is less familiar technical jargon to potential users.

polytypic · 2023-04-20T07:56:41Z

kcas-this is with blocking support and kcas-main is without blocking support. As expected, blocking support adds some overhead.

Benchmark 1: kcas-main/_build/default/test/benchmark.exe 1 10000
  Time (mean ± σ):       3.5 ms ±   0.0 ms    [User: 2.6 ms, System: 0.6 ms]
  Range (min … max):     3.5 ms …   4.0 ms    826 runs
 
  Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.
 
Benchmark 2: kcas-this/_build/default/test/benchmark.exe 1 10000
  Time (mean ± σ):       3.6 ms ±   0.0 ms    [User: 2.7 ms, System: 0.6 ms]
  Range (min … max):     3.5 ms …   3.9 ms    822 runs
 
  Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.
 
Summary
  'kcas-main/_build/default/test/benchmark.exe 1 10000' ran
    1.01 ± 0.01 times faster than 'kcas-this/_build/default/test/benchmark.exe 1 10000'


Benchmark 1: kcas-main/_build/default/test/benchmark.exe 2 10000
  Time (mean ± σ):       7.7 ms ±   0.0 ms    [User: 6.8 ms, System: 0.6 ms]
  Range (min … max):     7.7 ms …   8.0 ms    386 runs
 
Benchmark 2: kcas-this/_build/default/test/benchmark.exe 2 10000
  Time (mean ± σ):       8.0 ms ±   0.0 ms    [User: 7.1 ms, System: 0.6 ms]
  Range (min … max):     8.0 ms …   8.3 ms    372 runs
 
Summary
  'kcas-main/_build/default/test/benchmark.exe 2 10000' ran
    1.04 ± 0.01 times faster than 'kcas-this/_build/default/test/benchmark.exe 2 10000'


Benchmark 1: kcas-main/_build/default/test/benchmark.exe 4 10000
  Time (mean ± σ):      12.9 ms ±   0.1 ms    [User: 11.9 ms, System: 0.7 ms]
  Range (min … max):    12.8 ms …  13.2 ms    230 runs
 
Benchmark 2: kcas-this/_build/default/test/benchmark.exe 4 10000
  Time (mean ± σ):      13.4 ms ±   0.1 ms    [User: 12.5 ms, System: 0.7 ms]
  Range (min … max):    13.3 ms …  13.8 ms    223 runs
 
Summary
  'kcas-main/_build/default/test/benchmark.exe 4 10000' ran
    1.04 ± 0.01 times faster than 'kcas-this/_build/default/test/benchmark.exe 4 10000'


Benchmark 1: kcas-main/_build/default/test/xt_benchmark.exe 1 10000
  Time (mean ± σ):       7.0 ms ±   0.1 ms    [User: 6.1 ms, System: 0.6 ms]
  Range (min … max):     6.8 ms …   7.3 ms    422 runs
 
Benchmark 2: kcas-this/_build/default/test/xt_benchmark.exe 1 10000
  Time (mean ± σ):       7.3 ms ±   0.1 ms    [User: 6.4 ms, System: 0.7 ms]
  Range (min … max):     7.2 ms …   7.9 ms    407 runs
 
  Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.
 
Summary
  'kcas-main/_build/default/test/xt_benchmark.exe 1 10000' ran
    1.05 ± 0.01 times faster than 'kcas-this/_build/default/test/xt_benchmark.exe 1 10000'


Benchmark 1: kcas-main/_build/default/test/xt_benchmark.exe 2 10000
  Time (mean ± σ):      12.7 ms ±   0.1 ms    [User: 11.7 ms, System: 0.7 ms]
  Range (min … max):    12.5 ms …  13.2 ms    237 runs
 
Benchmark 2: kcas-this/_build/default/test/xt_benchmark.exe 2 10000
  Time (mean ± σ):      13.0 ms ±   0.1 ms    [User: 12.0 ms, System: 0.7 ms]
  Range (min … max):    12.8 ms …  13.3 ms    227 runs
 
Summary
  'kcas-main/_build/default/test/xt_benchmark.exe 2 10000' ran
    1.02 ± 0.01 times faster than 'kcas-this/_build/default/test/xt_benchmark.exe 2 10000'


Benchmark 1: kcas-main/_build/default/test/xt_benchmark.exe 4 10000
  Time (mean ± σ):      22.7 ms ±   0.2 ms    [User: 21.7 ms, System: 0.7 ms]
  Range (min … max):    22.4 ms …  23.7 ms    133 runs
 
Benchmark 2: kcas-this/_build/default/test/xt_benchmark.exe 4 10000
  Time (mean ± σ):      23.4 ms ±   0.4 ms    [User: 22.3 ms, System: 0.8 ms]
  Range (min … max):    23.0 ms …  25.5 ms    129 runs
 
  Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.
 
Summary
  'kcas-main/_build/default/test/xt_benchmark.exe 4 10000' ran
    1.03 ± 0.02 times faster than 'kcas-this/_build/default/test/xt_benchmark.exe 4 10000'


Benchmark 1: kcas-main/_build/default/test/xt_parallel_cmp_bench.exe 100000
  Time (mean ± σ):      20.9 ms ±   1.9 ms    [User: 35.0 ms, System: 1.6 ms]
  Range (min … max):    19.1 ms …  31.5 ms    139 runs
 
  Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.
 
Benchmark 2: kcas-this/_build/default/test/xt_parallel_cmp_bench.exe 100000
  Time (mean ± σ):      20.7 ms ±   1.3 ms    [User: 34.1 ms, System: 1.6 ms]
  Range (min … max):    19.6 ms …  29.5 ms    151 runs
 
  Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.
 
Summary
  'kcas-this/_build/default/test/xt_parallel_cmp_bench.exe 100000' ran
    1.01 ± 0.11 times faster than 'kcas-main/_build/default/test/xt_parallel_cmp_bench.exe 100000'


Benchmark 1: kcas-main/_build/default/test/xt_parallel_cmp_bench.exe 200000
  Time (mean ± σ):      39.6 ms ±   2.1 ms    [User: 68.7 ms, System: 2.2 ms]
  Range (min … max):    36.0 ms …  41.4 ms    73 runs
 
  Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.
 
Benchmark 2: kcas-this/_build/default/test/xt_parallel_cmp_bench.exe 200000
  Time (mean ± σ):      38.8 ms ±   2.0 ms    [User: 66.3 ms, System: 2.2 ms]
  Range (min … max):    37.2 ms …  42.7 ms    70 runs
 
  Warning: The first benchmarking run for this command was significantly slower than the rest (42.3 ms). This could be caused by (filesystem) caches that were not filled until after the first run. You are already using the '--warmup' option which helps to fill these caches before the actual benchmark. You can either try to increase the warmup count further or re-run this benchmark on a quiet system in case it was a random outlier. Alternatively, consider using the '--prepare' option to clear the caches before each timing run.
 
Summary
  'kcas-this/_build/default/test/xt_parallel_cmp_bench.exe 200000' ran
    1.02 ± 0.08 times faster than 'kcas-main/_build/default/test/xt_parallel_cmp_bench.exe 200000'


Benchmark 1: kcas-main/_build/default/test/xt_parallel_cmp_bench.exe 400000
  Time (mean ± σ):      76.1 ms ±   4.9 ms    [User: 134.8 ms, System: 3.1 ms]
  Range (min … max):    69.8 ms …  85.3 ms    42 runs
 
  Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.
 
Benchmark 2: kcas-this/_build/default/test/xt_parallel_cmp_bench.exe 400000
  Time (mean ± σ):      76.7 ms ±   5.2 ms    [User: 133.9 ms, System: 3.3 ms]
  Range (min … max):    72.4 ms …  91.4 ms    34 runs
 
  Warning: The first benchmarking run for this command was significantly slower than the rest (86.5 ms). This could be caused by (filesystem) caches that were not filled until after the first run. You are already using the '--warmup' option which helps to fill these caches before the actual benchmark. You can either try to increase the warmup count further or re-run this benchmark on a quiet system in case it was a random outlier. Alternatively, consider using the '--prepare' option to clear the caches before each timing run.
 
Summary
  'kcas-main/_build/default/test/xt_parallel_cmp_bench.exe 400000' ran
    1.01 ± 0.09 times faster than 'kcas-this/_build/default/test/xt_parallel_cmp_bench.exe 400000'


Benchmark 1: kcas-main/_build/default/test/benchmark.exe 1 200000
  Time (mean ± σ):      35.3 ms ±   0.3 ms    [User: 34.3 ms, System: 0.7 ms]
  Range (min … max):    34.5 ms …  35.8 ms    85 runs
 
Benchmark 2: kcas-this/_build/default/test/benchmark.exe 1 200000
  Time (mean ± σ):      36.5 ms ±   0.2 ms    [User: 35.5 ms, System: 0.7 ms]
  Range (min … max):    36.0 ms …  36.8 ms    82 runs
 
Summary
  'kcas-main/_build/default/test/benchmark.exe 1 200000' ran
    1.03 ± 0.01 times faster than 'kcas-this/_build/default/test/benchmark.exe 1 200000'


Benchmark 1: kcas-main/_build/default/test/benchmark.exe 2 200000
  Time (mean ± σ):     120.9 ms ±   0.3 ms    [User: 119.6 ms, System: 1.0 ms]
  Range (min … max):   120.5 ms … 121.6 ms    24 runs
 
Benchmark 2: kcas-this/_build/default/test/benchmark.exe 2 200000
  Time (mean ± σ):     127.1 ms ±   0.1 ms    [User: 125.7 ms, System: 1.0 ms]
  Range (min … max):   126.8 ms … 127.3 ms    23 runs
 
Summary
  'kcas-main/_build/default/test/benchmark.exe 2 200000' ran
    1.05 ± 0.00 times faster than 'kcas-this/_build/default/test/benchmark.exe 2 200000'


Benchmark 1: kcas-main/_build/default/test/benchmark.exe 4 200000
  Time (mean ± σ):     224.9 ms ±   0.3 ms    [User: 223.4 ms, System: 1.2 ms]
  Range (min … max):   224.5 ms … 225.5 ms    13 runs
 
Benchmark 2: kcas-this/_build/default/test/benchmark.exe 4 200000
  Time (mean ± σ):     235.3 ms ±   0.3 ms    [User: 233.8 ms, System: 1.2 ms]
  Range (min … max):   234.7 ms … 235.8 ms    12 runs
 
Summary
  'kcas-main/_build/default/test/benchmark.exe 4 200000' ran
    1.05 ± 0.00 times faster than 'kcas-this/_build/default/test/benchmark.exe 4 200000'


Benchmark 1: kcas-main/_build/default/test/kcas_data/hashtbl_bench.exe 1 1_000_000 1000 90
  Time (mean ± σ):      43.2 ms ±   0.1 ms    [User: 42.0 ms, System: 0.9 ms]
  Range (min … max):    42.9 ms …  43.6 ms    68 runs
 
Benchmark 2: kcas-this/_build/default/test/kcas_data/hashtbl_bench.exe 1 1_000_000 1000 90
  Time (mean ± σ):      44.1 ms ±   0.2 ms    [User: 42.9 ms, System: 0.9 ms]
  Range (min … max):    43.9 ms …  44.7 ms    67 runs
 
Summary
  'kcas-main/_build/default/test/kcas_data/hashtbl_bench.exe 1 1_000_000 1000 90' ran
    1.02 ± 0.00 times faster than 'kcas-this/_build/default/test/kcas_data/hashtbl_bench.exe 1 1_000_000 1000 90'


Benchmark 1: kcas-main/_build/default/test/kcas_data/hashtbl_bench.exe 2 1_000_000 1000 90
  Time (mean ± σ):      27.3 ms ±   0.2 ms    [User: 50.3 ms, System: 1.2 ms]
  Range (min … max):    27.1 ms …  29.0 ms    108 runs
 
  Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.
 
Benchmark 2: kcas-this/_build/default/test/kcas_data/hashtbl_bench.exe 2 1_000_000 1000 90
  Time (mean ± σ):      29.5 ms ±   0.2 ms    [User: 54.7 ms, System: 1.3 ms]
  Range (min … max):    29.3 ms …  30.5 ms    100 runs
 
Summary
  'kcas-main/_build/default/test/kcas_data/hashtbl_bench.exe 2 1_000_000 1000 90' ran
    1.08 ± 0.01 times faster than 'kcas-this/_build/default/test/kcas_data/hashtbl_bench.exe 2 1_000_000 1000 90'


Benchmark 1: kcas-main/_build/default/test/kcas_data/hashtbl_bench.exe 4 1_000_000 1000 90
  Time (mean ± σ):      17.3 ms ±   0.3 ms    [User: 57.1 ms, System: 2.0 ms]
  Range (min … max):    16.7 ms …  18.8 ms    159 runs
 
Benchmark 2: kcas-this/_build/default/test/kcas_data/hashtbl_bench.exe 4 1_000_000 1000 90
  Time (mean ± σ):      17.5 ms ±   0.2 ms    [User: 58.0 ms, System: 2.1 ms]
  Range (min … max):    17.1 ms …  18.6 ms    174 runs
 
Summary
  'kcas-main/_build/default/test/kcas_data/hashtbl_bench.exe 4 1_000_000 1000 90' ran
    1.01 ± 0.02 times faster than 'kcas-this/_build/default/test/kcas_data/hashtbl_bench.exe 4 1_000_000 1000 90'


Benchmark 1: kcas-main/_build/default/test/kcas_data/hashtbl_bench.exe 1 1_000_000 1000 10
  Time (mean ± σ):     255.8 ms ±   0.3 ms    [User: 254.3 ms, System: 1.2 ms]
  Range (min … max):   255.4 ms … 256.3 ms    11 runs
 
Benchmark 2: kcas-this/_build/default/test/kcas_data/hashtbl_bench.exe 1 1_000_000 1000 10
  Time (mean ± σ):     261.1 ms ±   0.3 ms    [User: 259.5 ms, System: 1.2 ms]
  Range (min … max):   260.7 ms … 261.6 ms    11 runs
 
Summary
  'kcas-main/_build/default/test/kcas_data/hashtbl_bench.exe 1 1_000_000 1000 10' ran
    1.02 ± 0.00 times faster than 'kcas-this/_build/default/test/kcas_data/hashtbl_bench.exe 1 1_000_000 1000 10'


Benchmark 1: kcas-main/_build/default/test/kcas_data/hashtbl_bench.exe 2 1_000_000 1000 10
  Time (mean ± σ):     157.1 ms ±   0.2 ms    [User: 308.9 ms, System: 1.6 ms]
  Range (min … max):   156.8 ms … 157.4 ms    19 runs
 
Benchmark 2: kcas-this/_build/default/test/kcas_data/hashtbl_bench.exe 2 1_000_000 1000 10
  Time (mean ± σ):     177.3 ms ±   0.2 ms    [User: 349.0 ms, System: 1.7 ms]
  Range (min … max):   176.9 ms … 177.5 ms    16 runs
 
Summary
  'kcas-main/_build/default/test/kcas_data/hashtbl_bench.exe 2 1_000_000 1000 10' ran
    1.13 ± 0.00 times faster than 'kcas-this/_build/default/test/kcas_data/hashtbl_bench.exe 2 1_000_000 1000 10'


Benchmark 1: kcas-main/_build/default/test/kcas_data/hashtbl_bench.exe 4 1_000_000 1000 10
  Time (mean ± σ):      90.0 ms ±   0.6 ms    [User: 344.9 ms, System: 2.7 ms]
  Range (min … max):    88.9 ms …  91.2 ms    33 runs
 
Benchmark 2: kcas-this/_build/default/test/kcas_data/hashtbl_bench.exe 4 1_000_000 1000 10
  Time (mean ± σ):      91.4 ms ±   0.4 ms    [User: 350.6 ms, System: 2.8 ms]
  Range (min … max):    90.5 ms …  92.3 ms    32 runs
 
Summary
  'kcas-main/_build/default/test/kcas_data/hashtbl_bench.exe 4 1_000_000 1000 10' ran
    1.02 ± 0.01 times faster than 'kcas-this/_build/default/test/kcas_data/hashtbl_bench.exe 4 1_000_000 1000 10'

polytypic · 2023-04-20T08:06:22Z

README.md

@@ -1504,7 +1602,7 @@ experiment where we abort the transaction in case we observe that the values of

 ```ocaml
 # with_updater @@ fun () ->
-    for _ = 1 to 10_000 do
+    for _ = 1 to 100_000 do


This is inherently non-deterministic and I noticed that it occasionally failed on CI. I haven't noticed failures after increasing the number of attempts, but it can of course still happen.

polytypic · 2023-04-20T08:09:47Z

src/kcas.ml

+*)
+(**)
+let fenceless_get = Atomic.get
+let fenceless_set = Atomic.set


I have another PR #46 that changes these aliases to actually perform fenceless operations. Having these as fenceless should be safe (because the fences would be redundant), seems to significantly improve performance on ARM (Apple M1), and is also completely internal to the library, so I personally feel that we should just make the optimization.

polytypic · 2023-04-20T08:11:31Z

src/kcas.ml

+      state : 'a state;
+      lt : cass;
+      gt : cass;
+      mutable awaiters : awaiter list;


This is mutable so that during the determine phase the awaiters can be efficiently copied from the updated locations to be resumed during the release phase.

polytypic · 2023-04-20T08:14:57Z

src/kcas.ml

+                awaiter
+            then add_awaiters awaiter casn gt
+            else stop
+        | CASN _ as stop -> stop)


add_awaiters signals that some location had already been updated by returning the CASN _ descriptor of that location. This way the subsequent call of remove_awaiters (to prevent space leaks) can stop early at that same descriptor.

polytypic · 2023-04-20T08:18:58Z

src/kcas_data/promise.ml

@@ -0,0 +1,57 @@
+open Kcas
+
+type 'a internal = 'a Magic_option.t Loc.t


The Magic_option avoids a level of indirection. I decided to go with this optimization to reduce space usage and because promise is already using some other OCaml magic (as does the Eio Promise implementation).

lyrm

For this review, I :

read and tested the examples in README
read documentation
had a look at the added code but I can not say I fully understand kcas algorithm
made some tests/tries on my own
had a quick look at kcas_data changes

What I did not do :

try the changes in kcas_data data structures
try/read Kcas_data.Promise implementation

Overall, there is not a lot to say except documentation is (as usual) great and easy to read and obviously, this seems like a great addition to kcas and kcas_data !

lyrm · 2023-04-25T12:39:40Z

src/kcas.mli

+  (** Exception that may be raised to signal that the operation should be
+      retried, at some point in the future, after the examined shared memory
+      location or locations have changed.
+
+      {b NOTE}: It is important to understand that "{i after}" may effectively
+      mean "{i immediately}", because it may be the case that the examined
+      shared memory locations have already changed. *)
+
+  val later : unit -> 'a
+  (** [later ()] is equivalent to [raise Later]. *)


Maybe give more information about where this exception is caught (at least a pointer to the corresponding functions ?) ?

Yes.

BTW, it could be a nice addition to odoc to automatically attach a list of references to items. The list of operations like get_as, update, and commit that refer to Later would then be seen directly when looking at Later.

lyrm · 2023-04-25T12:45:56Z

src/kcas.mli

+  (** In [lock_free] mode the algorithm makes sure that at least one domain will
+      be able to make progress by performing read-only operations as read-write
+      operations. *)


Which is a more expensive operation, right ? It may be good to emphasize this with something like

at least one domain will be able to make progress at the cost of performing read-only operations as read-write operations.

I will adjust the text. The cost is actually non-trivial, but generally speaking mutating locations is more expensive unless avoiding the mutation leads to starvation, which is kind of the difference between the modes. (The commit mechanism is designed to avoid starvation due to the obstruction-free mode by switching to the lock-free mode in case interference is detected.)

lyrm · 2023-04-25T12:56:09Z

README.md

+    let x =
+      x
+      |> Loc.get_as @@ fun x ->
+         Retry.unless (x <> 0);
+         x


Not convinced this is more readable than

let x = Loc.get_as (fun x -> Retry.unless (x<>0); x) x

Hmm... I guess it is better to avoid unnecessary cute uses of operators in examples.

lyrm · 2023-04-25T13:18:01Z

src/kcas.mli

+  val to_blocking : xt:'x t -> (xt:'x t -> 'a option) -> 'a
+  (** [to_blocking ~xt tx] converts the non-blocking transaction [tx] to a
+      blocking transaction by retrying on [None]. *)
+
+  val to_nonblocking : xt:'x t -> (xt:'x t -> 'a) -> 'a option
+  (** [to_nonblocking ~xt tx] converts the blocking transaction [tx] to a
+      non-blocking transaction by returning [None] on retry. *)


Seems like simple but useful functions 👍

lyrm · 2023-04-25T13:20:13Z

src/kcas.mli

-  val attempt : ?mode:Mode.t -> 'a tx -> 'a
-  (** [attempt tx] attempts to atomically perform the transaction over shared
-      memory locations recorded by calling [tx] with a fresh explicit
-      transaction log.  If used in {!Mode.obstruction_free} may raise
-      {!Mode.Interference}.  Otherwise either raises [Exit] on failure to commit
-      the transaction or returns the result of the transaction.  The default for
-      [attempt] is {!Mode.lock_free}. *)


Why remove this function ? Is it because commit can be aborted by an exception ?

Good question! I usually try to comment on changes like this, but I apparently missed that.

I decided to remove attempt for two reasons:

attempt can not implement the blocking mechanism by itself. Instead it would potentially raise the Later exception for the user to handle and then it would mean that there should probably be some additional support to be able to await for changes to multiple locations.

I initially provided attempt as a means for users to write their own version of commit (e.g. with additions like timeouts). However, one should be able to achieve pretty much everything via commit already (including timeouts — e.g. setup a location that is written at timeout) and I don't see much use cases for attempt — I never used it myself except in tests.

So, rather than add even more (potentially unused) things to the API, I decided that it is better to just remove attempt. Something like it can be later added back if there are real use cases for it.

I am interested to see how you add a timeout (this also seems like it could be a good example).

lyrm · 2023-04-25T14:52:52Z

src/kcas.ml

-  let modify ?backoff loc f = update ?backoff loc f |> ignore [@@inline]
+    match f before with
+    | after ->
+        if before == after then before


resume_awaiters is not called here because the value has not changed, right ?

Yes, this is an optimization to avoid unnecessarily updating locations. It wasn't implemented before, but with the addition of awaiters it potentially avoids waking up awaiters unnecessarily (as nothing logically changed), which could avoid a lot of unnecessary computation.

lyrm · 2023-04-25T15:11:34Z

src/kcas.ml

+            resume_awaiters before state'.awaiters
+          else update_no_alloc await (Backoff.once backoff) loc state f
+    | exception Retry.Later ->
+        let state = new_state (Obj.magic ()) in


I am not extremely familiar with the use of Obj.magic but why not use before here instead ?

Hmm... Both state.before and state.after will be overwritten, so using before would work here also. Sure, why not.

polytypic · 2023-04-26T09:33:01Z

For this review [...]

Thanks for the review!

My plan now is to move the Domain_local_await thing to a separate repository and release it as a separate package on opam and adjust this PR (and related PRs) to use that package.

As has been discussed elsewhere, the idea with Domain_local_await is to provide a mechanism that works today and should also work in the future to implement blocking in kcas and potentially other places like lockfree. Specifically, I want to be able to publish a new version of kcas that potential users can install, today, from opam and then that version of kcas should just work out of the box, allowing users to use blocking and to communicate and synchronize between domains, systhreads, Eio fibers, Domainslib fibers, and anything else that support Domain_local_await.

However, in the future we may end up using some different means to provide blocking support. This is not a problem for me and should not be a problem in general and we can then change kcas to use such a future mechanism when it is available. At that point we can deprecate Domain_local_await and remove support for it from schedulers.

It should also be mentioned that Domain_local_await is something that 99% of users should not need to know about at all. It is just an internal mechanism that allows blocking support. So, if and when we decide what the final interface for such blocking support is, most users should not even notice that it changed (they just upgrade their packages and at some point the Domain_local_await package is no longer used at all).

kayceesrk · 2023-04-26T10:49:56Z

Sounds good to me.

99% of users should not need to know about at all.

The 1% will be the implementations of concurrency libraries such as Eio and Domainslib, and any blocking synchronization structures such as promises, kcas, rendezvous channels, etc. That said, by going ahead with the proposed interface for DLA, I'm hoping that we will gain experience by doing it rather than trying to come up with the perfect scheme now.

polytypic linked an issue Feb 28, 2023 that may be closed by this pull request

Extend the Tx mechanism to support non-busy wait or blocking #25

Closed

polytypic force-pushed the add-blocking branch 5 times, most recently from 9e6ea54 to 8ba2122 Compare March 7, 2023 09:19

polytypic force-pushed the add-blocking branch from 8ba2122 to 1a64dc7 Compare March 7, 2023 22:24

polytypic force-pushed the add-blocking branch 6 times, most recently from 00fe8f8 to 1d03f71 Compare March 22, 2023 07:40

This was referenced Mar 23, 2023

Rendezvous and demonstration on Spsc_queue API ocaml-multicore/saturn#68

Open

Review and Commentary deepali2806/unified_interface#1

Open

polytypic force-pushed the main branch 2 times, most recently from 1f32020 to 790267c Compare April 5, 2023 09:52

polytypic force-pushed the add-blocking branch 4 times, most recently from c2c780c to 3d8521b Compare April 8, 2023 23:28

polytypic changed the title ~~WIP: Add blocking~~ Add blocking Apr 9, 2023

polytypic force-pushed the add-blocking branch 8 times, most recently from 3e9287e to 0dce076 Compare April 11, 2023 20:15

polytypic force-pushed the add-blocking branch 6 times, most recently from 9c4951f to a13b8d7 Compare April 20, 2023 07:20

polytypic marked this pull request as ready for review April 20, 2023 07:57

polytypic requested a review from a team April 20, 2023 08:01

polytypic commented Apr 20, 2023

View reviewed changes

polytypic force-pushed the add-blocking branch from a13b8d7 to 6426050 Compare April 20, 2023 21:10

This was referenced Apr 21, 2023

Add domain local await support ocaml-multicore/domainslib#107

Merged

Using fenceless get and set operations where it is safe #21

Closed

Optimizations #46

Merged

polytypic force-pushed the add-blocking branch 2 times, most recently from d1d972b to 25c9d9e Compare April 24, 2023 16:07

polytypic mentioned this pull request Apr 24, 2023

Add support for domain local await ocaml-multicore/eio#494

Merged

lyrm reviewed Apr 25, 2023

View reviewed changes

polytypic force-pushed the add-blocking branch from ef4e40d to 1f921a5 Compare April 27, 2023 07:58

polytypic added 2 commits April 28, 2023 09:23

Add Queue.take_all and Stack.pop_all

7b223ee

Add blocking

1514eb1

polytypic force-pushed the add-blocking branch from 1f921a5 to 1514eb1 Compare April 28, 2023 06:23

polytypic merged commit 877fe54 into main Apr 28, 2023
1 of 2 checks passed

polytypic deleted the add-blocking branch April 28, 2023 06:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add blocking #32

Add blocking #32

polytypic commented Feb 28, 2023 •

edited

polytypic commented Apr 20, 2023

polytypic Apr 20, 2023

polytypic Apr 20, 2023 •

edited

polytypic Apr 20, 2023 •

edited

polytypic Apr 20, 2023

polytypic Apr 20, 2023

lyrm left a comment

lyrm Apr 25, 2023

polytypic Apr 26, 2023

lyrm Apr 25, 2023

polytypic Apr 26, 2023 •

edited

lyrm Apr 25, 2023

polytypic Apr 26, 2023

lyrm Apr 25, 2023

lyrm Apr 25, 2023

polytypic Apr 26, 2023

lyrm Apr 26, 2023

lyrm Apr 25, 2023

polytypic Apr 26, 2023

lyrm Apr 25, 2023

polytypic Apr 26, 2023

polytypic commented Apr 26, 2023 •

edited

kayceesrk commented Apr 26, 2023 •

edited

		@@ -0,0 +1,57 @@
		open Kcas

		type 'a internal = 'a Magic_option.t Loc.t

Add blocking #32

Add blocking #32

Conversation

polytypic commented Feb 28, 2023 • edited

polytypic commented Apr 20, 2023

Choose a reason for hiding this comment

polytypic Apr 20, 2023 • edited

Choose a reason for hiding this comment

polytypic Apr 20, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lyrm left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

polytypic Apr 26, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

polytypic commented Apr 26, 2023 • edited

kayceesrk commented Apr 26, 2023 • edited

polytypic commented Feb 28, 2023 •

edited

polytypic Apr 20, 2023 •

edited

polytypic Apr 20, 2023 •

edited

polytypic Apr 26, 2023 •

edited

polytypic commented Apr 26, 2023 •

edited

kayceesrk commented Apr 26, 2023 •

edited