Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add Dynarray to the stdlib. #11563

Closed
wants to merge 29 commits into from
Closed

add Dynarray to the stdlib. #11563

wants to merge 29 commits into from

Conversation

c-cube
Copy link
Contributor

@c-cube c-cube commented Sep 25, 2022

Overview

This is a (work in progress) PR to add dynamic arrays ("vectors") to the stdlib. The module name is Dyn_array, which, as some people pointed out, is more correct than vector. For now the implementation is pure OCaml. I discussed with @Octachron about ways to implement some filling functions in C, but I now think it might not be worth it after he pointed out some design constraints newly imposed by multicore.

A lot of the API mimics Array, when it does not change the length of the dynamic array.

Rationale

In OCaml code that veers into the imperative side, lists are not enough. A lot of performance oriented languages, such as Rust, C++, Zig, etc. use dynamic arrays as a very common data structure for accumulating items in a given order; as a integer-indexed map; and as a stack. OCaml has the hashtbl structure, which is extremely useful, but so far it has lacked the dynamic array.

Design

A dyn array is a pair

{
  mutable arr: 'a array;
  mutable size: int;
}

It is initialized with the empty array (see discussion of alternative designs below). The first element pushed is used to fill unused slots. At a given point in time, the array might look like:

{ size=n;
  arr= [| a0; a1; a2; …; a_{n-1}; filler; filler; filler; filler |];
}

The filler element is normally the very first element that was pushed into the vector.

The resize factor is 1.5 and is not configurable. This is in the interest of simplicity and should be good enough. It's at least used in some C++ STL implementations, to my knowledge. Compared to the traditional ×2 it wastes a bit less space, at the expense of slightly more frequent resizings.

Alternative designs that this does not implement

There are a few design choice points related to the specifics of resizable arrays in OCaml; here are some we considered but dropped.

dummy element

Some of the existing libraries (res, vector) require a dummy element at initialization. While this proposal does so for ensure_capacity_with, I believe that requiring a dummy element to create an empty vector is inconvenient and not good enough. Sometimes there are types that are complicated to create, or just quite large, and coming out with a dummy value is a massive pain. I've experienced that in the context of SAT solvers (which can use a lot of dynamic arrays), and switched to the current design with no performance issue.

unused slots

With @Octachron we discussed a design where dummy values (potentially invalid, although valid to the GC) could be used to fill the array. That's what is done in containers. However, due to the contract that multicore OCaml must not observe invalid values, even in the presence of data races, this design is unsound; a race condition between push and pop, or get and pop, might be able to observe the (invalid) filler value.

customizable resizing policies

The strategy for resizing the vector when it's too full (or too empty, when we downsize) can be implemented in many ways. For example res and batteries parametrize over it.

Here I picked a single strategy and sticked to it. Rust and C++ do not offer this capability either. The upside is that vectors are simpler. If there is evidence of alternative strategies being critical for performance, this PR could be amended to include a resizer strategy.

The current strategy also tries to reset the underlying array when the dyn array becomes empty, so it doesn't hold on to the dummy element anymore. Some functions, like shrink_to_size, can also be used to cut off the unused portion of the array to free the dummy element, and to eliminate the memory overhead.

Related discussions

Recent comments about the lack of this data structure:

Prior attempts:

stdlib/dyn_array.ml Outdated Show resolved Hide resolved
@johnwhitington
Copy link
Contributor

Bikeshedding:

Dynarray is fine: it's a consonant followed by a vowel, and the whole thing is pronounceable, so no need for the underscore.

@dbuenzli
Copy link
Contributor

Personally I'm not very fond about the idea. For the following reasons:

  1. APIs returning arrays will now have to choose between returning an array, a bigarray or a dynarray which will invariably make one or other consumer of the API sad.

  2. I think it would be interesting to eventually consider the addition persistent vectors (something like RBB vectors or one of its variants), especially with multicore at the corner – and because I suspect they would do an excellent and extremely versatile Unicode text data structure without having to add one to the stdlib, allowing to process different sequences of units of text embedded in each other at one's granularity wish using a uniform API (e.g. sequences of paragraphs which are sequences of words which are sequences of graphemes clusters which are sequences of scalar values). However that would also add one more option to 1 at which point the landscape becomes seriously messy.

That being said I had to implement more than once a form of Dynarray so it's not as if I don't think something is not missing. But my needs were invariably along the Buffer way – with the ability to access though (e.g. for implementing heaps).

This leads me to the following suggestions:

  1. Why don't we rather provide an Array.Buffer.t type that easily allows to construct arrays without having to specify a size upfront. This would be more indicative that the type to use in general is array (if that's what people think should be of course…)

  2. Or even simpler why don't we simply add:

    val grow : int -> 'a -> 'a array -> 'a array

@gasche
Copy link
Member

gasche commented Sep 25, 2022

As stated earlier, I think that extensible/dynamic arrays are an important data structure and would warmly welcome their addition to the standard library.

It's a difficult API to design because of this requirement to chose a value to put in the "empty" slots of the array -- in particular when growing the underlying array or popping values from the array. The problem is that the lifetime of this "filler" value gets extended to the lifetime of the array, which may be a memory leak in some scenarios.

I see three approaches that make sense:

  1. Not worry about this memory leak, and just keep user-provided values in the empty slots. This should be documented carefully. Leak-conscious users can work around it: when they want to pop but don't want the value in the array to be retained, they can explicitly write a non-leakly value in this position first. (A fill_and_pop helper could be provided for this advanced use-case.)
  2. Use a very explicit API that forces users to be mindful of leaks, by requiring the user to provide a "filler" value on array creation (or, alternatively, on both push and pop). @c-cube makes the point above that this is inconvenient in some cases; if you want to write polymorphic functions that internally use dynamic arrays, you will probably have to ask a filler value to your caller, leaking this implementation concern into your own API.
  3. Try to be smart by using a clever Obj.magic-unsafe value as a filler. The simple approach of using Obj.magic 0 requires to use a synchronization barrier on pop for multicore safety. Another approach using a fresh/unique block for the module requires extra equality checks. This is conceptually a bit ugly, would have to be benchmarked carefully, and it ties our hands by using a specification that relies on unsafe code.

The current PR implements a variant of (1) where we add an extra write after pop, removing the main source of leakage. It can still leak memory: if you write to the first slot of the array, the previous value in this slot may remain alive indefinitely. (But this behavior only occurs with elements at position 0, no others.) One could see this as a good compromise, or as the worst of two worlds: we have a less-efficient pop (than with (1)) but we still leak (some) memory.

If you want to go that route, I would be tempted to suggest a set_filler function that lets user explicitly set a filler value on an array. If no filler has been set by the user, we always use the first element when we need a filler (as in the current implementation). This way, in the corner case where users want to guarantee the absence of leak, they can do it explicitly. (But then, why do something clever on pop when no filler has been set, instead of just leaving the element in place?)

in any case, I think that the module should carefully document the way it may (or not) retain values provided by the user, and ideally there should exist a (documented) way to use the module to guarantee the absence of leaks.

@c-cube
Copy link
Contributor Author

c-cube commented Sep 25, 2022

@johnwhitington I think it can make sense. I'll wait for agreement on it before doing the renaming though. :)

@gasche thank you for the summary! I think it's accurate.

I'm not convinced anymore that 3) can work in a domain-safe way without terrible performance (basically, a full mutex to protect push/pop/…, because the slot has to be filled/erased at the same time as the length is adjusted; or read synchronisation as well, meaning get would become much slower than the equivalent Array.get). There might be a way I haven't thought of, of course.

The suggestion to have an optional dummy element is interesting. If the type was {mutable arr: 'a array; mutable size: int; mutable filler: 'a option} I could see it working; only if filler = Some f do we do additional work on pop/truncate.

@dbuenzli

  1. I've rarely written code that returns an array, I must say. But otherwise, bigarray is out of the question since it can only store a small set of types. I wouldn't necessarily expect a lot of code to return an explicit Dyn_array.t; it can be more flexible to return a 'a Seq.t, 'a Iter.t, or something like that instead. In general I use a lot of dynamic arrays in the internals of some complex algorithm: as a stack, as a trail for SAT solving, or… well anything in a SAT solver really. It's also a good accumulator for the result of complex fold/iter/loops, since push is amortized O(1).
  2. I think persistent arrays/vectors would be extremely useful too. They're very common in clojure and scala, for a start. However, they're no substitute for a dynamic array in imperative code. So I agree this would be useful, but not really relevant to the current PR.

As for your suggestions:

  1. if Array.Buffer.t 's API is as poor as the current Buffer.t, there really is no point. Dyn arrays are not just useful to build arrays, they can also act as longer-lived structure with push/pop/get/set/append/truncate/… that are really versatile.
  2. grow is not as useful because you can't implement push with it. The whole point of dynamic arrays is that each push is O(1) amortized and has a small chance of allocating/growing. If you grow the array by 1 at each push you get quadratic behavior.

I think I write more imperative OCaml than you do, which is simply indicative of different styles. In the more imperative side, the lack of dynamic arrays has always been clear to me. I've had them in containers for at least 8 years by now :)

@gasche
Copy link
Member

gasche commented Sep 25, 2022

@Octachron and myself discussed ways to do (3) last week. I'm not sure whether you synchronized with him since :-) I'm also not sure whether we should try any harder to use unsafe code in the stdlib; it's dubious and it gives people excuses to write ugly code on their own. ("But Stdlib.Queue does something similar!")

let dummy : 'a = Obj.magic (ref ())

let check x =
  if (x == dummy) then invalid_arg "Dynarray: empty value"

let pop v =
  let new_size = v.size - 1 in
  v.size <- new_size;
  let x = v.arr.(new_size) in
  fill_ v.arr new_size dummy;
  check x; x

let get v i =
  let x = v.(arr).i in
  check x; x

We would need to do benchmarks to know whether this approach is actually worth it.

@c-cube
Copy link
Contributor Author

c-cube commented Sep 26, 2022

Benchmarking could be good, but I'm never too happy about additional overhead on get. It's an operation that, imho, should ideally be a few instructions long.

Also I'm worried it might be even slightly uglier because of the case of float, just like in containers. You need to have a dummy for floats too (probably a nan?) and test against one of the two dummies based on the tag of the array.

@gasche
Copy link
Member

gasche commented Sep 26, 2022

Would you then be willing to update the PR to a version with an optional filler element? (This could be provided at creation time and/or set later.) I don't know if you want to keep your current logic of picking a filler in the other cases, or gain the speed benefits of just leaking elements.

@chambart
Copy link
Contributor

let dummy : 'a = Obj.magic (ref ())

let check x =
  if (x == dummy) then invalid_arg "Dynarray: empty value"

let get v i =
  let x = v.(arr).i in
  check x; x

This might have some problems with marshaling: if you unmarshal a dynarray, your check functions won't work.

@xavierleroy
Copy link
Contributor

This might have some problems with marshaling: if you unmarshal a dynarray, your check functions won't work.

The dummy object could be part of the dynarray:

type 'a t = {
  mutable size : int;
  mutable arr : 'a array;
  filler: Obj.t
}

This said, I'm not convinced we need to go to such lengths to avoid potential memory leaks. Using the first element of the array as the filler is already pretty safe, at least when the dynarray is used as a stack.

@xavierleroy
Copy link
Contributor

Dynarray is fine: it's a consonant followed by a vowel, and the whole thing is pronounceable, so no need for the underscore.

Fully agreed. I find underscores in module names hard to read, so let's avoid them when possible. A precedent: Hashtbl (and not Hash_tbl).

@chambart
Copy link
Contributor

In general I see the (3) solution as implementing the dynarray using an option array as storage, and using a magic version of the option array. This is 'type safe' in the multicore sense as long as you don't implement get functions with an unsafe get.

If you want some reference for an efficient option array that I think handles all the strange cases, you can have a look at https://github.com/chambart/ocaml-nullable-array/blob/master/lib/nullable_array.ml
But if we were to go that way, I would rather suggest to add a few more tricks to the GC to allow a simpler and more efficient version of that (Which I think are quite simple and not expensive, but I didn't check recently).

If this were the preferred choice, I would suggest to use 'a option array for this PR and patch it later when a proper version is available.

@gadmm
Copy link
Contributor

gadmm commented Sep 26, 2022

  • You are using unsafe_get and unsafe_set internally so there should be an argument in the code for why this is memory-safe in terms of the OCaml memory model (this is not just needed for Obj.magic). But I think the current version is unsafe (for instance the current get implementation). The OCaml memory model makes programs memory-safe essentially by: 1) handling deallocation through the GC, 2) storing block width and tag immutably within the block itself, 3) always using checked get and set operations on these blocks, 4) some fancy stuff for publication safety irrelevant here. In other words by explicitly excluding C++-style vectors. There is no magic. So I do not believe that one can efficiently implement them the way you are trying here. It is worth looking at the implementation of ArrayList in Java, a language which has the same approach to memory safety as OCaml 5. It looks like they still had to use checked get and set on the underlying array.

  • Note that the version with unsafe_get/set (if you adapted one that already existed) was probably already memory-unsafe in the presence of systhreads, depending on where polling points end up. (But memory safety seemed to be less of a concern with systhreads in OCaml 4 at least for the stdlib.)

  • On the other hand, you can remove some checks in the iterator, e.g. in the result of to_seq, if like in Java you forbid concurrent writes by raising a ConcurrentModificationException or something whenever you detect something is wrong with the iterator. With this assumption you can simply iterate on the underlying array, up to the last element from when to_seq was called, check the array length only once, and access array elements with unsafe_get. (It is hopeless to try to do anything meaningful in the presence of concurrent modifications.)

  • Regarding the absence of leaks, I think that the priority is to have a correct implementation, one that does not leak. You have also shared your own experience that asking for a dummy element is painful for the user (and it does look like a kludge). But also, in terms of early race detection, this is moving a check to the user (the check that Gabriel mentions). I quite like @gasche's Obj.magic (ref ()) approach (the additional check is also useful with other approaches for data-race detection, it is just that this approach lets you do it reliably). Now he writes:

    This is conceptually a bit ugly, would have to be benchmarked carefully, and it ties our hands by using a specification that relies on unsafe code.

    Not at all! This is simply a micro-optimisation of wrapping every cell in an option type. This is a bit similar to using C NULL as a special value distinct from all values (as suggested some time ago by Leo White), and which as a compile-time constant would result in better-generated code (but which requires support in the runtime, which would not cost anything but which is currently absent). Note that this is what Java's ArrayList does (using NULL and thus a null check)¹.

    So one should first try the obvious solution (already proposed at Added dynamic arrays #9122) of wrapping cells in an option type, and then observe that there are necessary performance gains by unboxing the option by hand. Regarding the broader question of what limits OCaml here, it look to me like the question fits nicely within Stephen and Leo's work on unboxing presented at the ML workshop. If one believes that the option type captures all notions of nullability, then why not use option here.

  • As for what to do with "flat" float dynarrays, there are several possible choices but I would just like to point out that there is no room for a dummy value; using a signalling NaN instead of checking for dummy is the closest you can get to fail early in case of a data race.

Of course I am not a fan of the Java concurrency model, but given that this is the general direction chosen by OCaml and even if it makes dynamic arrays slower than expected, trying to hack now a different model than the one advertised with OCaml multicore could make it look disingenuous. Unless one decides that such a data structure made for imperative programming are not made for OCaml, compared to other structures (e.g. what Daniel says).

¹: of course in Java this is worse because it suffers from conflating the various notions of NULL (Option vs. unshared, uninitialized value); there is no reason to have the same confusion in OCaml so lectures about how bad NULL is in Java are off-topic for this PR. However, lectures about how bad the Java concurrency model is are welcome!

@gadmm
Copy link
Contributor

gadmm commented Sep 26, 2022

Our messages crossed with @chambart, and I see that his example at https://github.com/chambart/ocaml-nullable-array/blob/master/lib/nullable_array.ml also mentions the solution of making the runtime understand NULL as a special value.

@gadmm
Copy link
Contributor

gadmm commented Sep 26, 2022

@chambart

But if we were to go that way, I would rather suggest to add a few more tricks to the GC to allow a simpler and more efficient version of that (Which I think are quite simple and not expensive, but I didn't check recently).

As per my work presented at the OCaml workshop, adding a null check during marking would not result in an observable effect on performance, even for the prefetching marking loop.

@c-cube
Copy link
Contributor Author

c-cube commented Sep 26, 2022

  • You are using unsafe_get and unsafe_set internally so there should be an argument in the code for why this is memory-safe in terms of the OCaml memory model (this is not just needed for Obj.magic). But I think the current version is unsafe (for instance the current get implementation).

I think you're correct, in case another thread modifies the underlying array while the size check is ongoing (e.g. by calling pop). I'll use regular access.

* Note that the version with `unsafe_get/set` (if you adapted one that already existed) was probably already memory-unsafe in the presence of systhreads, depending on where polling points end up. (But memory safety seemed to be less of a concern with systhreads in OCaml 4 at least for the stdlib.)

Maybe. There was not a clear promise of memory safety in case of race condition, I believe :). This has never been intended to be thread-safe, nor should it be; I suppose the change with OCaml 5 is that now we have to provide a basic promise of memory safety even for very wrong™ code.

* On the other hand, you can remove some checks in the iterator, e.g. in the result of `to_seq`, if like in Java you forbid concurrent writes by raising a `ConcurrentModificationException` or something whenever you detect something is wrong with the iterator. With this assumption you can simply iterate on the underlying array, up to the last element from when `to_seq` was called, check the array length only once, and access array elements with `unsafe_get`. (It is hopeless to try to do anything meaningful in the presence of concurrent modifications.)

Does that mean every modification needs to do additional work so its changes can be detected by an existing iterator? Perhaps additional storage, too? If so, I think it's better to just have a tiny overhead on the iteration itself.

* Regarding the absence of leaks, I think that the priority is to have a correct implementation, one that does not leak. You have also shared your own experience that asking for a dummy element is painful for the user (and it does look like a kludge). But also, in terms of early race detection, this is moving a check to the user (the `check` that Gabriel mentions). I quite like @gasche's `Obj.magic (ref ())` approach (the additional check is also useful with other approaches for data-race detection, it is just that this approach lets you do it reliably). Now he writes:
  > This is conceptually a bit ugly, would have to be benchmarked carefully, and it ties our hands by using a specification that relies on unsafe code.
  
  So one should first try the obvious solution (already proposed at [Added dynamic arrays #9122](https://github.com/ocaml/ocaml/pull/9122)) of wrapping cells in an option type, and then observe that there are necessary performance gains by unboxing the option by hand. 

I think @chambart 's idea is nice, if I understood it correctly: reserve a slot at the beginning for the sentinel value (or perhaps in the record around it, which works the same wrt marshal), and compare to it for detecting invalid memory. So I'm open to moving to option (3), possibly reusing @chambart's code, if it has reasonable chances to be accepted despite the low level magic. On the other hand, using option everywhere is not good enough even for a first version. And (2) is a buggy API :).

* As for what to do with "flat" float dynarrays, there are several possible choices but I would just like to point out that there is no room for a dummy value; using a signalling NaN _instead_ of checking for dummy is the closest you can get to fail early in case of a data race.

I don't know what the maintainers think, but banning floats would also be an option in my view. Floats belong in bigarrays in general, so a specialized resizable bigarray might be an idea for another PR.

@gadmm
Copy link
Contributor

gadmm commented Sep 26, 2022

Does that mean every modification needs to do additional work so its changes can be detected by an existing iterator?

It is a "may" not a "must", so no extra work. Instead you allow yourself to raise an exception if you meet an unexpected sentinel value (instead of stopping the iteration silently, say). I have not looked deeply at ArrayList to see what extra work they do, it might be worth having a look.

I like @chambart's approach avoiding the flat float array representation.

For the rest of your replies to my comments, sounds good to me.

@c-cube c-cube changed the title add Dyn_array to the stdlib. add Dynarray to the stdlib. Sep 27, 2022
@chambart
Copy link
Contributor

After some discussion with @Ekdohibs we thought of another way of creating a NULL that does not require any change to the GC with all the good properties. I leave you this teasing while we produce an example.

Copy link
Contributor

@dbuenzli dbuenzli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know it's still a draft but here's a first round of comments on the interface.

stdlib/dynarray.mli Show resolved Hide resolved
stdlib/dynarray.mli Outdated Show resolved Hide resolved
stdlib/dynarray.mli Show resolved Hide resolved
stdlib/dynarray.mli Outdated Show resolved Hide resolved
stdlib/dynarray.mli Show resolved Hide resolved
stdlib/dynarray.mli Outdated Show resolved Hide resolved
stdlib/dynarray.mli Outdated Show resolved Hide resolved

val shrink_capacity : 'a t -> unit
(** Shrink internal array to fit the size of the array. This can be useful
to make sure there is no memory wasted on a long-held array. *)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this name is confusing. Didn't think long about it but make_tight, fit_capacity, tighten_capacity, make_capacity_tight perhaps ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly, fit_capacity sounds good I think.

stdlib/dynarray.mli Show resolved Hide resolved
Copy link
Contributor

@gadmm gadmm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new version is memory-safe by virtue of not using any unsafe operation. However I believe one can squeeze a bit more out of it.

@@ -217,36 +211,36 @@ let shrink_capacity v : unit =
let iter k v =
let n = v.size in
for i = 0 to n-1 do
k (Array.unsafe_get v.arr i)
k (Array.get v.arr i)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was rather considering something safely using unsafe_get as follows:

let iter k v =
  let n = v.size in
  let arr = v.arr in
  if n > Array.length arr then raise ConcurrentModification ;
  for i = 0 to n-1 do
    let x = Array.unsafe_get arr i in
    if is_null x then raise ConcurrentModification ;
    k x
  done

(where is_null is always false in your version but should be checked in a version with null checks.)

Idem for other iterators.

Copy link
Contributor Author

@c-cube c-cube Sep 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's very reasonable, but I'll re-add them after we get a good implementation for null, so that I can really think through each case.

When the time comes, should ConcurrentModification be added to Dynarray, or directly as one of the main exceptions in Stdlib? One could imagine Hashtbl starting to use it too, if it's helpful.


let[@inline] unsafe_get v i =
Array.unsafe_get v.arr i
Array.get v.arr i

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dynarray.unsafe_get/set are now removed but they make sense in controlled situations (however they are a bit more tricky to use than Array.unsafe_get/set due to races). I do not have strong opinions on their inclusion/removal.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think they'd probably be extremely dangerous given the OCaml 5 semantics. However, if the contract is that unsafe_get doesn't promise anything wrt memory safety, I suppose the old versions could just work?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we have a underlying_array : 'a dynarray -> 'a array operator instead? This is safe, and it lets people use unsafe_get on the resulting array. (For unsafe_set it's unclear that users can reason at all on whether the dynarray will update its backing array in the meantime, so probably not a good idea.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean underlying_array would copy? Otherwise, since the array is partially uninitialized/filled with junk, this seems dangerous to me.

Also, if we do that, we should reopen #10982 ? 🙄

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, unsafe_to_array then I guess. (We already have to_array that copies in the API, no need for that.)

Re. #10982, I wondered about this as well. I think it's a matter of how confident we are that the implementation will remain a single continuous bytes/array, or may move to another design later. My intuition is that you want to document the fact that dynamic arrays are backed by a single array, and that Dynarray.get is morally just as efficient as Array.get. Then having a function that breaks this "abstraction" is probably okay. In contract, for buffers it's much less clear that wouldn't want to move to a chunked structure later.

But I'm not insisting on unsafe_to_array, I just thought of it as a preferable alternative to unsafe_get.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A tricky part here is that unsafe_to_array might return an array with nullable slots. It's all brand new :). For #10982 it was fine because bytes are always valid…

@gadmm
Copy link
Contributor

gadmm commented Sep 28, 2022

I looked at Java's approach and it is indeed interesting (for this approach to thread-safety): essentially increment a modification_count field (non-atomically as best-effort but very cheap). Then check inside the iterator that the count does not change unexpectedly. This also protects from concurrent modifications from the same code (i.e. iterator invalidation, which can be very hard to debug in memory-safe languages because it does not lead to a crash).

This raises an interesting question regarding the memory model and how to leverage its guarantees in practice (@kayceesrk !).

  • I think this kind of checks are necessary to leverage the guarantees: essentially, if a race is not detected early by such a check, then a data race could make a data structure logically invalid (e.g. the size not matching the contents of the array), and this invalidity manifest itself much later in the program and in a very different form (not "bounded in time or space" at least not in spirit).
  • I wonder if such checks are useful to leverage the guarantees: can the local DRF guarantee let us deduce something useful from such early data-race checks?

Note that you lose in additional checks but you also gain in making some optimisations to iterators valid. Essentially you could argue that the optimisation I propose above for iter is invalid because one could still try to make sense of code that modifies the array while it is being iterated, but I argue that it is better to make it impossible to write such accidentally-correct code.

@c-cube
Copy link
Contributor Author

c-cube commented Oct 24, 2022

Now that #11583 is closed, should I try and adapt @chambart's original nullable array module (if he gives permission)? Still seems to me like it's the best option in terms of performance and memory behavior…

@gadmm
Copy link
Contributor

gadmm commented Oct 24, 2022

One take-away of the discussion at #11583 is that you can use any atom with tag <> 0 for a null value, and it should work without modification to the runtime IIUC. That's how I understand the other PR being closed.

Coming back to what Java does, I think using a modification_count field to detect races and iterator invalidations deserves to be discussed, because otherwise it is a breaking API change, and so far I do not see anything else being proposed to leverage the promises of the memory model. This is very reasonable within this approach to thread-safety.

@gasche
Copy link
Member

gasche commented Oct 24, 2022

Coming back to what Java does, I think using a modification_count field to detect races and iterator invalidations deserves to be discussed, because otherwise it is a breaking API change, and so far I do not see anything else being proposed to leverage the promises of the memory model. This is very reasonable within this approach to thread-safety.

Sure, why not? We don't currently try to do any runtime race checking for mutable datastructures, but it's no excuse to not do the right thing. Do you have a pointer to how that looks like? Can we expect the performance cost to be neglectible? (Ideally we would only add memory writes on resize events, not get/sete operations.)

@gadmm
Copy link
Contributor

gadmm commented Oct 24, 2022

Can we expect the performance cost to be neglectible?

I expect it to be negligible (uncontended non-atomic integer increment). You guess correctly that only the changes to the structure counts, not to the contents.

The equivalent Java type is called ArrayList: https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/util/ArrayList.java. You can grep for modCount. It is documented here: https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/util/AbstractList.java#L604-L630.

@c-cube
Copy link
Contributor Author

c-cube commented Oct 24, 2022

I'm not against the modification counter, I'll look into it. The only sad part is that with both a fake "null" value, and a modification counter, that's a whole 2 words (more if unmarshalled) wasted per dynarray.

@gadmm
Copy link
Contributor

gadmm commented Oct 24, 2022

That's the nice thing with Obj.new_block 1 0, you do not need to take care of sharing by hand since the runtime implements atoms as unique static data.

@c-cube
Copy link
Contributor Author

c-cube commented Oct 24, 2022 via email

@Gbury
Copy link
Contributor

Gbury commented Oct 24, 2022

The tag in that case is not really important, as long as it is not 0. Currently, as far as I know, only the atom (block of size 0) with tag 0 is used for empty arrays, and other atoms are unused currently (but still available). The "problem" with that solution, is that if/when someone else tries to use a similar trick to implement some kind of nullable option type, than that person should use a tag different than the one you'll use for dynarrays.

@c-cube
Copy link
Contributor Author

c-cube commented Oct 24, 2022

So is that really different than implementing #11583 using, say, let null = Obj.new_block 42 0 ?

@Gbury
Copy link
Contributor

Gbury commented Oct 24, 2022

It's similar indeed, with the small difference that it is easier for someone else to use a similar trick and get unfortunate collisions if multiple implementations choose the same tag for their internal "null" value.

@gadmm
Copy link
Contributor

gadmm commented Oct 24, 2022

and also <= 243. Sorry if I confused you, I thought 1 would be more natural than 42.

If you use it I guess it will be nice to document it somewhere in the runtime.

And the problem is not re-using the same tag, it is leaking the (non-)value to the outside world. It is fine for everyone to use the atom with the same tag if they keep it to themselves; in particular the person implementing the other option_array should simply not insert null inside a Dynarray (the idea is that null is not a valid value for any type, so the contract is respected).

@Ekdohibs
Copy link
Contributor

If such a solution is implemented, it could be good to specify that tags less (or higher) than some value are reserved by the compiler for future extensions, so that we can ensure no-one will conflict with it. Even if Obj is not supposed to be used by outside code, it would still be slightly better to have this documented.

@gadmm
Copy link
Contributor

gadmm commented Oct 24, 2022

For what it's worth, occurrences of atoms in the opam repository are with tags 0, String_tag, Double_array_tag, Abstract_tag and Object_tag (all four > 243). Other uses are either marshalling-like (i.e. garbage-in garbage-out) such as some marshalling code inside Coq.

@xavierleroy
Copy link
Contributor

xavierleroy commented Oct 25, 2022

I would much prefer a pure OCaml solution, i.e. without any use of Obj, using a ref () as dummy value and storing it inside the dynarray record. "Premature optimization is the root of all evil" (Knuth), and doubly so when it involves Obj.

@gadmm
Copy link
Contributor

gadmm commented Oct 25, 2022

How is this not using Obj.magic and how is this less premature?

@xavierleroy
Copy link
Contributor

OK, it's still using Obj.magic, sue me. It makes me less nervous than Obj.magic (Obj.new_block 42 0), however, because the latter makes more assumptions about data representations.

@nojb nojb added the stdlib label Dec 30, 2022
assert (not (array_is_empty_ a));
let new_array = Array.make newcapacity filler in
Array.blit a.arr 0 new_array 0 a.size;
fill_with_junk_ new_array a.size (newcapacity-a.size) ~filler;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This call to fill_with_junk_ seems redundant.


val of_array : 'a array -> 'a t
(** [of_array a] returns a array corresponding to the array [a].
Operates in [O(n)] time by making a copy. *)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The two occurrences of "array" are confusing. Also, I find the mention of the time complexity oddly redundant with the "copy". So, I suggest something along the lines of "[of_array a] returns a dynamic array whose content is a copy of the array [a]."

done
]} *)

val fit_capacity : 'a t -> unit
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since there is no prior art in OCaml's standard library yet, it might be worth using the same function name as in some other language rather than inventing yet another new name users will have to remember. Java uses trim_to_size, while C++ and Rust use shrink_to_fit

provides the good behavior of amortized O(1) number of allocations
without wasting too much memory in the worst case. *)
let[@inline] next_grow_ n =
min Sys.max_array_length (1 + n + n lsr 1)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the user has called fit_capacity on a very small array, some degenerate behavior will happen next, since this function will return a grown size smaller than the minimal size (i.e., 4). This can be avoided by prepending max 4, or by changing the formula a bit, e.g., n + (n + 5) lsr 1.

By the way, it is folklore that, with a non-compacting garbage collector, the optimal growth ratio is the golden ratio. So, choosing 1.5 in practice is less arbitrary than it looks like.

@gasche gasche mentioned this pull request Jan 11, 2023
5 tasks
@gasche
Copy link
Member

gasche commented Jan 11, 2023

I rebased this PR on top of trunk and wrote follow-up commits to turn it into a boxed implementation and side-step completely the question of which Obj magic we should be using: see #11882. (I think we could revisit Obj later, after some years enjoying the use of a pure-OCaml Dynarray module in the standard library.)

@c-cube
Copy link
Contributor Author

c-cube commented Jan 18, 2023

closing since @gasche picked up the torch.

@c-cube c-cube closed this Jan 18, 2023
@gasche
Copy link
Member

gasche commented Jan 18, 2023

(I'm still curious about your "impact of boxing" benchmarks measurements ;-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet