Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add create_int_array #7

Closed
wants to merge 1 commit into from

Conversation

Sudha247
Copy link
Contributor

This patch adds a primitive Task.create_int_array n which returns an uninitialised integer array of length n.

Stock OCaml and in-effect Multicore OCaml have Array.create_float which returns a fresh float array. There is no such primitive (yet) for integer arrays. The Array.init primitive walks through the array sequentially to initialise values in the array, which may be a bottleneck in Multicore OCaml programs.

The idea is to use this primitive to allocate an uninitialised array and subsequently do the initialisation in parallel. An easy way to do this is by using the parallel_for primitive. One such example is provided in test/par_init.ml.

@kayceesrk
Copy link
Contributor

Wonderful. Thanks for including the API docs and example! Will merge once the CI turns green.

@stedolan
Copy link

I don't think this function quite works. Int arrays get scanned by the GC, and the GC will crash if it follows an uninitialised pointer.

@kayceesrk
Copy link
Contributor

The tag remains Double_array_tag: https://github.com/ocaml-multicore/domainslib/pull/7/files#diff-19d2ab33c85af4134534e6b10a0cb202R15. It should be GC safe?

@stedolan
Copy link

Oh, I see. That's unsafe in a different way: if you call any of the generic-array functions (e.g. Array.get), they'll think it's a double array and box the result.

@kayceesrk
Copy link
Contributor

kayceesrk commented Apr 29, 2020

Ok. I didn't consider that issue. I see two potential solutions (with their own issues):

(1) Allocate the float array and change the tag to Abstract_tag with Obj.set_tag. Of course, Obj.set_tag is deprecated since 4.09. But here's a solution that can undeprecate set_tag: darken the object whose tag is being set. Since the unsafety arises from overwriting marking by a GC thread, darkening should take care of this. The downside is that if the object becomes unreachable immediately, it will be GCed only in the next cycle. Is this sound? @stedolan

(2) Use Obj.new_block to allocate the array with Abstract_tag. But it looks like Obj.new_block unconditionally initializes the object, which is unnecessary since caml_alloc conditionally initializes objects if necessary. These lines can safely be removed: https://github.com/ocaml/ocaml/blame/6923fd159087049250c5df74baaccd3ef7c4473b/runtime/obj.c#L94-L95

@stedolan
Copy link

stedolan commented May 4, 2020

But here's a solution that can undeprecate set_tag: darken the object whose tag is being set. Since the unsafety arises from overwriting marking by a GC thread, darkening should take care of this

I don't think this works either. There's still a race, and if the GC thread wins it'll clobber the set_tag (I agree that this makes it not segfault though, but a set_tag that only sometimes works isn't good)

I think a possible solution here is to use strings instead of arrays as the backing for the intarray type. There's already runtime support for allocating uninitialised strings, and there are no GC worries since the GC never scans strings.

I've prototyped something along these lines here: https://gist.github.com/stedolan/f0cb0f5ab24f283d398e48763db1ae62

@kayceesrk
Copy link
Contributor

I don't think this works either. There's still a race, and if the GC thread wins it'll clobber the set_tag (I agree that this makes it not segfault though, but a set_tag that only sometimes works isn't good)

Ok. I missed that case. Concurrency is hard.

@Sudha247 Stephen's solution is the best way forward as it does not mess about with Obj at all. We should get this into domainslib, possibly as a separate Array module.

@Sudha247
Copy link
Contributor Author

Sudha247 commented May 5, 2020

Closing this in favour of Stephen's solution. I will make a separate PR for that.

I have a question though, on @stedolan's code. The primitives caml_bytes_set64u and caml_bytes_get64u don't appear to be part of 4.06.1, meaning this doesn't work with 4.06.1+multicore. But this does work well with parallel_minor_gc_4_10. So, do we just specify in the API docs that this wouldn't work with 4.06.1+multicore?

@Sudha247 Sudha247 closed this May 5, 2020
@stedolan
Copy link

stedolan commented May 5, 2020

I'm surprised it doesn't work in 4.06.1 - I think these primitives were added in 4.01 by ocaml/ocaml#5771

@Sudha247
Copy link
Contributor Author

Sudha247 commented May 5, 2020

I might be incorrect; but I got this error message when I tried to build this on 4.06.1+multicore

Error: Unknown builtin primitive "%caml_bytes_set64u"

I did a git grep on the source and did not find anything matching this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants