New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Distinguish root and heap values in Lambda.Initialization #673

Merged
merged 1 commit into from Dec 27, 2016

Conversation

Projects
None yet
7 participants
@let-def
Contributor

let-def commented Jul 7, 2016

Flambda introduced a distinction between assignment and initialization in Psetfield.
What was really meant was "initialization of roots", as these only require a cheap store compared to a call to caml_modify.

However, a more general notion of field initialization would be less "error-prone" and still useful for the middle-end.

This patch propose to distinguish In_heap and Root initializations.
In_heap being equivalent to caml_initialize, Root being a simple store.

The backend doesn't have any use for In_heap (it is not produced by the frontend anyway).
The intention is that the middle-end could benefit from the distinction between arbitrary mutations and simple delayed initialization on otherwise pure values. (We briefly discussed that with @chambart and @mshinwell).

My latest port of TRMC also makes use of it.

Other remarks

Initialization In_heap is compiled to a call to caml_initialize.

The primitive didn't check if the field was in the minor heap or not. That made it many times slower (8x on a simple loop benchmark doing alloc + init on my machine) than caml_modify.
Not adding the field to the remembered set if in minor heap seems correct though.

The last patch inline this check... It is provided more for information and I plan to remove before merging. I saw a negligible benefit in my benchmark (I guess my Intel CPU did some smartness seeing many calls caml_initialize). Slightly better was hand tuned code using %r15...

@lpw25

This comment has been minimized.

Show comment
Hide comment
@lpw25

lpw25 Jul 7, 2016

Contributor

I haven't looked at the code, but this sounds like the same distinction that is used in multicore OCaml as well. (In fact I mistook the "initialization" case in Flambda for the one in multicore OCaml in one of my other patches, which lead to a hard to track down bug).

Contributor

lpw25 commented Jul 7, 2016

I haven't looked at the code, but this sounds like the same distinction that is used in multicore OCaml as well. (In fact I mistook the "initialization" case in Flambda for the one in multicore OCaml in one of my other patches, which lead to a hard to track down bug).

@mshinwell

This comment has been minimized.

Show comment
Hide comment
@mshinwell

mshinwell Jul 8, 2016

Contributor

I am in favour of this change, but don't currently have time to review the patch.

Contributor

mshinwell commented Jul 8, 2016

I am in favour of this change, but don't currently have time to review the patch.

Show outdated Hide outdated bytecomp/lambda.mli
split [Initialization] into two cases, depending on whether the place
being initialized is in the heap or not. *)
| Initialization
| Initialization of initialization_place

This comment has been minimized.

@alainfrisch

alainfrisch Jul 9, 2016

Contributor

Instead of adding an argument, you could also split the Initialization constructor in two (e.g. Init_in_heap / Init_root). Currently, a single pattern matching (in selectgen) benefits from the fact that these two cases are treated in a uniform way, which suggests that the two "sub-cases" are as different as they are from the other case.

@alainfrisch

alainfrisch Jul 9, 2016

Contributor

Instead of adding an argument, you could also split the Initialization constructor in two (e.g. Init_in_heap / Init_root). Currently, a single pattern matching (in selectgen) benefits from the fact that these two cases are treated in a uniform way, which suggests that the two "sub-cases" are as different as they are from the other case.

@let-def

This comment has been minimized.

Show comment
Hide comment
@let-def

let-def Jul 14, 2016

Contributor

@alainfrisch: updated, I used Root_initialization and Heap_initialization to keep the "unabbreviated name" style.

Contributor

let-def commented Jul 14, 2016

@alainfrisch: updated, I used Root_initialization and Heap_initialization to keep the "unabbreviated name" style.

@xavierleroy

This comment has been minimized.

Show comment
Hide comment
@xavierleroy

xavierleroy Dec 4, 2016

Contributor

@lpw25 @mshinwell @chambart @alainfrisch or any one else: please review and comment.

My own quick review is that this is good code but in the initial merge I'd rather not have the fast path of caml_initialize being inlined. (Just because it's an orthogonal concern, and we should have a separate discussion about the merits and demerits of inlining bits and pieces of runtime functions.)

Contributor

xavierleroy commented Dec 4, 2016

@lpw25 @mshinwell @chambart @alainfrisch or any one else: please review and comment.

My own quick review is that this is good code but in the initial merge I'd rather not have the fast path of caml_initialize being inlined. (Just because it's an orthogonal concern, and we should have a separate discussion about the merits and demerits of inlining bits and pieces of runtime functions.)

Show outdated Hide outdated byterun/memory.c
@@ -586,7 +586,7 @@ CAMLexport CAMLweakdef void caml_initialize (value *fp, value val)
{
CAMLassert(Is_in_heap(fp));
*fp = val;
if (Is_block (val) && Is_young (val)) {
if (!Is_young((value)fp) && Is_block (val) && Is_young (val)) {

This comment has been minimized.

@stedolan

stedolan Dec 7, 2016

Contributor

I'm unsure what this change is for.

Currently, caml_initialize only works on the major heap, and it is an error to use it on minor heap objects. You seem to be relaxing this restriction with the new check (which I think is a good idea, multicore already works that way).

However, you're careful to not use the newly relaxed caml_initialize (inlining the minor heap check in cmmgen.ml), and you haven't changed the Is_in_heap assertion to Is_in_heap_or_young, so I don't know what your intent is.

@stedolan

stedolan Dec 7, 2016

Contributor

I'm unsure what this change is for.

Currently, caml_initialize only works on the major heap, and it is an error to use it on minor heap objects. You seem to be relaxing this restriction with the new check (which I think is a good idea, multicore already works that way).

However, you're careful to not use the newly relaxed caml_initialize (inlining the minor heap check in cmmgen.ml), and you haven't changed the Is_in_heap assertion to Is_in_heap_or_young, so I don't know what your intent is.

@stedolan

This comment has been minimized.

Show comment
Hide comment
@stedolan

stedolan Dec 7, 2016

Contributor

You also need to change make_alloc in cmmgen.ml, which uses addr_array_set to initialise objects that are too large for the major heap, mislabelling some initialisations as assignments.

Contributor

stedolan commented Dec 7, 2016

You also need to change make_alloc in cmmgen.ml, which uses addr_array_set to initialise objects that are too large for the major heap, mislabelling some initialisations as assignments.

@mshinwell

This comment has been minimized.

Show comment
Hide comment
@mshinwell

mshinwell Dec 9, 2016

Contributor

@let-def Can you address Stephen and Xavier's comments and then I will review this as well?

Contributor

mshinwell commented Dec 9, 2016

@let-def Can you address Stephen and Xavier's comments and then I will review this as well?

@let-def

This comment has been minimized.

Show comment
Hide comment
@let-def

let-def Dec 13, 2016

Contributor

@stedolan Thanks, I missed the make_alloc occurrence.

The intent is indeed to allow caml_initialize to be used with minor heap, I just forgot to update the assertion.
I think I saw existing code doing that, I will maybe check with some code search tools (I know how to use github search, I don't know with if there is any tool with opam universe).

Contributor

let-def commented Dec 13, 2016

@stedolan Thanks, I missed the make_alloc occurrence.

The intent is indeed to allow caml_initialize to be used with minor heap, I just forgot to update the assertion.
I think I saw existing code doing that, I will maybe check with some code search tools (I know how to use github search, I don't know with if there is any tool with opam universe).

@let-def

This comment has been minimized.

Show comment
Hide comment
@let-def

let-def Dec 13, 2016

Contributor

@mshinwell I removed the inlining and adressed Stephen issues. I think you can start :)

Contributor

let-def commented Dec 13, 2016

@mshinwell I removed the inlining and adressed Stephen issues. I think you can start :)

@mshinwell

This comment has been minimized.

Show comment
Hide comment
@mshinwell

mshinwell Dec 27, 2016

Contributor

OK.
Please resolve conflicts.

Contributor

mshinwell commented Dec 27, 2016

OK.
Please resolve conflicts.

@mshinwell mshinwell added the approved label Dec 27, 2016

@mshinwell mshinwell added this to the 4.05.0 milestone Dec 27, 2016

add Lambda.initialization_place
Lambda.Initialization is now parameterized by the place where the
mutated value is expected to be.

The previous semantics of [Initialization] are now obtained by
[Initialization Root].
The new [Initialization In_heap] behaves like [caml_initialize]
C-primitive.
@let-def

This comment has been minimized.

Show comment
Hide comment
@let-def

let-def Dec 27, 2016

Contributor

Rebased.

Contributor

let-def commented Dec 27, 2016

Rebased.

@mshinwell mshinwell merged commit d1eecfc into ocaml:trunk Dec 27, 2016

2 checks passed

continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@stedolan

This comment has been minimized.

Show comment
Hide comment
@stedolan

stedolan Dec 29, 2016

Contributor

I don't think the new caml_initialize is correct. It's been generalised to accept young objects as well, but the check to add objects to the ref_table is still:

Is_block (val) && Is_young (val)

I think it should check whether fp is young as well, to prevent young fp being added to the ref_table, since the minor GC assumes that the ref_table contains only the addresses of fields in the major heap.

Contributor

stedolan commented Dec 29, 2016

I don't think the new caml_initialize is correct. It's been generalised to accept young objects as well, but the check to add objects to the ref_table is still:

Is_block (val) && Is_young (val)

I think it should check whether fp is young as well, to prevent young fp being added to the ref_table, since the minor GC assumes that the ref_table contains only the addresses of fields in the major heap.

@damiendoligez

This comment has been minimized.

Show comment
Hide comment
@damiendoligez

damiendoligez Dec 29, 2016

Member

I agree with @stedolan. The minor GC, as it currently works, might not break because of this, but we still don't want the ref_table pointing into the minor heap.

Member

damiendoligez commented Dec 29, 2016

I agree with @stedolan. The minor GC, as it currently works, might not break because of this, but we still don't want the ref_table pointing into the minor heap.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment