Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reimplement Obj.reachable_word using a hash table to detect sharing #9678

Merged
merged 5 commits into from
Jun 23, 2020

Conversation

xavierleroy
Copy link
Contributor

The previous implementation (caml_obj_reachable_words in runtime/obj.c) was modifying the mark bits of block headers to detect sharing. This is not compatible with Multicore OCaml.

The marshaler used to play the same trick. In #9353 it was modified to use a hash table instead of mark bits to track blocks already seen.

Likewise, this PR reimplements caml_obj_reachable_words to use a hash table.

The implementation reuses the hash table used to detect sharing during marshaling and other bits of the marshaler, hence the function caml_obj_reachable_words was moved to runtime/extern.c.

A change of semantics is required in no-naked-pointers mode. Currently, Obj.reachable_word does not count data that is statically allocated outside the OCaml heap, e.g. ocamlopt-generated structured constants. In no-naked-pointers mode and later in Multicore OCaml we want to get rid of the page table, and will have no way to distinguish statically-allocated data from heap-allocated data, and we must count both.

The above is the first commit of this PR, 50d51c4.

Now there is a funny test that checks that Obj.reachable_word on a structured constant returns 0 in native-code (because out-of-heap static allocation) and the actual size in bytecode. This test fails in no-naked-pointers mode, because the actual size is returned in both case. Hence, commit a31cbc1 splits the funny test into its own file, and conditionalizes the test on the fact that naked pointers are supported.

To support this, commit bb13bb5 adds a naked_pointers action to ocamltest.

To enable that, commit 3e92cae teaches configure to put a NAKED_POINTERS={true,false} in Makefile.config. (Before, the "naked pointer" info was only in runtime/caml/m.h)

Phew.

Special mention to @alainfrisch (original author of Obj.reachable_words) and to @shindere (for the configure and testing aspects).

mlsize_t sz = Wosize_hd(hd);
/* Infix pointer: go back to containing closure */
if (tag == Infix_tag) {
v = v - Infix_offset_hd(hd);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we check if v was already visited here ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to. I chose to enter the closure itself in the hash table, but not its infix pointers. Sharing will be detected and recorded after we go back to the containing closure.

Copy link
Contributor

@hhugo hhugo Jun 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, that makes sense.

@xavierleroy
Copy link
Contributor Author

This PR needs to be reviewed.

Copy link
Contributor

@jhjourdan jhjourdan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be better to reorder the commits of this PR, to make sure that tests succeed even in intermediary commits, for easier bisecting.

The documentation of Obj.rachable_words need to be updated to the new behavior.

Apart from that, this looks good to me.

}
/* Pop one more item to traverse, if any */
if (sp == extern_stack) break;
v = *((sp->v)++);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
v = *((sp->v)++);
v = *(sp->v++);

Spurious parentheses (just because I like nitpicking).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was patterned after line 831. I like the extra parentheses because otherwise I tend to read this as sp->(v++), which makes no sense of course.

We have NAKED_POINTERS=false if configured with --disable-naked-pointers,
NAKED_POINTERS=true otherwise.
xavierleroy added a commit to xavierleroy/ocaml that referenced this pull request Jun 22, 2020
@xavierleroy
Copy link
Contributor Author

It should be better to reorder the commits of this PR, to make sure that tests succeed even in intermediary commits, for easier bisecting.

Right. Done!

The documentation of Obj.rachable_words need to be updated to the new behavior.

Done.

I'll wait some more to give @shindere a chance to comment on the changes to configure and ocamltest.

@shindere
Copy link
Contributor

shindere commented Jun 22, 2020 via email

@shindere
Copy link
Contributor

shindere commented Jun 22, 2020 via email

@xavierleroy
Copy link
Contributor Author

Re: the "naked_pointers" ocamltest action, I think it would be better to
add it to ocaml_actions rather than to the builtin _actions module
because it is specific to the OCaml compiler. Think of builtin_actions
as actions that could be relevant to otehr contexts than the one of the
OCaml compiler.

I was unaware of this distinction, but I just followed what was already done. The following actions are in builtin_actions:

hasinstrumentedruntime
hasunix
libunix
libwin32unix
hassysthreads
hasstr
windows
not_windows
bsd
not_bsd
macos
arch32
arch64
arch_arm
arch_arm64
arch_power
function_sections
has_symlink

The naked_pointers test is conceptually very similar to hasinstrumentedruntime or function_sections: it's a characteristic of the runtime system.

Also, regarding the text associated with this action, could it be made
clearer?

Again I copied function_sections which says "Target supports function sections". But I'm happy to put "Runtime system supports naked pointers", that's more precise indeed.

It succeeds if the runtime system supports naked pointers,
and fails otherwise (e.g. if configured with --disable-naked-pointers).
Part of the test makes sense only if the runtime system supports
naked pointers and has a page table to distinguish major heap pointers
from out-of-heap pointers.

This part is split off in a new test, lib-obj/reachable_words_np.ml,
conditionalized on "naked_pointers".
The previous implementation (caml_obj_reachable_words in runtime/obj.c)
was using the mark bits of block headers to detect sharing.

This is not compatible with Multicore OCaml.

This commit reimplements caml_obj_reachable_words to detect sharing
with a hash table of addresses already seen.

The implementation reuses the hash table used to detect sharing
during marshaling (commit 67ada54), and other bits of the marshaler,
hence the function caml_obj_reachable_words was moved to runtime/extern.c.

In no-naked-pointers mode, to anticipate the disappearance of the page
table, statically-allocated blocks cannot be treated specially and will
be counted towards the size.  This change of semantics is mentioned
in the documentation for Obj.reachable_words.
@shindere
Copy link
Contributor

shindere commented Jun 22, 2020 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants