Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

classify_addr.h: explain the no_naked_pointers value model #9684

Conversation

gasche
Copy link
Member

@gasche gasche commented Jun 15, 2020

We are planning to support two configurations (with or without
naked pointers) in the runtime for at least a couple years, so I think
it is useful to explain their value models and corresponding runtime
functions/macros. This should help runtime developers reason about the
code written to support both modes.

Most of the actual content in this PR comes from a comment by @xavierleroy
in #9683 (comment)

(Xavier and I have different sensibilities about how much explanations should be given about internal details, so this PR may turn out to be rejected as too explanatory, which is okay. That said, I think that the comments in the middle of a bunch of macro definitions is a not-too-plublic place to have internal comments.)

Copy link
Contributor

@xavierleroy xavierleroy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. The wording of the Is_in_static_data case is not quite right, see below.

Comment on lines 28 to 31
- a pointer outside the heap to a well-structured block;
an interesting subset of those are the structured constants
statically-allocated by OCaml code or the OCaml runtime
(Is_in_static_data)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit pick: if the well-structured block is not in a page registered as "static data", i.e. if Is_in_static_data returns false, it is treated as a naked pointer. For example, generic comparison will not traverse it. So the 4th case could be
"a pointer to a well-structured block, part of a structured constant statically allocated by ocamlopt (Is_in_static_data)".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me, being a naked pointer or not is an extensional property: if you point to a valid OCaml block, you are not naked, naked pointers are those into garbage. (This is the way I understand what some maintainers write when they say that, to avoid naked pointers, one can sometimes use Caml_out_of_heap_header for blocks allocated outside the heap.)

This is why I rephrased your message to say that there are well-structured outside-the-heap blocks and naked pointers, and that Is_in_static_data is a well-defined subset of the well-structured blocks. If a pointer falls outside the "value area", then we don't know whether it is naked or not, and the current implementation must avoid following it (because it may be).

You seem to use a different interpretation of the term "naked pointer", all pointers that go outside the value area. I find it confusing. Are we sure we want to go this way?

(In both interpretations, code pointers are naked pointers.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I rephrased the comment to clarify these nuances in a way that I hope you will be happy with. This part (there are other minor changes) now reads as follows:

   - a pointer to a constant block statically-allocated by OCaml code
     or the OCaml runtime (Is_in_static_data)
   - a "foreign" pointer, which is none of the above; the destination
     of those pointers may be a well-formed OCaml blocks, but it may
     also be a naked pointer.

[...] The runtime
conservatively assumes that all foreign pointers may be naked
pointers, and uses the page table to not dereference/follow them.

@xavierleroy
Copy link
Contributor

Oh, and by the way, I'm fine with implementation comments that explain the implementation. It's when it comes to the user manual that I'd rather omit implementation details.

We are planning to support two configurations (with or without
naked pointers) in the runtime for at least a couple years, so I think
it is useful to explain their value models and corresponding runtime
functions/macros. This should help runtime developers reason about the
code written to support both modes.

Most of the actual content in this PR comes from a comment by Xavier
Leroy in
  ocaml#9683 (comment)
@gasche gasche force-pushed the document-address-macros-for-no-naked-pointers-code branch from 4263c95 to 4ebb7e8 Compare June 16, 2020 08:30
Typo + wording
Copy link
Contributor

@xavierleroy xavierleroy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the new wording better, thank you.

I fixed a typo directly.

@sadiqj
Copy link
Contributor

sadiqj commented Jun 16, 2020

Currently the major GC asserts that a block outside the major heap must have a black header: https://github.com/ocaml/ocaml/blame/trunk/runtime/major_gc.c#L167 . Is this restriction worth including in the documentation?

As there's a requirement in the major GC that the header is black, it means the contents of the block won't be scanned during marking. Right now wording on the documentation allows for blocks outside the major heap that can be scanned, I think that might be worth strengthening. Something like:

(Well-formed blocks outside the heap will not be scanned by the GC, one can use the Caml_out_of_heap_header from mlvalues.h to ensure this.)

@xavierleroy xavierleroy merged commit 58b3efd into ocaml:trunk Jun 16, 2020
@gasche
Copy link
Member Author

gasche commented Jun 16, 2020

@sadiqj that's a good question, thanks. I'm not sure whether this is something we want to mandate as part of the value model, or more a debug check for the current implementation of no-naked-pointers.

@xavierleroy
Copy link
Contributor

I'm not sure whether this [out-of-heap blocks have black headers] is something we want to mandate as part of the value model

I was not sure either, but @damiendoligez wants it very much, in particular for future work on the compactor, and also because it makes it safe to have out-of-heap blocks in read-only parts of memory. (A white header would be turned into a black header at the first marking of the out-of-heap block.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants