Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add 'noinit' support to initializers #8792

Open
bradcray opened this issue Mar 12, 2018 · 13 comments
Open

Add 'noinit' support to initializers #8792

bradcray opened this issue Mar 12, 2018 · 13 comments
Labels

Comments

@bradcray
Copy link
Member

bradcray commented Mar 12, 2018

As a Chapel Programmer, I want to be able to use noinit to squash default initialization of certain types, particularly large arrays, and to be able to define my own initializers that respond to noinit such that fields within them can be left uninitialized as well because sometimes the speed of avoiding initialization trumps the safety benefits that initialization brings.

Acceptance criteria:
noinit is supported for arrays and users can write their own initializers that respond to noinit

For reference, CHIP 10 and CHIP 12 have a bit of information on the vision for this feature, though I think we've also worked up some more detailed examples at times that don't seem to be reflected there.

@bradcray bradcray added the Epic label Mar 12, 2018
@benharsh
Copy link
Member

benharsh commented May 14, 2018

A handful of existing noinit tests fail under --force-initializers:

  • expressions/lydia/noinit/recordHasClassField
  • expressions/lydia/noinit/recordHasRecordField
  • expressions/lydia/noinit/tupleWithRecord
  • expressions/lydia/noinit/usedWithRecord
  • expressions/lydia/noinit/recordHasComplicatedRecordField
  • expressions/lydia/noinit/recordHasTypeField
  • expressions/lydia/noinit/recordIncorrectAssignment
  • functions/lydia/userDefinedDefaultFunction/redefine-param-record
  • functions/lydia/userDefinedDefaultFunction/redefine-type-record

@bradcray
Copy link
Member Author

I think I'd be OK with futurizing noinit tests while switching to initializers given that (IIRC), we never had it working for arrays to begin with which is the important case, and will likely need/want to reimplement it from scratch in an initializers world.

@gbtitus
Copy link
Member

gbtitus commented Dec 13, 2018

While designing this, we should keep in mind that if we're using a comm layer that can allocate memory and need to register that memory for best performance (currently just ugni), we want the registration to follow the first-touch so that we get the desired NUMA locality. As of this writing the first touch is always our default initialization, so we can do the registration call right after that. If in the future we don't do the default initialization then we'll want to delay the registration call until after whatever does the first touch. For use cases where the declaration includes a user-written initial value that obviates the need for default init it seems like it ought to be straightforward for the compiler to insert (however indirectly) the registration call. But for cases where the first touch is done elsewhere we'll need to think about what to do. Note that there's no functional effect from not registering, but when program performance is driven by the cost of remote references the difference between registering and not can be quite dramatic.

@mppf
Copy link
Member

mppf commented Jul 12, 2020

I think we have recently been thinking a record author would write proc noinit() to implement support for noinit, rather than having a param argument on the default initializer.

An interesting alternative (that might make sense to the compiler but maybe not to humans) would be to express the noinit as copy-initialization from a special type, e.g. proc SomeType.init=(rhs: noinit) ).

Besides the exact naming for the initializer run on var x: MyRecord = noinit;, I have the following questions:

Supposing a record is initialized with var x: MyRecord = noinit; and that runs MyRecord.noinit(). Should a MyRecord.postinit() function be run immediately after MyRecord.noinit()? Or should records initialized in this manner always skip postinit ?

What happens with var x: MyRecord = noinit; when proc MyRecord.noinit() does not exist? I can see these options:

  1. Produce a compilation error
  2. Default-initialize x
  3. Leave x completely uninitialized

What does var y: int = noinit do? In some ways this is similar to the above case (because int isn't a record, in a strict sense it doesn't (yet) have initializers, so it can't have a proc noinit()).

  1. Produce a compilation error
  2. Default-initialize y
  3. Leave y completely uninitialized

What happens when a no-inited variable reaches end of scope? Here I understand the options to be:

  1. It could be that the compiler doesn't add a deinit call for no-inited variables at all
  2. It could be that a type supporting no-init tracks at runtime - generally with a field - to what degree the object is inited/deinitialized and writes a proc deinit that operates accordingly.

@mppf
Copy link
Member

mppf commented Jul 12, 2020

What happens when a no-inited variable reaches end of scope? ... It could be that the compiler doesn't add a deinit call for no-inited variables at all

The problem that I see with this is that the compiler could easily lose track of a variable being noinited when it is passed by in intent. Consider for example

{
  var x: MyRecord = noinit;
  acceptInIntent(x); // copy-elision applies
}
proc acceptInIntent(in arg: MyRecord) {
   // compiler runs arg.deinit() here at the end of this function,
   // but that might not make sense... e.g. MyRecord might have some
   // key fields not set.
}

This kind of pattern currently comes up in compiler-generated initializers for classes and records.

@mppf
Copy link
Member

mppf commented Jul 12, 2020

What happens with var x: MyRecord = noinit; when proc MyRecord.noinit() does not exist?
What does var y: int = noinit do?

I think that case is generally better served by split-init. However, issue #15808 describes a case where one might wish to do something like split-init in a case where it is not currently implemented. If we do not change the split-init rule for on blocks, the situation will be that the programmer will need to opt in in some way to moveing the variable across locales. While noinit could be a way to express that, it seems to me that doing so would have noinit solve two problems rather than one. I would prefer instead that noinit be focused on the case of data structures where the data structure author writes code to support not initializing some elements. (i.e. cases similar to the array case).

This causes me to lean away from the option of leaving these values uninitialized.

@mppf
Copy link
Member

mppf commented Jul 12, 2020

I understand that we have been reserving noinit for this, but since (in my understanding anyway) noinit is really about allowing a type to do less initialization (e.g. allocate array elements but do not initialize them) - why not just expect record authors to provide initializers that do that? See also #15703 (comment)

In particular, suppose a record has a buffer that it manages and the record author would like to provide a feature where the buffer is allocated but the elements are not initialized. We have been thinking that the record author would write this:

record Container {
  type eltType;
  var buffer: [1..n] eltType;
  proc noinit(type eltType) {
    this.eltType = eltType;
    this.buffer = noinit;
  }
}

and then the user could write

var c : Container(int) = noinit;

But, is that really better than having the record author write this:

record Container {
  type eltType;
  var buffer;
  proc init(type eltType, param initElts=true) {
    this.eltType = eltType;
    this.buffer = {1..n}.buildArray(int, initElts=initElts);
  }
}

and then the user could write

var c = new Container(int, initElts=false);

?

(Either way, as I understand it, the user of the container will have to do other relatively involved things: explicitly move any non-pod elements into the storage & indicate to the data structure when all elements are initialized).

I don't really see why this feature needs language support. I think it could be implemented at the library level.

@lydia-duncan
Copy link
Member

But, is that really better than having the record author write this

This may be unreasonable on my part, but I have a different reaction to exposing aspects of ordinary array initialization to the user/record author versus exposing noinit-specific created methods.

@mppf
Copy link
Member

mppf commented Jul 13, 2020

This may be unreasonable on my part, but I have a different reaction to exposing aspects of ordinary array initialization to the user/record author versus exposing noinit-specific created methods.

I'm not trying to propose that D.buildArray specifically be the interface. Merely trying to say that I don't currently see why we need a keyword to implement the noinit concept. Either way, for arrays, I think we will need some special methods that will be used along with noinit. (e.g elementInitializationComplete() in #15703 (comment) ). Could the process of constructing a noinit array also just be done by calling a special method?

@bradcray
Copy link
Member Author

An interesting alternative (that might make sense to the compiler but maybe not to humans) would be to express the noinit as copy-initialization from a special type, e.g. proc SomeType.init=(rhs: noinit) ).

I think we (@benharsh and I) discussed this when we were introducing init=, but didn't get excited about it for whatever reason (I think some forms we kicked around included init=() (no arguments), init=(noinit), etc. I think our general sense was that either form was essentially introducing a special-case and that proc noinit() had the advantage of seeming more straightforward / self-explanatory (to someone accustomed to using it on an array of reals, say).

Supposing a record is initialized with var x: MyRecord = noinit; and that runs MyRecord.noinit(). Should a MyRecord.postinit() function be run immediately after MyRecord.noinit()? Or should records initialized in this manner always skip postinit ?

That's an interesting question. I'd expect them to skip .postinit() by default ("since I'm not initializing it, I'm not running the thing that should happen once its been initialized"). I also think postinit() is less interesting for records than for class hierarchies—in that if there's something I wanted to do in postinit(), I could do it directly within init() / noinit() for a record, but not always for a [parent] class. So I don't think there's loss of capability by adopting this behavior.

What happens with var x: MyRecord = noinit; when proc MyRecord.noinit() does not exist? I can see these options:

I would expect:

Leave x completely uninitialized

What does var y: int = noinit do? In some ways this is similar to the above case (because int isn't a record, in a strict sense it doesn't (yet) have initializers, so it can't have a proc noinit()).

I would expect:

Leave y completely uninitialized

What happens when a no-inited variable reaches end of scope? Here I understand the options to be:

I would expect:

It could be that the compiler doesn't add a deinit call for no-inited variables at all

@mppf
Copy link
Member

mppf commented Jul 15, 2020

What happens when a no-inited variable reaches end of scope? Here I understand the options to be:

I would expect:

It could be that the compiler doesn't add a deinit call for no-inited variables at all

Did you see #8792 (comment) ? I think the problem shown in that comment makes this strategy unworkable. The reason is that a noinited variable cannot be passed by in intent until it is fully initialized - due to the way deinitialization works - even if the in intent will "move" the variable. For this reason, the prototype in PR #16064 uses the other option - of tracking at runtime with a field whether the elements need deinitialization. This seems to be needed at runtime in any case for arrays to perform resizing per your other comment .

I would expect:

Leave x completely uninitialized

Do you know of any use cases where this is useful? At the moment I feel I understand the purpose of no-init for arrays but am less confident that no-initializing a stack variable or field is useful (as compared to split initialization).

It could help with an explicit way to write #15808 - but similarly to #15703 (comment) we have the problem that the way of eventually setting it differs depending on whether or not x had proc noinit() run or if it was completely uninitialized. (Noinit -> initialize with =; but uninitialized -> initialize with moveInitialize).

If we consider the case where somebody might want to opt-in to something like a split init pattern across an on statement - if noinit does that, they would write e.g.:

{
  var A:[D] eltType = noinit;
  on AnotherLocale {
    forall i in D {
      moveInitialize(A[i], something());
    }
  }
}

But what if they wanted this code to be generic on the type? Meaning they don't know it's an array? Then would they write:

{
  var A: SomeType = noinit;
  on AnotherLocale {
    moveInitialize(A, something());
  }
}

and expect moveInitialize to move initialize the elements of the array rather than completely replacing the LHS? Does that imply that we will need not only proc noinit() but also something like proc moveinit() ?

Or, should noinit be a low-level feature that just behaves differently for each type?

@mppf
Copy link
Member

mppf commented Jul 16, 2020

After a discussion with @bradcray the plan for now is to allow noinit as a feature only for arrays of POD types. Discussion for things like moveInitialize will be put off into a separate issue. These features might still be worthwhile but are not specifically connected to this effort any longer since = will do for initializing a POD type.

@mppf
Copy link
Member

mppf commented Jul 30, 2020

PR #16064 helps with array no-init. I am currently thinking that array no-init should leave the array memory totally uninitialized and if we want to have records that also support some kind of no-initialization of storage that they should do so with a param initElts or similar argument to their initializer. We are adding the = noinit syntax for arrays because it is particularly useful for array declarations. I don't think that implies that noinit needs to be (or can be) a general feature that all records are expected to support in a uniform way.

mppf added a commit that referenced this issue Jul 30, 2020
Adjust domain map implementations to no longer require pragma

For issue #15673.
Also for issues #15703 and #8792.

This PR:
 * adjusts the domain map interface to include
   `dsiElementDeinitializationComplete`
 * removes `pragma "no auto destroy"` from domain map implementations
 * adds a user-facing capability to `noinit` arrays of POD elements

The goal of this PR is to resolve issue #15673 by adjusting the domain
map implementations to no longer require the use of a `pragma "no auto
destroy"`. Originally I was expecting to do that with a `noinit`
syntactic construction but it turns out that `noinit` is not strictly
necessary in this case since the `buildArray(..., initElts=false)`
pattern already covers it. Either way, the array in question has memory
allocated for elements but the elements are totally uninitialized.

In particular, to remove the need for `pragma "no auto destroy"`, this PR
adds a `deinitElts:bool` field to DefaultRectangular to indicate if the
elements should be deinitialized when the array is destroyed. It adjusts
`dsiElementInitializationComplete` to set `deinitElts` to `true`. It adds
`dsiElementDeinitializationComplete` to set `deinitElts` to `false` (and
also allow similar behavior for other array types). It adjusts
`dsiDestroyArr` for to check `this.deinitElts`.

Even though `noinit` is not used in the distribution implementations,
this PR implements noinit for arrays of POD elements. For example,

``` chapel
var A:[1..10] int = noinit; // allocates space for elements but does not initialize them
for i in 1..10 {
  A[i] = i; // `=` is sufficient for initializing trivially copyable types
}
```

Users should also indicate when (if ever) the array's elements are
completely initialized so that the memory can be registered to support
communication. That is the subject of issue #16173.

In order to easily test no-init of arrays, this PR adds a flag
`--allow-noinit-array-not-pod` that will allow noinit of arrays of
non-POD elements. It uses this in several tests along with a version of a
low-level move module added to the test system. Issue #16172 discusses
the design of a user-facing low-level move module.

Reviewed by @vasslitvinov - thanks!

- [x] full local futures testing
- [x] primers pass with verify/valgrind and do not leak
mppf added a commit that referenced this issue Oct 1, 2020
Add noinit technote

Follow-up to PR #16064.

The 1.23 release includes a very basic and restricted version of `noinit`
because we were able to agree on the behavior of that case.

The `noinit` feature is discussed in #15703 and #8792. The future work is
discussed in #16172, #16173.

Reviewed by @lydia-duncan - thanks!
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants