User-defined implicit initialization hooks #252

haxscramper · 2020-09-11T11:09:46Z

User-defined implicit initialization

This RFC mostly reiterates ideas from #48, #126, #233

Add support for user-defined implicit initialization hook with following prototype:

proc `=init`(x: var T)

Is this needed?

Existing proposals

There has been several RFCs related to default initialization/implicit construction for user-defined types.

#48 - proposal for a standardized object construction/initialization
Support default values for object properties #126 - Support default values for object properties
User defined default values #233 - User defined default values

Existing compiler warnings

Nim compiler already provides two warnings directly related to default initialization, three more related to initialization in general, making total of five initalization-related diagnostics, meaning there is at least some interest in correct initialization behavior

UnsafeSetLen - "setLen can potentially expand the sequence, but the element type '$1' doesn't have a valid default value"
UnsafeDefault - "The '$1' type doesn't have a valid default value"
ProveInit "Cannot prove that '$1' is initialized. This will become a compile time error in the future.",
ProveField "cannot prove that field '$1' is accessible",
ProveIndex "cannot prove index '$1' is valid",

`{.requiresinit.}`

Separate pragma {.requiresinit.} to completely prevent implicit default initialization. Used really infrequently (only 126 times in 1340 packages - approximately 90% of packages I checked haven't used it even once)

It is not possible to contain effects of requiresinit - once added it affects all code that uses type with annotated fields. It also affects templates that rely on type Res = typeof((var it {.inject.}; op)) to determine type of expression (right now almost none of the *It templates can deal with these types).

import sequtils

type
  Req = object
    f {.requiresinit.}: int

  Nreq = object
    f: int

template test(templ, expr): untyped =
  echo compiles(@[Nreq()].templ(expr)), " ", compiles(@[Req()].templ(expr))

test mapIt, 0
test filterIt, true
test allIt, true
test anyIt, true

true false
true false
true false
true false

Why this is needed?

Broken type system

As mentioned in these comments by @timotheecour large portion of type safety guarantees is invalidated - enum with offset, ranges now can't really guarantee anything unless explicitly created with initT. Any kind of value that has non-zero default requires special attention - it is now your responsibility to make sure this -1-as-default-value is actually used. {.requiresinit.} is a solution, but has already mentioned it propagates through whole codebase, requiring far-reaching modifications.

NOTE: I personally think that {.requiresinit.} is a great way to explicitly declare requirements and enforce them via compiler diagnostics. The only drawback is that it is really viral and has to be worked around in some cases (typeof pattern can just be written as var tmp: ref InType; var it {.inject.} = tmp[]; op).

`=destroy` confusion

It is possible to have specific destruction hook, bound to particular type and you can write initT proc for user-defined constructor, but when it comes to default initialization everything is just filled with zero and that's it. It is also possible to completely forbid implicit initialization, but not configure it. I find it rather confusing and counter-intuitive.

Large number popular imperative/OOP programming languages provide way to customize default values. Out of all languages mentioned in nim for X programmers on wiki only C lacks this feature.

D, Erlang, Typescript constructor keyword
Python
Java, Kotlin, C#, C++
Partially Rust - it is still necessary to explicitly call Default::default()

Other concerns

RFC #126 (Support default values for object properties) suggests implementing default value initialization in form of

type MyObj = object
    id: int = generateUniqueId()
    x: int = 42
    stuff: seq[string] = @[]

Which can be implemented using macro (see forum thread) and it is not necessary to add this into language core. If one wishes they can use macro to automatically declare `=init` hook. It is already possible to do for explicit initialization initT procs, but default initialization is not currently configurable.

Possible implementation behavior

Similar to how `=destroy` is handled

var x: T; stmts
------------- (default variable initalization)
var x: T
`=init`(x)
stmts

If type does not have user-defined `=init` then no injection shall happen. If any of the fields have initialization declared then default initialization in form of

proc `=init`(obj: var T) =
  `=init`(obj.fieldWithInit)

is implicitly declared recursively. If field is has type range or enum for which low(Enum).int != 0 or low(range[..]) != 0 then `=init` is implicitly declared too.

Object construction syntax. If field is not initialized by user explicitly and field type has `=init` declared field should be implicitly initialized. If forced explicit initialization is necessary then {.requiresinit.} can be used on object field.

let obj = Object()
------------- (default field initalization)
let obj = block:
  var obj = Object()
  `=init`(obj.uninitializedFieldWithInit)
  obj

NOTE: {.requiresinit.} already uses similar logic - if type field cannot be default-initalized then none of the object containing file of this type can be default-initialized too.

The text was updated successfully, but these errors were encountered:

haxscramper · 2020-09-12T18:38:56Z

This is not an addition to RFC - just some ideas that might potentially be useful.

It is not uncommon to see procedure implementation pattern where result is not explicitly initialized and immediately used to append, set field value etc. It is fine most of the time, but when type definition switches to ref (e.g. it was just Type = object and now it is Type = ref object) this can lead to annoying debugging where you have to figure out all places where implicit initialization happened. This is a rare use case but happens sometimes. `=init` could potentially make this a non-issue and further diminish distinction between ref and non-ref types, which is in line with already supported (experimental) automatic dereferencing.

If ref variable really has to be nil it might be better to explicitly initialize it as = nil, otherwise treat it as regular variable and use implicit initialization hook.

More on 'broken type system' - object that have non-trivial initial state (e.g. not just zero-filled memory) are more fragile in cases where implicit initialization is not configurable - you must take care and use dedicated constructors all the time, even in situations like var obj: Obj.

Another (mostly theoretical) idea is that it might be possible to automatically add finalizers for ref objects if they are created using `=init` regardless of GC algorithm used. Something like

proc `=init`(v: ref var T) = new(v, final)

Varriount · 2020-09-16T10:11:45Z

How would exceptions be handled?

haxscramper · 2020-09-16T10:23:30Z

If you mean exceptions in the `=init` hook the answer is - I don't think any specific handing is necessary, since value initialization should happen in the same scope as object construction, immediately after var declaration, which means we either get correctly initialized object (if not exception is raised) or exception is raised - no half-initalized objects if you mean this.

Although I'm not sure if I understand what exact scenario you have in mind - if you could elaborate on your question I might provide better answer if possible.

mratsim · 2020-09-16T10:56:21Z

Just like we require =destroy to be plain object, we can enforce =init to not throw. And in C++ AFAIK it's undefined behavior to through in a constructors.

proc `=init`(v: ref var T): {.raises: [].]

Araq · 2020-09-16T11:04:15Z

And in C++ AFAIK it's undefined behavior to through in a constructors.

Pretty sure it's supported and partially constructed objects are deconstructed properly. Looks super expensive to implement (like everything else in C++ I guess).

Araq · 2020-09-16T11:28:17Z

-1 from me. First of all construction is very different from destruction, constructors take parameters in most languages and the problem is worse when "size hints" optimizations enter the picture: A size hint should be attached to an object, not to an object's type.

Furthermore the mechanism will soon be misused to avoid the initT, newT idiom even though it's strictly less flexible than custom constructor procs, see the "factory" pattern and how C++ got make_shared and make_unique even though C++ does have very good support for constructors, there is a lesson to be learned here.

The route forward IMHO is to allow default values inside object declarations with the restriction that the value has to be a compile-time value. For multiple reasons:

It prevents id: int = generateUniqueId(), which is spooky action at a distance. Side effects should not be hidden.
The implicit construction cannot fail with an exception or otherwise.
It is easier to implement.
The interaction with .requiresInit remains obvious.

haxscramper · 2020-09-16T16:55:18Z

Main point is - with constexpr as default values there is no way to execute code when implicit initialization happens. Yes, in overwhelming majority of use cases constexpr is more than enough, but this route completely closes way for non-trivial logic in implicit initialization which might be necessary in some cases.

It is possible to place additional restrictions on =init procedure prototype, such as .raises[] and .noSideEffect., although latter one makes =init almost indistinguishable from constexpr fields.

even though it's strictly less flexible than custom constructor procs

I would argue that =init being explicitly less flexible is a good thing since it prevents misuse.

mechanism will soon be misused to avoid the initT, newT idiom

Again - since there is no support for parameters in =init I don't see how it would affect existing idioms in most cases. In addition -initT just looks better, more logical and often used. I don't think =init will "soon" be misused to avoid initT.

First of all construction is very different from destruction, constructors take parameters in most languages

Again - this is not about explicit constructors - we already have them (initT and newT) and they fit quite nicely into the language. This is only about being able to configure implicit object instantiation.

It prevents id: int = generateUniqueId(), which is spooky action at a distance. Side effects should not be hidden.

I'm sorry, but I don't follow how this would prevent it. If you mean let id: int = ... then no init call should be generated in this case since explicit initalization happened. In case of unique id for each object instance - this would only help, since

type
  Obj = object
    id: int

proc initObj(): Obj = Obj(id: generateUniqueId())

proc `=init`(obj: var Obj) = obj = initObj()

Allows include Obj in different structures and not worry about correct implicit initialization of all subfields.

If this is a 'misuse' you were talking about - I think it is necessary to have some way to configure this behavior and cut chain of "if A includes B I must initialize A it correctly using initB" which basically stretches from initial type to infinity now. With initT responsibility for correct initialization is pushed to all potential users of a type, again and again, potentially breaking adjacent layers of abstraction (each user of Obj must be aware that it is important to construct it using initT and is responsible to making this knowledge available to next abstraction layer (via documentation or .requiresinit.)). constexpr default values partially mitigate this issue, but by definition (compile-time evaluation) they fail to address cases with generateUniqueId().

This is basically the same as .requiresinit. - yes, it is possible to use, I would argue it is a great tool even, but you are creating responsibility for all potential users.

This problem is quite nicely illustrated by mapIt's inability to deal with .requiresinit. types - even though it uses it to get expression type it is still not possible. Using initT is not possible because it would require passing constructor proc to parameter everywhere necessary. Constexpr types could solve the issue, but as already mentioned this is too restrictive solution.

Araq · 2020-09-16T17:34:03Z

Again - since there is no support for parameters in =init I don't see how it would affect existing idioms in most cases. In addition -initT just looks better, more logical and often used. I don't think =init will "soon" be misused to avoid initT.

Ok, this wasn't clear to me before, thanks!

But then your proposal is mostly a different syntax for field = defaultValue (my context here is an object declaration). Syntax aside, there is one difference, you seek to allow arbitary expressions, I really want to restrict it to constant expressions. If we start with the restrictive version, we can always make it less strict in later versions. The same is not so easy for the reverse case: Allow everything and soon enough somebody will rely on it.

This problem is quite nicely illustrated by mapIt's inability to deal with .requiresinit. types - even though it uses it to get expression type it is still not possible.

I think that's a problem that can be solved by special casing typeof even further.

haxscramper · 2020-09-16T17:51:51Z

Yes, exactly. I think that arbitrary expressions might be necessary in some cases, but I agree that it is not possible to make things less strict so starting with constexpr and potentially expanding into =init is a good solution.

mapIt is is relatively easy to work around - you can just use (not really pretty though) hack - typeof((var tmp: ref InType; var it {.inject.} = tmp[]; op)) which is not valid at runtime but works fine in most cases.

Araq · 2020-09-22T05:09:31Z

So can we agree on supporting it in this way:

type
  StartWith1 = object
    x: int = 1

?

haxscramper · 2020-09-22T05:15:30Z

Yes. It covers main concerns about type guarantees invalidation (which is really important) and other complex cases of default initialization would be nice to support, but not right now at least.

timotheecour · 2020-10-07T18:23:33Z

@Araq it's not entirely clear what proposal led to "Accepted RFC", is it the following:

let a3 = 3
type 
  Foo = object
    x1: int = 1 # ok
    x2 = 2 # ok, type inference allowed in initializer
    x3: int = a3 # CT error, field initializer must be const

?
if so, then +1

note 1:

it would currently prevent initializers that are ref/ptr/pointer:

type Bar = ref object
  b0: int

type Foo = object
  b: Bar(b0: 1) # error: initializer is a ref and can't be const

EDIT: this restriction could be lifted by allowing const ref objects, by accepting nim-lang/Nim#15528
(see also #126 (comment))

note 2:

this caveat applies:

type Foo = object
  x1: cstring = "abc"
var a = Foo()
a.x1[0] = 'A' # SIGBUG

[EDIT] note 3

see #126 (comment) for a more detailed proposal that also covers:

`var a: T` # always equivalent to `var a = default(T)`
# `default(T)` is defined recursively in the obvious way, taking into account default intializers for object types, eg: see example provided there

Araq · 2020-10-07T21:14:35Z

it's not entirely clear what proposal led to "Accepted RFC", is it the following: ...

Yes.

this caveat applies:

well var a = Foo(x1: "abc") has the same problem, nothing changed.

pmetras · 2020-10-28T22:08:11Z

The problem I see with @Araq syntax

type
  StartWith1 = object
    x: int = 1

is that it works only for object initialization. You can't use it with other types like

type
  Ranged = range[10 .. 20]    # I would like to have 10 as default
  BoolTrueDefault = bool      # This type of bool should default to 'true'
  Constraint[T] =
    c: T                      # When implementation is delegated to another client module,
                              # default initialization should be too.

haxscramper · 2020-10-28T22:25:42Z

First two types here are not distinct so they should follow regular initialization logic (e.g. false for bool and 0 for range[10 .. 20]). Although range example shows how easily it is to break all guarantees with zeroed default values. (var r: range[10 .. 20]; echo r gives 0).

Default initialization of distinct types is also an important case to consider, but I just can't see how this can be added in type definition syntax. In objects value for fld: type = val was just explicitly prohibited so it is easy to just relax the syntax checking, but for distinct types there is just no place. So for cases like type Hp = distinct range[0 .. 100] you need to have `=init`(hp: var Hp) = hp = Hp(100) or something.

Constraint[T] is just a chain of responsibility "who needs to initialize what" and I don't think it can be solved without =init or constructor procs.

timotheecour · 2020-11-25T01:41:55Z

The problem I see with @Araq syntax
Default initialization of distinct types is also an important case to consider, but I just can't see how this can be added in type definition syntax.

that's a separate problem, see my proposal for this here: =default type bound operator to override default and var initialization #290
can we close this in favor of Support default values for object properties #126, given that the title, and description of Support default values for object properties #126 is much closer the actual accepted RFC User-defined implicit initialization hooks #252 (comment)
(Support default values for object properties #126 just needs to be modified to mention that initializers must be evaluatable at CT instead of RT) and then mark the other one as accepted;
alternatively Support default values for object properties #126 should be closed to avoid duplicate

metagn · 2022-01-16T22:36:37Z

construction is very different from destruction

In my opinion it makes sense if the goal is to prevent invalid states. Destruction turns a value from a valid state to an invalid state, initialization turns it from an invalid state to a valid state. Optimizations like noinit would prevent a call to =init. Maybe there's a way to turn "zero values" into a compile time construct, while turning "runtime memory zeroing" into a type-bound operation. JS codegen already special cases "runtime memory zeroing" for each mappable type. Not sure how these constructs would interact.

ajusa · 2022-04-19T02:43:27Z

Would it also be possible for this RFC to support tuples? I didn't see an example using them yet:

type
  StartWith1 = tuple
    x: int = 1
  StartWith2 = tuple[y: string = "2"]

konsumlamm · 2022-04-22T16:23:54Z

Would it also be possible for this RFC to support tuples?

Tuples are different than objects in that every tuple with the same fields is the same type. So defining a custom initialization hook for a tuple type would affect all tuples of that type, no matter where they're defined. Your example seems more related to #126, as it doesn't involve defining a =init hook.

haxscramper changed the title ~~User-defined implicit initialization~~ User-defined implicit initialization hooks Sep 11, 2020

Araq added the Accepted RFC label Oct 7, 2020

timotheecour mentioned this issue Oct 13, 2020

Support default values for object properties #126

Closed

This was referenced Nov 25, 2020

Object Default Values #253

Closed

=default type bound operator to override default and var initialization #290

Open

mratsim mentioned this issue Jan 10, 2021

defModule macro to define neural nets without inline C++ SciNim/flambeau#5

Open

c-blake mentioned this issue Jan 26, 2021

Commas for variable sections #320

Closed

timotheecour mentioned this issue Feb 5, 2021

document default(T) nim-lang/Nim#16942

Merged

haxscramper mentioned this issue May 23, 2021

Definite assignment analysis and out parameters #378

Closed

konsumlamm mentioned this issue Jun 27, 2021

[Concepts] Consider concept declarations for name resolution #392

Open

haxscramper mentioned this issue Aug 29, 2021

Object init shorthand #418

Open

haxscramper mentioned this issue Oct 7, 2021

C++ move semantics, copying, destructors, memory management haxscramper/hcparse#13

Open

iacore mentioned this issue Oct 19, 2021

Cannot instantiate generic type with not nil parameter type nim-lang/Nim#19014

Closed

quantimnot mentioned this issue Nov 17, 2021

version-2-0 branch and CI #436

Closed

mratsim mentioned this issue Dec 22, 2021

[Miracl] Default init to infinity point status-im/nim-blscurve#122

Open

ringabout self-assigned this Jan 16, 2022

ringabout mentioned this issue Jan 17, 2022

Roadmap for Nim #437

Closed

33 tasks

ringabout mentioned this issue Aug 16, 2022

add default field support for object in ARC/ORC nim-lang/Nim#20220

Closed

19 tasks

ringabout mentioned this issue Oct 2, 2022

add default field support for object in ARC/ORC nim-lang/Nim#20480

Merged

17 tasks

Araq closed this as completed in nim-lang/Nim#20480 Oct 4, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

User-defined implicit initialization hooks #252

User-defined implicit initialization hooks #252

haxscramper commented Sep 11, 2020 •

edited

haxscramper commented Sep 12, 2020 •

edited

Varriount commented Sep 16, 2020

haxscramper commented Sep 16, 2020 •

edited

mratsim commented Sep 16, 2020

Araq commented Sep 16, 2020

Araq commented Sep 16, 2020 •

edited

haxscramper commented Sep 16, 2020 •

edited

Araq commented Sep 16, 2020 •

edited

haxscramper commented Sep 16, 2020

Araq commented Sep 22, 2020

haxscramper commented Sep 22, 2020

timotheecour commented Oct 7, 2020 •

edited

Araq commented Oct 7, 2020

pmetras commented Oct 28, 2020

haxscramper commented Oct 28, 2020

timotheecour commented Nov 25, 2020 •

edited

metagn commented Jan 16, 2022 •

edited

ajusa commented Apr 19, 2022

konsumlamm commented Apr 22, 2022

User-defined implicit initialization hooks #252

User-defined implicit initialization hooks #252

Comments

haxscramper commented Sep 11, 2020 • edited

User-defined implicit initialization

Is this needed?

Existing proposals

Existing compiler warnings

{.requiresinit.}

Why this is needed?

Broken type system

`=destroy` confusion

Other concerns

Possible implementation behavior

haxscramper commented Sep 12, 2020 • edited

Varriount commented Sep 16, 2020

haxscramper commented Sep 16, 2020 • edited

mratsim commented Sep 16, 2020

Araq commented Sep 16, 2020

Araq commented Sep 16, 2020 • edited

haxscramper commented Sep 16, 2020 • edited

Araq commented Sep 16, 2020 • edited

haxscramper commented Sep 16, 2020

Araq commented Sep 22, 2020

haxscramper commented Sep 22, 2020

timotheecour commented Oct 7, 2020 • edited

note 1:

note 2:

[EDIT] note 3

Araq commented Oct 7, 2020

pmetras commented Oct 28, 2020

haxscramper commented Oct 28, 2020

timotheecour commented Nov 25, 2020 • edited

metagn commented Jan 16, 2022 • edited

ajusa commented Apr 19, 2022

konsumlamm commented Apr 22, 2022

haxscramper commented Sep 11, 2020 •

edited

`{.requiresinit.}`

haxscramper commented Sep 12, 2020 •

edited

haxscramper commented Sep 16, 2020 •

edited

Araq commented Sep 16, 2020 •

edited

haxscramper commented Sep 16, 2020 •

edited

Araq commented Sep 16, 2020 •

edited

timotheecour commented Oct 7, 2020 •

edited

timotheecour commented Nov 25, 2020 •

edited

metagn commented Jan 16, 2022 •

edited