Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New rules for type inference #149

Closed
krux02 opened this issue May 21, 2019 · 1 comment
Closed

New rules for type inference #149

krux02 opened this issue May 21, 2019 · 1 comment
Labels

Comments

@krux02
Copy link
Contributor

krux02 commented May 21, 2019

This example shows a problem with the current type inference algorithm.

  # this one doesn't work
  var x : set[char] = (echo "a"; {})
  # where this one does work
  var y : set[char] = {}

In the compiler the expression (echo "a"; {}) is semantically
checked and the inferred type is set[empty]. And then when the
assignment is checked, the type of the expression on the right hand
side is patched as an afterthought. This is the code in the compiler
that does it.

  proc fitNodePostMatch(c: PContext, formal: PType, arg: PNode): PNode =
    result = arg
    let x = result.skipConv
    if x.kind in {nkPar, nkTupleConstr, nkCurly} and formal.kind != tyUntyped:
      changeType(c, x, formal, check=true)
    else:
      result = skipHiddenSubConv(result)

Basically this only works for very simple expressions. According to
Araq this function does: "uoh, two types here that we need to combine
somehow into a real type". I see this as a problematic hack in many
ways. First of all it is a special case that only takes the simple
case into consideration x = {} where x is fully typed. Then the
second problem is, changes the type of the AST an a mutable
transformation as an afterthought (mutable changes might lead to
bugs). That is not a clean way to solve this problem, and there is a
better way to do it.

My proposal is the following:

Get rid of fidNode and fitNodePostMatch entirely. These procedures
don't work recursively on the AST, and making them traverse
recursively will introduce yet another AST traversal. This is costly
maybe incorrect and I don't think it is worth it. So I think the types
can be inferred in the first sem checking phase.

Currently when an expression like var x : set[char] = (echo "a"; {})
is checked, we know that we expect the right hand side to be of type
set[char], but this expected type is not passed to
semExprWithType. So I suggest to add another parameter to
semExprWithType, the expected type. The expected type may always be
nil, but when it isn't it should be forwarded properly. With this
additional parameter the sem-checker will eventually do a sem-check on
{} with with an expected type. This means that the expression {}
instantly get the correct type, it won't be inferred as an empty node
anymore.

This expected type can then be used recursively to have expected types
in other contexts, such as tuple constructors, arrays and sequences.
Here is an example on how the expected type should be used recursively
in other contexts (pseudo syntax):

  var xxx : byte = (echo("a"); 123)

  # The expected type in every stmtListExpr before the last is `void`.
  # The last expression in a stmtListExpr is expected to be of the same
  # type as the whole expression.

  semExprWithType( {. (echo("a"); 123) .}, expectedType = {.byte.} )
  | semExprWithType( {. echo("a") .}, expectedType = {.void.} )
  | | semExprWithType( {. "a" .}, expectedType = nil ) -> ("a", string)
  | `-> (echo("a"), void)
  | semExprWithType( {. 123 .}, expectedType = {.byte.} ) -> (123'u8, byte)
  `-> ((echo("a"); 123), byte)

  var myarray: array[3, uint8] = [1,2,3]

  # When an array is expected, the elements of a array literal should be
  # expected to be of the arrays element type.

  semExprWithType( {. [1,2,3] .}, expectedType = {. array[3, uint8] .} )
  | semExprWithType( {. 1 .}, expectedType = {. uint8 .} ) -> (1'u8, uint8)
  | semExprWithType( {. 2 .}, expectedType = {. uint8 .} ) -> (2'u8, uint8)
  | semExprWithType( {. 3 .}, expectedType = {. uint8 .} ) -> (3'u8, uint8)
  `-> ( [1'u8, 2'u8, 3'u8],  array[3, uint8] )

  # When a tuple is expected, the elements of a tuple literal should be
  # expected to be of their corresponding type in the tuple literal.

  var mytuple: (uint8,string) = (123,"abc")
  semExprWithType( {. (123,"abc") .}, expectedType = {. (uint8,string) .} )
  | semExprWithType( {. 123 .}, expectedType = {. uint8 .} ) -> (123'u8, uint8)
  | semExprWithType( {. "abc" .}, expectedType = {. string .} ) -> ("abc", string)
  `-> ( (123'u8, "abc"), (uint8, string) )

  # The following example already works today, but how it works would change.

  type BaseObject = ref object of RootObj
  type ObjA = ref object of BaseObject
  type ObjB = ref object of BaseObject
  var x: seq[BaseObject] = @[ObjA(), ObjB()]

  semExprWithType( {. @[ObjA(), ObjB()] .}, expectedType = {. seq[BaseObject] .} )
  | semExprWithType( {. ObjA() .}, expectedType = {. BaseObject .} ) -> (ObjA(), BaseObject)
  | semExprWithType( {. ObjB() .}, expectedType = {. BaseObject .} ) -> (ObjB(), BaseObject)
  `-> (@[ObjA(), ObjB()], BaseObject), seq[BaseObject])

  # This is an example on how error messages could be improved. The bug
  # in the program is `ObjC` inherits from `RootObj` instead of `BaseObject`.
  # The current error message is ``type mismatch: got <seq[ref RootObj]> but expected 'seq[BaseObject]'``
  # The error message does not specify which element is wrong.

  type ObjC = ref object of RootObj
  var x: seq[BaseObject] = @[ObjA(), ObjB(), ObjC()]

  semExprWithType( {. @[ObjA(), ObjB(), ObjC()] .}, expectedType = {. seq[BaseObject] .} )
  | semExprWithType( {. ObjA() .}, expectedType = {. BaseObject .} ) -> (ObjA(), BaseObject)
  | semExprWithType( {. ObjB() .}, expectedType = {. BaseObject .} ) -> (ObjB(), BaseObject)
  | semExprWithType( {. ObjC() .}, expectedType = {. BaseObject .} ) -> error("type mismatch: got <ObjC> but expected 'BaseObject'")

  # This error is much more specific as it can point out the exact type
  # mismatch and the position in the array where the type mismatch
  # happens.

Tuples like the example above might also be detected as correct code,
because the sem checker will be able to know that in the first element
of the tuple constructor an uint8 is expected, not a string literal.

steps that need to be done before this can be implemented

  1. enforceVoidContext is currently tyTyped. This needs to be
    changed to tyVoid consistently though the compiler.
  2. The void type in the compiler is currently represented as nil. I
    this is problematic as typ.kind cannot be accessed safely. I
    think the nil type can be used better as "no type expectation",
    or when seen in the AST: "type not yet inferred".

I hope that the expression flags efWantStmt and efDetermineType
can be superseded by this and therefore removed. As efWantStmt is
represented as efWantValue.

This could potentially improve the usability of pure enums, because
in contexts where a concrete enum type is expected, the owner of that
enum identifier is clear.

type
  MyEnum {.pure.} = enum
    valueA, valueB, valueC, valueD, amb

  OtherEnum {.pure.} = enum
    valueX, valueY, valueZ, amb

let mySet: set[MyEnum] = {valueX, amb}

amb is declared in both MyEnum and OtherEnum, but it is clear that
MyEnum.amb is meant and not OtherEnum.amb, because the expected
type will be MyEnum.

This RFC supersedes the following issues:
nim-lang/Nim#11109

metagn added a commit to metagn/Nim that referenced this issue Jul 26, 2022
metagn added a commit to metagn/Nim that referenced this issue Jul 31, 2022
metagn added a commit to metagn/Nim that referenced this issue Aug 5, 2022
metagn added a commit to metagn/Nim that referenced this issue Aug 9, 2022
metagn added a commit to metagn/Nim that referenced this issue Aug 15, 2022
metagn added a commit to metagn/Nim that referenced this issue Aug 15, 2022
Araq added a commit to nim-lang/Nim that referenced this issue Aug 24, 2022
* micro implementation of rfc 149

refs nim-lang/RFCs#149

* number/array/seq literals, more statements

* try fix number literal alias issue

* renew expectedType with if/case/try branch types

* fix (nerf) index type handling and float typed int

* use typeAllowed

* tweaks + const test (tested locally) [skip ci]

* fill out more of the checklist

* more literals, change @ order, type conversions

Not copying the full call tree before the typedesc call check
in `semIndirectOp` is also a small performance improvement.

* disable self-conversion warning

* revert type conversions (maybe separate op later)

* deal with CI for now (seems unrelated), try enums

* workaround CI different way

* proper fix

* again

* see sizes

* lol

* overload selection, simplify int literal -> float

* range, new @ solution, try use fitNode for nil

* use new magic

* try fix ranges, new magic, deal with #20193

* add documentation, support templates

Co-authored-by: Andreas Rumpf <rumpf_a@web.de>
capocasa pushed a commit to capocasa/Nim that referenced this issue Mar 31, 2023
* micro implementation of rfc 149

refs nim-lang/RFCs#149

* number/array/seq literals, more statements

* try fix number literal alias issue

* renew expectedType with if/case/try branch types

* fix (nerf) index type handling and float typed int

* use typeAllowed

* tweaks + const test (tested locally) [skip ci]

* fill out more of the checklist

* more literals, change @ order, type conversions

Not copying the full call tree before the typedesc call check
in `semIndirectOp` is also a small performance improvement.

* disable self-conversion warning

* revert type conversions (maybe separate op later)

* deal with CI for now (seems unrelated), try enums

* workaround CI different way

* proper fix

* again

* see sizes

* lol

* overload selection, simplify int literal -> float

* range, new @ solution, try use fitNode for nil

* use new magic

* try fix ranges, new magic, deal with nim-lang#20193

* add documentation, support templates

Co-authored-by: Andreas Rumpf <rumpf_a@web.de>
narimiran pushed a commit to nim-lang/Nim that referenced this issue May 11, 2023
* micro implementation of rfc 149

refs nim-lang/RFCs#149

* number/array/seq literals, more statements

* try fix number literal alias issue

* renew expectedType with if/case/try branch types

* fix (nerf) index type handling and float typed int

* use typeAllowed

* tweaks + const test (tested locally) [skip ci]

* fill out more of the checklist

* more literals, change @ order, type conversions

Not copying the full call tree before the typedesc call check
in `semIndirectOp` is also a small performance improvement.

* disable self-conversion warning

* revert type conversions (maybe separate op later)

* deal with CI for now (seems unrelated), try enums

* workaround CI different way

* proper fix

* again

* see sizes

* lol

* overload selection, simplify int literal -> float

* range, new @ solution, try use fitNode for nil

* use new magic

* try fix ranges, new magic, deal with #20193

* add documentation, support templates

Co-authored-by: Andreas Rumpf <rumpf_a@web.de>
(cherry picked from commit 0014b9c)
@github-actions
Copy link

This RFC is stale because it has been open for 1095 days with no activity. Contribute a fix or comment on the issue, or it will be closed in 7 days.

@github-actions github-actions bot added the Stale label May 31, 2023
@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Jun 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant