Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Join var binds in add_type_pat_union/3 with least upper bound not greatest lower bound #512

Merged
merged 44 commits into from
Mar 2, 2023

Conversation

erszcz
Copy link
Collaborator

@erszcz erszcz commented Jan 26, 2023

This is similar to @xxdavid's #505, but for var binds. I'm not convinced by the comment in the code that var binds should be merged by an intersection (aka GLB). @zuiderkwast, I think it's your note from 958459e - are you sure of it? This PR fixes at least:

ebin/typechecker.beam: The pattern {integer, _, Arity} on line 1849 at column 50 doesn't have the type none()

and likely more similar self-check errors.

I also think it might fix #433, which reported list comprehension generators having type none(). I remember troubleshooting this previously, but with no good idea for a fix back then - I think this might be the proper fix, but it needs confirmation.

The tradeoff, however, is that this generates A LOT of new self-check errors - but they all seem sensible! For example, they're related to us not matching out the any atom in places where it's found in various nodes of type(). We just assume the last of such node elements to be a list of types, but don't make sure that's the case and the type checker is not convinced with such an unbacked assumption. Example node on which it might happen: {type, anno(), tuple, any | [type()]}.

ToDo:

@erszcz
Copy link
Collaborator Author

erszcz commented Jan 26, 2023

I confirm that this solves the list comprehension generators being reported as none() errors - 893f361.

@zuiderkwast
Copy link
Collaborator

This is similar to @xxdavid's #505, but for var binds. I'm not convinced by the comment in the code that var binds should be merged by an intersection (aka GLB). @zuiderkwast, I think it's your note from 958459e - are you sure of it?

No, I'm not sure. I was probably wrong. I don't remember this anymore.

Can you give the simplest example possible where add_type_pat_union gets called?

This PR fixes at least:

ebin/typechecker.beam: The pattern {integer, _, Arity} on line 1849 at column 50 doesn't have the type none()

and likely more similar self-check errors.

That's good. I don't understand why we get that error for the code on line 1849. Can you explain? For reference, it's {interger, _, Arity} in this code:

do_type_check_expr(Env, {'fun', P, {function, M, F, A}}) ->
    case {get_atom(Env, M), get_atom(Env, F), A} of
        {{atom, _, Module}, {atom, _, Function}, {integer, _, Arity}} ->

@erszcz
Copy link
Collaborator Author

erszcz commented Jan 27, 2023

This is similar to @xxdavid's #505, but for var binds. I'm not convinced by the comment in the code that var binds should be merged by an intersection (aka GLB). @zuiderkwast, I think it's your note from 958459e - are you sure of it?

No, I'm not sure. I was probably wrong. I don't remember this anymore.

No worries 👍

Can you give the simplest example possible where add_type_pat_union gets called?

This PR fixes at least:

ebin/typechecker.beam: The pattern {integer, _, Arity} on line 1849 at column 50 doesn't have the type none()

and likely more similar self-check errors.

That's good. I don't understand why we get that error for the code on line 1849. Can you explain? For reference, it's {interger, _, Arity} in this code:

do_type_check_expr(Env, {'fun', P, {function, M, F, A}}) ->
    case {get_atom(Env, M), get_atom(Env, F), A} of
        {{atom, _, Module}, {atom, _, Function}, {integer, _, Arity}} ->

add_type_pat_union/3 is called when add_type_pat tries to check a pattern against a type, but that type is a union. It then fans out to each of the union members (via add_type_pat on the member) and upon return has to merge the results somehow. Here we're considering how to merge them.

On to the example. The quoted snippet, given we merge with GLB, results in Arity :: none(), as the algorithm attempts to glb(arity(), af_integer() | af_variable(), ...) based on this type:

-type af_remote_fun() ::
        {'fun', anno(), {'function', module(), function_name(), arity()}}
      | {'fun', anno(), {'function',
                         af_atom() | af_variable(),
                         af_atom() | af_variable(),
                         af_integer() | af_variable()}}.

This leads nowhere, since arity() is an integer range, whereas af_integer() | af_variable() are abstract forms.

If we use LUB, on the other hand, we get Arity :: arity() | af_integer() | af_variable(), which is a non-empty type that can be inhabited.

@erszcz
Copy link
Collaborator Author

erszcz commented Jan 27, 2023

For the record, a simple example of add_type_pat_union/3 would be:

-module(simple).

-spec f({a, 1} | {a, 2}) -> 1 | 2.
f(U) ->
    {a, A} = U,
    A.

Trace (with this PR):

$ ./bin/gradualizer simple.erl
{trace,<0.88.0>,call,
    {typechecker,add_type_pat_union,
        [{tuple,{5,5},[{atom,{5,6},a},{var,{5,9},'A'}]},
         {type,0,union,
             [{type,0,tuple,[{atom,0,a},{integer,0,1}]},
              {type,0,tuple,[{atom,0,a},{integer,0,2}]}]},
         {venv,
             #{'U' =>
                   {type,0,union,
                       [{type,0,tuple,[{atom,0,a},{integer,0,1}]},
                        {type,0,tuple,[{atom,0,a},{integer,0,2}]}]}}}]}}
{trace,<0.88.0>,return_from,
       {typechecker,add_type_pat_union,3},
       {{type,0,union,
              [{type,0,tuple,[{atom,0,a},{integer,0,1}]},
               {type,0,tuple,[{atom,0,a},{integer,0,2}]}]},
        {type,0,union,
              [{type,0,tuple,[{atom,0,a},{integer,0,1}]},
               {type,0,tuple,[{atom,0,a},{integer,0,2}]}]},
        {venv,#{'A' => {type,0,range,[{integer,0,1},{integer,0,2}]},
                'U' =>
                    {type,0,union,
                          [{type,0,tuple,[{atom,0,a},{integer,0,1}]},
                           {type,0,tuple,[{atom,0,a},{integer,0,2}]}]}}},
        {constraints,#{},#{},#{}}}}
{trace,<0.88.0>,return_to,{typechecker,add_types_pats,6}}

@zuiderkwast
Copy link
Collaborator

zuiderkwast commented Jan 28, 2023

Thank you! I'm convinced. ☺️

@erszcz
Copy link
Collaborator Author

erszcz commented Jan 28, 2023

For the record, the change to LUB leads to some conflicts. For example:

1> Env = gradualizer:env("-type type() :: gradualizer_type:abstract_type().", []).
{env,#{},#{},#{},
     #{module => undefined,records => #{},
       types =>
           #{{type,0} =>
                 {[],
                  {remote_type,0,
                               [{atom,1,gradualizer_type},{atom,1,abstract_type},[]]}}}},
     false,false,true,[],30,none,false}
2> typechecker:subtype(gradualizer:type("{user_type, erl_anno:anno(), atom(), [type()]}"), gradualizer:type("type()"), Env).
{true,{constraints,#{},#{},#{}}}
3> typechecker:subtype(gradualizer:type("{type, erl_anno:anno(), atom(), [type()]}"), gradualizer:type("type()"), Env).
{true,{constraints,#{},#{},#{}}}
4> typechecker:subtype(gradualizer:type("{type | user_type, erl_anno:anno(), atom(), [type()]}"), gradualizer:type("type()"), Env).
false

Even though {user_type, anno(), atom(), [type()]} and {type, anno(), atom(), [type()]} are subtypes of type(), {type | user_type, anno(), atom(), [type()]} is not. The last one might be returned by add_type_pat_union if we use merging with LUB. I'll add a known-problem for this case. I think it might be solved be expanding types with nested unions to a cartesian product of types (without unions) when calculating subtype.

@erszcz
Copy link
Collaborator Author

erszcz commented Jan 28, 2023

I've been thinking about it a bit more and I'm not sure if we should allow {type | user_type, anno(), atom(), [type()]} :: {type, anno(), atom(), [type()]} | {user_type, anno(), atom(), [type()]}. Or, in general, if:

     ?_assert(subtype(?t( {a1, b}           ), ?t( {a1, b} | {a2, b} ))),
     ?_assert(subtype(?t( {a2, b}           ), ?t( {a1, b} | {a2, b} ))),

hold, make

     ?_assert(subtype(?t( {a1 | a2, b}      ), ?t( {a1, b} | {a2, b} ))),

also hold.

It resonates to me with this remark about MLstruct:

One limitation is that we can't have "smart" record unions like in TypeScript: in MLstruct { tag: 0; value: Str } | { tag: 1; value: Int } and { tag: 0 | 1; value: Str | Int } are equivalent.

We want the TypeScript, not the MLstruct, behaviour in Erlang. We cannot afford the equivalence since Erlang tagged tuples are akin to algebraic data type constructors and are used for pattern matching. @zuiderkwast WDYT?

@erszcz erszcz force-pushed the join-var-binds-with-lub-not-glb branch 2 times, most recently from eb47aae to 999679a Compare January 31, 2023 08:54
@erszcz
Copy link
Collaborator Author

erszcz commented Jan 31, 2023

75eec9b shows an interesting problem. 999679a distills it into a test case. It resembles 95de84f, which manifested when registering constraints on type vars passed as call arguments - the solution of registering a union upper bound was a bit of a hack.

Having fixed the union merge issue in add_type_pat_union, we now see a similar issue in regular non-constraint related code, too:

     1	-module(call_intersection_function_with_union_arg_should_pass).
     2
     3	-export([f/1]).
     4
     5	-type t() :: t1 | t2.
     6
     7	-spec f(t()) -> ok.
     8	f(T) ->
     9	    helper(T).
    10
    11	-spec helper(t1) -> ok;
    12	            (t2) -> ok.
    13	helper(t1) -> ok;
    14	helper(t2) -> ok.

It's especially well visible with #491:

test/known_problems/should_pass/call_intersection_function_with_union_arg_should_pass.erl:
    helper/1 call arguments on line 9 at column 5 don't match the function type:
fun((t1) -> ok)
fun((t2) -> ok)
Inferred argument types:
t1 | t2

f/1 obviously handles t1 | t2 - it's the union of its clauses' argument types. Our code checking intersection function calls cannot handle union arguments, though.

@xxdavid
Copy link
Collaborator

xxdavid commented Jan 31, 2023

I don't understand everything in this PR but I'd like to note that aside from what has been mentioned, it also fixes this issue I just came across:

-spec f(#{a := [integer()]}) -> empty | non_empty.
f(#{a := []}) -> empty;
f(#{a := _List}) -> non_empty.

-spec g(#{a := b | c}) -> integer().
g(#{a := b}) -> 1;
g(#{a := c}) -> 2.

The second clause for both f and g raises the The clause on line … at column 1 cannot be reached error on master but not on this branch.

@erszcz
Copy link
Collaborator Author

erszcz commented Jan 31, 2023

Thanks, @xxdavid, that's good feedback 👍

@erszcz erszcz force-pushed the join-var-binds-with-lub-not-glb branch 2 times, most recently from 82cbacc to 8c23d6c Compare February 2, 2023 19:13
@erszcz
Copy link
Collaborator Author

erszcz commented Feb 2, 2023

There's some light at the end of the tunnel - test/known_problems/should_pass/call_intersection_function_with_union_arg_should_pass.erl passes with the latest intersection-typed function call fixes.

@erszcz erszcz force-pushed the join-var-binds-with-lub-not-glb branch from 8c23d6c to 0c6362d Compare March 1, 2023 22:18
@erszcz
Copy link
Collaborator Author

erszcz commented Mar 1, 2023

Down to 302 lines of self-check errors. Only 202 to go 😅

erszcz added 26 commits March 2, 2023 18:09
Apparently, ! is missing from the definition of binary_op() in OTP's erl_parse, too.
However, it almost certainly should be there as the relevant AST nodes are built with ?mkop2() macro
as all the other binops.
@erszcz erszcz force-pushed the join-var-binds-with-lub-not-glb branch from e7a011d to 6cd7eb7 Compare March 2, 2023 17:09
@erszcz erszcz merged commit d2ea320 into josefs:master Mar 2, 2023
@erszcz erszcz deleted the join-var-binds-with-lub-not-glb branch March 2, 2023 17:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

List comprehensions: variable is expected to have type [any()] but it has type none()
3 participants