Allow local redefinition of built-in types #6335

bjorng · 2022-09-29T08:16:36Z

If a module defines a type and a new built-in type is added with the same name, Dialyzer will allow the redefinition of that particular release for one release only. That makes it difficult to write code that is supposed to work for multiple releases of OTP.

For example, the OTP 24 release introduced the nonempty_binary type. Modules that had their own definition would continue to work in OTP 24 but not in OTP 25.

Always allow the name of a built-in type to be reused locally, in the same way that a definition of a function with the same name as a BIF will take precedence over the BIF. Starting from OTP 25, the built-in types belong to the erlang module, so it will always be possible to get the built-in definition if needed by prefixing the name with erlang:.

Closes #6132

github-actions · 2022-09-29T08:17:45Z

CT Test Results

      4 files   388 suites 58m 14s ⏱️
2 782 tests 2 734 ✔️ 48 💤 0 ❌
7 016 runs 6 965 ✔️ 51 💤 0 ❌

Results for commit b08f467.

♻️ This comment has been updated with latest results.

To speed up review, make sure that you have read Contributing to Erlang/OTP and that all checks pass.

See the TESTING and DEVELOPMENT HowTo guides for details about how to run test locally.

Artifacts

// Erlang/OTP Github Action Bot

kikofernandez · 2022-10-10T07:52:30Z

lib/dialyzer/src/dialyzer_utils.erl


 get_core_location([L | _As]) when is_integer(L) -> L;
 get_core_location([{L, C} | _As]) when is_integer(L), is_integer(C) -> {L, C};
 get_core_location([_ | As]) -> get_core_location(As);
 get_core_location([]) -> undefined.

+massage_forms([{type, Loc, [{Name, Type0, Args}]} | T], Defs0) ->
+  Type = massage_type(Type0, Defs0),
+  Defs = sets:add_element({Name, length(Args)}, Defs0),


Maybe equivalent but in this line the sets:add_element({Name, length(Args)}, Defs0) takes the Name as argument. However, your massage_type function checks if sets:is_element({Type, length(Args0)}, Defs) where the Type I guess is not the same as the Name. Unless Name and Type are exactly equal, I believe you are creating a set of Names with their arity but your massage_type looks for Types and their arity.

Yes, my naming is inconsistent. I have pushed a fixup commit that renames Type to Name in massage_type/2.

kikofernandez · 2022-10-10T07:54:39Z

lib/dialyzer/src/dialyzer_utils.erl

+massage_forms([{type, Loc, [{Name, Type0, Args}]} | T], Defs0) ->
+  Type = massage_type(Type0, Defs0),
+  Defs = sets:add_element({Name, length(Args)}, Defs0),
+  [{type, Loc, [{Name, Type, Args}]} | massage_forms(T, Defs)];


The function is not tail recursive but I guess that we do not care because we assume that the list is not going to be big enough to make an impact?

body-recursive functions that just build a list generally are the same, or even more efficient than tail recursive with a reverse at the end.

See: https://ferd.ca/erlang-s-tail-recursion-is-not-a-silver-bullet.html

Thanks @michalmuskala , I also saw this other tip from Erlang
https://www.erlang.org/doc/efficiency_guide/listhandling#recursive-list-functions

kikofernandez · 2022-10-10T08:00:21Z

lib/stdlib/src/erl_lint.erl

@@ -436,12 +436,8 @@ format_error({undefined_type, {TypeName, Arity}}) ->
    io_lib:format("type ~tw~s undefined", [TypeName, gen_type_paren(Arity)]);
 format_error({unused_type, {TypeName, Arity}}) ->
    io_lib:format("type ~tw~s is unused", [TypeName, gen_type_paren(Arity)]);
-format_error({new_builtin_type, {TypeName, Arity}}) ->


The removal of these two cases (new_builtin_type and builtin_type) mean that if we ever get them, the linter will crash. Is this the expected behaviour?

Is there any guarantees that new_builtin_type and builtin_type cannot appear as arguments to the format_error?

format_error/1 is only supposed to be called with a term returned from call to one of the functions in erl_lint. That means that when invoked from the compiler, format_error/1 will never crash.

Not having a default clause makes sure that if we have forgotten to add a new clause to format_error/1, the bug will be found quickly.

lib/stdlib/src/erl_lint.erl

kikofernandez · 2022-10-10T08:19:48Z

lib/stdlib/src/erl_lint.erl

+    TypePair = {TypeName, length(Args)},
+    case is_map_key(TypeName, Types) of
+        true ->
+            TypePair = {TypeName, length(Args)},


This has been define in line 3052 :)

The code look ok but it is almost repeating the same pattern in all cases.
I wonder if there is a better way to write this code without repeating the same structure 3 times.
Maybe it can be refactored to use a function that assigns {TypeName, 0} or {TypeName, length(Args) depending on whether Args is any or not.

The third clause (the one with the redundant definition) is wrong and redundant. It handles annotated types (e.g. A :: atom()), remote types (e.g. gb_sets:set()), and integer and atom literals. Those need no special handling, so the third clause can be removed.

The first and second clauses can indeed be combined to avoid repeating code.

I have pushed a fixup commit.

kikofernandez · 2022-10-10T11:50:31Z

lib/stdlib/src/erl_lint.erl

+check_type_1(Types, SeenVars, St) ->
+    check_type_2(Types, SeenVars, St).
+
+check_type_2({ann_type, _A, [_Var, Type]}, SeenVars, St) ->


Is there a better name for the function than check_type_1 and check_type_2?

Any suggestions for better names?

The number suffixes indicates that check_type_1 and check_type_2 are helpers for check_type/1, and that they are never meant to be called directly from other parts of erl_lint.

michalmuskala · 2022-10-10T14:09:55Z

This has an unfortunate side effect of allowing "spooky action at a distance" with header files.

See for example:

% foo.hrl
-record(person, {name :: string()}).

% foo.erl
-include("foo.hrl").
-type string() :: atom().

foo() ->
    #person{name = "Name"} % surprise, invalid (!)

bjorng · 2022-10-11T04:03:49Z

This has an unfortunate side effect of allowing "spooky action at a distance" with header files.

Yes, it does. Not sure if that is a problem in practice, or how we could avoid it.

kikofernandez · 2022-10-11T06:09:34Z

to avoid the problem stated by @michalmuskala , can we consider the header files to use the default erlang module for their types? I think we do not want to allow dynamic scoping, correct?

bjorng · 2022-10-11T08:23:00Z

to avoid the problem stated by @michalmuskala , can we consider the header files to use the default erlang module for their types?

I don't think we should make the semantics dependent on whether the definition is in a header filer or not. That could also defeat the purpose of allowing redefinitions if the types that need to be redefined are in a header file. (It's no longer considered best practice to define types in header files, but we still have such code in Erlang/OTP.)

michalmuskala · 2022-10-11T10:29:21Z

I agree with @bjorng - making semantics different between header files & modules wouldn't be desirable.

TBH, I don't think there's much we can do here, it's the semantics of headers & records that are the actual problem, not really this feature. I don't think we can do much, other than warn people not to do it. You could argue the same is already possible with functions & default values:

%foo.hrl
-record(processor, {pid = self(), ...}).

%mod1.erl
-include("foo.hrl").
processor() -> #processor{}.

%mod2.erl
-include("foo.hrl").
-compile({no_auto_import, [self/0]}).

processor() -> #processor{}.

self() -> 'self'.

If a module defines a type and a new built-in type is added with the same name, Dialyzer will allow the redefinition of that particular release for one release only. That makes it difficult to write code that is supposed to work for multiple releases of OTP. For example, the OTP 24 release introduced the `nonempty_binary` type. Modules that had their own definition would continue to work in OTP 24 but not in OTP 25. Always allow the name of a built-in type to be reused locally, in the same way that a definition of a function with the same name as a BIF will take precedence over the BIF. Starting from OTP 25, the built-in types belong to the erlang module, so it will always be possible to get the built-in definition if needed by prefixing the name with `erlang:`. Closes erlang#6132

Summary: - why? `bultin-in` dynamic() in OTP 26 and erlang/otp#6335 Reviewed By: alanz Differential Revision: D45967322 fbshipit-source-id: f6e5635b4a7e3d1b19e47266e977dd4f168f0434

bjorng added team:VM Assigned to OTP team VM testing currently being tested, tag is used by OTP internal CI labels Sep 29, 2022

bjorng requested a review from kikofernandez September 29, 2022 08:16

bjorng self-assigned this Sep 29, 2022

kikofernandez reviewed Oct 10, 2022

View reviewed changes

bjorng force-pushed the bjorn/dialyzer/local-type-redefinition/GH-6132 branch from b08f467 to 4d08fc9 Compare October 12, 2022 09:44

bjorng merged commit 5b279e3 into erlang:master Oct 12, 2022

bjorng deleted the bjorn/dialyzer/local-type-redefinition/GH-6132 branch October 12, 2022 09:51

ilya-klyuchnikov mentioned this pull request Mar 14, 2023

EEP 61: Built-in dynamic type #6993

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow local redefinition of built-in types #6335

Allow local redefinition of built-in types #6335

bjorng commented Sep 29, 2022

github-actions bot commented Sep 29, 2022 •

edited

Loading

kikofernandez Oct 10, 2022

bjorng Oct 11, 2022

kikofernandez Oct 10, 2022

michalmuskala Oct 10, 2022 •

edited

Loading

kikofernandez Oct 10, 2022

kikofernandez Oct 10, 2022

bjorng Oct 11, 2022

kikofernandez Oct 10, 2022

kikofernandez Oct 10, 2022

bjorng Oct 11, 2022

kikofernandez Oct 10, 2022

bjorng Oct 11, 2022

michalmuskala commented Oct 10, 2022

bjorng commented Oct 11, 2022

kikofernandez commented Oct 11, 2022

bjorng commented Oct 11, 2022

michalmuskala commented Oct 11, 2022

Allow local redefinition of built-in types #6335

Allow local redefinition of built-in types #6335

Conversation

bjorng commented Sep 29, 2022

github-actions bot commented Sep 29, 2022 • edited Loading

CT Test Results

Artifacts

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

michalmuskala Oct 10, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

michalmuskala commented Oct 10, 2022

bjorng commented Oct 11, 2022

kikofernandez commented Oct 11, 2022

bjorng commented Oct 11, 2022

michalmuskala commented Oct 11, 2022

github-actions bot commented Sep 29, 2022 •

edited

Loading

michalmuskala Oct 10, 2022 •

edited

Loading