-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow local redefinition of built-in types #6335
Allow local redefinition of built-in types #6335
Conversation
CT Test Results 4 files 388 suites 58m 14s ⏱️ Results for commit b08f467. ♻️ This comment has been updated with latest results. To speed up review, make sure that you have read Contributing to Erlang/OTP and that all checks pass. See the TESTING and DEVELOPMENT HowTo guides for details about how to run test locally. Artifacts// Erlang/OTP Github Action Bot |
|
||
get_core_location([L | _As]) when is_integer(L) -> L; | ||
get_core_location([{L, C} | _As]) when is_integer(L), is_integer(C) -> {L, C}; | ||
get_core_location([_ | As]) -> get_core_location(As); | ||
get_core_location([]) -> undefined. | ||
|
||
massage_forms([{type, Loc, [{Name, Type0, Args}]} | T], Defs0) -> | ||
Type = massage_type(Type0, Defs0), | ||
Defs = sets:add_element({Name, length(Args)}, Defs0), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe equivalent but in this line the sets:add_element({Name, length(Args)}, Defs0)
takes the Name
as argument. However, your massage_type
function checks if sets:is_element({Type, length(Args0)}, Defs)
where the Type
I guess is not the same as the Name
. Unless Name
and Type
are exactly equal, I believe you are creating a set of Names
with their arity but your massage_type
looks for Types
and their arity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, my naming is inconsistent. I have pushed a fixup commit that renames Type
to Name
in massage_type/2
.
massage_forms([{type, Loc, [{Name, Type0, Args}]} | T], Defs0) -> | ||
Type = massage_type(Type0, Defs0), | ||
Defs = sets:add_element({Name, length(Args)}, Defs0), | ||
[{type, Loc, [{Name, Type, Args}]} | massage_forms(T, Defs)]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function is not tail recursive but I guess that we do not care because we assume that the list is not going to be big enough to make an impact?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
body-recursive functions that just build a list generally are the same, or even more efficient than tail recursive with a reverse at the end.
See: https://ferd.ca/erlang-s-tail-recursion-is-not-a-silver-bullet.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @michalmuskala , I also saw this other tip from Erlang
https://www.erlang.org/doc/efficiency_guide/listhandling#recursive-list-functions
@@ -436,12 +436,8 @@ format_error({undefined_type, {TypeName, Arity}}) -> | |||
io_lib:format("type ~tw~s undefined", [TypeName, gen_type_paren(Arity)]); | |||
format_error({unused_type, {TypeName, Arity}}) -> | |||
io_lib:format("type ~tw~s is unused", [TypeName, gen_type_paren(Arity)]); | |||
format_error({new_builtin_type, {TypeName, Arity}}) -> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
The removal of these two cases (
new_builtin_type
andbuiltin_type
) mean that if we ever get them, the linter will crash. Is this the expected behaviour? -
Is there any guarantees that
new_builtin_type
andbuiltin_type
cannot appear as arguments to theformat_error
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
format_error/1
is only supposed to be called with a term returned from call to one of the functions in erl_lint
. That means that when invoked from the compiler, format_error/1
will never crash.
Not having a default clause makes sure that if we have forgotten to add a new clause to format_error/1
, the bug will be found quickly.
lib/stdlib/src/erl_lint.erl
Outdated
TypePair = {TypeName, length(Args)}, | ||
case is_map_key(TypeName, Types) of | ||
true -> | ||
TypePair = {TypeName, length(Args)}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has been define in line 3052 :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code look ok but it is almost repeating the same pattern in all cases.
I wonder if there is a better way to write this code without repeating the same structure 3 times.
Maybe it can be refactored to use a function that assigns {TypeName, 0}
or {TypeName, length(Args)
depending on whether Args
is any
or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The third clause (the one with the redundant definition) is wrong and redundant. It handles annotated types (e.g. A :: atom()
), remote types (e.g. gb_sets:set()
), and integer and atom literals. Those need no special handling, so the third clause can be removed.
The first and second clauses can indeed be combined to avoid repeating code.
I have pushed a fixup commit.
check_type_1(Types, SeenVars, St) -> | ||
check_type_2(Types, SeenVars, St). | ||
|
||
check_type_2({ann_type, _A, [_Var, Type]}, SeenVars, St) -> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a better name for the function than check_type_1
and check_type_2
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any suggestions for better names?
The number suffixes indicates that check_type_1
and check_type_2
are helpers for check_type/1
, and that they are never meant to be called directly from other parts of erl_lint
.
This has an unfortunate side effect of allowing "spooky action at a distance" with header files. See for example: % foo.hrl
-record(person, {name :: string()}).
% foo.erl
-include("foo.hrl").
-type string() :: atom().
foo() ->
#person{name = "Name"} % surprise, invalid (!) |
Yes, it does. Not sure if that is a problem in practice, or how we could avoid it. |
to avoid the problem stated by @michalmuskala , can we consider the header files to use the default |
I don't think we should make the semantics dependent on whether the definition is in a header filer or not. That could also defeat the purpose of allowing redefinitions if the types that need to be redefined are in a header file. (It's no longer considered best practice to define types in header files, but we still have such code in Erlang/OTP.) |
I agree with @bjorng - making semantics different between header files & modules wouldn't be desirable. TBH, I don't think there's much we can do here, it's the semantics of headers & records that are the actual problem, not really this feature. I don't think we can do much, other than warn people not to do it. You could argue the same is already possible with functions & default values: %foo.hrl
-record(processor, {pid = self(), ...}).
%mod1.erl
-include("foo.hrl").
processor() -> #processor{}.
%mod2.erl
-include("foo.hrl").
-compile({no_auto_import, [self/0]}).
processor() -> #processor{}.
self() -> 'self'. |
If a module defines a type and a new built-in type is added with the same name, Dialyzer will allow the redefinition of that particular release for one release only. That makes it difficult to write code that is supposed to work for multiple releases of OTP. For example, the OTP 24 release introduced the `nonempty_binary` type. Modules that had their own definition would continue to work in OTP 24 but not in OTP 25. Always allow the name of a built-in type to be reused locally, in the same way that a definition of a function with the same name as a BIF will take precedence over the BIF. Starting from OTP 25, the built-in types belong to the erlang module, so it will always be possible to get the built-in definition if needed by prefixing the name with `erlang:`. Closes erlang#6132
b08f467
to
4d08fc9
Compare
Summary: - why? `bultin-in` dynamic() in OTP 26 and erlang/otp#6335 Reviewed By: alanz Differential Revision: D45967322 fbshipit-source-id: f6e5635b4a7e3d1b19e47266e977dd4f168f0434
If a module defines a type and a new built-in type is added with the same name, Dialyzer will allow the redefinition of that particular release for one release only. That makes it difficult to write code that is supposed to work for multiple releases of OTP.
For example, the OTP 24 release introduced the
nonempty_binary
type. Modules that had their own definition would continue to work in OTP 24 but not in OTP 25.Always allow the name of a built-in type to be reused locally, in the same way that a definition of a function with the same name as a BIF will take precedence over the BIF. Starting from OTP 25, the built-in types belong to the erlang module, so it will always be possible to get the built-in definition if needed by prefixing the name with
erlang:
.Closes #6132