[Dialyzer] False positive: strange interaction with `string()`, `list()` and overloaded functions #7624

albsch · 2023-09-04T22:45:02Z

This is a minimized function from the Gradualizer test suite:

-spec g(list(tag)) -> list(string());
       (tag) -> string().
g([]) -> [];
g([tag | _]) -> [g(tag)];
g(tag) -> "".

The first part of the specification describes the list processing, the second part the tag to string conversion. This is a valid function with a valid spec.

Dialyzer complains:

should_pass.erl:5:2: Invalid type specification for function should_pass:g/1. The success typing is 
          ('tag' | maybe_improper_list()) -> [[[[any()]]]]

The return type looks really wild, why is Dialyzer inferring a 4-times nested list?

Reproducible build with Erlang 24-26:
https://github.com/albsch/dialyzer-test

Expected behavior
Dialyzer should not complain, the type spec is valid.

The text was updated successfully, but these errors were encountered:

TD5 · 2023-09-08T09:43:14Z

From Dialyzer 5.1's release notes (my emphasis), here's one aspect which I think muddies the water a bit:

Dialyzer's overloaded domain warning is now disabled by default, and can be enabled with the flag -Woverlapping_contract.

Dialyzer used to issue a warning for overloaded domains stating "such contracts are currently unsupported and are simply ignored".

These contracts are not "ignored" but rather, Dialyzer takes the union of the overloaded domains. This means that we lose the dependency from each corresponding input to output type. Because of this, the warning is really about not being able to establish a dependency between the input and output types of each respective overloaded function specification.

With this in mind, splitting the function to avoid an overload specification seems to satisfy Dialyzer:

-module(should_pass).

-export([g/1, h/1]).

-spec g(list(tag)) -> list(string()).
g([]) -> []; 
g([tag | _]) -> [h(tag)].                                                                                                                                                       

-spec h(tag) -> string().
h(tag) -> "".

And, indeed, changing the spec to naively simplify the overloading to:

-spec g(list(tag) | tag) -> list(string()) | string().

or

-spec g(list(tag) | tag) -> list(string()); (list(tag) | tag) -> string().

seems to make Dialyzer happy too.

I suspect this means the issue arrises due to how "Dialyzer takes the union of the overloaded domains" when it comes to refining the inferred success type using the specification given in the source file.

The return type looks really wild, why is Dialyzer inferring a 4-times nested list?

After "simplifying" the spec/inferred type, I imagine Dialyzer's reasons about string() (defined as https://www.erlang.org/doc/man/unicode#type-chardata), which notably allows nested and improper lists, and Dialyzer limits the depth it explores recursive types, which presumably explains the arbitrary quadruply-nested lists.

Dialyzer should not complain, the type spec is valid.

I think this is a reasonable claim, but pragmatically, I think using spec overloading is a "here be dragons" scenario.

For the solution inside Dialyzer, I think we need to revisit the soundness of the handling of overloading or the warning given for certain overloaded specs.

I did also try to enable the old -Woverlapping_contract warning explicitly, but it doesn't seem to be available in the CLI: https://github.com/erlang/otp/blob/master/lib/dialyzer/src/dialyzer_cl_parse.erl#L146

bjorng · 2023-09-13T08:00:38Z

The return type looks really wild, why is Dialyzer inferring a 4-times nested list?

The return type [[[[any()]]]] is weird and at first glance it might seem wrong. It is correct because any of the nested lists can be empty. (This took me a while to figure out.) Dialyzer infers this type because it sees a recursive call to g/1, and g/1 can return either [] or a non-empty list. Dialyzer does not understand that the call g(tag) only can return [].

After "simplifying" the spec/inferred type, I imagine Dialyzer's reasons about string() (defined as https://www.erlang.org/doc/man/unicode#type-chardata), which notably allows nested and improper lists, and Dialyzer limits the depth it explores recursive types, which presumably explains the arbitrary quadruply-nested lists.

The string type is not defined in unicode module; it is defined like this in the erlang module:

otp/erts/preloaded/src/erlang.erl

Line 122 in 6674ed5

-type string() :: [char()].

Now for why Dialyzer complains about the spec. Using OTP 26, Dialyzer gives a little more information about what Dialyzer considers wrong:

t.erl:13:2: Invalid type specification for function t:g/1.
 The success typing is t:g
          ('tag' | maybe_improper_list()) -> [[[[any()]]]]
 But the spec is t:g
          (['tag']) -> [string()];
         ('tag') -> string()
The return types do not overlap

As long as there is at least some overlap between spec and code, Dialyzer will allow it. However, lists are treated specially in that there must be an overlap not only when the lists are empty. So for the following example:

-spec h(list()) -> string().
h([H|T]) when is_atom(H) ->
    [H|h(T)];
h([]) ->
    [].

Dialyzer will complain:

t.erl:20:2: Invalid type specification for function t:h/1.
 The success typing is t:h
          ([atom()]) -> [atom()]
 But the spec is t:h
          ([any()]) -> string()
 The return types do not overlap

because the spec and the code only overlap for the empty list. I find it reasonable that there should be warning for this spec and code.

Therefore, I think that Dialyzer is correct in complaining about g/1. Perhaps we should make the warning a little bit clearer by saying "The return types do not overlap except when the lists are empty" or something similar.

albsch · 2023-09-22T13:38:19Z

Thank you for looking into this!

I find it reasonable that there should be warning for this spec and code.

Is this then considered a warning (that ideally can be disabled via a flag, like the warning for overlapping domains), or something that Dialyzer thinks will definitely lead to a runtime crash? Since that list edge case will never lead to a runtime crash and Dialyzer complains anyway, which I thought Dialyzer would (should?) never do.

So for the following example: h/1 Dialyzer will complain:

It's OK to complain about h/1 because h/1 definitely does not return just string() values when inputting a list() value. This will lead to a runtime crash if h/1 is used anywhere where string() values are expected. Dialyzer behaves for h/1 as expected.

bjorng · 2023-09-25T05:11:38Z

Is this then considered a warning (that ideally can be disabled via a flag, like the warning for overlapping domains), or something that Dialyzer thinks will definitely lead to a runtime crash? Since that list edge case will never lead to a runtime crash and Dialyzer complains anyway, which I thought Dialyzer would (should?) never do.

Dialyzer does not say that it will lead to a runtime crash. It is a warning that the combination of spec and code does not make much sense.

albsch added the bug Issue is reported as a bug label Sep 4, 2023

IngelaAndin added team:VM Assigned to OTP team VM team:PS Assigned to OTP team PS labels Sep 8, 2023

rickard-green assigned bjorng, kikofernandez and jhogberg Sep 11, 2023

IngelaAndin removed the team:PS Assigned to OTP team PS label Sep 20, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Dialyzer] False positive: strange interaction with `string()`, `list()` and overloaded functions #7624

[Dialyzer] False positive: strange interaction with `string()`, `list()` and overloaded functions #7624

albsch commented Sep 4, 2023

TD5 commented Sep 8, 2023 •

edited

Loading

bjorng commented Sep 13, 2023

albsch commented Sep 22, 2023 •

edited

Loading

bjorng commented Sep 25, 2023

[Dialyzer] False positive: strange interaction with string(), list() and overloaded functions #7624

[Dialyzer] False positive: strange interaction with string(), list() and overloaded functions #7624

Comments

albsch commented Sep 4, 2023

TD5 commented Sep 8, 2023 • edited Loading

bjorng commented Sep 13, 2023

albsch commented Sep 22, 2023 • edited Loading

bjorng commented Sep 25, 2023

[Dialyzer] False positive: strange interaction with `string()`, `list()` and overloaded functions #7624

[Dialyzer] False positive: strange interaction with `string()`, `list()` and overloaded functions #7624

TD5 commented Sep 8, 2023 •

edited

Loading

albsch commented Sep 22, 2023 •

edited

Loading