Fix handling undefined records and record fields by gomoripeti · Pull Request #147 · josefs/Gradualizer

gomoripeti · 2019-01-15T19:51:01Z

In general Gradualizer should be run on code that compiles without
errors. (i.e. there should be no undefined records or record fields)
However if such a case is encounter crash with an error tuple (that is a
bit nicer than badkey, and much nicer than
function_clause, handle_type_error({error, {record_field_not_found, ...)

I kept one exception: undefined remote records (and fixed the error
formatting). This is a bit different as it is fetched from
gradluaizer_db. The compiler does not check entities in remote modules
(except behaviours), but it will catch this anyway when the remote
module itself is compiled. Maybe this should be an error as well.

codecov-io · 2019-01-15T19:56:03Z

Codecov Report

Merging #147 into master will increase coverage by 0.19%.
The diff coverage is 96.55%.

@@            Coverage Diff             @@
##           master     #147      +/-   ##
==========================================
+ Coverage   81.69%   81.88%   +0.19%     
==========================================
  Files           9        9              
  Lines        2054     2059       +5     
==========================================
+ Hits         1678     1686       +8     
+ Misses        376      373       -3

Impacted Files	Coverage Δ
src/typechecker.erl	`82.41% <96.55%> (+0.23%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6ee4396...93bf0ba. Read the comment docs.

zuiderkwast · 2019-01-17T14:27:47Z

+        #{RecName := Fields} ->
+            Fields;
+        _NotFound ->
+            error({undefined_record, RecName})


Why not throw({undef, record, RecName})? Error is for unexpected cases in the program IMO, e.g. for invalid input to a function, so a badarg error perhaps. But I think a throw with a formatted error message is more user friendly.

I used error/1 and not throw/1 to emphasize that this is invalid input to this function. This is the responsibility of the compiler and gradualizer should only be given forms which don't have undefined records or fields. (I used undefined_record and undefined_field because these are the atoms the compiler also returns as errors in such cases)

Ok, but the compiler returns those errors, it doesn't raise them as errors, right? There is a difference.... The word "error" is overloaded....

yes, yes, there are two things. I looked at the compiler only to choose an atom.
Independent from the compiler, I chose to raise in order to emphasize (that this is not the responsibility of Gradualizer). But I can change this.

josefs · 2019-01-21T09:49:03Z

In general Gradualizer should be run on code that compiles without
errors. (i.e. there should be no undefined records or record fields)

I disagree. The way that programmers work in typed languages such as Haskell is, they start by running the program through the type checker until the type checker stops complaining. Only after that do they compile it.

I think that we should make as few assumptions as possible about how and when the Gradualizer is run. If we can give informative error messages in all kinds of situations then Gradualizer will be a much more usable tool.

gomoripeti · 2019-01-21T16:10:53Z

I thought that the type checker is part of the Haskell compiler...

Anyway, I think in practice people have powerful IDE's nowadays with on-the-fly compilation (+unit testing and what not) as they type. We can also plug into this chain on-the-fly gradualization (as long as they are fast enough the order does not matter) in the end the programmer gets real-time warnings about any of the on-the-fly steps. So it has no advantage to run Gradualizer before compilation.
Another thing is that Elixir users can only check beam files - so that part of the community will for sure run Gradualizer after compilation.
From a practical side of developing Gradualizer, making no assumptions means duplicating a lot (is it a lot?) of work that is already present in the compiler - instead of focusing on features that for sure the compiler wouldn't do.
What we could do is when the AST is read from an erl source file (not beam) instead of epp we could use compile:file(Erl, ['P', binary]) which runs not just the preprocessor but also parse_transforms (think of lager) and the linter, before returning the abstract forms?

I could argue that taking the source file instead of beam has no advantage - but it has, we can gain richer AST eg. by including column info.
Another argument against my point is if we would like to run Gradualizer as a parse_transform (don't know if that will ever make sense), that is definitely run before the linter (and would face undefined records etc.)

so I have a few arguments pro and con.

zuiderkwast · 2019-01-21T17:03:55Z

Another thing is that Elixir users can only check beam files - so that part of the community will for sure run Gradualizer after compilation.

Are you sure? I thought there's a way to only generate the AST and then type check it. We have exposed a gradualizer:type_check_forms/2 for this.

What we could do is when the AST is read from an erl source file (not beam) instead of epp we could use compile:file(Erl, ['P', binary]) which runs not just the preprocessor but also parse_transforms (think of lager) and the linter, before returning the abstract forms?

That's great! Let's do this! 👏 🍰 😄

We can even get the column information if we first parse the source code and then run the compiler checks using compile:forms(Forms, ['P', binary]).

gomoripeti · 2019-01-21T17:32:13Z

Are you sure? I thought there's a way to only generate the AST and then type check it. We have exposed a gradualizer:type_check_forms/2 for this.

I'm not familiar with the Elixir compiler in-depth, but there are all kinds of inter-dependencies between Elixir modules (require, use) and one Elixir module can contain multiple modules (resulting in multiple beam files) so it does not look straight-forward to me (I also don't see in the Elixir docs how to pass Erlang compiler options on).

We can even get the column information...

I also thought of this :)

josefs · 2019-01-22T09:39:21Z

I thought that the type checker is part of the Haskell compiler...

Indeed. I'm talking about them separately for the sake of this discussion as the typechecker and compiler are different for Erlang.

Anyway, I think in practice people have powerful IDE's nowadays with on-the-fly compilation (+unit testing and what not) as they type. We can also plug into this chain on-the-fly gradualization (as long as they are fast enough the order does not matter) in the end the programmer gets real-time warnings about any of the on-the-fly steps. So it has no advantage to run Gradualizer before compilation.

I envision enabling on-the-fly graudalization as part of their IDEs. Especially if the Gradualizer is faster than the compiler. But also because it will provide a lot more helpful feedback than the compiler.

But I get your point about the problem of duplicating checks. I didn't mean to suggest that we duplicate all the checks that the compiler does. What I mean is that when we find ourselves in a situation where the Gradualizer finds something that it not right with the input program, it should report that in a way that is helpful to the programmer and not simply crash. A concrete example is records; if the programmer tries to use a record that is not defined it is better if Gradualizer reports this in a friendly way instead of crashing.

What we could do is when the AST is read from an erl source file (not beam) instead of epp we could use compile:file(Erl, ['P', binary]) which runs not just the preprocessor but also parse_transforms (think of lager) and the linter, before returning the abstract forms?

That's great! Let's do this! 👏 🍰 😄

That's indeed an interesting idea. Do you have any sense if we might take a performance hit?

gomoripeti · 2019-01-22T12:33:49Z

off-topic: the Erlang compiler is getting better and better in type inference:
erlang/otp#2091
erlang/otp#2100
(it's just good to see development around types on multiple threads around Erlang tool-chain)

gomoripeti · 2019-01-26T12:04:17Z

So we agree that we shouldn't check if a record or field is defined if we don't need that info (just to please the user - as I implemented in a previous commit)

I think for Erlang programmers ("let it crash") a nice crash report is helpful and readable. To make it even more helpful we would need to pass around the location info. And I think the error thrown in these cases should be distinct from type_errors (and pattern_errors).

In practice Gradualizer is mostly run on code that compiles without errors. (i.e. there should be no undefined records or record fields) However we don't make such assumptions and if an undefined record or record field is encountered it is still reported in a user friendly way. (Let's note though that Gradualizer won't make extra efforts to detect all undefined records or record fields if they are not needed for type checking)

CodeCov complains that there are too many uncovered changed lines. (Undefined recods cannot be tested in should_fail modules, because compilation fails before eunit could be run on them.)

This is rather a known_problem. I realised that it's a bit more complicated to implement, so for now only the crash is addressed with a big TODO for later.

gomoripeti · 2019-01-26T14:54:38Z

+
+%% wildcard in pattern matching
+-spec h(#rec{}) -> boolean().
+h(#rec{apa = 1, _ = true}) ->


this is something the compiler allows, but I dont think it's quite useful or used. Should we allow this at all or rather throw a warning (similar to lazy-andalso) discouraging the usage?

It's quite much used in some code that I've seen, so I think we should allow it.

E.g. when writing patterns for mnesia. I wonder how we could support that, but that's for the future...

do you see it in record creation or record patterns? do you see it with any other value than the underscore atom '_' or 'undefined'? it's totally fine with creating match-specs for ets/mnesia. (ets:fun2ms(fun(#rec{apa = 1, _ = undefined}) -> ... is also converted to record creation and not a pattern by the ms_transform parse trans). Given that could you give a real-world example of using wildcard in record pattern?

I'm fine with supporting this...also I realise this is the wrong PR, I just found this case when reviewing my code changes.

I've never seen it in a pattern. Only in expressions. I've seen it with _ = false in a record where all fields are booleans.

zuiderkwast

Looks good to me.

josefs · 2019-01-26T18:45:25Z

So we agree that we shouldn't check if a record or field is defined if we don't need that info (just to please the user - as I implemented in a previous commit)

Yes.

I think for Erlang programmers ("let it crash") a nice crash report is helpful and readable. To make it even more helpful we would need to pass around the location info. And I think the error thrown in these cases should be distinct from type_errors (and pattern_errors).

Erlang's "let is crash" slogan is useful for writing fault tolerant distributed systems. But it need to be amended with, "and handle the crash gracefully by other means". Erlang has supervision trees and other nice mechanisms to handle that for a distributed system.
I don't see how it's helpful for the programmer if we apply the "let it crash" slogan to the Gradualizer. But I do agree that it makes sense to report a different kind of error for things like unused records, compared to type errors.

josefs

Great work!

josefs · 2019-01-26T18:47:57Z

    all_type(Tys, Ty, AOut, [Cs|Css], TEnv).

-get_record_fields(RecName, Anno, #tenv{records = REnv}) ->
+get_maybe_remote_record_fields(RecName, Anno, TEnv) ->


Nit: I think 'get_possibly_remote_record_field' would be a better name.

gomoripeti requested review from UlfNorell, Zalastax, josefs and zuiderkwast January 15, 2019 19:51

gomoripeti mentioned this pull request Jan 15, 2019

WIP: Improve type checking records #141

Closed

zuiderkwast reviewed Jan 17, 2019

View reviewed changes

gomoripeti added 3 commits January 26, 2019 13:47

Add unit tests for undefined record and field

648339d

CodeCov complains that there are too many uncovered changed lines. (Undefined recods cannot be tested in should_fail modules, because compilation fails before eunit could be run on them.)

Don't crash on record field wildcard in patterns

93bf0ba

This is rather a known_problem. I realised that it's a bit more complicated to implement, so for now only the crash is addressed with a big TODO for later.

gomoripeti force-pushed the undefined_records branch from c949074 to 93bf0ba Compare January 26, 2019 13:24

gomoripeti commented Jan 26, 2019

View reviewed changes

zuiderkwast approved these changes Jan 26, 2019

View reviewed changes

josefs approved these changes Jan 26, 2019

View reviewed changes

josefs merged commit 5095fa6 into josefs:master Jan 27, 2019

gomoripeti deleted the undefined_records branch January 27, 2019 20:57

gomoripeti mentioned this pull request Mar 29, 2019

Refactor operator patterns #174

Merged

Conversation

gomoripeti commented Jan 15, 2019

Uh oh!

codecov-io commented Jan 15, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

josefs commented Jan 21, 2019

Uh oh!

gomoripeti commented Jan 21, 2019

Uh oh!

zuiderkwast commented Jan 21, 2019

Uh oh!

gomoripeti commented Jan 21, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

josefs commented Jan 22, 2019

Uh oh!

gomoripeti commented Jan 22, 2019

Uh oh!

gomoripeti commented Jan 26, 2019

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zuiderkwast left a comment

Choose a reason for hiding this comment

Uh oh!

josefs commented Jan 26, 2019

Uh oh!

josefs left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

codecov-io commented Jan 15, 2019 •

edited

Loading

gomoripeti commented Jan 21, 2019 •

edited

Loading