Fix handling undefined records and record fields#147
Conversation
Codecov Report
@@ Coverage Diff @@
## master #147 +/- ##
==========================================
+ Coverage 81.69% 81.88% +0.19%
==========================================
Files 9 9
Lines 2054 2059 +5
==========================================
+ Hits 1678 1686 +8
+ Misses 376 373 -3
Continue to review full report at Codecov.
|
| #{RecName := Fields} -> | ||
| Fields; | ||
| _NotFound -> | ||
| error({undefined_record, RecName}) |
There was a problem hiding this comment.
Why not throw({undef, record, RecName})? Error is for unexpected cases in the program IMO, e.g. for invalid input to a function, so a badarg error perhaps. But I think a throw with a formatted error message is more user friendly.
There was a problem hiding this comment.
I used error/1 and not throw/1 to emphasize that this is invalid input to this function. This is the responsibility of the compiler and gradualizer should only be given forms which don't have undefined records or fields. (I used undefined_record and undefined_field because these are the atoms the compiler also returns as errors in such cases)
There was a problem hiding this comment.
Ok, but the compiler returns those errors, it doesn't raise them as errors, right? There is a difference.... The word "error" is overloaded....
There was a problem hiding this comment.
yes, yes, there are two things. I looked at the compiler only to choose an atom.
Independent from the compiler, I chose to raise in order to emphasize (that this is not the responsibility of Gradualizer). But I can change this.
I disagree. The way that programmers work in typed languages such as Haskell is, they start by running the program through the type checker until the type checker stops complaining. Only after that do they compile it. I think that we should make as few assumptions as possible about how and when the Gradualizer is run. If we can give informative error messages in all kinds of situations then Gradualizer will be a much more usable tool. |
|
I thought that the type checker is part of the Haskell compiler... Anyway, I think in practice people have powerful IDE's nowadays with on-the-fly compilation (+unit testing and what not) as they type. We can also plug into this chain on-the-fly gradualization (as long as they are fast enough the order does not matter) in the end the programmer gets real-time warnings about any of the on-the-fly steps. So it has no advantage to run Gradualizer before compilation. I could argue that taking the source file instead of beam has no advantage - but it has, we can gain richer AST eg. by including column info. so I have a few arguments pro and con. |
Are you sure? I thought there's a way to only generate the AST and then type check it. We have exposed a
That's great! Let's do this! 👏 🍰 😄 We can even get the column information if we first parse the source code and then run the compiler checks using |
I'm not familiar with the Elixir compiler in-depth, but there are all kinds of inter-dependencies between Elixir modules (
I also thought of this :) |
Indeed. I'm talking about them separately for the sake of this discussion as the typechecker and compiler are different for Erlang.
I envision enabling on-the-fly graudalization as part of their IDEs. Especially if the Gradualizer is faster than the compiler. But also because it will provide a lot more helpful feedback than the compiler. But I get your point about the problem of duplicating checks. I didn't mean to suggest that we duplicate all the checks that the compiler does. What I mean is that when we find ourselves in a situation where the Gradualizer finds something that it not right with the input program, it should report that in a way that is helpful to the programmer and not simply crash. A concrete example is records; if the programmer tries to use a record that is not defined it is better if Gradualizer reports this in a friendly way instead of crashing.
That's indeed an interesting idea. Do you have any sense if we might take a performance hit? |
|
off-topic: the Erlang compiler is getting better and better in type inference: |
|
So we agree that we shouldn't check if a record or field is defined if we don't need that info (just to please the user - as I implemented in a previous commit) I think for Erlang programmers ("let it crash") a nice crash report is helpful and readable. To make it even more helpful we would need to pass around the location info. And I think the error thrown in these cases should be distinct from type_errors (and pattern_errors). |
In practice Gradualizer is mostly run on code that compiles without errors. (i.e. there should be no undefined records or record fields) However we don't make such assumptions and if an undefined record or record field is encountered it is still reported in a user friendly way. (Let's note though that Gradualizer won't make extra efforts to detect all undefined records or record fields if they are not needed for type checking)
CodeCov complains that there are too many uncovered changed lines. (Undefined recods cannot be tested in should_fail modules, because compilation fails before eunit could be run on them.)
This is rather a known_problem. I realised that it's a bit more complicated to implement, so for now only the crash is addressed with a big TODO for later.
c949074 to
93bf0ba
Compare
|
|
||
| %% wildcard in pattern matching | ||
| -spec h(#rec{}) -> boolean(). | ||
| h(#rec{apa = 1, _ = true}) -> |
There was a problem hiding this comment.
this is something the compiler allows, but I dont think it's quite useful or used. Should we allow this at all or rather throw a warning (similar to lazy-andalso) discouraging the usage?
There was a problem hiding this comment.
It's quite much used in some code that I've seen, so I think we should allow it.
There was a problem hiding this comment.
E.g. when writing patterns for mnesia. I wonder how we could support that, but that's for the future...
There was a problem hiding this comment.
do you see it in record creation or record patterns? do you see it with any other value than the underscore atom '_' or 'undefined'? it's totally fine with creating match-specs for ets/mnesia. (ets:fun2ms(fun(#rec{apa = 1, _ = undefined}) -> ... is also converted to record creation and not a pattern by the ms_transform parse trans). Given that could you give a real-world example of using wildcard in record pattern?
I'm fine with supporting this...also I realise this is the wrong PR, I just found this case when reviewing my code changes.
There was a problem hiding this comment.
I've never seen it in a pattern. Only in expressions. I've seen it with _ = false in a record where all fields are booleans.
Yes.
Erlang's "let is crash" slogan is useful for writing fault tolerant distributed systems. But it need to be amended with, "and handle the crash gracefully by other means". Erlang has supervision trees and other nice mechanisms to handle that for a distributed system. |
| all_type(Tys, Ty, AOut, [Cs|Css], TEnv). | ||
|
|
||
| get_record_fields(RecName, Anno, #tenv{records = REnv}) -> | ||
| get_maybe_remote_record_fields(RecName, Anno, TEnv) -> |
There was a problem hiding this comment.
Nit: I think 'get_possibly_remote_record_field' would be a better name.
In general Gradualizer should be run on code that compiles without
errors. (i.e. there should be no undefined records or record fields)
However if such a case is encounter crash with an error tuple (that is a
bit nicer than
badkey, and much nicer thanfunction_clause, handle_type_error({error, {record_field_not_found, ...)I kept one exception: undefined remote records (and fixed the error
formatting). This is a bit different as it is fetched from
gradluaizer_db. The compiler does not check entities in remote modules(except behaviours), but it will catch this anyway when the remote
module itself is compiled. Maybe this should be an error as well.