author | date-accepted | ticket-url | implemented |
---|---|---|---|
Alejandro Serrano Mena |
This proposal is discussed at this pull request.
The syntax of do
blocks in Haskell requires a final expression in order to even parse, lists cannot start with a comma, these are all examples of errors which GHC recognizes as parse errors nowadays. This means that half-baked do
blocks, or a misplaced comment before the first element of a list, stop the pipeline, and no feedback can be gathered about name resolution or typing. This scenario arises quite often during interactive development. Our proposal is to treat those unfinished elements as having holes filling the missing places.
The syntax of do
blocks in Haskell requires a final expression in order to be syntactically correct. However, during development, it is quite common to have "half-baked" do
blocks in which that final expression is missing.
do putStr "what is your name?"
name <- getLine
-- this do block is syntactically wrong
The last statement in a 'do' block must be an expression
name <- getLine
Since the error happens in the parsing phase, this means that it is impossible to get any feedback on following phases (name resolution, typing) until that error is solved. This problem manifests even more when using interactive development tools, as pointed out by Neil Mitchell.
Another instance of this problem is that lists literals do not accept an initial comma. However, suppose we are developing code with uses a list, following a common code style within the Haskell community.
thing [ a
, b
, c
]
Then we turn the first element into a comment, as follows.
thing [ -- a
We have the same problem as with do
blocks: this missing item makes the code syntactically wrong, leaving out any possibility of further analysis by the compiler.
A third instance is having one type signature without the corresponding definition. This also happens when writing code, as many people write the type signature and then the implementation.
f :: Int -> Int
The type signature for ‘f’ lacks an accompanying binding
Instead of flagging a parse error, the compiler allows such "unfinished" sequences of items to go over the parsing phase. Conceptually, they are treated as having a (typed) hole wherever the item is missing, but the error reports the location of the entire block instead. This is enough to allow the compiler to continue until the typing phase, which is great for interactive development. These holes are can be taken until runtime if -fdefer-type-errors
is also enabled.
Considering the syntax the corresponding section of the Haskell 2010 Report, we make the following change:
lexp → do { stmts }
- stmts → stmt1 … stmtn exp [;]
+ stmts → stmt1 … stmtn [exp] [;]
stmt → ...
The translation section is updated with the rule:
+ do { stmt } = do { stmt ; _end } (where '_end' is a fresh hole)
As described below in the implementation plan, GHC already allows unfinished do
in its syntax.
Considering the syntax for list literals of the Haskell 2010 Report, we make the following change:
aexp → [ exp1 , … , expk ]
+ | [ , exp1, … , expk ]
The translation section is updated with the rule:
+ [ , e1, …, ek ] = [ _elt , e1, …, ek ]
For those cases in which there is a type signature, but not a binding, we also want to replace it by a single implementation with a hole. This enables the compiler to continue. However, we do not want to introduce yet another message, since the already-existing one already points the problem correctly.
The first example in the motivation section would produce the following error message instead of the current one:
• Found unfinished do block
(the last statement must be an expression)
with inferred type :: IO b
at <interactive>:(2,1)-(4,7)
Where: ‘b’ is a rigid type variable bound by
the inferred type of it :: IO b
at <interactive>:(10,1)-(12,7)
• In an equation for ...
• Relevant bindings include
name :: String (bound at <interactive>:3:4)
The effect of this proposal is that the programmer may get more useful feedback than the one given now by GHC in this kind of scenarios. Furthermore, tools for interactive development may switch on this flag by default, as they do now with other such as -fdefer-type-errors
.
The maintenance cost seems quite low, given that the change is quite local.
One drawback is that tools for interactive development sometime hard-code the shape of the error messages. If we change the message, some of these tools may break.
The main alternative is to keep the status quo. Some people (including myself) have learnt that whenever you start a do
block you must immediately write undefined
or a hole afterwards, in order not to break the interactive development cycle. However, it feels weird that this problem occurs given that GHC can detect the problem with a lot of precision.
Another usual suspect for error which are signalled as parse errors but occur often during interactive development is unfinished bind blocks.
f = g 3
where g =
Would it be possible to turn this into a hole too? My fear is that the syntax of Haskell, with its complicated layout rules, may require too much lookahead for this to work. Other than that, it would be great that we could obtain.
• Found missing implementation of 'g'
with inferred type :: Num a => a -> b
It turns out that GHC already checks that do
blocks end with an expression in a separate phase! We can read in the Parser.y
file (although ParseUtils
no longer exists).
-- The last Stmt should be an expression, but that's hard to enforce
-- here, because we need too much lookahead if we see do { e ; }
-- So we use BodyStmts throughout, and switch the last one over
-- in ParseUtils.checkDo instead
This suggests that the implementation should be quite straightforward.
The syntax of list expressions is a bit more convoluted, since we have many different kinds (literals, comprehensions, sequences). But conceptually a new production rule:
lexps :: { forall b. DisambECP b => PV [Located b] }
: ...
| ',' lexps
should be enough to make this work.