Skip to content

Latest commit

 

History

History
142 lines (91 loc) · 4.27 KB

overview.mkd

File metadata and controls

142 lines (91 loc) · 4.27 KB
title
Rich errors messages in GHC

This document explores a variety of solutions for incorporating rich error message support into GHC. There are several motivations for this work:

  • Prettier REPL output (e.g. highlight all mentions of id in green and, all types in purple, and all kinds in bold typeface).
  • Place the "squiggly red lines" support for error reporting provided by most Haskell IDEs on a more solid foundation
  • Enable automated refactoring tasks (e.g. instead of GHC saying "Use GADTs or TypeFamilies to permit this", the IDE should be able to prompt the user to do so in an automated manner)
  • Allow GHC to perform various expensive analyses (e.g. computing hole fits) to be performed asynchronously or on demand

Pretty-printer with point annotations

See the proposal.

Pros

  • Relatively low implementation and maintenance effort
  • Allows us to realize any number of design points on the spectrum from plain text errors (the status quo) to an explicit ADT-of-errors (discussed below)
  • Easily allows errors to be built up compositionally

Cons

  • Ambiguity regarding how to analyze documents
  • Still very presentation-centric

Pretty-printer with scoped annotations

This is the approach used by Idris (as described in A pretty-printer that says what it means).

Pros

  • Very low implementation and maintenance effort
  • Proven to be useful in Idris's rich REPL

Cons

  • Very presentation-centric
  • Not terribly conducive to analysis by external tools

Classic ADT-of-errors approach

Here we define a type defining all of the errors which can possibly be produced by GHC. To avoid separating the type definition and its rendering from the point where we emit it, we define a type for each error alongside the point where it's produced (e.g. in TcErrors). This avoids some of the maintenance overhead we worry this approach might otherwise incur.

-- We define a sum type of all errors GHC can throw
module GhcErrors where

import TcErrors 
import ... -- other modules which can produce errors

data GhcError = TcError' TcError
              | ...
              | GenericError SDoc
                -- ^ enables us to gradually port the existing SDoc errors to
                -- the new scheme.


-- Where we throw the error we define its structure and textual presentation
module TcErrors where

import {-# SOURCE #-} GhcErrors

data TcError = TyVarEqError' Error1
             | ...

instance Outputable TcError where
    ppr (Error1' e) = ppr e
    ...

...

-- In an attempt to minimize the maintenance overhead incurred by this change,
-- we define the individual errors (e.g. type variable equality error) as
-- separate types rather than as constructors of TcError
-- such that we can place their definitions and Outputable instances next to
-- the place where they are produced.
data TyVarEqError = TyVarEqError ...

instance Outputable TyVarEqError where ...

mkTyVarEqErr :: ... -> TcM GhcError
mkTyVarEqErr = do
  ...
  return $ TcError' $ TyVarEqError' $ TyVarEqError { ... }

Pros

  • It's clear what errors can be produced and what information they carry

Cons

  • Import cycles. Either GhcErrors or TcErrors et al. will require an hs-boot file due to the cyclic imports.
  • Maintenance overhead: a lot of words are spent enumerating the types of errors that GHC can throw (e.g. defining the GhcError, TcError, and TyVarEqError types).

A more dynamic approach

I suspect the import cycles above are nearly a deal-breaker. To avoid them we can go for a slightly more dynamic approach, taking inspiration from our extensible exceptions mechanism:

module GhcErrors where

class (Typeable a, Outputable a) => IsError a

data GhcError where
  GhcError :: forall a. (IsError a) => a -> GhcError


--
module TcErrors where

data Error1 = Error1 ...

instance Outputable Error1 where ...
instance IsError Error1

mkError1 :: ... -> TcM GhcError

Pros

  • We eliminate much of the maintenance overhead associated with defining the error sum types.

Cons

  • It's slightly less obvious what
  • Consumption requires dynamic type comparisons.