Skip to content

@natefaubion natefaubion released this May 30, 2019

Grammar/Parser Changes

0.13 is a very exciting release for me (@natefaubion). For the past few months I've been working on a complete rewrite of the existing parser. The old parser has served us very well, but it has grown very organically over the years which means it's developed some unsightly limbs! Throughout the process I've tried to iron out a lot of dark corner cases in the language grammar, and I hope this release will set us on a firm foundation so we can start to specify what "PureScript the Language" actually is. This release is definitely breaking, but I think you'll find the changes are modest. I also hope that this release will open up a lot of opportunities for syntactic tooling, both using the existing parser or even using alternative parsers (which are now possible).

Breaking

There are a number of breaking changes, but I think you'll find that most code will continue to parse fine. We've tested the parser against the existing ecosystem and several large production applications at Awake, Lumi, and SlamData. The migration burden was either non-existent or only involved a few changes.

  • The only whitespace now allowed in code is ASCII space and line endings. Since you must use indentation to format PureScript code (unlike Haskell), we felt it was best to be more restrictive in what you can write instead of allowing potentially confusing behavior (implicit tab-width, zero-width spaces, etc). You can still use unicode whitespace within string literals.
  • The only escapes accepted in string literals are \n\r\t\'\"\\, \x[0-9a-fA-F]{1,6} (unicode hex escape), and \[\r\n ]+\ (gap escapes). We had inherited a vast zoo of escape codes from the Parsec Haskell Language parser. We decided to minimize what we support, and only add things back if there is significant demand.
  • Octal and binary literals have been removed (hex remains).
  • \ is no longer a valid operator. It conflicts with lambda syntax.
  • @ is no longer a valid operator. It conflicts with named binder syntax.
  • forall is no longer a valid identifier for expressions. We wanted a consistent rule for type identifiers and expression identifiers.
  • Precedence of constructors with arguments in binders (a@Foo b must be a@(Foo b)).
  • Precedence of kind annotations (a :: Type -> Type b :: Type must now be (a :: Type -> Type) (b :: Type)).
  • Precedence of type annotations (:: has lowest precedence, rather than sitting between operators and function application).
  • Various edge cases with indentation/layout. Again, most code should work fine, but there were some cases where the old parser let you write code that violated the offside rule.

Fixes

  • Many fixes around parse error locations. The new parser should yield much more precise error locations, especially for large expressions (like in HTML DSLs).
  • Reported source spans no longer include whitespace and comments.
  • Reported source span for the last token in a file is now correct.

Enhancements

  • where is still only sugar for let (it does not introduce bindings over guards), but it is now usable in case branches in the same manner as declarations.
  • _ is now allowed in numeric literals, and is an ignored character (ie. 1_000_000 == 1000000).
  • Raw string literals (triple quotes) can now contain trailing quotes (ie. """hello "world"""" == "hello \"world\"").
  • Kind annotations are now allowed in forall contexts (#3576 @colinwahl).
  • The new parser is much faster and can avoid parsing module bodies when initially sorting modules. We also do more work in parallel during the initialization phase of purs compile. This means that time to start compiling is faster, and incremental builds are faster. In my testing, a noop call to purs compile on the Awake codebase went from ~10s to ~3s.

Other Changes

Breaking

  • Fix sharing in function composition inlining (#3439 @natefaubion). This is really a bugfix, but it has the potential to break code. Previously, you could write recursive point-free compositions that the compiler inadvertently eta-expanded into working code by eliminating sharing. We've changed the optimization to respect strict evaluation semantics, which can cause existing code to stack overflow. This generally arises in instance definitions. Unfortunately, we don't have a way to disallow the problematic code at this time.
  • Fail compilation when a module imports itself (#3586 @hdgarrood).
  • Disallow re-exporting class and type with the same name (#3648 @joneshf).

Enhancements

  • Better illegal whitespace errors (#3627 @hdgarrood).
  • Only display class members that are not exported from the module when throwing a TransitiveExportError for a class (#3612 @colinwahl).
  • Tweaks to type pretty printing (#3610 @garyb).
  • Unify matching constraints (#3620 @garyb).
  • Improve error message on ModuleNotFound error for Prim modules (#3637 @ealmansi).

Docs

  • Make markdown format behave like html. Remove --docgen opt. Separate directories for html and markdown docs (#3641 @ealmansi).
  • Make html the default output format (#3643 @ealmansi).
  • Write ctags and etags to filesystem instead of stdout (#3644 @ealmansi).
  • Add --output option for purs docs (#3647 @hdgarrood).
  • Use externs files when producing docs (#3645 @hdgarrood). docs is now a codegen target for purs compile where documentation is persisted as a docs.json file in the output directory.

Internal

Assets 8
You can’t perform that action at this time.