Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Polymorphic expression type #3615

Open
adnelson opened this issue Apr 29, 2019 · 5 comments
Open

Proposal: Polymorphic expression type #3615

adnelson opened this issue Apr 29, 2019 · 5 comments

Comments

@adnelson
Copy link
Contributor

adnelson commented Apr 29, 2019

Polymorphic expression type

This is a proposal to modify the current expression type, and other related types in Language.PureScript.AST, to be polymorphic.

Why

The current Expr is a monomorphic type which adds an argument of type SourceSpan to some, but not all, of its constructors. This is used to convey where the expression was parsed from in the source code. Other related types, however, such as Type and CoreFn.Expr, instead use a polymorphic type variable to pass through source information or any other annotation information.

There are a few advantages to being polymorphic:

  • When parsing a file it makes sense to have source information; however for programmatically generated expressions such as those during desugaring or type checking, a source annotation doesn't always make sense (as evidenced for example by the usage of nullSourceSpan for generated expressions in Entailment.hs). An annotation type of Maybe SourceSpan seems more appropriate than creating meaningless null source spans.

  • Making it clear where the annotations should live. There's at least one issue that I know of related to this, and presumably others.

  • Enabling a fully type-annotated AST. The reason I actually started looking into this is because I wanted to inspect type information during code generation; however, the annotation for a CoreFn.Expr only contains a Maybe SourceType. If the annotation type were polymorphic, then the typechecker could return e.g. an Expr SourceType which would guarantee that any expression would have its type be available alongside.

  • Using phantom types to represent different stages of compilation the code. This could be used for example to allow the type checker to guarantee that expressions are processed in a particular order.

    -- A desugared expression maintains a pointer to its source expression
    newtype Desugared = Desugared (Expr (Expr SourceType))
    
    -- desugar converts parsed expressions into desugared expressions
    desugar :: Expr SourceType -> Either Error Desugared

Work thus far

I got curious what this change would involve and spent some time refactoring the code in Declarations.hs and some related modules to see what would go into this type of change. My work so far is on this branch: https://github.com/purescript/purescript/compare/master...adnelson:annotated_expression?expand=1

What's done so far:

  • Renamed the current Expr to AbstractExpr a, and removed any SourceSpan arguments from constructors.
  • Created a new type AnnExpr a for an annotated expression, defined as data AnnExpr a = AnnExpr { eAnn :: a, eExpr :: AbstractExpr a }.
  • Module, Declaration, Binder and a few other types now become polymorphic as well, with SourceSpan arguments either removed or replaced with polymorphic variables. In most cases these have been given new names with a ' to be distinct from the original names.
  • Defined a bunch of type synonyms for existing types. For example, type Expr = AnnExpr SourceSpan, type Declaration = Declaration' SourceSpan, etc, to reflect the original semantics. As with all other naming decisions, I have no strong opinions.
  • Declarations type checks
  • Traversals type checks
  • Desugaring type checks

Work remaining

The majority of remaining work, that I can anticipate anyway, is adapting code which uses Expr to use either AbstractExpr or AnnExpr as the case may be. However, there are still quite a few modules left to update, and a lot of them contain complex code and/or places where it might make more sense to refactor, so I frankly might need help to push this across the finish line, if it's something the community sees as valuable. In any case I wanted to create this issue before going that much further, to avoid unnecessary work and get feedback before modifying a lot of code where I have no idea what I'm doing 😅

@adnelson
Copy link
Contributor Author

By the way, I'm happy to move this to discourse if that's preferred. Wasn't sure.

@natefaubion
Copy link
Contributor

I think this is desirable, and something we do in other data types (CoreFn, Kind, Type). However, I would use the naming scheme in those modules for consistency. We've avoided doing it for Expr just because it's so much work and very difficult to execute without tons of conflicts. Something to be aware of is Eq and Ord instances as well.

@adnelson
Copy link
Contributor Author

adnelson commented May 5, 2019

I would use the naming scheme in those modules for consistency.

So as I stated above, I'm completely unopinionated about naming schemes and am happy to go with whatever. But there is a difference in that in the changes I've been working on there are two interlinked expression types (annotated and unannotated), rather than the one for e.g. CoreFn.Expr. So, maybe I'm not quite understanding what you mean when you say to use the same naming scheme?

Something to be aware of is Eq and Ord instances as well.

Can you expand on this? The current Expr doesn't have either of these instances so I didn't add it. It wouldn't be difficult to do so but it seems like an orthogonal issue

@natefaubion
Copy link
Contributor

So, maybe I'm not quite understanding what you mean when you say to use the same naming scheme?

I just mean we have things like Type a withSourceType = Type SourceAnn and Kind a with SourceKind = Kind SourceAnn.

The current Expr doesn't have either of these instances so I didn't add it.

If it doesn't have these instance than no worries! We do a lot of deriving, and I know Type and Kind had them so we had to write some dodgy manual instances.

@adnelson
Copy link
Contributor Author

adnelson commented May 8, 2019

Having a type SourceExpr = Expr SourceSpan would be pretty cool. I went with Expr to try to avoid renaming things, but it's perhaps pointless because there's so much that needs to be changed anyway...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants