Skip to content
This repository was archived by the owner on Apr 1, 2025. It is now read-only.

Conversation

@robrix
Copy link
Contributor

@robrix robrix commented Jun 20, 2019

This PR adds a generic mechanism for s-expression serialization of precise ASTs, e.g. those generated by @aymannadeem’s TH derivation of datatypes from tree-sitter grammars: tree-sitter/haskell-tree-sitter#144.

It does not actually hook this up internally anywhere and use it, however; that will happen sometime after the above PR is merged and a new version of the tree-sitter cabal package is pushed to hackage.

import GHC.Generics

serializeSExpression :: ToSExpression t => t -> Builder
serializeSExpression t = toSExpression t 0 <> "\n"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’ve used the same name for the class as in Serializing.SExpression because I intend that this module should eventually replace the former (and in the meantime, module namespacing is well and good).

This implementation does not support Options to show only the annotations of nodes, which we use in Serializing.SExpression for ts-parse, which I think we should drop as part of the migration to precise ASTs.



class ToSExpression t where
toSExpression :: t -> Int -> Builder
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might actually be a good case for Uniplate or similar, but as I don’t have experience with the syb family of approaches I’ve opted instead to implement it using GHC.Generics.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIL about Uniplate (well, yesterday, YIL?). Not familiar with advantages over using that to reduce boilerplate vs GHC.Generics in this case!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Me neither 😅

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uniplate would probably make this code a lot shorter, but it would also be slower, rely on RTTI, and would necessitate some shenanigans to handle the Text case. We should start getting familiar with Uniplate just so we don’t have to write an advanced overlap every single time we need a generic function, but since this is here and working I’m a big 👍


type family ToSExpressionStrategy t :: Strategy where
ToSExpressionStrategy Text = 'Show
ToSExpressionStrategy _ = 'Generic
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our old friend advanced overlap (cf https://wiki.haskell.org/GHC/AdvancedOverlap, Oleg Kiselyov, Simon Peyton Jones) allows us to specialize the instances s.t.:

  • Text fields are shown,
  • everything else is handled via the generic GToSExpression class, and
  • we don’t require instances of ToSExpressionWithStrategy (below) for every possible AST type (which would be agonizing).

This means that specializations of the behaviour for specific AST types will have to be listed in here—which makes this specific approach suitable only for cases where we specifically do not intend to specialize the behaviour for any particular language’s AST. That is, if we sometimes wanted to customize the behaviour, we should take a different (tho perhaps similar) approach where we do explicitly define specialized ToSExpressionWithStrategy instances for each datatype in each AST.

It may be possible to do something clever e.g. with -XDerivingVia or type families that would allow you to customize the behaviour piecemeal without necessarily requiring per-AST-datatype instances, but this sufficed for this particular case so I haven’t explored it further.

toSExpressionWithStrategy :: proxy strategy -> t -> Int -> Builder

instance Show t => ToSExpressionWithStrategy 'Show t where
toSExpressionWithStrategy _ t _ = stringUtf8 (show t)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ToSExpressionStrategy selects this instance for Text, which means that, unlike the Serializing.SExpression implementation, this one is able to show the contents of non-subterm fields (like the Text in a leaf node like an identifier). IMO that makes this implementation much nicer for the actual inspection of an AST, since you get to see which identifier is being used, and not just the presence of some identifier.

gtoSExpression _ _ = []

instance GToSExpression f => GToSExpression (M1 S s f) where
gtoSExpression = gtoSExpression . unM1 -- FIXME: show the selector name, if any
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably more of a TODO, but anyway, yes, we could show field labels pretty conveniently too, which would be nice.

@robrix robrix requested a review from a team June 20, 2019 15:11
Copy link
Contributor

@patrickt patrickt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome work! Can’t wait to put this into action!



class ToSExpression t where
toSExpression :: t -> Int -> Builder
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uniplate would probably make this code a lot shorter, but it would also be slower, rely on RTTI, and would necessitate some shenanigans to handle the Text case. We should start getting familiar with Uniplate just so we don’t have to write an advanced overlap every single time we need a generic function, but since this is here and working I’m a big 👍

@robrix robrix merged commit 6c31189 into master Jun 20, 2019
@robrix robrix deleted the serialize-precise-ast-as-s-expressions branch June 20, 2019 18:08
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

4 participants