-
Notifications
You must be signed in to change notification settings - Fork 459
Serialize precise AST as s-expressions #171
Conversation
This reverts commit 50813fd.
| import GHC.Generics | ||
|
|
||
| serializeSExpression :: ToSExpression t => t -> Builder | ||
| serializeSExpression t = toSExpression t 0 <> "\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’ve used the same name for the class as in Serializing.SExpression because I intend that this module should eventually replace the former (and in the meantime, module namespacing is well and good).
This implementation does not support Options to show only the annotations of nodes, which we use in Serializing.SExpression for ts-parse, which I think we should drop as part of the migration to precise ASTs.
|
|
||
|
|
||
| class ToSExpression t where | ||
| toSExpression :: t -> Int -> Builder |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might actually be a good case for Uniplate or similar, but as I don’t have experience with the syb family of approaches I’ve opted instead to implement it using GHC.Generics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TIL about Uniplate (well, yesterday, YIL?). Not familiar with advantages over using that to reduce boilerplate vs GHC.Generics in this case!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Me neither 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Uniplate would probably make this code a lot shorter, but it would also be slower, rely on RTTI, and would necessitate some shenanigans to handle the Text case. We should start getting familiar with Uniplate just so we don’t have to write an advanced overlap every single time we need a generic function, but since this is here and working I’m a big 👍
|
|
||
| type family ToSExpressionStrategy t :: Strategy where | ||
| ToSExpressionStrategy Text = 'Show | ||
| ToSExpressionStrategy _ = 'Generic |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Our old friend advanced overlap (cf https://wiki.haskell.org/GHC/AdvancedOverlap, Oleg Kiselyov, Simon Peyton Jones) allows us to specialize the instances s.t.:
Textfields areshown,- everything else is handled via the generic
GToSExpressionclass, and - we don’t require instances of
ToSExpressionWithStrategy(below) for every possible AST type (which would be agonizing).
This means that specializations of the behaviour for specific AST types will have to be listed in here—which makes this specific approach suitable only for cases where we specifically do not intend to specialize the behaviour for any particular language’s AST. That is, if we sometimes wanted to customize the behaviour, we should take a different (tho perhaps similar) approach where we do explicitly define specialized ToSExpressionWithStrategy instances for each datatype in each AST.
It may be possible to do something clever e.g. with -XDerivingVia or type families that would allow you to customize the behaviour piecemeal without necessarily requiring per-AST-datatype instances, but this sufficed for this particular case so I haven’t explored it further.
| toSExpressionWithStrategy :: proxy strategy -> t -> Int -> Builder | ||
|
|
||
| instance Show t => ToSExpressionWithStrategy 'Show t where | ||
| toSExpressionWithStrategy _ t _ = stringUtf8 (show t) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ToSExpressionStrategy selects this instance for Text, which means that, unlike the Serializing.SExpression implementation, this one is able to show the contents of non-subterm fields (like the Text in a leaf node like an identifier). IMO that makes this implementation much nicer for the actual inspection of an AST, since you get to see which identifier is being used, and not just the presence of some identifier.
| gtoSExpression _ _ = [] | ||
|
|
||
| instance GToSExpression f => GToSExpression (M1 S s f) where | ||
| gtoSExpression = gtoSExpression . unM1 -- FIXME: show the selector name, if any |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is probably more of a TODO, but anyway, yes, we could show field labels pretty conveniently too, which would be nice.
patrickt
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome work! Can’t wait to put this into action!
|
|
||
|
|
||
| class ToSExpression t where | ||
| toSExpression :: t -> Int -> Builder |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Uniplate would probably make this code a lot shorter, but it would also be slower, rely on RTTI, and would necessitate some shenanigans to handle the Text case. We should start getting familiar with Uniplate just so we don’t have to write an advanced overlap every single time we need a generic function, but since this is here and working I’m a big 👍
This PR adds a generic mechanism for s-expression serialization of precise ASTs, e.g. those generated by @aymannadeem’s TH derivation of datatypes from tree-sitter grammars: tree-sitter/haskell-tree-sitter#144.
It does not actually hook this up internally anywhere and use it, however; that will happen sometime after the above PR is merged and a new version of the
tree-sittercabal package is pushed to hackage.