New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make Term
and Type
agnostic to the type of variables, in prep for supporting rich function layou
#26
Conversation
Make `Term` and `Type` agnostic to the type of variables, in prep for supporting rich function layou
For that compiler I wrote, I did this separately for bindings and references. Unless I am missing something, It seems that much of the information that this is intended to keep track of would be stored redundantly with each reference. For the same reasons it is better to store information in the AST than a "side table", wouldn't it be better to store this information just once at the binding site? |
You mean storing the auxiliary information where the variable is bound? My
|
Regarding the redundancy, my thought was that it's easy to share the values
|
Yeah the ABT will help. Yeah memory sharing wouldn't help unless one had some sort of memoization scheme right? |
So, I played with this a bit. You could do the following: data Var = Var { freshId :: Int, name :: Text }
data ABT f v r
= Var Var
| Cycle r
| Abs Var v r
| Tm (f r) deriving (Functor, Foldable, Traversable)
data Term f v a = Term { freeVars :: Set v, annotation :: a, out :: ABT f v (Term f v a) } So at the point where variables are introduced (the What I don't like about this is that now every function that inspects or manipulates a term needs to keep a local stack of this auxiliary info, to maintain that mapping between I'm sure you could address this by writing a zipper type for ABT terms. The zipper keeps track of the association between variable references and their aux info. But now you have to use zipper traversal and extraction rather than just pattern matching, and it's more complicated for very little benefit that I can see. Regarding the redundancy, it's not an issue, and we don't need any fancy memoization either.
Regarding this last point, we can easily write the pure functions: -- move all symbol annotations to where they are introduced
share :: Ord v
=> Term f (Symbol v) a
-> Term f (Symbol ()) (a, Maybe v)
-- push symbol annotations down into the corresponding variable references
unshare :: Ord v
=> Term f (Symbol ()) (a, Maybe v)
-> Maybe (Term f (Symbol v) a) (We probably need some extra info about What we actually serialize is the result of And this is also all only really relevant if that auxiliary info is really large... which is not an issue yet. Anyway, that is my current thinking. |
What I was thinking was something like: data ABT f binding reference r
= Var reference
| Cycle r
| Abs binding r
| Tm (f r) deriving (Functor, Foldable, Traversable)
data Term f f2 b v a = Term { freeVars :: f2 v, annotation :: a, out :: ABT f b v (Term f v a) } For example, de Bruijn indices is So the bookeeping or lack thereof is wholly dependent on the parameters. We can still write share and unshare too -- indeed doing so may be easier than keeping an environment in some cases.
Don't get me wrong, the hashing is elegant, but avoiding communicating with the server when caching with the hash as a key is a memoization scheme of sorts. My concern is on the server---or wherever the hashes are recursively substituted with their subtrees---that it is easy to just pattern match on the same shared memory and then replace it with something different when doing a functional update---without reflecting on a sort of reference (be it hash or adress) there is nothing (selectively) enforcing that the sharing is preserved in future functional updates. |
Okay, I see what you are going for there. My feeling is that it's overkill - I don't really have a need to abstract along that dimension. (Like I don't care to abstract over whether de bruijn indices or ABTs are used). And it makes the type more complicated - you will certainly be forced to define some typeclasses to write code which is generic in
True. I guess my point is that we are already doing that anyway (for other efficiency reasons), so there's no additional complexity to maintaining that sharing there.
Ah. I don't think this will be an issue, because all code other than the term / type rendering logic is or will be parametric in the choice of variable type. Just like a function By the way, even though I am rejecting your suggestions :) I appreciate that you are making them. If you would like to work on something that would definitely be accepted, check out #25 and #24. :) I also have another project I still need to write up which involves implementing some succinct data structures. I think it will be really interesting also. |
This was a pretty tedious but straightforward change. Rather than hardcoding
Symbol
as the variable type used byABT
, it is now a parameter, bounded in most places with the typeclassVar
(inUnison.Var
). So we haveTerm v
with variables inv
,Type v
with variables inv
, etc.With
v
being the oldSymbol
type, we have the old behavior. But,v
could add mixfix operators, or the generalizedDoc
-based layout I am going to be implementing next. This also makes it easy to fiddle with different precedence schemes if we want. The only code that is aware of the specificv
will be the absolute top-level, which fixesv
to some concrete type.Something I considered but rejected was storing richer symbol layout outside of the syntax tree. This is really error prone - variables are getting moved around really often when editing, and keeping this richer information in sync is a nightmare. Better to store it in the tree directly, and just use parametricity to avoid coupling yourself to the details of that extra info.