Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Implementation update on lexical lookup via quasis/injectiles, etc #410
This is probably something I should publish as a blog post on strangelyconsistent. For now, there's time only to write it up here as a self-closing issue.
Who knows, maybe there's some interested reader out there for whom this is useful. The main use is for me to info-dump what I feel is a first complete, workable solution to hygiene in 007 and Perl 6. (As a consequence, I won't pull any punches. If the below sounds like complete gobbledygook, assume that's me, not you. But maybe see the examples at the end.)
I should also link to this gist because it contains the germs of the thinking that's outlined here. Think of the gist as the exploration part and this issue as its final distillation.
In order to explain the solution, let's introduce a non-established term: unique variables.
In general, each variable in the source code is declared somewhere, and is then used a number of times:
Without any further information, we don't actually know whether the variable
But there are also a few well-known cases where a variable declaration corresponds exactly to one single location — let's call such a variable (both its declaration and its usages) a unique variable. Here are some examples of variables that turn out to be unique:
In all of these cases, we can start to think about optimizations where we do the lookup at compile time, and replace the runtime lookup with just the unique location. (If we can infer that all we're ever doing is reading form that location, we can further optimize reading from the location down to just its resulting constant value.)
Why are macro variables unique?
The last point in the preceding list is the odd one out, so it bears pausing for a bit and motivating that one. Macros are routines just like functions, so on the face of it, they should sort into the top list, not the bottom one.
While it is true that a macro can be called multiple times, and that each such call will generate a fresh location for each of its variables, code generated in the macro will only ever see a fixed variable. Whenever the macro gets called again, it's another quasi being called; one does not simply declare the same variable in the same macro call twice. It might be related to the fact that the macro runs at compile time, so what's compile time for the injectile is actually runtime for the macro, even though that's... the same time.
I can't stress enough how this "happy coincidence" feels both necessary and sufficient in some way. In the sense that it's been really tricky to see how to make macro hygiene realistic... but the fact that quasis in macros only ever see unique variables in their surrounding macros means that, by a cosmic coincidence, all the runtime lookups that would have been "detached" (in the terminology of the gist) can instead be optimized away.
I dunno, YMMV, but to me it seems like quite a wonderful generalization. Who knew global variables and macro "closure" variables had this trait in common?
It gets cuter. Let's assume that it's possible both to uniquify a variable (to turn all its runtime lookups into global location accesses), and to de-uniquify it (to turn all its location accesses back into runtime lookups). Enough information needs to be stored in the location object itself to be able to restore it that way.
Two points in time are of interest during macro expansion:
I think a good metaphor here is the freeze-drying of foodstuffs. We freeze-dry the food in order to transport it long distances or store it for a long time. When the food arrives at its destination, we can rehydrate it, returning it to its original fresh state.
In the case of the quasi, there are some variables that we know won't be lexically available from the mainline code. The two "lookup sites" involved form an inverted Y shape:
All the variables in the left branch of the Y shape will be unavailable from the point of view of (2), because they are no longer part of the sequence of
Why? Because while uniquification is a "necessary evil" and something we need to do to get hygiene at all, non-unique variables are still preferable and more in line with the end user's intuition/expectations.
Hesitant addendum: with those last two examples, we might be able to get our cake and have it, too, getting