Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Statement/expression mismatch and quasi scoping #7

Closed
masak opened this issue Jan 7, 2015 · 4 comments
Closed

Statement/expression mismatch and quasi scoping #7

masak opened this issue Jan 7, 2015 · 4 comments

Comments

@masak
Copy link
Owner

masak commented Jan 7, 2015

A bit of a design issue more than an implementation one.

The thing to keep in mind when reading this is that in 007, statements can contain expressions, but expressions cannot really contain statements. That's to say, of course a block term will hold statements in it (Update: block terms don't exist anymore) (Update-update: but sub terms do), but from the point of view of the expression, that block is opaque. More to the point, there's none of Perl 6's do keyword in 007. (do turns the subsequent statement into something that can be used as an expression.)

A macro call is by necessity (part of) an expression (Update: this is too limiting an assumption). But sometimes we want to insert something that isn't just an expression: a whole statement, a sequence of statements, a block, a subroutine declaration, a macro declaration, etc.

In fact, I bet the case where we insert statement-level stuff rather than expression-level stuff is the common case.

I had a solution for this. It can be summarized in this table:

       \...INTO    a whole expr     part of an expr
        \
INSERTING\-----------------------------------------
          |
an expr   |         fine            fine
          |
a stmt    |         stutter         error

Inserting an expression into an expression is fine. Even inserting an expression into part of an expression is fine. (Mainly because this is what expressions do all day.) It even works out pretty well with precedence and stuff by default. Better than C's macros, anyway.

If a statement of basically any kind ends up at the top level of an expression, then that's an error condition in the AST and we have to do something. It's in this case that we "stutter" and simply peel away the outer Q::Statement::Expr, letting the inner statement break free of it as if from a chrysalis.

(We have to do a similar peeling-away process for the first two cases, too, because the result from the macro will be a block with a statement with the expression we want to insert, and that block and that statement have to go away.)

It's only the last case, trying to insert a statement-y thing in the middle of an expression, that should be an error.


Now all of the above was fine, and is doable. But here's the snag, and the reason I'm writing this down.

macro foo() {
    my x = 5;
    return quasi { x };
}

my x = 19;
say(foo());

In the fullness of time — and hopefully not too soon — the above 007 program should output "5", because of how scoping and macro insertion works.

But by what I said above, all that would be inserted into the say(...) expression would be the expression x, a textual identifier without any context associated with it. The result would be that "19" would be printed. (Or, worse, if I hadn't declared that outer x, a "variable undeclared" error at parse-time (hopefully) or runtime (in the worst case).)

Perl 6 solves this in the following way: what's inserted is always a block, which gets run as it gets evaluated. Blocks, blocks, blocks. Even in the middle of an expression. The beauty of this is that the block can be forced to have the appropriate OUTER (the macro), and the variable lookup would work out. (As has been noted elsewhere, this makes the unhygienic lookup more problematic. But let's not worry about that now.)

Now,

  • In 007, just putting a block there wouldn't do the right thing. Blocks don't self-evaluate.
  • So we insert a call to the block. Now the problem is that 007 blocks never actually return any value. (They are only used for their side effects.)
  • So we insert a call to a subroutine instead. Now the problem is that we have to define the subroutine, and since subroutines are always named, we have to gensym the name.

I think that last one actually works. But it feels very wrong. I thought we would be able to do macro insertion with proper scoping with the primitives that we have, and without resorting to generating a lot of extra stuff just to make everything fit together.

Just some random thoughts before going to bed. I suspect we're missing a primitive somewhere, or something. In the best case, the whole thing can be solved by allowing blocks to return the value of the last statement. (I think it can.) But that opens up a smallish can of worms that I had kept closed by doing it the way he have it right now (blocks always evaluate to None):

  • If blocks return values, then we have two notions of returning. There's the block return, and the return return from subs (and macros). Slightly higher conceptual burden all around.
  • If blocks return values, should conditionals and loops also return statements? I kinda hope not, and letting them do so feels like the start of an edifice that leads to a total blending-together of statements and expressions. The pinnacle of which would be to sigh and add the do keyword. On the other hand, if we said that conditionals and loops always evaluate to None, then that's an exception in the language.
  • The notion of "last statement" is complicated. Last statement to be evaluated? Last textually?
  • Suddenly we have to care about the value of statements that we didn't have to care about before. just in case they're last in a block. What's the value of a my statement? Of a constant statement? Of a sub declaration? Again this takes us down the slippery slope of mixing up statements and expressions.
@masak
Copy link
Owner Author

masak commented Jan 8, 2015

I forgot to emphasize that the "make blocks return values" solution is one that will work, even with all the listed drawbacks.

The only alternative solution I've been able to come up with is to say "scopes shouldn't be rooted in blocks the way they are now". That is, an inserted AST fragment can have its very own scope without having its own block.

(Note that this incidentally solves the COMPILING:: problem, too.)

I don't know which approach would be best. They seem to each have their drawbacks.

@masak
Copy link
Owner Author

masak commented Jan 9, 2015

A third solution, and the one I dislike the least up until now:

Introduce a new Q type, called Q::Expr::Block. Unlike all the other Q types, an object of this type does not map to a particular stretch of text in the source. It cannot be written in textual source, and the parser does not generate it. Instead, it is responsible for inserted code, the result of e.g. a macro call.

More properties of Q::Expr::Block:

  • It carries a Val::Block with the outer-frame being the frame that the quasi (or whatever) was in.
  • It can occur as a term in an expression.
  • When evaluated, it makes a special call to this Val::Block that returns the value of the last statement evaluated. Details pending about the mechanism, but it's not intrinsically hard.

By way of silly example, this code:

macro squee() {
    my x = 5;
    return quasi {
        say("whoa");
        x = x + 1;
        x;
    }
}

for [1, 2] {
    say("hi " ~ str(squee()));
}

Would result in the for loop being macro-injected like this:

for [1, 2] {
    say("hi " ~ str(⦃
        say("whoa");
        x = x + 1;
        x;
    ⦄));
}

And when run, this would print: whoa␤hi 6␤whoa␤hi 7.

Note well that:

  • The ⦃...⦄ braces above are not syntax. They are not a valid 007 program in source form. They are a representation of what, conceptually, the compiler would substitute the macro call for.
  • At the border of those special braces, variable lookup would go "...huh", and then follow the outer to find the right x inside the macro frame. Thus, the ⦃⦄ also represent a discontinuity in the ordinary outer chain, caused by code being forcibly spliced as part of the macro call.

Other advantages, touching on what's been kvetched above:

  • The mismatch between statements and expressions doesn't smart so much.
  • No stuttering mechanism is needed.
  • It is no longer an error to insert a whole statement into a subexpression.
  • Literal blocks in code still wouldn't self-evaluate (unless they're statement blocks). That's good.
  • First-class blocks still don't return a value. Also good, although less clear-cut. I might give in and make them start returning values eventually. (Edit: fixed — we outlawed them. See Get rid of blocks as values #11.)
  • As far as I can see, this solves the COMPILING:: problem, too.

All of the benefits, none of the nasty side effects!

masak pushed a commit that referenced this issue Jan 18, 2015
This necessitates the stuttering of an expression into a statement. Thanks to
the tentative conclusion in #7, I expect
this stuttering to be temporary.

Also change the first test in t/syntax/quasi.t to expect something slightly
different from (the AST returned from) a quasi. I'm not sure we'll keep that
output at all, but for now it's easier just to change the test to expect what
it actually gets.
masak pushed a commit that referenced this issue Jan 18, 2015
This necessitates the stuttering of an expression into a statement. Thanks to
the tentative conclusion in #7, I expect
this stuttering to be temporary.

Also change the first test in t/syntax/quasi.t to expect something slightly
different from (the AST returned from) a quasi. I'm not sure we'll keep that
output at all, but for now it's easier just to change the test to expect what
it actually gets.
masak pushed a commit that referenced this issue Jan 19, 2015
This necessitates the stuttering of an expression into a statement. Thanks to
the tentative conclusion in #7, I expect
this stuttering to be temporary.

Also change the first test in t/syntax/quasi.t to expect something slightly
different from (the AST returned from) a quasi. I'm not sure we'll keep that
output at all, but for now it's easier just to change the test to expect what
it actually gets.
masak pushed a commit that referenced this issue Nov 2, 2015
The former is now the strange weird thing generated as a result of
evaluating a quasi. Thanks for the naming suggestion, #7!

I first tried making Q::Expr::Block a subrole of both Q::Block and
Q::Expr, but that ended in hilarity, tragedy, and horror. First I
got a number of diamond-type conflicts with methods from Q, and
after uselessly fixing these, I got this wonderful error:

    No concretization found for Q

So for better or worse, Q::Expr::Block is currently a Q::Block but
not a Q::Expr.
@masak
Copy link
Owner Author

masak commented Dec 4, 2015

  • The notion of "last statement" is complicated. Last statement to be evaluated? Last textually?
  • Suddenly we have to care about the value of statements that we didn't have to care about before. just in case they're last in a block. What's the value of a my statement? Of a constant statement? Of a sub declaration? Again this takes us down the slippery slope of mixing up statements and expressions.

I've been thinking about this recently. I think we want the very simplest semantics possible, which to me seems to be this:

  • Last statement textually. Evaluation order doesn't matter. If control flow is interrupted before we get to the last statement, then the value of the whole spliced block is moot anyway.
  • Anything that's a non-expression statement is defined to have the value None.

@masak
Copy link
Owner Author

masak commented Oct 4, 2016

70d2b9c begins to address this issue. I'd say by the time we can actually return the value of the last expression statement out of a Q::Expr::StatementListAdapter, we can resolve this issue happily.

masak pushed a commit that referenced this issue Mar 4, 2017
wip
Try doing this:

    $ bin/007 --backend=i13n examples/name.007

Should get this output:

        # static frame #1
        # call A from line zero: frame 2
        # static frame #3
        # call B from line zero: frame 4
     1. macro name(expr) {
            # static frame #5
            # call C from line zero: frame 6
     2.     if expr ~~ Q::Postfix::Property {
     3.         expr = expr.property;
     4.     }
            # static frame #7
            # call D from line zero: frame 8
     5.     if expr !~~ Q::Identifier {
     6.         throw new Exception {
     7.             message: "Cannot turn a " ~ type(expr) ~ " into a name"
     8.         };
     9.     }
            # static frame #9
            # call E from line zero: frame 10
            # call F from line zero: frame 11
    10.     return quasi { expr.name };
    11. }
    12.
    13. my info = {
    14.     foo: "Bond",
    15.     bar: {
    16.         baz: "James Bond"
    17.     },
    18. };
    19.
    20. say(name(info));           # info
    21. say(name(info.foo));       # foo
    22. say(name(info.bar.baz));   # baz
@masak masak closed this as completed in 626dfc1 Mar 23, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant