Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement quasi unquotes #30

Open
masak opened this issue Oct 10, 2015 · 68 comments
Open

Implement quasi unquotes #30

masak opened this issue Oct 10, 2015 · 68 comments

Comments

@masak
Copy link
Owner

masak commented Oct 10, 2015

Hacker News wants unquotes. We happily oblige.

Unquotes in expressions

Whenever the parser is ready to parse a term, it should also expect an unquote.

quasi { say("Mr Bond!") }
quasi { say({{{greeting_ast}}}) }

Technically, I don't see why we shouldn't expect the same for operators. But we get into the interesting issue of what syntactic category it is.

Screw it, I'm tired of theorizing. Let's just steal the colon for this.

quasi { 2 + 2 }
quasi { 2 {{{infix: my_op}}} 2 }

quasi { -14 }
quasi { {{{prefix: my_op}}}14 }

quasi { array_potter[5] }
quasi { array_potter{{{postfix: my_op}}} }

Backporting this solution to terms, you could mark up a quasi as term if you want, but it's the default so you don't have to:

quasi { say("Mr Bond!") }
quasi { say({{{term: greeting_ast}}}) }

At the time of evaluating the quasi (usually macro application time), we'll have the type of the unquoted Qtree. The runtime dies if you try to stick a square Qtree into a round unquote.

But the parser can sometimes reject things early on, too. For example, this shouldn't even parse:

quasi { sub {{{prefix: op}}}(n) { } }

(That slot doesn't hold an operator, it holds an identifier.)

Unquotes for identifiers

007 currently has 5 major places in the grammar where it expects an identifier:

  • Usages
    • Terms in expressions (but that's handled by the previous section)
    • is <ident> traits
  • Declarations
    • my and constant
    • sub and macro
    • parameters in parameter lists (in subs and pointy blocks)

The traits one is kind of uninteresting right now, because we have four trait types. Someone who really wanted to play around with dynamic traits could write a case expression over those four. So let's skip traits — might reconsider this if we user-expose the traits more.

The three declaration cases are the really interesting ones. Notice that each of those has a side effect: introducing whatever name you give it into the surrounding lexical scope. (Handling that correctly is likely part of the #5 thing with Qtree strengthening.)

I would be fine with the {{{identifier: id}}} unquote accepting both Q::Identifier nodes and Str values. Q::Identifier is basically the only node type where I think this automatic coercion would make sense.

Unquotes in other places

These are the remaining things I can think of where an unquote would make sense:

  • Sequence of statements (Q::Statements)
  • Parameter list (Q::Parameters)
  • Argument list
  • Standalone block (Q::Block)
  • Trait (Q::Trait) — note: we should probably have a Q::Traits container, just like we do for statements and parameters
  • Individual statement (Q::Statement) — not sure what this one'll give us over sequence of statements, but it's possible it might be useful
  • Lambda — interesting; this one is in the grammar but not reified among the Q nodes
  • "Expr block" — ditto
  • Unquote (Q::Unquote) — mind explosion — note that this is only allowed inside a Q::Quasi
@masak
Copy link
Owner Author

masak commented Oct 11, 2015

After some deliberation on #6macros, I've decided to make the syntax {{{my_op: Q::Infix}}} instead. And generally I think we ought to try conflating grammatical categories and Q types.

Conceptually, the typing is a run-time check. With #33 we can probably do better than that, and detect impossible things statically. The important thing is that the parser is no longer confused.

masak pushed a commit that referenced this issue Oct 11, 2015
A quasi can be full of holes -- so-called "unquotes". An unquote
contains an expression that needs to evaluate to a Qtree, which
then gets spliced directly into the code in the quasi. It's
quite similar to qq string interpolation, but with code instead
of a string. The unquote is a way of "escaping" out of the literal
code and into the land of Qtrees.

As an example, these all mean the same thing:

    quasi { say("Mr Bond!") }
    quasi { say({{{Q::Literal::Str("Mr Bond!")}}}) }
    my s = Q::Literal::Str("Mr Bond!"); quasi { say({{{s}}}) }
    my t = quasi { "Mr Bond!" }; quasi { say({{{t}}} }

(The last one doesn't work yet, I think. It should, though.)

The quasis are replaced by actual Qtrees at *evaluation*, when the
runtime "hits" the quasi by evaluating an expression in which it
is a part. At that point, we run down the Qtree structure of the
quasi block, cloning everything we find (except trivial leaf nodes,
which are safe to re-use), and replacing every unquote we find with
its expression (hopefully a Qtree). That's what the new .interpolate
method is for.

Addresses #30. Still a lot to do.
vendethiel pushed a commit to vendethiel/007 that referenced this issue Oct 20, 2015
A quasi can be full of holes -- so-called "unquotes". An unquote
contains an expression that needs to evaluate to a Qtree, which
then gets spliced directly into the code in the quasi. It's
quite similar to qq string interpolation, but with code instead
of a string. The unquote is a way of "escaping" out of the literal
code and into the land of Qtrees.

As an example, these all mean the same thing:

    quasi { say("Mr Bond!") }
    quasi { say({{{Q::Literal::Str("Mr Bond!")}}}) }
    my s = Q::Literal::Str("Mr Bond!"); quasi { say({{{s}}}) }
    my t = quasi { "Mr Bond!" }; quasi { say({{{t}}} }

(The last one doesn't work yet, I think. It should, though.)

The quasis are replaced by actual Qtrees at *evaluation*, when the
runtime "hits" the quasi by evaluating an expression in which it
is a part. At that point, we run down the Qtree structure of the
quasi block, cloning everything we find (except trivial leaf nodes,
which are safe to re-use), and replacing every unquote we find with
its expression (hopefully a Qtree). That's what the new .interpolate
method is for.

Addresses masak#30. Still a lot to do.
@masak
Copy link
Owner Author

masak commented Oct 23, 2015

I've thought some more about it and I think that the syntax should be {{{my_op @ Q::Infix}}} instead of with the colon. Yes, Ven++ was an influence here.

But what finally brought me around is that we'll likely end up with a syntax like quasi @ Q::Trait { ... }, and there it feels like @ matches better. (And the two should definitely be the same symbol, and we could read it as "quasi as trait".)

We still kind of get the strange consistency with the colon even when it isn't a colon. And it's kind of nifty that it's actually another symbol, too. People could go "ah, it's like types, but for parsing".

@masak masak mentioned this issue Nov 6, 2015
2 tasks
@masak
Copy link
Owner Author

masak commented Dec 28, 2015

I started keeping a "quasi checklist", a list of the individual parsing modes a quasi term should be able to handle. Might as well keep it here in the issue, rather than offline.

  • Q::ArgumentList
  • Q::Block
  • Q::CompUnit
  • Q::Expr
  • Q::Identifier
  • Q::Infix
  • Q::Literal
  • Q::Literal::Int
  • Q::Literal::None
  • Q::Literal::Str
  • Q::Parameter
  • Q::ParameterList
  • Q::Postfix
  • Q::Prefix
  • Q::Property
  • Q::PropertyList
  • Q::Statement
  • Q::StatementList
  • Q::Term
  • Q::Term::Array
  • Q::Term::Object
  • Q::Term::Quasi
  • Q::Trait
  • Q::TraitList
  • Q::Unquote

Note that this is a checklist for quasis, not for unquotes. Will need a similar checklist for unquotes, which I deferred until after this for reasons I don't remember now.

@masak
Copy link
Owner Author

masak commented Jan 6, 2016

Alright, now that that's done, let's keep a checklist for the unquote forms, too. Seemed to work pretty well.

  • Q::ArgumentList
  • Q::Block
  • Q::CompUnit
  • Q::Expr
  • Q::Identifier
  • Q::Infix
  • Q::Literal
  • Q::Literal::Int
  • Q::Literal::None
  • Q::Literal::Str
  • Q::Parameter
  • Q::ParameterList
  • Q::Postfix
  • Q::Prefix
  • Q::Property
  • Q::PropertyList
  • Q::Statement
  • Q::StatementList
  • Q::Term
  • Q::Term::Array
  • Q::Term::Object
  • Q::Term::Quasi
  • Q::Trait
  • Q::TraitList
  • Q::Unquote

@masak
Copy link
Owner Author

masak commented Jan 6, 2016

I would be fine with the {{{identifier: id}}} unquote accepting both Q::Identifier nodes and Str values. Q::Identifier is basically the only node type where I think this automatic coercion would make sense.

(Syntax is now {{{id @ Q::Identifier}}} {{{Q.Identifier @ id}}}.)

Let's be conservative and wait with this until we see a clear use case for it. I think if we decide to go down this road, the really useful part would be the ability to auto-concatenate identifiers and strings into bigger identifiers, like so:

gensym_{{{q @ Q::Identifier}}}

That could be really nice — but let's scope that to be outside of this issue.

@masak
Copy link
Owner Author

masak commented Apr 26, 2016

I'll just note that this issue is a little bit stalled because it's hit some conceptual difficulties.

When you're about to parse an expression, you might hit a prefix op or a term. Therefore, any of these is fine as an expression:

1 + 2
-42
{{{ast @ Q::Term}}} + 2
{{{ast @ Q::Literal::Int}}} + 2
{{{ast @ Q::Prefix}}} 42

The problem comes because we don't know which "parser mode" we should have been in until after the @ when we see the Q type. We want to preserve one-pass parsing, so that we don't re-trigger effects from parsing the expression before the @. (Let's say it wasn't ast but a sub or a macro with declarations and stuff inside.)

Perhaps it would be possible to "fixup" the parser state once we see the Q type. The parser assumed it was expecting a term (say), but now that we see Q::Prefix, we tweak it to instead expect a prefix. I have no clue how that'd be made to work.

Or, maybe the {{{expr @ qtype}}} syntax is inherently disadvantageous and should be changed. Something like qtype @ {{{expr}}} or @ qtype {{{expr}}} would work fine.

@masak
Copy link
Owner Author

masak commented Jun 13, 2016

Nothing has happened on this front. I'm defaulting to suggesting qtype @ unquote(expr) as a way forward.

@vendethiel
Copy link
Collaborator

Not sure how that allows one-pass parsing, though ?

@vendethiel
Copy link
Collaborator

vendethiel commented Jun 13, 2016

But I think that, inherently, having splicing in a syntaxful language means you can't keep one-pass parsing. Or you're going to have to patch holes by yourself.

One way that could look like would be...

token if { 'if' <cond> '{' <block> '}' }

token cond:splice { '{{{ Q::Cond @'  <var> '}}}' }
token cond:expr { <expr> }

token block:splice { '{{{ Q::Block @' <var> '}}}' }
token block:simple { '{' <expr> + %% ';' '}' }

That looks awful – because it is.

As we've already pointed out numerous times, this is not an issue at all in Lisp(s), because we're already done parsing when we do splicing.

(foo ,@(mapcar #'process xs))
;; literally parsed as (at least in CL)
(foo (unquote-splicing (mapcar #'process xs)))

So, it's "free" in the sense that it's not in any way special syntax. I think we've established that quite some time ago, but I think it's still worth pointing out.


Anyway, I'm not sure how a solution featuring "one-pass parsing" would make sense, let alone look like. Maybe we could get away with only a lot of backtracking? Even then, I'm not sure.

I'm not arguing we should pre-process the code, and put in the AST text in place of the {{{ }}} just to parse it. That'd defeat the purpose of hygienic macros (mostly).
But I think we should act "as if" we just parsed a block, and in the action, just use the given AST fragment (<var> in my example).


I've thought about figuring out a syntax lisp-like, in that you could also use it for day-to-day use, like Lisp can use quote/quasiquote/unquote for arrays...
But as I've just said, we are not homoiconic – Elixir is not either, and they have macros, but their syntax is far less flexible ... Though it's worth it to know what they're doing, AFAIK they only allow to splice in expressions – you can't write 3 unquote(plus) 4, which makes it considerably easier for them.

@masak
Copy link
Owner Author

masak commented Jun 13, 2016

Not sure how that allows one-pass parsing, though ?

Well, hm, we'd know with a one-token lookahead... but that doesn't sound all that alluring now that I say it out loud.

Notably, the expression parser would still need to go "oh, that's a Q::Mumble::Handwave, that's a TTIAR. but is there a @ after it, then it's possibly OK!"

@masak
Copy link
Owner Author

masak commented Jun 13, 2016

Ok, new suggestion that really front-loads the unquote signal: unquote(qtype @ expr)

@vendethiel
Copy link
Collaborator

vendethiel commented Jun 13, 2016

that doesn't really simplify anything over {{{, does it? (...except it doesn't look as ugly)

@masak
Copy link
Owner Author

masak commented Jun 13, 2016

But I think that, inherently, having splicing in a syntaxful language means you can't keep one-pass parsing. Or you're going to have to patch holes by yourself.

That's an interesting assertion that I don't immediately agree with. 😄

One way that could look like would be...

token if { 'if' <cond> '{' <block> '}' }

token cond:splice { '{{{ Q::Cond @'  <var> '}}}' }
token cond:expr { <expr> }

token block:splice { '{{{ Q::Block @' <var> '}}}' }
token block:simple { '{' <expr> + %% ';' '}' }

Yes, that's the worst-case solution to what we need to do with the grammar.

As we've already pointed out numerous times, this is not an issue at all in Lisp(s), because we're already done parsing when we do splicing.

Hm, wait, isn't this trivially true for Perl 6 as well? Parsing happens first, and leads to an AST with a lot of Q::Quasi nodes in it, possible with a lot of Q::Unquote nodes in them. Then (usually) a macro is invoked (still at parse-time, but at least later), and as part of this, a things splice. Granted, we're not done with all the parsing, but we're done with the quasi we're splicing into. Or am I missing something?

So, it's "free" in the sense that it's not in any way special syntax. I think we've established that quite some time ago, but I think it's still worth pointing out.

Given that what we're implementing is a Perl and not a Lisp, are there any obvious benefits that carry over cleanly that I seem to be stubbornly ignoring? That question is often on my mind. 😄

@masak
Copy link
Owner Author

masak commented Jun 13, 2016

that doesn't really simplify anything over {{{, does it? (...except it doesn't look as ugly)

Oh, but it does, because unquote(qtype @ expr) introduces its three components in this order:

  • unquote (so we know we're out of the normal parse)
  • qtype (optional) (so we can quickly fail parses where we know the qtype is incompatible — example: having a qtype of Q::Expr where we're expecting a Q::Statement is fine, because of expression statements; vice versa is not OK because 007 doesn't treat statements as expressions)
  • expr

Compare this to {{{expr @ qtype}}} which:

  • front-loads the {{{ signal (good)
  • reads the whole expression, an unbounded amount of tokens
  • finds @ qtype and may need to reject the whole parse because of the above mismatch (bad)

@vendethiel
Copy link
Collaborator

vendethiel commented Jun 13, 2016

Sorry, I meant "versus the possible {{{qtype @ expr}}} form".

@masak
Copy link
Owner Author

masak commented Jun 13, 2016

Oh yeah, that one's equivalent as far as the above argument is concerned.

Any reason I should prefer {{{qtype @ expr}}} to unquote(qtype @ expr) and not use this opportunity to ditch the universally hated {{{ }}} syntax? 😄

@vendethiel
Copy link
Collaborator

Absolutely none! Except maybe that it doesn't stand out as much – which is fine for Lisp, since it has no syntax, but might not be for Perl 6/007.

I thought {{{qtype @ expr}}} was one of the variation of your previous comment, but apparently it wasn't, so I stand corrected.

@masak
Copy link
Owner Author

masak commented Jun 13, 2016

Here's one reason I've fallen out of love with {{{ ... }}}: S06 talks a lot about custom delimiters for the quasi block, so you could have quasi < ... >, quasi ⟦ ... ⟧, etc. Nicely enough, you then use the delimiters (tripled) for the unquotes: <<< ... >>>, ⟦⟦⟦ ... ⟧⟧⟧. Cool.

But then the final ingredient which would make all that machinery make sense is missing: you can't nest those things. Specifically, you can't unquote "several levels up" in analogy with how you'd do flow control with loop labels:

quasi <
    quasi ⟦
        ⟦⟦⟦ ... ⟧⟧⟧;    # I'd expect this to escape out of the inner quasi
        <<< ... >>>;    # I'd expect this to escape out of the outer quasi
    ⟧;
>;

TimToady assures me that this doesn't work, shouldn't work, and will never work. The reason, as far as I understand it, has to do with proper nesting, which is something we value in the parser more than the above conjectural feature. The way the parser works is that it never violates proper nesting, and so you can't unquote < ... > if there's a ⟦ ... ⟧ layer in-between.

Given all that, I just don't see why anyone would choose to use a non-standard delimiter, except possibly to make their code look funny. And — given that I considered the above conjectural feature to be the chief argument for the {{{ ... }}} syntax — I don't see why we need to stick to the {{{ ... }}} syntax.

@masak
Copy link
Owner Author

masak commented Jun 13, 2016

Absolutely none! Except maybe that it doesn't stand out as much – which is fine for Lisp, since it has no syntax, but might not be for Perl 6/007.

Yes, that's perhaps one of the biggest unknowns with the unquote(qtree @ expr) proposal. Given masak's Deplorable Truth About Templating Syntax, I guess the empirical experiment here is "what happens if we make our placeholder syntax not be ugly and stand out so much?".

Let's find out! 😄

@vendethiel
Copy link
Collaborator

vendethiel commented Jun 13, 2016

You needn't convince me about nesting – I argued about it quite a lot already!. So, yeah, I don't remember believing in that S06 bit... but anyway, whatever you choose is fine.

We could try translating these:

from Graham's "On Lisp": (and another one is the same SO question):

(defmacro =defun (name parms &body body)
  (let ((f (intern (concatenate 'string
                                "=" (symbol-name name)))))
    `(progn
       (defmacro ,name ,parms
         `(,',f *cont* ,,@parms))
       (defun ,f (*cont* ,@parms) ,@body))))

@vendethiel
Copy link
Collaborator

vendethiel commented Jun 13, 2016

macro cont-defun(Q::Identifier $name, Q::ParamList $parms, Q::Block $body) {
  my $new-name = $name.scope.declare("cont-$name.bare()");
  quasi {
    macro unquote(Q::Identifier @ $name)(unquote(Q::ParamList @ $parms)) { # should the ParamList unquote be bare, without parenthesis?
      quasi {  # an unquote inside of another one
        my $f = unquote(Q::Identifier @ unquote(Q::Identifier @ $new-name)); # ... just to make this more readable
        $f(unquote(Q::ParamList @ unquote(Q::ParamList @ $parms))); # same q. for ParamList
      };
    }; # end of inner macro
    sub unquote(Q::Identifier @ $new-name)(unquote(Q::ParamList @ $parms))
      unquote-unhygienic(Q::Block @ $block); # mmh... needs that "unhygienic" so that `$parms` are visible.
  };
}

Whew.

@vendethiel
Copy link
Collaborator

Seems like the github live update earlier had me miss one comment...

Hm, wait, isn't this trivially true for Perl 6 as well? Parsing happens first, and leads to an AST with a lot of Q::Quasi nodes in it, possible with a lot of Q::Unquote nodes in them. Then (usually) a macro is invoked (still at parse-time, but at least later), and as part of this, a things splice. Granted, we're not done with all the parsing, but we're done with the quasi we're splicing into. Or am I missing something?

That's not exactly what I mean – but I was unclear, as usual. I meant that there's no "parser state" to be recovered after an unquote. You don't have to "insert" anything, to parse in a different way – it's not even really splicing at parse-time, it's a function call in the source code.

Hm, wait, isn't this trivially true for Perl 6 as well? Parsing happens first, and leads to an AST with a lot of Q::Quasi nodes in it, possible with a lot of Q::Unquote nodes in them. Then (usually) a macro is invoked (still at parse-time, but at least later), and as part of this, a things splice. Granted, we're not done with all the parsing, but we're done with the quasi we're splicing into. Or am I missing something?

Not quite, exactly because of that same "parser state". In lisp, it's all functions calls. Doesn't matter if you're splicing an operator or whatever-else.

Given that what we're implementing is a Perl and not a Lisp, are there any obvious benefits that carry over cleanly that I seem to be stubbornly ignoring? That question is often on my mind. 😄

I'm not 100% sure what you're refering to here?

@masak
Copy link
Owner Author

masak commented Jun 13, 2016

I'm not 100% sure what you're refering to here?

Just making sure it's not the case that the Lisp advantages you like to highlight are blindingly obviously mappable onto the problems we're wrestling with in 007 (and Perl 6).

The very-early stages of Perl 6 macros consisted of "yuh, of course we're gonna have Lisp macros (and not dirty C macros, at least not just)". Later: "oh, huh, Perl is not Lisp, Perl is Perl". (And all the other languages that are not Lisp are also not Lisp, and all have slightly different readings of what "macro" means.)

With all that said, I really, really want to learn from Lisp and soak up all the experience with macros that's obviously there. That's why I keep trying to read Lisp code, and to grok what the macros in there are doing.

@raiph
Copy link

raiph commented Jun 26, 2016

Completely rewritten

Hopefully you'll forgive some bikeshed coloring suggestions.

On less confusing and prettier quasi syntax

Claim: the keyword-and-block quasi { ... } syntax is confusing.

  • The thing that looks like a block isn't actually a block. It's a quote that happens to use the same delimiters as blocks. This is very misleading, imo.
  • The word "quasi" has a popular meaning that only quasi relates to quasi-quoting. The popular meaning is too general to be helpful.
  • The technical term quasi quoting means "a linguistic device in formal languages that facilitates rigorous and terse formulation of general rules about linguistic expressions while properly observing the use–mention distinction". That's on target but still too general to be helpful, imo, for someone who just wants to know what the construct they're looking at does, in layman's terms. Imo it's entirely appropriate to call it a quasi-quoting feature but not to use the word 'quasi` in the language. After exploring a bit, I've concluded I don't know of a word that would be better.

So...

Wikipedia:

Quasi-quotation is sometimes denoted using the ... double square brackets, ⟦ ⟧, ("Oxford brackets") instead of ordinary quotation marks.

So, strawman proposal, replace the quasi {} construct^1 with a circumfix operator such that:

⟦⟦ 42 + 99 ⟧⟧

quasi quotes the enclosed code.^1

(Or more brackets, with matching open/close count eg ⟦⟦⟦ say pi; say now ⟧⟧⟧^2 etc.)

For the Texas equivalent, use at least three pipes:

||| 42 + 99 |||
# or more pipes if desired: 
||||| say pi; say now |||||

On less confusing and prettier unquote syntax

Claim: the {{{ ... }}} syntax is confusing.

  • I'm not arguing it's ugly. I'm not arguing it isn't.
  • It's confusing. Because braces are associated with blocks.

So...

Wikipedia:

Quasi-quotation is sometimes denoted using the symbols ⌜ and ⌝ (unicode U+231C, U+231D) ... instead of ordinary quotation marks.

So, strawman proposal, use bracketing that includes a reversal of the ⌜ and ⌝ symbols:

say $onething, ⟦⌝ $ast-arg ⌜⟧, $anotherthing;

For the Texas equivalent use:

say $onething, ||^ $ast-arg ^||, $anotherthing;

^1 If there must be a keyword, perhaps AST or to-AST or to-ast or toast rather than quasi?

@masak
Copy link
Owner Author

masak commented Jun 30, 2016

Hopefully you'll forgive some bikeshed coloring suggestions.

Consider yourself forgiven.

And please forgive me in turn for answering somewhat selectively, where I feel I have something to say. Feel free to take my silence on the rest as implicit agreement. Maybe.

  • The thing that looks like a block isn't actually a block. It's a quote that happens to use the same delimiters as blocks. This is very misleading, imo.

I'll 50% agree on this one. It's a block more often than people actually mean block. That is, in

quasi { x + 5 }

someone clearly meant to produce the expression x + 5, not the block { x + 5 }. The braces here are only a kind of delimiter. Our oldest open 007 issue is a little about that, actually.

On the flip side, though, it is a block in the sense that it holds in variable declarations.

quasi {
    my y = "oh, James!";
    say(y);
}
# no `y` variable here, nor outside where the macro/quasi gets expanded

This is part of "hygiene". I expect people are going to like that and be pleasantly un-surprised by that, because that's how braces/blocks behave.

  • The word "quasi" has a popular meaning that only quasi relates to quasi-quoting. The popular meaning is too general to be helpful.

Ok, but is it worse than using enum for enumerations or subset for subtypes or given for switch statements? I don't think so.

In some ways, it's too good a term to pass up, even when the match isn't squeaky-Lisp perfect.

  • The technical term quasi quoting means "a linguistic device in formal languages that facilitates rigorous and terse formulation of general rules about linguistic expressions while properly observing the use–mention distinction".

Would it help if I reformulated the above as "a neat way to build code, some of comes in as parameters"? Because that's what it says, 'cept with more stilted language.

Quasi-quotation is sometimes denoted using the ... double square brackets, ⟦ ⟧, ("Oxford brackets") instead of ordinary quotation marks.

Yes, in logic. That whole article you're quoting refers to Quine's notion of quasiquotation in formal languages, that is, in mathematical logic. The Lisp-y backquote and the comma would have much bigger claims to quasiquotation syntax than ⟦ ⟧. Those are quite decent candidates in Lisp because they're "free for use" in Lisp. In Perl 6, the backquote has been reserved for DSLs, and the comma is used all over the place.

So, strawman proposal, replace the quasi {} construct^1 with a circumfix operator such that:

⟦⟦ 42 + 99 ⟧⟧

quasi quotes the enclosed code.

Not saying I endorse it, but what in your syntax would be the corresponding way to express quasi @ Q::ParameterList { x, y, z } ?

(Or more brackets, with matching open/close count eg ⟦⟦⟦ say pi; say now ⟧⟧⟧^2 etc.)

Unmatched footnote.

What's your use case for having more brackets? Do you conceive of something like ⟦⟦⟦ ... ⟦⟦ ... ⟧⟧ ... ⟧⟧⟧ ? When would that be needed, in your opinion? (See this comment in the same thread for some discussion about nested quasis and unquotes.)

For the Texas equivalent, use at least three pipes:

||| 42 + 99 |||
# or more pipes if desired: 
||||| say pi; say now |||||

Still not assuming I endorse the rest, this is not going to fly because the Texas equivalent of a matching pair of braces can not be a non-matching symbol. (Not by some hardcoded rule specified somewhere, granted, just... not.)

Claim: the {{{ ... }}} syntax is confusing.

  • I'm not arguing it's ugly. I'm not arguing it isn't.

Ok, then let me help you with that one: it's ugly. 😄

Question is more like, is that a (necessary) feature, or something we should fix?

  • It's confusing. Because braces are associated with blocks.

Again, blocks convey scoping and hygiene, something that isn't all that far-fetched with unquotes either.

Quasi-quotation is sometimes denoted using the symbols ⌜ and ⌝ (unicode U+231C, U+231D) ... instead of ordinary quotation marks.

You're proposing using a quasi-quotation symbol (from mathematical logic) to stand in for unquotes? You don't consider that a bit confusing?

So, strawman proposal, use bracketing that includes a reversal of the ⌜ and ⌝ symbols:

say $onething, ⟦⌝ $ast-arg ⌜⟧, $anotherthing;

I think that falls under "you think it's cute today". More specifically/constructively: we could have played around with outwards-facing brackets for ranges, like some mathematical textbooks do:

[1, 2]    # closed
]1, 2[    # open
[1, 2[    # half-closed, half-open
]1, 2]    # half-open, half-closed

But we didn't — nor a competing form with ranges like [1, 2) — because it completely messes up parsing. I think a ⟦⌝ ... ⌜⟧ construct should be avoided for much the same reasons.

Also, what in your syntax would be the corresponding way to express {{{ ast_arg @ Q::ParameterList }}} or — using the newer as-yet unimplemented syntax — unquote(Q::ParameterList @ ast_arg)?

^1 If there must be a keyword, perhaps AST or to-AST or to-ast or toast rather than quasi?

One good thing about this footnote: I finally understood why the heck you proposed toast as a keyword back in your advent post comment. I guess what I'm saying is I personally didn't find "toast" as a term very transparent. 😄

@raiph
Copy link

raiph commented Jul 15, 2016

[quasi is hygienic because it has closure semantics]
[this motivates use of { ... } syntax]

I consider these good arguments for using { and } as the delimiters of a quasi.

However, I was thinking that you intend to also support quasis that:

  • represent a block
  • allow unhygiene

Afaict supporting these, even if they're a much less common use case, would be good arguments for not using { and } as the delimiters of a quasi.

[is "quasi"] worse than using enum for enumerations or subset for subtypes or given for switch statements?

Yes, very much so according to google, wikipedia, and my English language sensibilities.^1

Would it help if I reformulated the above as "a neat way to build code, some of comes in as parameters"?

The reformulation isn't helpful to me because I understand quasi quotation.

But yes, I think that reformulation is much closer to what mere mortals will get.

It also happens to coincide with the code suggestion that I cover at the end of this comment.

The Lisp-y backquote and the comma would have much bigger claims to quasiquotation syntax than ⟦ ⟧.

Sure. But:

In Perl 6, the backquote has been reserved for DSLs, and the comma is used all over the place.

So that nixes backquote and comma.

what in your syntax would be the corresponding way to express quasi @ Q::ParameterList { x, y, z } ?

I don't know. I'll respond to that in another comment.

What's your use case for having more brackets?

It's pretty weak.^2 Let's move on.

||| 42 + 99 |||

is not going to fly ... (Not by some hardcoded rule specified somewhere, granted, just... not.)

Fair enough. That was one of a couple of especially wild suggestions. :)

You're proposing using a quasi-quotation symbol (from mathematical logic) to stand in for unquotes?

Well, as I said, I'm strawman proposing. This is part of exploring what I see as problems with planned Perl 6 syntax. It didn't feel right to simply complain without strawman proposing something else.

But, yes, I see the mathematical usage to be so near identical in concept to quasi-quoting in programming languages that use of the same syntax seemed attractive and I see unquotes as basically the inverse of quasi-quoting.

That said:

say $onething, ⟦⌝ $ast-arg ⌜⟧, $anotherthing;

I think that falls under "you think it's cute today". More specifically/constructively: we could have played around with outwards-facing brackets for ranges, like some mathematical textbooks do ...
But we didn't ... I think a ⟦⌝ ... ⌜⟧ construct should be avoided for much the same reasons.

Fair enough. This was the other especially wild suggestion. :)

Also, what in your syntax would be the corresponding way to express {{{ ast_arg @ Q::ParameterList }}} or — using the newer as-yet unimplemented syntax — unquote(Q::ParameterList @ ast_arg)?

I'll defer a response to that for a later comment.

^1 If there must be a keyword, perhaps AST or to-AST or to-ast or toast rather than quasi?

I guess what I'm saying is I personally didn't find "toast" as a term very transparent. 😄

Indeed. That was part of my point.

I still think I'd find it easier to explain my mostly tongue-in-cheek suggestion "toast" to a mere mortal than "quasi". (To a small degree it was not entirely tongue-in-cheek; I thought it could perhaps be spelled to-AST or somesuch.)

Anyhow, I've since come up with another strawman proposal:

code [[ x + 5 ]]

That is, the keyword code followed by any amount of whitespace, followed by [[ without whitespace, code, and then matching ]].

Of course, the [[ could be misunderstood by a human as the opening of a literal array in an array, in a manner similar to { being misunderstood by a human as being the opening of a block or hash. But I like that it allows for as a Unicode equivalent. (I was surprised you complained that is for math quasi quotes, not programming quasi quotes, and have not yet let go of the possibility that you will have a change of heart so that you see this correspondence as a positive.)

And for unquotes I strawman propose:

code [[
    x + {{ y }}
]]

If a user actually means to write {{ ... inside a code [[ ... ]] to mean the start of a hash in a hash or a hash in a block or whatever they must disambiguate by adding whitespace between the braces.

If you really don't like the above, please consider waiting a week or two before replying. :)


^1

Data For me a google for "enumeration" yields 100+ million matches. "commonly used in mathematics and computer science to refer to a [ordered] listing of all of the elements of a set.". Several mainstream languages use enum as their enumeration declarator keyword. (I recall you referring to confusion if we use "enum" to refer to both the overall declaration and individual elements, and I concur. But that's a different issue.)

Conclusion? "enumeration" is a common English word. It has exactly the right meaning for a programming language enumeration. enum is a common choice as a programming language enumeration declarator. Using the same keyword in Perl 6 makes sense.

Data For me a google for "subset" yields 40+ million matches. "a part of a larger group of related things. ... a set of which all the elements are contained in another set." In Perl 6 a subset is a particular kind of subtype, one that is guaranteed to only be a subtype by virtue of being a subset of the allowed values (so a subset never changes any operations).

Conclusion? "subset" is a common English word. It has exactly the right meaning for a subtype that's a subset. Using the keyword subset makes sense.

(Likewise "given" is also a common English word whose meaning is well suited to the use made of it in Perls.)

Data For me a google for "quasi" yields 200+ million matches. "seemingly; apparently but not really. ... being partly or almost.". So, a popular word, used more than enumeration and subset combined, but with a meaning that isn't at all suggestive of what a quasi quote is when it's used without the word "quote". And the #1 result when I search for the highly specific phrase "quasi-quotation" is the wikipedia page on "quasi-quotation" which all but ignores programming language use of the concept.

Conclusion? "quasi" is a common English word. It does not remotely hint at what it might mean in Perl 6. Even the relatively obscure term "quasi quotation" is a poor fit. I was hugely surprised by your suggestion that the keyword choices "enum" and "subset" are of about the same quality as "quasi". Hopefully the foregoing has opened your mind about this bit of bikeshedding. You may be in love with "quasi", and it makes sense in 007, but I would like to think that Perl 6 macros will aim at avoiding unnecessarily weird color schemes.

^2 Assuming a Texas variant of quasi quoting existed, and that it was several [ and matching ], it might entail confusion if the contained expression also used the same bracket characters (human confusion, not parser confusion) so it might be nice to allow users to use as many multiple brackets as they liked.

masak pushed a commit that referenced this issue Jul 17, 2016
This also starts making use of an optional parameter to the
`unquote` rule, passing along the expected type in the context
so that we can fail earlier, inside of the rule instead of
outside it.

This will go some way to getting out of the quagmire described in
#30 (comment) --
it was enough for Q::ArgumentList, for example, which clashes with
term position -- but I'll be quick to note that it doesn't
address the problem at its root. Notably, when we do hit the type,
at the end of the unquote, we've already had who-knows-how-many
side effects along the way from the stuff inside that we parsed.

In other words, something like `unquote(qtype @ expr)` would
still be beneficial, maybe even necessary, in order to finish up
issue #30.
@masak
Copy link
Owner Author

masak commented Sep 25, 2017

Since Q::type { ... } would occur in term position, that'd collide with the syntax for regular terms: Q::type. Of course we could special-case that syntax by looking ahead for a block after the Q::type thing. We've been down that path before — the syntax for new Q::Identifier { ... } used to be Q::Identifier { ... } in 007. See #147. Nowadays I consider keywords like new and quasi to be sensible "guards" in the grammar that serve to disambiguate the parsing rule early so that there's less ambiguity.

Putting Qtypes in the Q:: hierarchy is just a convention. Someone could do class A::B::C::DefyingConvention extends Q and it'd be a Qtype. So any identifier we parsed we'd have to do the lookahead.

Lastly, though I seem to be in a minority in this thread to do so, I kind of like the keyword quasi and its connotations. (:wink:) It stays for now.

@masak
Copy link
Owner Author

masak commented Sep 25, 2017

Oh! Here's another thing I wanted to mention. Besides the slogan

Quasi quotes are not 007, they're a template.

that I mentioned above, I would also like to champion the slogan

Quasiquotation is a slang.

The two features that I usually associate with slangs fit well with quasiquotes:

  • They return a value out to their parent environment.
  • They allow mixing in "blocks" of the parent environment.

But there's a third one too that I hadn't kept at the front of my attention but that seems blindingly true with quasis:

  • They have different runtime semantics; essentially a different "runloop".

In the case of quasis, the runloop is simply a sub that evaluates to a Qtree.

I don't know if it means that all slangs are templates... I guess time will tell.

@raiph
Copy link

raiph commented Sep 30, 2017

FYI, I just began reading A Guide To Parsing; in it, Federico explains how "In a way parsing can be considered the inverse of templating".

@masak
Copy link
Owner Author

masak commented Oct 2, 2017

Well, yes — parsing extracts extrudes a (possibly nested) model from flat text, and templating produces flat text from a (possibly nested) model.

But also, in subtle ways, no... which is why Frederico writes "in a way", I guess. 😛 For one thing, spooling a flat structure from a deep model is algorithmically very different from deriving a deep model from something flat. There's a fair amount of guessing/backtracking when you do the latter which is missing from the former.

The one point that stands out from skimming the document is that Perl 6 (and 007) are languages such that scannerless parsing is the only option. That's because "where the lexer ends and the parser begins" becomes such an overriding issue that scannerless becomes the only reasonable option. And that's because "you have to know which language you're in" in order to even do lexing properly — in other words, lexing is tied up with and dependent on the parsing up to that point. Not all languages are like that, but it feels reasonable (maybe inevitable) that a syntax-extensible language with macros and slangs be like that.

I liked the text (having skimmed it). Thanks for throwing it my way. 😄

@masak
Copy link
Owner Author

masak commented Oct 2, 2017

And also, yes, quasis (being templates) are the "inverse" of parsing in the sense that when you stuff a string into an EVAL call in Perl 6, the compiler has to parse that string as a new compunit... but when you put something in a quasi, it's being parsed (to the extent possible) along with the rest of the program, and so you've "saved" a parse step at runtime compared to EVAL.

@masak masak mentioned this issue May 26, 2018
@raiph
Copy link

raiph commented May 29, 2018

Brainstorm incoming. I tried to read the entire thread before commenting but some dude raiph hijacked the thread with bikeshedding and I lost heart wading thru it all. I hope I'm not about to do repeat his mistake but we are all one after all so I apologize if there ain't that much difference between the two of us.

Also, I'm pretty sure the following will be completely invalid and/or stupid in multiple places, but I'm just going to spew it out rather than pause to worry about that because that can kill the flow and you were gracious about his stuff, so why shouldn't I just post what I've got right now?

#Strawman proposal 1

Macros are their own slang, analogous to the way rules are. The macro slang is almost identical to the main language except it introduces a couple critical tweaks. It's essentially a template language where most of a typical template uses exactly the same syntax as the main language.

Macros with unquotes must initially mention their unquotes (with possibly one exception which I'll get to) as either parameters of the macro in a parenthesized list at the start, or via variable declarations.

Each such mention of an unquote / parameter must use a prefix which I'll call a sigil. I'm thinking symbols like $, @, &, etc. -- I'm sure you get my drift. The use of unquotes must use the same identifier as the initial mention, including the sigil.

So, for example:

macro foo (infix &infix, term $foo) {
    my $a = 42;
    quasi { $a &infix $foo }
}

The infix, term etc. must be, or be subtypes of, the AST type in the main language that correspond to named grammatical categories in the macro slang (and also, quite plausibly, in the main language too).

#Strawman proposal 1.1

The default type for macro parameters depends on their sigil:

  • $ sigil ---> term;

  • & ---> callable (not committing to whether it's a listop, infix, or postfix);

  • @---> code -- a list of AST fragments;

  • % ---> ???

#Strawman proposal 2

Replace use of the word quasi with [[ ... ]]. This would be a useless doubling of the square brackets in the main language so is arguably available for Perlish repurposing. {...} declares a fragment of the main language, including the braces. These mean different things:

macro foo [[ my &b = { my $a = 42 }; my $a = 99 ]]
macro foo { my &b = [[ my $a = 42 ]]; my $a = 99 }
macro foo [[ my &b = [[ my $a = 42 ]]; my $a = 42 ]]
macro foo { my &b = { my $a = 42 }; [[ my $a = 42 ]] }

The unicode characters ⟦ ⟧ may be used as an alternative to [[ ]].

#Strawman proposal 3

Let the main language parse anonymous [[ ... ]] as IIMEs -- immediate inline macro expressions. And perhaps -> $foo [[ ... ]] as shorthand for macro ($foo) { ... [[ ... ]] }.

Am I completely crazy?

@masak
Copy link
Owner Author

masak commented Jun 23, 2018

Hi, sorry for dropping the ball on this one. Will attempt a reply now.

@masak
Copy link
Owner Author

masak commented Jun 23, 2018

Here's hoping I manage to write this without coming off as exceedingly nitpicky or grumpy. 😄 @raiph, the summary is that you are addressing issues which (I've come to understand) are important in your mind, but not so highly prioritized in mine. (I'm still glad that you're inventing in this space, don't get me wrong. Keep it coming!)

Because I know that summary is not nearly sufficient, let me list some specifics:

  • I'm kind of happy I have managed to keep sigils out of 007. I'm not hating on Perl's sigils as such, I think they're neat. It's more like they are the kind of extra convenience that 007 eschews in the name of simplicity. It gets that after Python, I guess.

  • TimToady also suggested a sigil for macro arguments, instead of unquotes. He suggested one sigil (¤) instead of three-four, but besides that, his suggestion reminds me of this proposal. I'll write now what I wrote then: it introduces sigils without addressing the challenges the parser actually has with unquotes. As such, it feels more like handwaving than a solution.

  • To be specific, this proposal hopes that the infix keyword in the parameter list of the macro will propagate to make the &infix variable parse correctly as an infix. (Correct me if I'm wrong.) I fail to see how this is better than the current status quo, providing that information to the parser locally where it needs it in the form of a typed/"categorified" unquote: {{{ Q::Infix @ infix }}}.

  • Any solution where information needs to be propagated along the AST in order for the parser to do its job, makes it that much harder to do the parser by deliberately mixing typechecking into the mix. Besides being technically harder, it just feels intuitively odd to me. In Java when you compile, first you get a bunch of syntax-related errors because the compilation unit doesn't parse cleanly. Once you've fixed those, you start getting type errors. In XML, first the parser needs the XML to be well-formed, and only then can we even talk about it being valid. A non-well-formed document is not even invalid; it's a categorical error to talk about its validity. Similarly (I feel), a computer program should be syntactically correct before we start analyzing its types. Sorry for repeating myself, but I feel this just as strongly as last time. I hope I at least expressed it more clearly this time around.

  • The proposal eradicates the distinction between unquoting/using and mentioning an AST fragment inside of a quasi. (It's been an uphill battle sometimes to defend this as a feature, not a bug. See examples/name.007 for a case of where we unabashedly mention rather than use. I'd hate to see that go.)

  • Introducing sigils for macros makes the solution inherently non-portable into Perl 6, since Perl 6 already uses sigils for indicating container type families.

  • As someone who still has no problem at all with the term "quasiquote" or the keyword quasi, the suggestion of [[ ]] and ⟦ ⟧ comes off as a rehash of previous suggestions. (Which I dismissed back then.) But let me give my main beef with it: double square brackets are not an unlikely-enough syntactical occurrence in macro-free programs. They happen when you want to write a 2D matrix, or a table of stuff, or a chess board... "We can't just waltz in and mess up the main language in the name of macros and quasis." Cf. masak's Deplorable Truth About Templating Syntax — quasis and unquotes are ugly because they need to stay far apart from the main language.

  • I don't see the double square brackets [[ ]] being chosen for any reason other than its (perceived) availability. The curly braces in quasi { }, on the other hand, are consistent with the fact that the quasi is a block and a scope. Braces delimit scopes in 007/Perl; brackets don't.

  • Similarly, using braces { } to mean "take us back outside of the quasi slang" is ruled out because braces are commonly used for all kinds of blocks inside the quasi'd code.

  • Not sure we need immediate inline macro expressions for anything, but the suggested syntax is inconsistent with a long tradition in Perl and other languages of being able to write out an abstraction without it being automatically applied.

That's it. Again, sorry if this comes off as just finding flaws — I hope I'm getting across that the above are not so much reactions from me as they are from the realities of syntax, parsing, compilation, and Perl. (And, likely as not, my own view of things is limited too and there are adequate responses/workarounds to what I consider problematic.)

@masak
Copy link
Owner Author

masak commented Oct 3, 2018

Although I haven't much shifted my stance on whether "quasiquote" (and its current keyword/syntax) is easy to understand or optimal...

...I realized one thing the other day that sets 007 (and Perl 6) apart from Lisp: in 007 (and Perl 6) there's quasiquotation, but no quotation. (See here for a quick explanation of quotation and quasiquotation in Lisp.)

We don't have a construct, analogous to quasi, for quoting code. The closest we have is a quasi without any unquotes in it.

Again, this doesn't much change my stance as to the appropriateness of these terms in 007 and Perl 6... but it does explain to me parts of what people (maybe?) find distasteful about the term "quasiquote" in 007 and Perl 6.

@masak
Copy link
Owner Author

masak commented Nov 13, 2020

So, strawman proposal, use bracketing that includes a reversal of the ⌜ and ⌝ symbols:

[...]

@raiph, I'm not sure I said it clearly back in 2016, so let me say it now: reversing the quotation symbols for unquotes strikes me as an idea that really has the heart in the right place — they're called "unquotes", after all, and my working understanding of what they do semantically is that they decrement the quoting depth — it's just that I don't believe the execution of the idea works.

Or, to be more precise, I can't think of one example in the wild where it has worked. The reason, I think, is simple: it runs against the grain of what parsers are typically able to deal with. Squint hard enough, and you see not just parser technology adapting to what language syntax needs, but also the converse — it's not that we couldn't produce a parser that handled [...)-style ranges, or ⌝...⌜-style unquotes, it's that it would break too much on the way. Maybe part of the "break too much" is in the parser itself, and part of it is in the developer unconsciously emulating the parser, or the aggregated ecosystem of developers collectively parsing the code.

@masak
Copy link
Owner Author

masak commented Dec 24, 2020

...I realized one thing the other day that sets 007 (and Perl 6) apart from Lisp: in 007 (and Perl 6) there's quasiquotation, but no quotation. (See here for a quick explanation of quotation and quasiquotation in Lisp.)

Since I wrote the above, I've also come to realize that Scheme/Lisp quotation (and quasiquotation) is much worse for hygiene than Raku's/Alma's variant. (I've come to this realization after dipping into the two Kernel/vau papers, on which I hope to write several comments, soon.) I don't like to tease without providing details, so here is a brief summary: in Scheme/Lisp, lookup of a symbol happens entirely at runtime, whereas in Raku/Alma, lookup has a static part and a dynamic part; this is what leads on to unique variables and a stronger guarantee of hygiene.

Hope to write more on this soon.

@masak
Copy link
Owner Author

masak commented Dec 24, 2020

@raiph I just found the term "code quotes" in this paper, and found it sympathetic. From what I can see, these are actual quasiquotes (in Java), but the name is friendlier, and reminded me of your code keyword proposal above.

I don't see that I ever addressed using code as a keyword. (Though this issue thread is notoriously big, so maybe I did but can't find it.) It's fine, I think — I'd say it has merits on par with quasi in terms of how well it describes intent. Maybe if we imagine a group of relative beginners, code will look a lot friendlier and more disarming to them; if we imagine a group of grizzled Lisp vestigials veterans, quasi will similarly look natural as a keyword, but it will scare the beginners. (Whether this last aspect is seen as a positive or a negative depends, I guess, on whether one wants to put macro technology in the hands of beginners.)

I think in a parallel universe we could have easily gone with code from the start. As it is, S06 decided on quasi as a keyword, and there doesn't seem to be a compelling reason to switch.

Of course, since Alma is meant to be able to extend any and all syntax, code is still not out of the question as a syntactic extension... 😁

@raiph
Copy link

raiph commented Feb 11, 2021

Hi Carl, belated happy 2021. :)

I just noticed a bunch of comments in this repo in recent months that I'd previously missed in issues I'm mentioned in.

One thing I'm unclear about and need to resolve is whether including asides intended to be about the macros Raku eventually gets, not alma, are inappropriate here. In this comment I will presume they're OK if explicitly noted as such. PLMK if you'd rather I just dropped them altogether.


... it's not that we couldn't produce a parser that handled [...)-style ranges, or ⌝...⌜-style unquotes ...

You'd already said as much back in 2016, and I had replied with:

Fair enough. This was the other especially wild suggestion. :)

And now, in 2021, my thinking is... fair enough. :)


Scheme/Lisp quotation (and quasiquotation) is much worse for hygiene than Raku's/Alma's variant

Aiui the intent was that Raku's macros should default to hygiene so that sounds good.

Aiui the intent was that Raku's macros should also be able to be unhygienic. So there's that too. Though I'm sure that could reasonably be punted on for now / this decade.

Does quasi { foo } represent an ast that is a lambda whose body of code is foo? Or an ast that represents the compiler's evaluation of foo (either the symbol foo or a call of &foo)? And if it's the latter, does one write quasi {{ foo }} to get the former?

(And some more suggestions for Raku paint -- not alma. Maybe quasi [[ ... ]] / quasi ⟦ ... ⟧ would be an option for unhygienic quasis?)


I just found the term "code quotes" ... I'd say it has merits on par with quasi in terms of how well it describes intent.

I agree. Actually I'd say quasi is markedly better than code or codequotes.

One huge thing in favor of quasi is that you like it. Alma is primarily your and ven's adventure/story, and Raku's jnthn's, so you folk quite rightly get to choose the name of principal characters. And it really is very much a paint issue about a bikeshed, not a nuclear power plant ("Perl 6"!).

But quasi is also just better all round, than code. Imo. Here in 2021.

That said, as an incurably cheeky bit player I now strawman suggest yet another color that seems appealing as I write this sentence: template.

@masak
Copy link
Owner Author

masak commented Feb 11, 2021

Hi Carl, belated happy 2021. :)

Happy 2021, and (a not belated) 新年快乐 in the year of the ox! Happy 牛 NIU year 😆

I just noticed a bunch of comments in this repo in recent months that I'd previously missed in issues I'm mentioned in.

Well, this thread is funny — it started out as a small checklist to do an ostensibly simple thing (to make HN happy, more than five years ago), but it quickly ran into a technical hurdle which I will now describe:

Let's say there's an unquote in a quasi: {{{ myAst }}}. Parsing this unquote isn't the problem, but let's say there's a plus symbol after it: {{{ myAst }}} + .... Now if myAst contained a term of some kind, we should parse that plus as infix:<+>. But if it contained, I dunno, a statement, then the plus should be a prefix:<+>. When in doubt, we fall back on our deeply held principles, and in this case Raku has one: always know what language you're in. In practical terms, it means we can't just leave the situation like this — something is missing so that the parser can always know what language it's in — because the value of myAst isn't available yet; it comes in "a later phase", when we macro-expand, and the parser needs to know now.

The resolution to that, which basically happened in and around this issue, is the {{{ Q::Term @ myAst }}} syntax. (And {{{ Q::Statement @ myAst }}}.) Because expressions are by a long margin the thing we're likely to pass as ASTs, we take that as the default; if you have a non-expression, you need to specify its grammatical class so that the parser still knows which language it's in after the unquote.

I'm not married to this syntax which kind of introduces a new (pseudo-)operator @ which may or may not fly in Raku... but that part feels like a small detail compared to the victory of having a solution at all.

Unfortunately, just outlining the solution doesn't make it so; the implementation of this has stalled on exactly how to make that happen within the limits of a recursive-descent parser, and so I've been taking a yak-shaving tour towards exploring alternate parsers that would be up to the job (#293). (As I type this, I'm thinking "surely it can't be that tricky? All I need is to hardcode multiple rules for the different types of unquotes...". I might give it a go, if for no other reason than to remind myself why it was indeed that tricky.)

As this thread got stuck like a gazelle on the savanna, it became easy prey for discussions about quasiquoting, possible syntaxes, and philosophy/etymology. I would like to stress immediately that this has been one of the most valuable/enjoyable Alma issues for me, if not the most valuable/enjoyable. Simply because it has generated so many thoughts and ideas and bravely put them up for scrutiny — as well as involving basically all the people who have ever had an active interest in Alma. It's suitable, if unquotes are somehow at the core of what Alma is and wants to be, for this issue to also be the core of the repository's activity. I have not minded at all the fact that often the solutions proposed were for problems that I already considered solved, or not in dire need of solving. 😄 It is right to call some of the discussion "bikeshedding", yes, but that term to me was never all that negative; it's about finding smaller optima within larger ones.

One thing I'm unclear about and need to resolve is whether including asides intended to be about the macros Raku eventually gets, not alma, are inappropriate here. In this comment I will presume they're OK if explicitly noted as such. PLMK if you'd rather I just dropped them altogether.

They are not just OK, but encouraged. My stance on Raku these days is... complicated, but I still like and believe in the language. Although Alma (née 007) is not exclusively about supplying macros for Raku (née Perl 6), that charter is still in play. (Concretely, it's less about "yo Raku, you should use the @ syntax to indicate grammatical category in unquotes!" and more about "yo Raku, in our explorations we discovered that a mechanism is necessary for indicating grammatical category in unquotes!". If you see what I mean. Just like Perl 5 and Raku learn from each other in ways that are primarily non-syntax-based, so do Alma and Raku.)

Scheme/Lisp quotation (and quasiquotation) is much worse for hygiene than Raku's/Alma's variant

Aiui the intent was that Raku's macros should default to hygiene so that sounds good.

I still haven't dropped the other shoe on the above teaser, but it's happening slowly over in #302 (another @raiph thread, and strong contender to "most interesting discussion in the repo", at least I think so).

But saying things many times is probably good, so I'll try in a partial, most likely flawed way now: in Raku (and Alma), parsing goes as far as establishing the definition of a use there and then. That is, if you have

my $x = ...;

# ... (no intervening variable declaration)

... $x ...

The parser knows enough when it sees the use $x to say "yup, that's with 100% absolute certainty the variable that I declared back in my $x".

Not so Scheme. In Scheme, a symbol is just a symbol, and so the parser sees an x and goes "huh, an x". And that's it. No reasoning at that point about uses and definitions.

(I know, it's surprising! Scheme is known for its hygiene.)

The upshot is that, when time comes to expand a hygienic macro, what Scheme needs to do is a massively complicated (and costly) cover-all-bases alpha-rewriting of everything. John Shutt described this process as something he made sure he understood well enough for his thesis work, but then promptly forgot about soon after. The point is that, because the parser does the bare minimum, all of the hygiene work has to happen during macro expansion instead. In Raku/Alma, the parser quite naturally does more, and so the later hygiene work during macro expansion becomes less.

Aiui the intent was that Raku's macros should also be able to be unhygienic. So there's that too. Though I'm sure that could reasonably be punted on for now / this decade.

"Unhygienic" could refer to two things.

  • The first one is accidental capture. It's what happens if names declared by the macro author affect code written by the macro user, or vice versa. If this problem didn't exist, or wasn't something we felt like solving, then macro expansion could be simple C-like textual/token expansion, and Alma as a language wouldn't have been needed. But the problem does exist, and we feel like solving it. We model this on a similar guarantee from Algol's block scope: what's declared in the block scope stays in the block scope. With macro expansion this is harder, because code gets physically copied into remote parts of the AST, and so lookups can become "illexical"; a use no longer finds its definition by climbing a bunch of scopes outwards. Now... I don't think we'll ever want to "be able to be unhygienic" in this sense, because (as the name implies) it's an accident, not a feature. I won't rule out that there could be a use case when people would actually want this, though. One weak bit of evidence is the stance espoused by Paul Graham and others, that Scheme's hygiene-by-default has a higher "total cost of ownership" than Common Lisp's gensym-when-you-need-to.

  • The second one is anaphora (@vendethiel's term, borrowed from Paul Graham in "On Lisp", and also used in Bel a bit): the willful insertion of a name into the macro user's code. (See Give for loops an implicit topic variable _ #478 for a Raku-inspired example.) We do want this one, we know we do, and the only thing that's missing is basically an implemented mechanism for it. It'll probably be something like the conjectured parser.declare call in Have a mechanism to allow arbitrary (undeclared) identifiers in expressions #159. (That is, the macro code has a way to indicate in the quasi that a new variable is declared.) (But see below, I kind of realized something.)

So, not much need for punting, I think. Like, 90% of "unhygienic" isn't even a feature, and the remaining 10% we have a plan for and could prioritize as high as we want.

Does quasi { foo } represent an ast that is a lambda whose body of code is foo? Or an ast that represents the compiler's evaluation of foo (either the symbol foo or a call of &foo)? And if it's the latter, does one write quasi {{ foo }} to get the former?

Since this question follows directly on talks of hygiene, I'm assuming you feel it's making a connection to hygiene. But I... don't quite see it. I'll try to answer anyway. There are two ways to think about what a quasi expression represents. The first is that a code level, it's more of a "template" or (if you will) a function waiting for some parameters that it'll use to build an AST. The second is that whenever you evaluate a quasi expression (at "runtime", which may well be at macro expansion time during compile time), the result you get is an AST. The template/function interpretation is "internal" to the implementation, and all we ever observe is the AST result.

In either case — regardless of which interpretation you pick as "the" interpretation — it's the case that foo written inside the quasi expression always refers to the foo that's in play at that point. (That's hygiene.) For it to be something else, you'd need to have anaphorically introduced some other name using parser.declare.

I just realized something. The above means that parser.declare probably isn't a suitable solution here. The reason is that it looks too much like a regular method call, but what we're doing here is introducing a name, including inside the quasi. We have two audiences when introducing the name: the mainline/injectile, where the name is to be actually used later in practice, but also the quasi itself, where the parser needs to know which language it's in, and so it needs to share the knowledge that a variable has been injected there. So the syntax for that can't just be parser.declare, it needs to be something more declaration-y and macro-y, like inject my foo = ...;.

(And some more suggestions for Raku paint -- not alma. Maybe quasi [[ ... ]] / quasi ⟦ ... ⟧ would be an option for unhygienic quasis?)

Well, first off, see all of the above answers.

Second off, I'm still trying to follow the initial design set out by S06. (Yes, seemingly years after everyone else stopped caring about the synopses. 😄) What it says about this is that you can choose your delimiters as you please for quasi, and it'll keep working the same. That is, {} delimiters are the default, but if you choose something else, you don't lose hygiene. (It doesn't say this explicitly, but I wouldn't say it leaves room for other interpretations either.)

Also, when we say hygiene is so nice we'd like it to be the default (in spite of Common Lisp people's contrary opinion), I'm wondering if dropping hygiene just by switching delimiters doesn't make it a little bit too easy to drop hygiene. At the very least, I'd like to see some other reason to want to drop hygiene (besides anaphora) before I go and make it super-easy.

One huge thing in favor of quasi is that you like it. Alma is primarily your and ven's adventure/story, and Raku's jnthn's, so you folk quite rightly get to choose the name of principal characters. And it really is very much a paint issue about a bikeshed, not a nuclear power plant ("Perl 6"!).

But quasi is also just better all round, than code. Imo. Here in 2021.

I do like it. And via S06, either Damian or Larry liked it many years ago, and I enjoy the feeling of respecting that, too. In fact, if you take a huge step back and imagine a whole host of activities where you keep some parts of a quotation literal and allow some parts to be interpolated — Python f-strings, ES6 template strings, SQL prepared statements, Lisp quasiquotes — I think a good term for that general activity might very well be "quasiquotation".

That said, as an incurably cheeky bit player I now strawman suggest yet another color that seems appealing as I write this sentence: template.

The problem with that is that it focuses on the mechanism and not on the result, which is usually a mistake when naming things in programming languages. Consider if instead of function JavaScript had called it function-definition. (Which is correct, but unhelpful.) Or instead of try we had code-that-might-throw. The thing that comes after the quasi keyword is a template, yes... but the value it generates is an AST.

Now, arguably, the term "quasi" commits exactly the same mistake, focusing on the mechanism and not the result. But while template seems to focus only on the mechanism — the templating itself, as it were — the connotations of quasi feel to me as if they are also about the result we get out. This is very vague and subjective, I know, but nonetheless a felt difference.

(A famous and widespread example of "keywords that talk about the mechanism" is switch. I seem to recall Apocalypse 12 calls this out, even, as it introduces the given construct. A recent accepted PEP calls its new switch statement keyword match.)

@masak
Copy link
Owner Author

masak commented Sep 18, 2021

[...]

Of course, since Alma is meant to be able to extend any and all syntax, code is still not out of the question as a syntactic extension... 😁

I'd like to solicit a macro code that does this. It should be is parsed (according to one of the many iterations of is parsed floating around), and code { ... } should desugar to quasi { ... }.

Something tells me that that's not as easy as it might sound, because quasi is "special". But being able to implement code convincingly in Alma would be a nice milestone, I think.

@vendethiel
Copy link
Collaborator

vendethiel commented Sep 18, 2021

Something tells me that that's not as easy as it might sound, because quasi is "special". But being able to implement code convincingly in Alma would be a nice milestone, I think.

I don't think that's possible in (most) Lisps, FWIW. You can't define something like ,, ,@ or ` by yourself.

@masak
Copy link
Owner Author

masak commented Sep 18, 2021

I don't think that's possible in (most) Lisps, FWIW. You can't define something like ,, ,@ or ` by yourself.

Bel defines those in terms of Bel: syn \` and syn \, for the reader and mac bquote for the evaluator. All of them in userland.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants