Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement quasi unquotes #30

Open
masak opened this issue Oct 10, 2015 · 60 comments
Open

Implement quasi unquotes #30

masak opened this issue Oct 10, 2015 · 60 comments
Labels

Comments

@masak
Copy link
Owner

@masak masak commented Oct 10, 2015

Hacker News wants unquotes. We happily oblige.

Unquotes in expressions

Whenever the parser is ready to parse a term, it should also expect an unquote.

quasi { say("Mr Bond!") }
quasi { say({{{greeting_ast}}}) }

Technically, I don't see why we shouldn't expect the same for operators. But we get into the interesting issue of what syntactic category it is.

Screw it, I'm tired of theorizing. Let's just steal the colon for this.

quasi { 2 + 2 }
quasi { 2 {{{infix: my_op}}} 2 }

quasi { -14 }
quasi { {{{prefix: my_op}}}14 }

quasi { array_potter[5] }
quasi { array_potter{{{postfix: my_op}}} }

Backporting this solution to terms, you could mark up a quasi as term if you want, but it's the default so you don't have to:

quasi { say("Mr Bond!") }
quasi { say({{{term: greeting_ast}}}) }

At the time of evaluating the quasi (usually macro application time), we'll have the type of the unquoted Qtree. The runtime dies if you try to stick a square Qtree into a round unquote.

But the parser can sometimes reject things early on, too. For example, this shouldn't even parse:

quasi { sub {{{prefix: op}}}(n) { } }

(That slot doesn't hold an operator, it holds an identifier.)

Unquotes for identifiers

007 currently has 5 major places in the grammar where it expects an identifier:

  • Usages
    • Terms in expressions (but that's handled by the previous section)
    • is <ident> traits
  • Declarations
    • my and constant
    • sub and macro
    • parameters in parameter lists (in subs and pointy blocks)

The traits one is kind of uninteresting right now, because we have four trait types. Someone who really wanted to play around with dynamic traits could write a case expression over those four. So let's skip traits — might reconsider this if we user-expose the traits more.

The three declaration cases are the really interesting ones. Notice that each of those has a side effect: introducing whatever name you give it into the surrounding lexical scope. (Handling that correctly is likely part of the #5 thing with Qtree strengthening.)

I would be fine with the {{{identifier: id}}} unquote accepting both Q::Identifier nodes and Str values. Q::Identifier is basically the only node type where I think this automatic coercion would make sense.

Unquotes in other places

These are the remaining things I can think of where an unquote would make sense:

  • Sequence of statements (Q::Statements)
  • Parameter list (Q::Parameters)
  • Argument list
  • Standalone block (Q::Block)
  • Trait (Q::Trait) — note: we should probably have a Q::Traits container, just like we do for statements and parameters
  • Individual statement (Q::Statement) — not sure what this one'll give us over sequence of statements, but it's possible it might be useful
  • Lambda — interesting; this one is in the grammar but not reified among the Q nodes
  • "Expr block" — ditto
  • Unquote (Q::Unquote) — mind explosion — note that this is only allowed inside a Q::Quasi
@masak
Copy link
Owner Author

@masak masak commented Oct 11, 2015

After some deliberation on #6macros, I've decided to make the syntax {{{my_op: Q::Infix}}} instead. And generally I think we ought to try conflating grammatical categories and Q types.

Conceptually, the typing is a run-time check. With #33 we can probably do better than that, and detect impossible things statically. The important thing is that the parser is no longer confused.

masak pushed a commit that referenced this issue Oct 11, 2015
Carl Masak
A quasi can be full of holes -- so-called "unquotes". An unquote
contains an expression that needs to evaluate to a Qtree, which
then gets spliced directly into the code in the quasi. It's
quite similar to qq string interpolation, but with code instead
of a string. The unquote is a way of "escaping" out of the literal
code and into the land of Qtrees.

As an example, these all mean the same thing:

    quasi { say("Mr Bond!") }
    quasi { say({{{Q::Literal::Str("Mr Bond!")}}}) }
    my s = Q::Literal::Str("Mr Bond!"); quasi { say({{{s}}}) }
    my t = quasi { "Mr Bond!" }; quasi { say({{{t}}} }

(The last one doesn't work yet, I think. It should, though.)

The quasis are replaced by actual Qtrees at *evaluation*, when the
runtime "hits" the quasi by evaluating an expression in which it
is a part. At that point, we run down the Qtree structure of the
quasi block, cloning everything we find (except trivial leaf nodes,
which are safe to re-use), and replacing every unquote we find with
its expression (hopefully a Qtree). That's what the new .interpolate
method is for.

Addresses #30. Still a lot to do.
vendethiel added a commit to vendethiel/007 that referenced this issue Oct 20, 2015
A quasi can be full of holes -- so-called "unquotes". An unquote
contains an expression that needs to evaluate to a Qtree, which
then gets spliced directly into the code in the quasi. It's
quite similar to qq string interpolation, but with code instead
of a string. The unquote is a way of "escaping" out of the literal
code and into the land of Qtrees.

As an example, these all mean the same thing:

    quasi { say("Mr Bond!") }
    quasi { say({{{Q::Literal::Str("Mr Bond!")}}}) }
    my s = Q::Literal::Str("Mr Bond!"); quasi { say({{{s}}}) }
    my t = quasi { "Mr Bond!" }; quasi { say({{{t}}} }

(The last one doesn't work yet, I think. It should, though.)

The quasis are replaced by actual Qtrees at *evaluation*, when the
runtime "hits" the quasi by evaluating an expression in which it
is a part. At that point, we run down the Qtree structure of the
quasi block, cloning everything we find (except trivial leaf nodes,
which are safe to re-use), and replacing every unquote we find with
its expression (hopefully a Qtree). That's what the new .interpolate
method is for.

Addresses masak#30. Still a lot to do.
@masak
Copy link
Owner Author

@masak masak commented Oct 23, 2015

I've thought some more about it and I think that the syntax should be {{{my_op @ Q::Infix}}} instead of with the colon. Yes, Ven++ was an influence here.

But what finally brought me around is that we'll likely end up with a syntax like quasi @ Q::Trait { ... }, and there it feels like @ matches better. (And the two should definitely be the same symbol, and we could read it as "quasi as trait".)

We still kind of get the strange consistency with the colon even when it isn't a colon. And it's kind of nifty that it's actually another symbol, too. People could go "ah, it's like types, but for parsing".

@masak masak mentioned this issue Nov 6, 2015
2 of 2 tasks complete
@masak
Copy link
Owner Author

@masak masak commented Dec 28, 2015

I started keeping a "quasi checklist", a list of the individual parsing modes a quasi term should be able to handle. Might as well keep it here in the issue, rather than offline.

  • Q::ArgumentList
  • Q::Block
  • Q::CompUnit
  • Q::Expr
  • Q::Identifier
  • Q::Infix
  • Q::Literal
  • Q::Literal::Int
  • Q::Literal::None
  • Q::Literal::Str
  • Q::Parameter
  • Q::ParameterList
  • Q::Postfix
  • Q::Prefix
  • Q::Property
  • Q::PropertyList
  • Q::Statement
  • Q::StatementList
  • Q::Term
  • Q::Term::Array
  • Q::Term::Object
  • Q::Term::Quasi
  • Q::Trait
  • Q::TraitList
  • Q::Unquote

Note that this is a checklist for quasis, not for unquotes. Will need a similar checklist for unquotes, which I deferred until after this for reasons I don't remember now.

@masak
Copy link
Owner Author

@masak masak commented Jan 6, 2016

Alright, now that that's done, let's keep a checklist for the unquote forms, too. Seemed to work pretty well.

  • Q::ArgumentList
  • Q::Block
  • Q::CompUnit
  • Q::Expr
  • Q::Identifier
  • Q::Infix
  • Q::Literal
  • Q::Literal::Int
  • Q::Literal::None
  • Q::Literal::Str
  • Q::Parameter
  • Q::ParameterList
  • Q::Postfix
  • Q::Prefix
  • Q::Property
  • Q::PropertyList
  • Q::Statement
  • Q::StatementList
  • Q::Term
  • Q::Term::Array
  • Q::Term::Object
  • Q::Term::Quasi
  • Q::Trait
  • Q::TraitList
  • Q::Unquote
@masak
Copy link
Owner Author

@masak masak commented Jan 6, 2016

I would be fine with the {{{identifier: id}}} unquote accepting both Q::Identifier nodes and Str values. Q::Identifier is basically the only node type where I think this automatic coercion would make sense.

(Syntax is now {{{id @ Q::Identifier}}} {{{Q.Identifier @ id}}}.)

Let's be conservative and wait with this until we see a clear use case for it. I think if we decide to go down this road, the really useful part would be the ability to auto-concatenate identifiers and strings into bigger identifiers, like so:

gensym_{{{q @ Q::Identifier}}}

That could be really nice — but let's scope that to be outside of this issue.

@masak
Copy link
Owner Author

@masak masak commented Apr 26, 2016

I'll just note that this issue is a little bit stalled because it's hit some conceptual difficulties.

When you're about to parse an expression, you might hit a prefix op or a term. Therefore, any of these is fine as an expression:

1 + 2
-42
{{{ast @ Q::Term}}} + 2
{{{ast @ Q::Literal::Int}}} + 2
{{{ast @ Q::Prefix}}} 42

The problem comes because we don't know which "parser mode" we should have been in until after the @ when we see the Q type. We want to preserve one-pass parsing, so that we don't re-trigger effects from parsing the expression before the @. (Let's say it wasn't ast but a sub or a macro with declarations and stuff inside.)

Perhaps it would be possible to "fixup" the parser state once we see the Q type. The parser assumed it was expecting a term (say), but now that we see Q::Prefix, we tweak it to instead expect a prefix. I have no clue how that'd be made to work.

Or, maybe the {{{expr @ qtype}}} syntax is inherently disadvantageous and should be changed. Something like qtype @ {{{expr}}} or @ qtype {{{expr}}} would work fine.

@masak
Copy link
Owner Author

@masak masak commented Jun 13, 2016

Nothing has happened on this front. I'm defaulting to suggesting qtype @ unquote(expr) as a way forward.

@vendethiel
Copy link
Collaborator

@vendethiel vendethiel commented Jun 13, 2016

Not sure how that allows one-pass parsing, though ?

@vendethiel
Copy link
Collaborator

@vendethiel vendethiel commented Jun 13, 2016

But I think that, inherently, having splicing in a syntaxful language means you can't keep one-pass parsing. Or you're going to have to patch holes by yourself.

One way that could look like would be...

token if { 'if' <cond> '{' <block> '}' }

token cond:splice { '{{{ Q::Cond @'  <var> '}}}' }
token cond:expr { <expr> }

token block:splice { '{{{ Q::Block @' <var> '}}}' }
token block:simple { '{' <expr> + %% ';' '}' }

That looks awful – because it is.

As we've already pointed out numerous times, this is not an issue at all in Lisp(s), because we're already done parsing when we do splicing.

(foo ,@(mapcar #'process xs))
;; literally parsed as (at least in CL)
(foo (unquote-splicing (mapcar #'process xs)))

So, it's "free" in the sense that it's not in any way special syntax. I think we've established that quite some time ago, but I think it's still worth pointing out.


Anyway, I'm not sure how a solution featuring "one-pass parsing" would make sense, let alone look like. Maybe we could get away with only a lot of backtracking? Even then, I'm not sure.

I'm not arguing we should pre-process the code, and put in the AST text in place of the {{{ }}} just to parse it. That'd defeat the purpose of hygienic macros (mostly).
But I think we should act "as if" we just parsed a block, and in the action, just use the given AST fragment (<var> in my example).


I've thought about figuring out a syntax lisp-like, in that you could also use it for day-to-day use, like Lisp can use quote/quasiquote/unquote for arrays...
But as I've just said, we are not homoiconic – Elixir is not either, and they have macros, but their syntax is far less flexible ... Though it's worth it to know what they're doing, AFAIK they only allow to splice in expressions – you can't write 3 unquote(plus) 4, which makes it considerably easier for them.

@masak
Copy link
Owner Author

@masak masak commented Jun 13, 2016

Not sure how that allows one-pass parsing, though ?

Well, hm, we'd know with a one-token lookahead... but that doesn't sound all that alluring now that I say it out loud.

Notably, the expression parser would still need to go "oh, that's a Q::Mumble::Handwave, that's a TTIAR. but is there a @ after it, then it's possibly OK!"

@masak
Copy link
Owner Author

@masak masak commented Jun 13, 2016

Ok, new suggestion that really front-loads the unquote signal: unquote(qtype @ expr)

@vendethiel
Copy link
Collaborator

@vendethiel vendethiel commented Jun 13, 2016

that doesn't really simplify anything over {{{, does it? (...except it doesn't look as ugly)

@masak
Copy link
Owner Author

@masak masak commented Jun 13, 2016

But I think that, inherently, having splicing in a syntaxful language means you can't keep one-pass parsing. Or you're going to have to patch holes by yourself.

That's an interesting assertion that I don't immediately agree with. 😄

One way that could look like would be...

token if { 'if' <cond> '{' <block> '}' }

token cond:splice { '{{{ Q::Cond @'  <var> '}}}' }
token cond:expr { <expr> }

token block:splice { '{{{ Q::Block @' <var> '}}}' }
token block:simple { '{' <expr> + %% ';' '}' }

Yes, that's the worst-case solution to what we need to do with the grammar.

As we've already pointed out numerous times, this is not an issue at all in Lisp(s), because we're already done parsing when we do splicing.

Hm, wait, isn't this trivially true for Perl 6 as well? Parsing happens first, and leads to an AST with a lot of Q::Quasi nodes in it, possible with a lot of Q::Unquote nodes in them. Then (usually) a macro is invoked (still at parse-time, but at least later), and as part of this, a things splice. Granted, we're not done with all the parsing, but we're done with the quasi we're splicing into. Or am I missing something?

So, it's "free" in the sense that it's not in any way special syntax. I think we've established that quite some time ago, but I think it's still worth pointing out.

Given that what we're implementing is a Perl and not a Lisp, are there any obvious benefits that carry over cleanly that I seem to be stubbornly ignoring? That question is often on my mind. 😄

@masak
Copy link
Owner Author

@masak masak commented Jun 13, 2016

that doesn't really simplify anything over {{{, does it? (...except it doesn't look as ugly)

Oh, but it does, because unquote(qtype @ expr) introduces its three components in this order:

  • unquote (so we know we're out of the normal parse)
  • qtype (optional) (so we can quickly fail parses where we know the qtype is incompatible — example: having a qtype of Q::Expr where we're expecting a Q::Statement is fine, because of expression statements; vice versa is not OK because 007 doesn't treat statements as expressions)
  • expr

Compare this to {{{expr @ qtype}}} which:

  • front-loads the {{{ signal (good)
  • reads the whole expression, an unbounded amount of tokens
  • finds @ qtype and may need to reject the whole parse because of the above mismatch (bad)
@vendethiel
Copy link
Collaborator

@vendethiel vendethiel commented Jun 13, 2016

Sorry, I meant "versus the possible {{{qtype @ expr}}} form".

@masak
Copy link
Owner Author

@masak masak commented Jun 13, 2016

Oh yeah, that one's equivalent as far as the above argument is concerned.

Any reason I should prefer {{{qtype @ expr}}} to unquote(qtype @ expr) and not use this opportunity to ditch the universally hated {{{ }}} syntax? 😄

@vendethiel
Copy link
Collaborator

@vendethiel vendethiel commented Jun 13, 2016

Absolutely none! Except maybe that it doesn't stand out as much – which is fine for Lisp, since it has no syntax, but might not be for Perl 6/007.

I thought {{{qtype @ expr}}} was one of the variation of your previous comment, but apparently it wasn't, so I stand corrected.

@masak
Copy link
Owner Author

@masak masak commented Jun 13, 2016

Here's one reason I've fallen out of love with {{{ ... }}}: S06 talks a lot about custom delimiters for the quasi block, so you could have quasi < ... >, quasi ⟦ ... ⟧, etc. Nicely enough, you then use the delimiters (tripled) for the unquotes: <<< ... >>>, ⟦⟦⟦ ... ⟧⟧⟧. Cool.

But then the final ingredient which would make all that machinery make sense is missing: you can't nest those things. Specifically, you can't unquote "several levels up" in analogy with how you'd do flow control with loop labels:

quasi <
    quasi ⟦
        ⟦⟦⟦ ... ⟧⟧⟧;    # I'd expect this to escape out of the inner quasi
        <<< ... >>>;    # I'd expect this to escape out of the outer quasi
    ⟧;
>;

TimToady assures me that this doesn't work, shouldn't work, and will never work. The reason, as far as I understand it, has to do with proper nesting, which is something we value in the parser more than the above conjectural feature. The way the parser works is that it never violates proper nesting, and so you can't unquote < ... > if there's a ⟦ ... ⟧ layer in-between.

Given all that, I just don't see why anyone would choose to use a non-standard delimiter, except possibly to make their code look funny. And — given that I considered the above conjectural feature to be the chief argument for the {{{ ... }}} syntax — I don't see why we need to stick to the {{{ ... }}} syntax.

@masak
Copy link
Owner Author

@masak masak commented Jun 13, 2016

Absolutely none! Except maybe that it doesn't stand out as much – which is fine for Lisp, since it has no syntax, but might not be for Perl 6/007.

Yes, that's perhaps one of the biggest unknowns with the unquote(qtree @ expr) proposal. Given masak's Deplorable Truth About Templating Syntax, I guess the empirical experiment here is "what happens if we make our placeholder syntax not be ugly and stand out so much?".

Let's find out! 😄

@vendethiel
Copy link
Collaborator

@vendethiel vendethiel commented Jun 13, 2016

You needn't convince me about nesting – I argued about it quite a lot already!. So, yeah, I don't remember believing in that S06 bit... but anyway, whatever you choose is fine.

We could try translating these:

from Graham's "On Lisp": (and another one is the same SO question):

(defmacro =defun (name parms &body body)
  (let ((f (intern (concatenate 'string
                                "=" (symbol-name name)))))
    `(progn
       (defmacro ,name ,parms
         `(,',f *cont* ,,@parms))
       (defun ,f (*cont* ,@parms) ,@body))))
@vendethiel
Copy link
Collaborator

@vendethiel vendethiel commented Jun 13, 2016

macro cont-defun(Q::Identifier $name, Q::ParamList $parms, Q::Block $body) {
  my $new-name = $name.scope.declare("cont-$name.bare()");
  quasi {
    macro unquote(Q::Identifier @ $name)(unquote(Q::ParamList @ $parms)) { # should the ParamList unquote be bare, without parenthesis?
      quasi {  # an unquote inside of another one
        my $f = unquote(Q::Identifier @ unquote(Q::Identifier @ $new-name)); # ... just to make this more readable
        $f(unquote(Q::ParamList @ unquote(Q::ParamList @ $parms))); # same q. for ParamList
      };
    }; # end of inner macro
    sub unquote(Q::Identifier @ $new-name)(unquote(Q::ParamList @ $parms))
      unquote-unhygienic(Q::Block @ $block); # mmh... needs that "unhygienic" so that `$parms` are visible.
  };
}

Whew.

@vendethiel
Copy link
Collaborator

@vendethiel vendethiel commented Jun 13, 2016

Seems like the github live update earlier had me miss one comment...

Hm, wait, isn't this trivially true for Perl 6 as well? Parsing happens first, and leads to an AST with a lot of Q::Quasi nodes in it, possible with a lot of Q::Unquote nodes in them. Then (usually) a macro is invoked (still at parse-time, but at least later), and as part of this, a things splice. Granted, we're not done with all the parsing, but we're done with the quasi we're splicing into. Or am I missing something?

That's not exactly what I mean – but I was unclear, as usual. I meant that there's no "parser state" to be recovered after an unquote. You don't have to "insert" anything, to parse in a different way – it's not even really splicing at parse-time, it's a function call in the source code.

Hm, wait, isn't this trivially true for Perl 6 as well? Parsing happens first, and leads to an AST with a lot of Q::Quasi nodes in it, possible with a lot of Q::Unquote nodes in them. Then (usually) a macro is invoked (still at parse-time, but at least later), and as part of this, a things splice. Granted, we're not done with all the parsing, but we're done with the quasi we're splicing into. Or am I missing something?

Not quite, exactly because of that same "parser state". In lisp, it's all functions calls. Doesn't matter if you're splicing an operator or whatever-else.

Given that what we're implementing is a Perl and not a Lisp, are there any obvious benefits that carry over cleanly that I seem to be stubbornly ignoring? That question is often on my mind. 😄

I'm not 100% sure what you're refering to here?

@masak
Copy link
Owner Author

@masak masak commented Jun 13, 2016

I'm not 100% sure what you're refering to here?

Just making sure it's not the case that the Lisp advantages you like to highlight are blindingly obviously mappable onto the problems we're wrestling with in 007 (and Perl 6).

The very-early stages of Perl 6 macros consisted of "yuh, of course we're gonna have Lisp macros (and not dirty C macros, at least not just)". Later: "oh, huh, Perl is not Lisp, Perl is Perl". (And all the other languages that are not Lisp are also not Lisp, and all have slightly different readings of what "macro" means.)

With all that said, I really, really want to learn from Lisp and soak up all the experience with macros that's obviously there. That's why I keep trying to read Lisp code, and to grok what the macros in there are doing.

@raiph
Copy link

@raiph raiph commented Jun 26, 2016

Completely rewritten

Hopefully you'll forgive some bikeshed coloring suggestions.

On less confusing and prettier quasi syntax

Claim: the keyword-and-block quasi { ... } syntax is confusing.

  • The thing that looks like a block isn't actually a block. It's a quote that happens to use the same delimiters as blocks. This is very misleading, imo.
  • The word "quasi" has a popular meaning that only quasi relates to quasi-quoting. The popular meaning is too general to be helpful.
  • The technical term quasi quoting means "a linguistic device in formal languages that facilitates rigorous and terse formulation of general rules about linguistic expressions while properly observing the use–mention distinction". That's on target but still too general to be helpful, imo, for someone who just wants to know what the construct they're looking at does, in layman's terms. Imo it's entirely appropriate to call it a quasi-quoting feature but not to use the word 'quasi` in the language. After exploring a bit, I've concluded I don't know of a word that would be better.

So...

Wikipedia:

Quasi-quotation is sometimes denoted using the ... double square brackets, ⟦ ⟧, ("Oxford brackets") instead of ordinary quotation marks.

So, strawman proposal, replace the quasi {} construct^1 with a circumfix operator such that:

⟦⟦ 42 + 99 ⟧⟧

quasi quotes the enclosed code.^1

(Or more brackets, with matching open/close count eg ⟦⟦⟦ say pi; say now ⟧⟧⟧^2 etc.)

For the Texas equivalent, use at least three pipes:

||| 42 + 99 |||
# or more pipes if desired: 
||||| say pi; say now |||||

On less confusing and prettier unquote syntax

Claim: the {{{ ... }}} syntax is confusing.

  • I'm not arguing it's ugly. I'm not arguing it isn't.
  • It's confusing. Because braces are associated with blocks.

So...

Wikipedia:

Quasi-quotation is sometimes denoted using the symbols ⌜ and ⌝ (unicode U+231C, U+231D) ... instead of ordinary quotation marks.

So, strawman proposal, use bracketing that includes a reversal of the ⌜ and ⌝ symbols:

say $onething, ⟦⌝ $ast-arg ⌜⟧, $anotherthing;

For the Texas equivalent use:

say $onething, ||^ $ast-arg ^||, $anotherthing;

^1 If there must be a keyword, perhaps AST or to-AST or to-ast or toast rather than quasi?

@masak
Copy link
Owner Author

@masak masak commented Jun 30, 2016

Hopefully you'll forgive some bikeshed coloring suggestions.

Consider yourself forgiven.

And please forgive me in turn for answering somewhat selectively, where I feel I have something to say. Feel free to take my silence on the rest as implicit agreement. Maybe.

  • The thing that looks like a block isn't actually a block. It's a quote that happens to use the same delimiters as blocks. This is very misleading, imo.

I'll 50% agree on this one. It's a block more often than people actually mean block. That is, in

quasi { x + 5 }

someone clearly meant to produce the expression x + 5, not the block { x + 5 }. The braces here are only a kind of delimiter. Our oldest open 007 issue is a little about that, actually.

On the flip side, though, it is a block in the sense that it holds in variable declarations.

quasi {
    my y = "oh, James!";
    say(y);
}
# no `y` variable here, nor outside where the macro/quasi gets expanded

This is part of "hygiene". I expect people are going to like that and be pleasantly un-surprised by that, because that's how braces/blocks behave.

  • The word "quasi" has a popular meaning that only quasi relates to quasi-quoting. The popular meaning is too general to be helpful.

Ok, but is it worse than using enum for enumerations or subset for subtypes or given for switch statements? I don't think so.

In some ways, it's too good a term to pass up, even when the match isn't squeaky-Lisp perfect.

  • The technical term quasi quoting means "a linguistic device in formal languages that facilitates rigorous and terse formulation of general rules about linguistic expressions while properly observing the use–mention distinction".

Would it help if I reformulated the above as "a neat way to build code, some of comes in as parameters"? Because that's what it says, 'cept with more stilted language.

Quasi-quotation is sometimes denoted using the ... double square brackets, ⟦ ⟧, ("Oxford brackets") instead of ordinary quotation marks.

Yes, in logic. That whole article you're quoting refers to Quine's notion of quasiquotation in formal languages, that is, in mathematical logic. The Lisp-y backquote and the comma would have much bigger claims to quasiquotation syntax than ⟦ ⟧. Those are quite decent candidates in Lisp because they're "free for use" in Lisp. In Perl 6, the backquote has been reserved for DSLs, and the comma is used all over the place.

So, strawman proposal, replace the quasi {} construct^1 with a circumfix operator such that:

⟦⟦ 42 + 99 ⟧⟧

quasi quotes the enclosed code.

Not saying I endorse it, but what in your syntax would be the corresponding way to express quasi @ Q::ParameterList { x, y, z } ?

(Or more brackets, with matching open/close count eg ⟦⟦⟦ say pi; say now ⟧⟧⟧^2 etc.)

Unmatched footnote.

What's your use case for having more brackets? Do you conceive of something like ⟦⟦⟦ ... ⟦⟦ ... ⟧⟧ ... ⟧⟧⟧ ? When would that be needed, in your opinion? (See this comment in the same thread for some discussion about nested quasis and unquotes.)

For the Texas equivalent, use at least three pipes:

||| 42 + 99 |||
# or more pipes if desired: 
||||| say pi; say now |||||

Still not assuming I endorse the rest, this is not going to fly because the Texas equivalent of a matching pair of braces can not be a non-matching symbol. (Not by some hardcoded rule specified somewhere, granted, just... not.)

Claim: the {{{ ... }}} syntax is confusing.

  • I'm not arguing it's ugly. I'm not arguing it isn't.

Ok, then let me help you with that one: it's ugly. 😄

Question is more like, is that a (necessary) feature, or something we should fix?

  • It's confusing. Because braces are associated with blocks.

Again, blocks convey scoping and hygiene, something that isn't all that far-fetched with unquotes either.

Quasi-quotation is sometimes denoted using the symbols ⌜ and ⌝ (unicode U+231C, U+231D) ... instead of ordinary quotation marks.

You're proposing using a quasi-quotation symbol (from mathematical logic) to stand in for unquotes? You don't consider that a bit confusing?

So, strawman proposal, use bracketing that includes a reversal of the ⌜ and ⌝ symbols:

say $onething, ⟦⌝ $ast-arg ⌜⟧, $anotherthing;

I think that falls under "you think it's cute today". More specifically/constructively: we could have played around with outwards-facing brackets for ranges, like some mathematical textbooks do:

[1, 2]    # closed
]1, 2[    # open
[1, 2[    # half-closed, half-open
]1, 2]    # half-open, half-closed

But we didn't — nor a competing form with ranges like [1, 2) — because it completely messes up parsing. I think a ⟦⌝ ... ⌜⟧ construct should be avoided for much the same reasons.

Also, what in your syntax would be the corresponding way to express {{{ ast_arg @ Q::ParameterList }}} or — using the newer as-yet unimplemented syntax — unquote(Q::ParameterList @ ast_arg)?

^1 If there must be a keyword, perhaps AST or to-AST or to-ast or toast rather than quasi?

One good thing about this footnote: I finally understood why the heck you proposed toast as a keyword back in your advent post comment. I guess what I'm saying is I personally didn't find "toast" as a term very transparent. 😄

@raiph
Copy link

@raiph raiph commented Jul 15, 2016

[quasi is hygienic because it has closure semantics]
[this motivates use of { ... } syntax]

I consider these good arguments for using { and } as the delimiters of a quasi.

However, I was thinking that you intend to also support quasis that:

  • represent a block
  • allow unhygiene

Afaict supporting these, even if they're a much less common use case, would be good arguments for not using { and } as the delimiters of a quasi.

[is "quasi"] worse than using enum for enumerations or subset for subtypes or given for switch statements?

Yes, very much so according to google, wikipedia, and my English language sensibilities.^1

Would it help if I reformulated the above as "a neat way to build code, some of comes in as parameters"?

The reformulation isn't helpful to me because I understand quasi quotation.

But yes, I think that reformulation is much closer to what mere mortals will get.

It also happens to coincide with the code suggestion that I cover at the end of this comment.

The Lisp-y backquote and the comma would have much bigger claims to quasiquotation syntax than ⟦ ⟧.

Sure. But:

In Perl 6, the backquote has been reserved for DSLs, and the comma is used all over the place.

So that nixes backquote and comma.

what in your syntax would be the corresponding way to express quasi @ Q::ParameterList { x, y, z } ?

I don't know. I'll respond to that in another comment.

What's your use case for having more brackets?

It's pretty weak.^2 Let's move on.

||| 42 + 99 |||

is not going to fly ... (Not by some hardcoded rule specified somewhere, granted, just... not.)

Fair enough. That was one of a couple of especially wild suggestions. :)

You're proposing using a quasi-quotation symbol (from mathematical logic) to stand in for unquotes?

Well, as I said, I'm strawman proposing. This is part of exploring what I see as problems with planned Perl 6 syntax. It didn't feel right to simply complain without strawman proposing something else.

But, yes, I see the mathematical usage to be so near identical in concept to quasi-quoting in programming languages that use of the same syntax seemed attractive and I see unquotes as basically the inverse of quasi-quoting.

That said:

say $onething, ⟦⌝ $ast-arg ⌜⟧, $anotherthing;

I think that falls under "you think it's cute today". More specifically/constructively: we could have played around with outwards-facing brackets for ranges, like some mathematical textbooks do ...
But we didn't ... I think a ⟦⌝ ... ⌜⟧ construct should be avoided for much the same reasons.

Fair enough. This was the other especially wild suggestion. :)

Also, what in your syntax would be the corresponding way to express {{{ ast_arg @ Q::ParameterList }}} or — using the newer as-yet unimplemented syntax — unquote(Q::ParameterList @ ast_arg)?

I'll defer a response to that for a later comment.

^1 If there must be a keyword, perhaps AST or to-AST or to-ast or toast rather than quasi?

I guess what I'm saying is I personally didn't find "toast" as a term very transparent. 😄

Indeed. That was part of my point.

I still think I'd find it easier to explain my mostly tongue-in-cheek suggestion "toast" to a mere mortal than "quasi". (To a small degree it was not entirely tongue-in-cheek; I thought it could perhaps be spelled to-AST or somesuch.)

Anyhow, I've since come up with another strawman proposal:

code [[ x + 5 ]]

That is, the keyword code followed by any amount of whitespace, followed by [[ without whitespace, code, and then matching ]].

Of course, the [[ could be misunderstood by a human as the opening of a literal array in an array, in a manner similar to { being misunderstood by a human as being the opening of a block or hash. But I like that it allows for as a Unicode equivalent. (I was surprised you complained that is for math quasi quotes, not programming quasi quotes, and have not yet let go of the possibility that you will have a change of heart so that you see this correspondence as a positive.)

And for unquotes I strawman propose:

code [[
    x + {{ y }}
]]

If a user actually means to write {{ ... inside a code [[ ... ]] to mean the start of a hash in a hash or a hash in a block or whatever they must disambiguate by adding whitespace between the braces.

If you really don't like the above, please consider waiting a week or two before replying. :)


^1

Data For me a google for "enumeration" yields 100+ million matches. "commonly used in mathematics and computer science to refer to a [ordered] listing of all of the elements of a set.". Several mainstream languages use enum as their enumeration declarator keyword. (I recall you referring to confusion if we use "enum" to refer to both the overall declaration and individual elements, and I concur. But that's a different issue.)

Conclusion? "enumeration" is a common English word. It has exactly the right meaning for a programming language enumeration. enum is a common choice as a programming language enumeration declarator. Using the same keyword in Perl 6 makes sense.

Data For me a google for "subset" yields 40+ million matches. "a part of a larger group of related things. ... a set of which all the elements are contained in another set." In Perl 6 a subset is a particular kind of subtype, one that is guaranteed to only be a subtype by virtue of being a subset of the allowed values (so a subset never changes any operations).

Conclusion? "subset" is a common English word. It has exactly the right meaning for a subtype that's a subset. Using the keyword subset makes sense.

(Likewise "given" is also a common English word whose meaning is well suited to the use made of it in Perls.)

Data For me a google for "quasi" yields 200+ million matches. "seemingly; apparently but not really. ... being partly or almost.". So, a popular word, used more than enumeration and subset combined, but with a meaning that isn't at all suggestive of what a quasi quote is when it's used without the word "quote". And the #1 result when I search for the highly specific phrase "quasi-quotation" is the wikipedia page on "quasi-quotation" which all but ignores programming language use of the concept.

Conclusion? "quasi" is a common English word. It does not remotely hint at what it might mean in Perl 6. Even the relatively obscure term "quasi quotation" is a poor fit. I was hugely surprised by your suggestion that the keyword choices "enum" and "subset" are of about the same quality as "quasi". Hopefully the foregoing has opened your mind about this bit of bikeshedding. You may be in love with "quasi", and it makes sense in 007, but I would like to think that Perl 6 macros will aim at avoiding unnecessarily weird color schemes.

^2 Assuming a Texas variant of quasi quoting existed, and that it was several [ and matching ], it might entail confusion if the contained expression also used the same bracket characters (human confusion, not parser confusion) so it might be nice to allow users to use as many multiple brackets as they liked.

masak pushed a commit that referenced this issue Jul 17, 2016
Carl Masak
This also starts making use of an optional parameter to the
`unquote` rule, passing along the expected type in the context
so that we can fail earlier, inside of the rule instead of
outside it.

This will go some way to getting out of the quagmire described in
#30 (comment) --
it was enough for Q::ArgumentList, for example, which clashes with
term position -- but I'll be quick to note that it doesn't
address the problem at its root. Notably, when we do hit the type,
at the end of the unquote, we've already had who-knows-how-many
side effects along the way from the stuff inside that we parsed.

In other words, something like `unquote(qtype @ expr)` would
still be beneficial, maybe even necessary, in order to finish up
issue #30.
@raiph
Copy link

@raiph raiph commented Feb 5, 2017

Let me lead by saying how much I appreciate your probing questions and comments. They mean a lot, and they do push me in the right direction; thank you.

Gracious words and you are of course welcome.

(Also, maybe write a comment here? A couple years ago I posted a link to the 007 project but I'm thinking it might work for you and for redditors if you posted a friendly, casual, intimate, macro-enthusiast-to-macro-enthusiast, fresh account of what's been on your mind over the last month or two (and then repeat this casual commenting on /r/programminglanguages every month or two in future).)

Some further comments and loose ends:

{{{qtype @ expr}}}
Q2b. If expr was always of a type known at compile time

There's a fairly evolved type checking story waiting in the wings for 007. It's laid out at #33

Duly read.

... if the program would compile-fail with type checking off, then it must also compile-fail with type checking on.

That makes sense out of context but I'm not understanding why you're mentioning it in this context.

What I was getting at was that if, say, expr was one of the macro's arguments, then you could know that its type is as declared in the macro's signature. And if not, you could, presumably, "know" it's at least a qtype. (If it turns out it isn't at run-time the fault lays with the macro code.)

Q3b. Do some/all qtypes map to selected ast generating action methods?

As far as I can tell, macros replace action methods.

Yeah. I'd gotten confused. Now I've (re)read the Three Types of Macro I'm less confused. :)

Parenthetically: If I could, I would get rid of None and to Maybe types instead.

(Tangentially, have you read this wikipedia write up about Perl 6 types?)


I hope to get time in the next week or so to (re)read and give feedback on the q types doc and the three types of macro in their respective comments areas.

@masak
Copy link
Owner Author

@masak masak commented Feb 5, 2017

(Also, maybe write a comment here? A couple years ago I posted a link to the 007 project but I'm thinking it might work for you and for redditors if you posted a friendly, casual, intimate, macro-enthusiast-to-macro-enthusiast, fresh account of what's been on your mind over the last month or two (and then repeat this casual commenting on /r/programminglanguages every month or two in future).)

Sounds like a great idea.

I'll try to find time for it. Generally, I find that taking the time to explain 007, including finding examples with good traction, benefits everybody. The problem, as usual, is finding the time.

... if the program would compile-fail with type checking off, then it must also compile-fail with type checking on.

That makes sense out of context but I'm not understanding why you're mentioning it in this context.

What I was getting at was that if, say, expr was one of the macro's arguments, then you could know that its type is as declared in the macro's signature. And if not, you could, presumably, "know" it's at least a qtype. (If it turns out it isn't at run-time the fault lays with the macro code.)

The issue is with making the parser suddenly succeed some previously failed programs based on type information. That's what I was getting at. It's not that it can't be done, it's just that that wouldn't make the type annotations "orthogonal to everything else".

(And that's before considering the (much more serious) timing issues I mentioned between parsing and type inference.)

Parenthetically: If I could, I would get rid of None and to Maybe types instead.

(Tangentially, have you read this wikipedia write up about Perl 6 types?)

I hadn't. I'm surprised at the claim that Perl 6's type objects would be in some way equal to the Option type. To me type objects are much closer to a null reference, with all of the associated problems. I can see how there's a superficial resemblence, though.

The piece I feel is missing from Perl 6 to qualify — and it's missing because Perl 6 simply isn't that kind of language — is forcing you to consider undefined values during compile time. That is, if your type allows type objects (as most do), then you simply cannot handle the value without also taking the type object case into account. That's not Perl 6's approach at all; it gives the programmer the freedom to ignore the null case and/or the responsibility to remember it when it matters.

@raiph
Copy link

@raiph raiph commented Feb 6, 2017

Generally, I find that taking the time to explain 007, including finding examples with good traction, benefits everybody. The problem, as usual, is finding the time.

I hear that. I think you've misunderstood what I was suggesting and/or its possible consequences. This is well off topic so I'll follow up about this elsewhere than this github issue.

The issue is with making the parser suddenly succeed some previously failed programs based on type information.

OK. My "duly read" of #33 didn't mean I understood any of what I was reading. :) I've posted a question there in an attempt to catch up.

I'm surprised at the claim that Perl 6's type objects would be in some way equal to the Option type.

I saw hints that you wouldn't agree with it (which is why I mentioned it) but I'm surprised you're surprised. But this is way off topic so I'll follow up elsewhere.

@masak
Copy link
Owner Author

@masak masak commented Sep 23, 2017

I recently switched on typechecking of all properties initialized in the objects we're creating in 007. It makes everything run a lot slower, but it's led to some interesting insights along the way.

I went through all of our types, including all our Qtypes, and wrote the allowed types for all their fields. The result is here. Most of these are extremely straightforward; the type of Q::Dict's propertylist is Q::PropertyList, for example. Some of these are optional, which in effect means that the value None is allowed in that field. I solved that by annotating these with (fake) union types; for example, the name field in Q::Term::Sub is typed as Q::Identifier | NoneType, because a sub term doesn't have to have a name. (I might come back and change that to being an :optional property on the field object instead, but the end result will be the same. We'll keep using None as "the absence of some actual value here"; that's what it's for.)

The most complicated one we have is the else property in Q::Statement::If, which is Q::Block | Q::Statement::If | NoneType. The three types represent, respectively, an else block, an elsif, and neither of those.

I got two test failures from doing this. One was from format.007, bless its heart. (It's to date our most realistic use of macros in examples/.) It made me have to put in Q::Unquote in the type union on this line.

That Q::Unquote is needed because of this quasi in format.007.

It immediately felt wrong to have to put a Q::Unquote in that type union. The "steady state" of starting to do that would be basically to put Q::Unquote almost everywhere in Qtype fields. There's something wasteful and repetitive about the whole idea. It indicated to me that I was missing something.

On the other hand, putting those Q::Unquote in was what I had to do, in the short term, to make creating those objects work again.

This was a couple of days ago. I've spent those days (and a long-haul flight) pondering over what to do about this. I've found out the answer, and it feels like it also answers a bunch of unrelated questions we've had about quasis and unquotes. ("The Answer Will Shock You"™)

To be continued.

@masak
Copy link
Owner Author

@masak masak commented Sep 23, 2017

It's worth spending some time at what I thought was a right solution, but which turned out to be a wrong one. There's value in postmorteming things that don't pan out.

A fairly natural response to the fact that unquotes don't "fit" in the Q::Expr-shaped hole in operand in Q::Postfix is "ok, then we make it fit". Maybe Q::Unquote needs to subtype Q::Expr?

That doesn't sound exactly right — Q::Unquote would need to subtype most Qtypes in order to fit everywhere. It would need to be some kind of bottom-ish type.

A more promising idea might be to think of Q::Unquote as being generic in the Qtype it eventually spits out. (Cf #182.) In other words, make it a Q::Unquote<T extends Q>.

But... um. If the goal is still to make a Q::Unquote-shaped peg fit in a Q::Expr (or whatever)-shaped hole, then Q::Unquote would need to extend its generic type parameter:

class Q::Unquote<T> extends T {
    # ...
}

Ew. I mean, it'd be sort of awesome if that worked, but... I'm not aware of a language that does that, at least not this side of the FP fence. I'm 100% Java disallows it, 'cus type erasure. I'm reasonably certain C# wouldn't like it either. When I try in TypeScript, it informs me I'm trying to use a type as a value, which sounds about right.

Ultimately, both the generic-types lead and the preceding extend-everything lead are wrong, by this simple demonstration: is a Q::Unquote a kind of Q::Literal::Int? By assumption, yes. Well, what characterizes a literal int? It has a value property with type Int, denoting the literal value of the literal int. What's the value of a Q::Unquote<Q::Literal::Int>? There can't be any, because an unquote is not a literal int yet. Barbara Liskov's ghost appears and gives us a stern look, and we realize our assumption is wrong — Q::Unquote is not a type of Q::Literal::Int, even when it ultimately results in one.

I had other ideas along the way, like maybe I could have a separate Qtype hierarchy for Qtypes that can appear in quasis. Not so appealing either.

To be continued.

@masak
Copy link
Owner Author

@masak masak commented Sep 23, 2017

This summer I was writing code for a client related to HTML templating. The templating engine was Handlebars, which is a fairly reasonable one, allowing the template author to put directives around elements for for loops and if statements and the like.

By some kind of comfortable default, we were also making use of a WYSIWYG HTML editor. Actually, it wasn't just HTML, but XHTML, that useless extra strictness that people think they need but never really do. The editor was helpfully "fixing" illegal HTML in the templates, essentially creating a conflict of interest between the WYSIWYG editor and the Handlebars template. In particular, a fairly common thing to want to do in Handlebars is to {{#each}} some <tr> table rows. But the editor (correctly) noticed that there was some textual content inside a <tbody> but outside of a <tr>, and moved that text into its own <p> outside of the table. The template broke.

It took me a few days to distill what was wrong down to the surprising essence: Handlebars is not HTML. Handlebars is one level of indirection away from being HTML. It's unexpected because Handlebars kind of looks like HTML... just like an IKEA instruction for how to assemble a sofa kind of looks a little like a sofa — but it's not a sofa.

(We ended up disabling the WYSIWYG editor. It only works for HTML, after all.)

Generalizing, a templating language for some language X is not X. What it is, is one level of indirection away from X. It's X but minus a function application. Now, let's apply that to quasis.

To be continued.

@masak
Copy link
Owner Author

@masak masak commented Sep 23, 2017

Skipping right to the conclusion:

  • I don't think Q::Unquote should be a Qtype at all.
  • I don't think Q::Term::Quasi should contain a Qtree, because a quasi does not contain 007 code.

A quasi contains a template for 007 code, which is not 007 code. There's no (good) way to encode 007 code as a Qtree, because there are freaking holes in it. The unquotes are not made of Qtree, they are made of hole, which does not encode well as Qtree.

How should we encode the quasi? Going back to the IKEA instruction, a quasi is an assembly instruction. It's a functional mapping from zero or more ASTs, to an AST. It's... an IIFE. A sub declaration and a call to that sub. The parameters and the statements of the sub is what we want to encode.

The way this solves the problem of "need to put Q::Unquote everywhere" is that the Qtrees don't actually get constructed until the unquotes are replaced by actual ASTs. All we ever get in 007 land is real, honest-to-grog Qtrees.

To be continued with one last installment.

@masak
Copy link
Owner Author

@masak masak commented Sep 23, 2017

The "quasis are templates are functions" interpretation is so obviously correct that I'm a bit dizzy with happiness. I'm glad I did the types thing that made me realize this. (I'm also kinda glad I lost my phone in China, because I've had some wonderful time to focus when I haven't been distracting myself with my phone. True story.)

There are some more or less immediate consequences of this:

  • Just as we can eventually get rid of Q::Unquote, we can also get rid of Q::Unquote::Prefix, Q::Unquote::Infix, and Q::Unquote::Postfix. That makes me, if anything, even happier. Those were a blemish upon the earth. They were necessary because of the nesting structure of operators, but they weren't really useful for anything. They will be replaced by just the proper IKEA assembly instructions.

  • I meant to post a separate issue about "should the inside of unquotes be able to see their surrounding quasi?", but I now believe that falls under this issue. The answer is no, because (again) the thing inside the quasi is not 007 code. (The thing inside the unquotes, however, is.) If the content in quasis isn't proper 007 code, then it also doesn't have a lexical environment. So there's nothing for the unquote to see.

To be clear, we're talking about this situation:

my name = "outside";
quasi {
    my name = "inside";
    {{{ new Q::Identifier { name } }}}
};

I'm saying the identifier has the name "outside", because (by the time the quasi is fully constructed into a 007 AST) the statement my name = "inside"; isn't 007 code yet. (It could be in this case, but it can't be in the general case.)

Even if we did allow the above — and I really don't see why — there would be fatal timing issues involved. The unquote gets evaluated at macro time. The quasi code gets evaluated Late. Even if we BEGIN'd the quasi code, after #216 gets fixed, it'd still run too late. So there's no reasonable way an unquote could benefit from seeing the inside of a quasi.

I'm tempted to make this one an error message, even. If we don't find a binding for the variable outside the quasi, but there's one inside, then we should emit a polite version of "what is it you think you're doing?".

  • The thinking about the exact role of quasis and unquotes has made me want to make the following syntax alteration: currently we write quasi @ Q::Term { ... } and {{{Q::Term @ ... }}}. I propose we change the former syntax to quasi<Q::Term> { ... }. The syntax is meant to be strangely consistent with generic types — which 007 doesn't have but might get sometime in a distant future. Instead of equating what goes on with a quasi to what goes on with an unquote, it sets them apart. That's intentional: the quasi<Q::Term> is a signal primarily to the type system, and the {{{Q::Term @ ... }}} is a signal primarily to the parser. (The former one still affects the parser, though. Hm.)

Ok, I'm done now.

@raiph
Copy link

@raiph raiph commented Sep 24, 2017

Clarity!

What about just Q::type { ... } instead of quasi<Q::type> { ... }?

@masak
Copy link
Owner Author

@masak masak commented Sep 25, 2017

Since Q::type { ... } would occur in term position, that'd collide with the syntax for regular terms: Q::type. Of course we could special-case that syntax by looking ahead for a block after the Q::type thing. We've been down that path before — the syntax for new Q::Identifier { ... } used to be Q::Identifier { ... } in 007. See #147. Nowadays I consider keywords like new and quasi to be sensible "guards" in the grammar that serve to disambiguate the parsing rule early so that there's less ambiguity.

Putting Qtypes in the Q:: hierarchy is just a convention. Someone could do class A::B::C::DefyingConvention extends Q and it'd be a Qtype. So any identifier we parsed we'd have to do the lookahead.

Lastly, though I seem to be in a minority in this thread to do so, I kind of like the keyword quasi and its connotations. (😉) It stays for now.

@masak
Copy link
Owner Author

@masak masak commented Sep 25, 2017

Oh! Here's another thing I wanted to mention. Besides the slogan

Quasi quotes are not 007, they're a template.

that I mentioned above, I would also like to champion the slogan

Quasiquotation is a slang.

The two features that I usually associate with slangs fit well with quasiquotes:

  • They return a value out to their parent environment.
  • They allow mixing in "blocks" of the parent environment.

But there's a third one too that I hadn't kept at the front of my attention but that seems blindingly true with quasis:

  • They have different runtime semantics; essentially a different "runloop".

In the case of quasis, the runloop is simply a sub that evaluates to a Qtree.

I don't know if it means that all slangs are templates... I guess time will tell.

@raiph
Copy link

@raiph raiph commented Sep 30, 2017

FYI, I just began reading A Guide To Parsing; in it, Federico explains how "In a way parsing can be considered the inverse of templating".

@masak
Copy link
Owner Author

@masak masak commented Oct 2, 2017

Well, yes — parsing extracts a (possibly nested) model from flat text, and templating produces flat text from a (possibly nested) model.

But also, in subtle ways, no... which is why Frederico writes "in a way", I guess. 😛 For one thing, spooling a flat structure from a deep model is algorithmically very different from deriving a deep model from something flat. There's a fair amount of guessing/backtracking when you do the latter which is missing from the former.

The one point that stands out from skimming the document is that Perl 6 (and 007) are languages such that scannerless parsing is the only option. That's because "where the lexer ends and the parser begins" becomes such an overriding issue that scannerless becomes the only reasonable option. And that's because "you have to know which language you're in" in order to even do lexing properly — in other words, lexing is tied up with and dependent on the parsing up to that point. Not all languages are like that, but it feels reasonable (maybe inevitable) that a syntax-extensible language with macros and slangs be like that.

I liked the text (having skimmed it). Thanks for throwing it my way. 😄

@masak
Copy link
Owner Author

@masak masak commented Oct 2, 2017

And also, yes, quasis (being templates) are the "inverse" of parsing in the sense that when you stuff a string into an EVAL call in Perl 6, the compiler has to parse that string as a new compunit... but when you put something in a quasi, it's being parsed (to the extent possible) along with the rest of the program, and so you've "saved" a parse step at runtime compared to EVAL.

@masak masak mentioned this issue May 26, 2018
@raiph
Copy link

@raiph raiph commented May 29, 2018

Brainstorm incoming. I tried to read the entire thread before commenting but some dude raiph hijacked the thread with bikeshedding and I lost heart wading thru it all. I hope I'm not about to do repeat his mistake but we are all one after all so I apologize if there ain't that much difference between the two of us.

Also, I'm pretty sure the following will be completely invalid and/or stupid in multiple places, but I'm just going to spew it out rather than pause to worry about that because that can kill the flow and you were gracious about his stuff, so why shouldn't I just post what I've got right now?

#Strawman proposal 1

Macros are their own slang, analogous to the way rules are. The macro slang is almost identical to the main language except it introduces a couple critical tweaks. It's essentially a template language where most of a typical template uses exactly the same syntax as the main language.

Macros with unquotes must initially mention their unquotes (with possibly one exception which I'll get to) as either parameters of the macro in a parenthesized list at the start, or via variable declarations.

Each such mention of an unquote / parameter must use a prefix which I'll call a sigil. I'm thinking symbols like $, @, &, etc. -- I'm sure you get my drift. The use of unquotes must use the same identifier as the initial mention, including the sigil.

So, for example:

macro foo (infix &infix, term $foo) {
    my $a = 42;
    quasi { $a &infix $foo }
}

The infix, term etc. must be, or be subtypes of, the AST type in the main language that correspond to named grammatical categories in the macro slang (and also, quite plausibly, in the main language too).

#Strawman proposal 1.1

The default type for macro parameters depends on their sigil:

  • $ sigil ---> term;

  • & ---> callable (not committing to whether it's a listop, infix, or postfix);

  • @---> code -- a list of AST fragments;

  • % ---> ???

#Strawman proposal 2

Replace use of the word quasi with [[ ... ]]. This would be a useless doubling of the square brackets in the main language so is arguably available for Perlish repurposing. {...} declares a fragment of the main language, including the braces. These mean different things:

macro foo [[ my &b = { my $a = 42 }; my $a = 99 ]]
macro foo { my &b = [[ my $a = 42 ]]; my $a = 99 }
macro foo [[ my &b = [[ my $a = 42 ]]; my $a = 42 ]]
macro foo { my &b = { my $a = 42 }; [[ my $a = 42 ]] }

The unicode characters ⟦ ⟧ may be used as an alternative to [[ ]].

#Strawman proposal 3

Let the main language parse anonymous [[ ... ]] as IIMEs -- immediate inline macro expressions. And perhaps -> $foo [[ ... ]] as shorthand for macro ($foo) { ... [[ ... ]] }.

Am I completely crazy?

@masak
Copy link
Owner Author

@masak masak commented Jun 23, 2018

Hi, sorry for dropping the ball on this one. Will attempt a reply now.

@masak
Copy link
Owner Author

@masak masak commented Jun 23, 2018

Here's hoping I manage to write this without coming off as exceedingly nitpicky or grumpy. 😄 @raiph, the summary is that you are addressing issues which (I've come to understand) are important in your mind, but not so highly prioritized in mine. (I'm still glad that you're inventing in this space, don't get me wrong. Keep it coming!)

Because I know that summary is not nearly sufficient, let me list some specifics:

  • I'm kind of happy I have managed to keep sigils out of 007. I'm not hating on Perl's sigils as such, I think they're neat. It's more like they are the kind of extra convenience that 007 eschews in the name of simplicity. It gets that after Python, I guess.

  • TimToady also suggested a sigil for macro arguments, instead of unquotes. He suggested one sigil (¤) instead of three-four, but besides that, his suggestion reminds me of this proposal. I'll write now what I wrote then: it introduces sigils without addressing the challenges the parser actually has with unquotes. As such, it feels more like handwaving than a solution.

  • To be specific, this proposal hopes that the infix keyword in the parameter list of the macro will propagate to make the &infix variable parse correctly as an infix. (Correct me if I'm wrong.) I fail to see how this is better than the current status quo, providing that information to the parser locally where it needs it in the form of a typed/"categorified" unquote: {{{ Q::Infix @ infix }}}.

  • Any solution where information needs to be propagated along the AST in order for the parser to do its job, makes it that much harder to do the parser by deliberately mixing typechecking into the mix. Besides being technically harder, it just feels intuitively odd to me. In Java when you compile, first you get a bunch of syntax-related errors because the compilation unit doesn't parse cleanly. Once you've fixed those, you start getting type errors. In XML, first the parser needs the XML to be well-formed, and only then can we even talk about it being valid. A non-well-formed document is not even invalid; it's a categorical error to talk about its validity. Similarly (I feel), a computer program should be syntactically correct before we start analyzing its types. Sorry for repeating myself, but I feel this just as strongly as last time. I hope I at least expressed it more clearly this time around.

  • The proposal eradicates the distinction between unquoting/using and mentioning an AST fragment inside of a quasi. (It's been an uphill battle sometimes to defend this as a feature, not a bug. See examples/name.007 for a case of where we unabashedly mention rather than use. I'd hate to see that go.)

  • Introducing sigils for macros makes the solution inherently non-portable into Perl 6, since Perl 6 already uses sigils for indicating container type families.

  • As someone who still has no problem at all with the term "quasiquote" or the keyword quasi, the suggestion of [[ ]] and ⟦ ⟧ comes off as a rehash of previous suggestions. (Which I dismissed back then.) But let me give my main beef with it: double square brackets are not an unlikely-enough syntactical occurrence in macro-free programs. They happen when you want to write a 2D matrix, or a table of stuff, or a chess board... "We can't just waltz in and mess up the main language in the name of macros and quasis." Cf. masak's Deplorable Truth About Templating Syntax — quasis and unquotes are ugly because they need to stay far apart from the main language.

  • I don't see the double square brackets [[ ]] being chosen for any reason other than its (perceived) availability. The curly braces in quasi { }, on the other hand, are consistent with the fact that the quasi is a block and a scope. Braces delimit scopes in 007/Perl; brackets don't.

  • Similarly, using braces { } to mean "take us back outside of the quasi slang" is ruled out because braces are commonly used for all kinds of blocks inside the quasi'd code.

  • Not sure we need immediate inline macro expressions for anything, but the suggested syntax is inconsistent with a long tradition in Perl and other languages of being able to write out an abstraction without it being automatically applied.

That's it. Again, sorry if this comes off as just finding flaws — I hope I'm getting across that the above are not so much reactions from me as they are from the realities of syntax, parsing, compilation, and Perl. (And, likely as not, my own view of things is limited too and there are adequate responses/workarounds to what I consider problematic.)

@masak
Copy link
Owner Author

@masak masak commented Oct 3, 2018

Although I haven't much shifted my stance on whether "quasiquote" (and its current keyword/syntax) is easy to understand or optimal...

...I realized one thing the other day that sets 007 (and Perl 6) apart from Lisp: in 007 (and Perl 6) there's quasiquotation, but no quotation. (See here for a quick explanation of quotation and quasiquotation in Lisp.)

We don't have a construct, analogous to quasi, for quoting code. The closest we have is a quasi without any unquotes in it.

Again, this doesn't much change my stance as to the appropriateness of these terms in 007 and Perl 6... but it does explain to me parts of what people (maybe?) find distasteful about the term "quasiquote" in 007 and Perl 6.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
5 participants
You can’t perform that action at this time.