Skip to content

Qi Meeting May 3 2024

Siddhartha Kasivajhula edited this page May 17, 2024 · 9 revisions

Nothing for a Creator to Do

Qi Meeting May 3 2024

Adjacent meetings: Previous | Up | Next

Summary

Ben presented "curlique syntax," a new, compact way to embed flows, to general acclaim. Sid showed an experiment using curlique syntax in a #lang and met stiff resistance! Dominik presented plans for the architecture of compiler passes that would allow us to easily swap different implementations of the same compiler pass. We discussed the scope of deforestation and the prospect of handling multiple values.

Background

Ben has been working on Frosthaven Manager for some time. This is a large GUI application exhibiting a very cool "functional core, imperative shell" architecture and it makes heavy use of Qi together with Gui Easy. Ben has experimented with writing macros to make working with flows easier in this specialized application in the past, and developed the more general curlique syntax this week.

At various times, we've talked about whether it would make sense to write a #lang qi of some kind. We haven't found a compelling need for it in prior discussions, though the prospect of improved idiomatic performance over Racket for specialized cases (with the release of the optimizing compiler in Qi 4) was the first clear benefit that such a language could provide. It still didn't seem compelling on its own, however.

The Qi compiler is already extensible today, allowing the addition of compiler passes by maintaining a compile-time registry of passes. But providing multiple implementations of the same pass today would require duplication of a lot of the pattern matching logic.

Curlique

Curlique syntax is a new, compact way to embed a flow in Racket. By using the paren-shape functionality in syntax-parse, it converts uses of {} to an embedding of a flow. But crucially, it's not just a simple embedding. In addition to the basic transformation:

{f} → (flow f)

… Curlique also simplifies the use of common Qi forms by eliminating a layer of context they would ordinarily introduce (in the form of parentheses) to embed them directly in the flow specification. For instance:

{~> f ...} → (flow (~> f ...))

It's analogous to the existing embeddings of ~> and ~>> directly into Racket, but within the context of flow specification where these do not compete for space with other Racket identifiers. Ordinarily we are not in a position to make this transformation since translating something like (flow ~> f ...) to (flow (~> f ...)) is neither useful nor is it aesthetically pleasing. But curlique makes it natural.

The specific shape of the syntax (e.g. use of braces) is secondary to the conceptual economy it engenders, the latter of which is what makes it compelling as a language feature.

Implementation

Curlique is currently implemented as an #%app transformer rather than as a reader extension. This has some tradeoffs.

Operating at the expander level, it is able to specifically transform cases that are syntactically "applications," rather than just any form. Thus, we can continue to use (), [], and {} interchangeably in cases that are not applications (for instance, let binding clauses, cond clauses, etc.).

On the other hand, it does mean that Qi forms must be explicitly embedded to eliminate wrapping context (e.g. the ~> and ~>> embeddings we talked about).

Doing it via a reader extension would mean that {} would specifically indicate curlique syntax, and use of it in other cases (such as cond clauses) would be an error. But it would be able to implicitly promote all Qi subforms, including ones that haven't been defined yet (TODO: why couldn't the #%app macro do this?).

Another issue is that something like this wouldn't work in the #%app version:

(map {thunk 1} (list 1 2 3))

Since thunk seemingly occupies an #%app position, one might expect this to result in a list of thunks. But in fact, this results in an error because thunk is a macro rather than a function, so it is not really an #%app position after all. This could be surprising to users as it's not always obvious what is and isn't a macro.

Yet, that may not be a problem unique to Curlique, since Qi does not implicitly treat host-language macros as flows and they would need to be explicitly declared to be used, in any case. So even if Curlique handled this as an application, Qi would ultimately complain about it. And for that matter, mapping a list under a macro would not be supported in Racket either, so there is some awkwardness regarding macros and application in general.

This is just an example to illustrate the issue, but as it happens, in this particular case, curlique enables a much more elegant expression of what might be intended here:

(map {1} (list 1 2 3))  ;=> '(1 1 1)

Why Does It Work?

After being initially dazzled by curlique syntax, some of us started to question why it even worked. Based on the implementation:

(define-syntax curlique-app
  (syntax-parser
   [{~braces _ x ...}
    (syntax/loc this-syntax
      (flow x ...))]
   ...

(define-syntax-parse-rule (define-curlique-syntax name:id qi-form:id)
  (define-syntax name
    (syntax-parser
      [{~braces _ x ...}
       (syntax/loc this-syntax
         (flow (qi-form x ...)))]
      ...

(excerpt)

… it didn't seem that it should work at all!

The Qi expander is expecting Qi forms -- that is, either Qi core syntax or Qi macros -- in the flow expression position in (flow <floe>). Yet, the above expansion seems to put a Racket form in this position.

Michael confirmed that the expander (written in Syntax Spec), in addition to defining and providing the literal bindings for the Qi core language, does literal matching of the form name rather than datum matching, that is, it matches syntax not only in terms of the symbol name but also where it is defined. The compiler does datum matching, but Syntax Spec's design ensures that this is OK, since by the time the compiler receives syntax, it has already been validated.

So we expected that the curlique embedding above should indeed not work.

But after inspecting the expansion in DrRacket's Macro Stepper and Syntax Browser, we discovered that the ~> identifier had both Racket as well as Qi scopes on it, so that depending on context (i.e. Qi expansion or Racket expansion), it is matched to the appropriate form -- a convenient and nice state of affairs.

Another Kind of Language Composition?

Since Curlique operates by providing a custom #%app macro overriding the usual one, we wondered how this would interact with other languages and packages that do the same thing, for instance the Fancy App package that provides an application macro that supports template-based partial application (which is used in Qi), and even just the ability of users to override the #%app macro however they like.

In the past we've talked about language composition of a kind where the composition is structured so that expansion and compilation of syntax by one language is followed by the same done by another in an order explicitly defined by the user. There, the composition is done on a shared core language together with a formal way to promote this core language into the domain of the subsequent language in the composition, ensuring that identically-named forms may still operate after one another and compose in a well-defined way. But in the present case, there isn't necessarily an order indicated by the user and these #%app macros naively collide in a shared namespace. There's also no guarantee that the output of one would be usable by the other. So we considered what would happen if two such app macros interacted.

While composing both #%app macros on the same input syntax in sequence would not be well-defined, at least in cases we've considered so far, we'd typically just want to use one or the other #%app macro in each specific case. So we felt that the main impact here is that users who use the functionality from more than one #%app macro must themselves define the composition by means of a wrapping macro that matches input syntax patterns and employs the appropriate #%app in each case. We'll see an example of this soon when we discuss "#lang raqit".

(require curlique)?

Ben mentioned that he is considering providing curlique as a library so that this syntax can be used anywhere. In addition to being useful to end users, as our experiments next show, it seems that it could also be useful to language designers. Truly, it is the ultimate compact lambda syntax!

"#lang raqit"

Sid showed an experiment incorporating Curlique syntax into an actual language. It's just an experiment to reveal some horizons, and not necessarily something we'll pursue.

We've talked about a possible #lang qi in the past but haven't had any compelling use cases for it. Curlique syntax presents one opportunity for a useful and seamless synthesis between Qi and Racket.

Sid likes to say that if Racket were a painting, it wouldn't be a beautiful painting so much as a collage made up of lots of beautiful paintings. After all, Racket isn't just a language but also a platform for making languages. While seeking diversity or "completeness" of platform features in earnest, it cannot at the same time preserve consistency and unity across them.

So it seemed that if we're considering defining a new language anyway, it is also an opportunity to explore distilling and tweaking other aspects of Racket and Qi syntax to achieve a cohesive whole, a beautiful painting. If we could come up with such a coherent synthesis of Racket and Qi beyond a simple embedding, there may at least some uses for it that make it worthwhile.

We considered a working syntax specification. Of course, there will always be valid arguments for different choices of syntax, such as define over def, but sometimes it's easier to make such choices empirically rather than through rational substantiation.

Definitions

(def var 5)
(def (do-something arg)
  "something")

Standard Racket definitions. This uses the compact version of "define" that has gained currency in recent languages that have reached mainstream adoption, including Python, Ruby, and Clojure.

AList grouping of clauses

In some cases, list-based grouping of syntax is more structurally explicit than is strictly needed to infer appropriate semantics. For instance, bindings in a let form always come in pairs, so it's possible to eliminate one layer of parentheses by treating them in the same way as the syntax for "association lists" (a standard and traditional alternative to hash-based associations between keys and values used widely in Lisp communities). Paul Graham discusses this, and it is also the syntax used in Clojure for such forms.

(let (a 1
      b 2)
  (+ a b)) ;=> 3

Similarly, for cond:

(cond (positive? v) 'positive
      (negative? v) 'negative
	  else 'zero)

... and (Qi's) switch:

(switch (v)
  (positive? 'positive)
  (negative? 'negative)
  (else 'zero)

Racket treats {}, () and [] interchangeably. With one fewer layer of parentheses required in many cases, it frees up some of these parenthesis types for alternate uses.

Brackets for Unquoted List Syntax

[1 2 (abs -3) 4 5]  ;=> '(1 2 3 4 5)

This, like Curlique, is implemented as an #%app transformer.

Curlique syntax

(map {~> sqr add1} [1 2 (abs -3) 4 5])  ;=> '(2 5 10 17 26)

The two #%app macros above are composed as:

(define-syntax raqit-app
  (syntax-parser
    [(~brackets _ x ...)
     (syntax/loc this-syntax
       (list:app x ...))]
    [(~braces _ x ...)
     (syntax/loc this-syntax
       (curlique:app x ...))]
    [(_ x ...)
     (syntax/loc this-syntax
       (racket:app x ...))]))

Flows

Since curlique syntax is available for embedding flows compactly, we no longer need flow for this purpose, and can use it instead as a top-level flow definition form (i.e. define-flow), the Qi counterpart to def.

(flow check-number
  (switch
    (< 0) 'negative
    (> 0) 'positive
    else 'negative))

(map check-number [1 -2 0])

Macros

Syntax

The syntax for defining macros could be as simple as:

(syntax (where body ...+ bindings)
  (let bindings body ...))

for a specific rule, or

(syntax where
  (body ...+ bindings) (let bindings body ...)
  (body ...+ bindings) (let bindings body ...))

for a rule with multiple cases.

But this could be affected by deeper considerations around bindings discussed next.

One Kind of Binding

We didn't discuss this during the meeting, but it's another design aspect to consider.

Racket (via Scheme) has two kinds of bindings: value bindings and transformer bindings. The former are defined using define and the latter, with define-syntax.

The difference between the two is that the latter is not only an ordinary definition at an earlier phase (phase 1), but also an association of the resulting function with the syntax at phase 0, telling the expander that this function is to be employed during expansion. So really, define-syntax is not just a definition form but also does something more and is overloaded in this respect.

Yet, often, it is easier to employ the same concept in a different context than it is to learn two different concepts. The dichotomy of these two different kinds of bindings thus perhaps detracts from the ability of the language to "fit in one's head." To simplify it, we could eliminate the dichotomy in definitions and then introduce the association of resulting functions as syntax transformers by means of a module-level declaration, something like this:

(module mac-mod racket/base
  (define (mac a)
    (+ a a)))

… to define the form. Note that the "macro" is an ordinary definition. Now, as a first approach, we could then achieve the identification of this definition as a macro by using a simple declaration:

(macro mac)

But then, that is effectively the same as the overloaded behavior of define-syntax, which starts to appear a convenience, since it allows us to define and declare the macro in one step.

But instead, we propose the following:

(require (for-lang mac-mod))

This for-lang annotation seems just another way to declare that the required functions should be treated as macros, much like macro. But the difference is that it applies to an entire module and not individual functions.

Already in Racket, there is an identification of languages with modules. Every module defines a language as it implicitly specifies a reader and an expander. The truth is, every macro is an incremental extension of this base module language, but the new language L* which includes the base module language plus all of the macros imported and defined is not explicitly reified or implied anywhere. Encapsulating this language extension in explicit module-level declarations seems a natural way to model it in Racket.

In addition to ensuring that there is now only one type of definition (making macros trivial to test and requiring no special handling, as they are actually just functions and not "functions with something more"), this also has the benefit of encouraging defining macros, which are extensions to the language, in dedicated modules, rather than intersperse them amongst the code they transform.

It does mean that syntactic extensions in the form of macros must always be required for-lang. It would be possible for a single module to contain both functions as well as functions intended to be used as macros, but as a best practice, that should never be done, since the organization is well-modeled in independent modules.

Pattern-matching code could be used both for regular code as well as macros, since "macros" would just be regular functions, and would not require a custom pattern matching implementation. Indeed, the common pattern matching infrastructure (racket/match) could be augmented to include any functionality that would be useful for parsing syntax (from syntax/parse) that it doesn't already contain, or there could be a dedicated syntax parsing module -- but one that operates in phase 0 and does not entail any inter-phase considerations (those are achieved via the semantics of the module-level for-lang declaration).

If it is feasible, it would be another small feature that, together with everything else, could lend a feeling of coherence to the language as a whole, and could even be a practice adopted by other languages in the Racket ecosystem to simplify the interface to Racket's powerful macro machinery.

Feedback

Some of us felt that this was an instance of the "Lisp curse," where the ease of defining a new language could result in fracturing community efforts along related directions that would otherwise synergize into something robust, and that it would be better to contribute to existing community directions rather than introduce a new one.

For instance, as we have talked about before, it could be useful to port Qi to Rhombus. This is very much something we'd like to investigate, but it hasn't been clear if Qi could be easily translated into a non S-expression based language. In any case, it deserves to be explored.

Regarding the use of curly braces for embedding flows, what if we wanted to indicate sets in the language? Would we prefer to use {} for that instead? Then where does that leave us with compact syntax for flows?

Some concerns were raised that a language like this could be a lot of work and would thus take attention away from existing worthwhile directions. But it's likely that this kind of language would be mostly superficial, relying for its language features on Qi and on Racket, and not introducing any special features of its own besides syntax and besides curating and designing interfaces to features provided by the underlying languages. So it's likely that the work would still mainly be done on these underlying technologies.

Michael also opined that while a language with trimmed down, Clojure-like syntax makes sense as a project, and while Qi as a flow-oriented language makes sense as a project, the combination of the two didn't immediately seem motivated, especially if this combined language was the primary way to gain Qi functionality. Certainly, any such #lang must only be superficial and another way to use Qi seamlessly, not the way. More to the point, even if we develop this language further, it will never be the language of the Qi documentation. Rather, it would have distinct, superficial documentation that delegates to Qi and Racket docs as appropriate.

We also considered a #lang qi that was purely Qi without Racket, so that no flow embedding would be required. We felt that such a language could be useful as a scripting alternative to languages like Perl, where it could provide compact and intuitive one-liners that could be used at the command line.

The Perils of a Perfect Language

As a cautionary tale, Dominik shared an anecdote of how making a language "pure" doesn't always go according to plan. Recently, after a prolonged period of hard work on Actor Basic, the actors were communicating amongst themselves using an elegant protocol, and at long last, it had reached a stage where it was "perfect." But at this point, Dominik realized there was no way to interact with these actors and that introducing such a way would necessarily make the system "impure."

In the end, as it were, it was "a universe with no edge in space, no beginning or end in time, and nothing for a Creator to do."

(credit: Remedios Varo / Carl Sagan)

In any case, generally, as far as any "#lang raqit" or "#lang qi," the sentiment was that "thou shalt not mess with surface syntax," and that we should continue to focus our efforts on the many other interesting activities in which we are already engaged. Which of course, is very fair 😄

Decomposing Deforestation

Generalizing Beyond Qi?

We discussed the possibility of generalizing the deforestation in qi/list beyond Qi to the existing threading macro in threading-lib, and, for that matter, even to Racket in general, just by pulling in an appropriate require.

For instance, (require threading/fusion) could bring in a wrapping ~> macro that deforests functional operations and then delegates to the underlying ~> macro. Even more generally, (require fusion/list) could bring in map, filter, and foldl macros that are fusable.

Matching arbitrary syntax and optimizing it could be a seamless way to gain these optimizations in existing code, and it is possible with the power and flexibility that Syntax Spec gives us. On the other hand, reaching into host language syntax conceptually feels like overstepping (e.g. we would be changing invariants), and it seems cleaner to do such optimizations explicitly within the frame of a language.

As it happens, our current implementation of deforestation in Qi does already reach into the host language in this manner, and we've considered before whether it could be better for map, filter, and foldl to be core syntactic forms of a qi/list language rather than an optimization pass existing just at the compiler level, in order to preserve conceptual purity (with appropriate consideration of the risks of such a "pure" language gaining a life of its own!).

If we want to consider generalizing the applicability of deforestation, it would be worthwhile to first decide whether continuing to reach across the language boundary to do optimizations is advisable, or if it would be worth it to invest in proper language composition.

New Architecture

In the new architecture that Dominik presented, compilation takes two passes over the syntax, the first to determine what the fusable pieces are, and the second to call generate-fused-operation so that we have the pieces to compare (TODO: clarify). The goal for this architecture is to create alternative implementations of the optimization (e.g. state tupling or consing vs named let compositing that we talked about last time) easily.

qi-deforest-arch

As part of this there are 4 layers of operation:

  1. Syntax classes for functional operations
  2. Syntax classes for the three types of components: producer, transformer, consumer
  3. Creating the pass
  4. Doing the actual rewrite to apply the optimization

The components in the layout borrow terms from object-oriented programming, but we felt that the analogy was not exactly applicable. One component called "interface" implements a common set of attributes for manipulating each producer, transformer and consumer which may have diverse attributes. Instead of "interface," we considered the name "syntax" or "adapter."

We agreed that even without the desire to support multiple implementations, having the implementation be cleanly decomposed is nice. Eventually, we may at any rate decide that one implementation is better and delete the other.

On whether to extract qi/list as a separate package at this stage, we felt that there was still the possibility that it could be a generic optimization that could apply to any sequence type rather than lists specifically (in which case it should be more like qi/fusion), so we felt that we should retain it in its current location among the core modules until we have more clarity there.

Multiple Values?

We considered whether it's likely that one or the other implementation of fusion could be better for supporting multiple values.

It would be nice to be able to deforest Qi forms like >< and pass, but these support flows that could accept any number of values and return any number of values, for instance:

(~> (pass odd?) (>< (-< _ _)))

It would be enough for an implementation of fusion to be faster than the current implementation of these forms, and it doesn't have to be as fast as the (likely distinct) implementation for single-valued functional list operations, in order for it to be compelling to use fusion here.

Likewise, for map (with curlique and bracketed list syntax here):

(map {-< _ _} [1 2 3])  ;=> '(1 1 2 2 3 3)

We discussed why multiple valued outputs here should imply that those values are spliced in the result. That could have broader implications so it should be discussed and we should get feedback from community members on what they feel the consequences are, before committing to one path.

As another possibility, this expression could result in a multi-valued continuation or even independent continuations (in accordance with Flowy Logic), where each value is a list output corresponding to mapping under one tine of the tee junction, as the following case illustrates:

(map {-< add1 sub1} [1 2 3])  ;=> '(2 3 4) '(0 1 2)

We don't know at this time whether any of these cases would prove especially natural for any particular fusion implementation.

A few other things to consider. The first is that a known number of values can be inlined by the Chez compiler, so rewriting use of arbitrary-arity flows to those that use a specific number of arguments would address some of these cases in a different way. And lastly, cases where the "Dark Side of the Moon" operators (△ and ▽) are used could probably be rewritten to a single-valued fusion pipeline.

Syntax Spec Changes

Michael mentioned that there have been some recent changes to Syntax Spec syntax and it will require some manual labor to make the modification everywhere. Sounds like we've also got help from the other Michael (Mike Delmonaco) for writing some docs for Syntax Spec, so that's very exciting!

Next Steps

(Some of these are carried over from last time)

  • (Ben) Release Curlique as a library?
  • Put up a repo showcasing "#lang raqit" just so we have it up and can play with it.
  • Investigate what implementing Qi for Rhombus would look like. To what extent can we delegate to the present implementation? What would the surface syntax look like? How to compile to shrubberies?
  • Decide on a release model, document it in the user docs, and announce the performance regression (and the remedy) to users.
  • Improve unit testing infrastructure for deforestation.
  • Discuss and work out Qi's theory of effects and merge the corresponding PR.
  • Schedule a discussion for the language composition proposal and implement a proof of concept.
  • Decide on appropriate reference implementations to use for comparison in the new benchmarks report and add them.
  • Deforest other racket/list APIs via qi/list
  • Decide on whether there will be any deforestation in the Qi core, upon (require qi) (without (require qi/list))
  • Continue investigating options to preserve or synthesize the appropriate source syntax through expansion for blame purposes.

Attendees

Ben, Dominik, Michael, Sid

Clone this wiki locally