Skip to content

Qi Compiler Sync Oct 14 2022

Siddhartha Kasivajhula edited this page Dec 15, 2022 · 5 revisions

Compiler Planning and Bindings Implementation

Qi Compiler Sync Oct 14 2022

Adjacent meetings: Previous | Up | Next

Summary

We discussed some high level goals and approaches for the compiler and also made some progress implementing bindings at the compiler level.

Background

Since Sid is on vacation and Michael will be preparing for RacketCon, we spent part of our time taking stock of where things are and planning medium-long term next steps.

Last time, we talked about possible ways in which bindings could be implemented in the compiler. We agreed to go with the "A Normal form" approach for now. This time, we picked up on an A-normal-style implementation of bindings in the compiler and got it working, aside from a couple of issues that require additional poking.

Planning

Goals

Since the compiler is going to be a long term effort, we proposed some initial goals for the compiler before we can merge it into the main branch so that it becomes part of the main line of development. These goals were:

  1. The high-level architecture (e.g. the layout of compiler passes) should be in place.
  2. Performance should be at least as good as before the compiler was added.
  3. We should "beat Racket" on at least one benchmark (see #78).

Optimizations and Compiler Passes

So far, the broad optimizations we've considered are:

  1. Deforestation
  2. Arity inference
  3. Avoiding values ↔ list conversions

Each such optimization may correspond to multiple passes. For instance, an optimization could involve (1) an information gathering pass to collect and propagate nonlocal information that will be necessary to the optimization, and (2) using that information to perform the code transformations.

At the moment, the few optimizations we've written are mostly ad hoc rules, but later, it will likely make sense to have the families of rewrite rules organized into the generalized optimization categories (like the ones above) they represent, and these could constitute the optimization passes.

Optimal Ordering of Compiler Passes

One approach that may help in structuring passes is "equality saturation" or "e-graphs." This involves constructing a data structure that expresses applying all possible optimizations to the code at each step (so that, for N optimizations, a node representing the code being transformed would have N children at each step). The e-graph data structure is thus able to represent all possible paths through compiler transformations, detecting which ones are "confluent" (i.e. converge to the same output code) and which ones aren't. This may be useful in determining the optimal order in which the optimizations should be performed.

Implementing Bindings

Applying the Bindings Transformations on Syntax Instead of Lists

Last time, we'd implemented the bindings transformations in the compiler by making the simplifying assumption that the input was a list rather than syntax. This time, we used some helper utilities from the ee-lib library (in particular, map-transform, which is analogous to map but for mapping a syntax transformer over syntax) to translate the implementation so that it operates on syntax. We also added a "pass through" rule in the expander to treat (as ...) forms as anonymous (i.e. not bindings-specific) core syntax and simply pass it through to the compiler. This allowed us to write unit tests at the Qi level (instead of specifically at the low level of the compiler) in order to test the bindings functionality:

(check-equal? ((☯ (~> (as v) (+ v))) 3)
              3)

We will need to properly declare the as form as a binding form down the line, of course, so that it can check references, preserve hygiene, etc.

Issues to be Resolved

  1. The current approach to implementing bindings in the compiler isn't able to distinguish Qi contexts from Racket contexts, so it would perform the bindings transformations on nested Racket subexpressions (e.g. expressions wrapped in an esc form) in addition to transforming Qi code. It would also transform nested Qi subexpressions within those Racket subexpressions.

This should be changed to only transform code in a Qi context and not look at any Racket subexpression. Alternatively, we could define the mapping function so that it is in some way aware of all of Qi syntax, so that it does the transformations more formally, but this may be challenging to implement for languages with many core forms.

  1. Some Qi core language forms compile to Racket in such a way that they employ the binding before it is available (e.g. curry in the expansion for partial applications). These need to be rewritten so that they don't (e.g. the partial application should be written to use lambda instead of curry).

Next Steps

  • Write a (failing) unit test that validates the unbounded nesting problem.
  • Fix the unbounded nesting problem.
  • Write a (failing) unit test that validates the "anaphoric" binding problem.
  • Fix the core language forms to avoid the anaphoric binding problem.

Note

The next compiler meetup is scheduled for November 4.

Attendees

Michael, Sid

Clone this wiki locally