Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Racket 7 changes the behavior of syntax-parameterize when combined with local-expand #2035

Closed
lexi-lambda opened this issue Apr 9, 2018 · 12 comments

Comments

@lexi-lambda
Copy link
Member

Flamebait alternate title: Racket 7 breaks my program!

Given the below program:

#lang racket

(require racket/stxparam
         syntax/parse/define)

(define-syntax (#%prim stx)
  (raise-syntax-error #f "cannot be used as an expression" stx))

(begin-for-syntax
  (define-syntax-class (form [intdef-ctx #f])
    #:attributes [expansion]
    [pattern _ #:with {~var || (expanded-form intdef-ctx)}
                      (local-expand this-syntax 'expression (list #'#%prim) intdef-ctx)])

  (define-syntax-class (expanded-form intdef-ctx)
    #:description #f
    #:attributes [expansion]
    #:commit
    #:literal-sets [kernel-literals]
    #:literals [#%prim]
    [pattern (head:#%expression ~! {~var a (form intdef-ctx)})
             #:attr expansion (syntax-track-origin #'a.expansion this-syntax #'head)]
    [pattern (letrec-syntaxes+values ~! ([(id:id ...) e:expr] ...) () t:expr)
             #:do [(define intdef-ctx* (syntax-local-make-definition-context intdef-ctx))
                   (for ([ids (in-list (attribute id))]
                         [e (in-list (attribute e))])
                     (syntax-local-bind-syntaxes ids e intdef-ctx*))]
             #:with {~var || (form intdef-ctx*)} #'t]
    [pattern (#%prim ~! _:id)
             #:attr expansion this-syntax]))

(define-simple-macro (expand-and-quote-form e:form)
  (quote-syntax e.expansion))

(expand-and-quote-form (#%prim x))

(define-syntax-parameter foo #f)

(expand-and-quote-form (syntax-parameterize ([foo (syntax-parser
                                                    [(_ x) #'(#%prim x)])])
                         (foo x)))

…running this on Racket 6.12 and HEAD produces different results:

$ racket6.12 stxparam-bug.rkt
#<syntax:/private/tmp/stxparam-bug.rkt:35:23 (#%prim x)>
#<syntax:/private/tmp/stxparam-bug.rkt:40:61 (#%prim x)>
$ racket7 stxparam-bug.rkt
stxparam-bug.rkt:40:61: #%prim: cannot be used as an expression
  in: (#%prim x)
  location...:
   stxparam-bug.rkt:40:61
  context...:
   /private/tmp/stxparam-bug.rkt:6:0

This is because syntax-parameterize in Racket 7 recursively expands its body, which causes problems when it’s important that certain forms aren’t expanded. This change was introduced in racket/racket7@6c7574f.

Pinging @mflatt and @michaelballantyne as the relevant parties.

@michaelballantyne
Copy link
Contributor

I believe the problem was actually present for all versions of Racket 7's syntax-parameterize implementation, rather than being specific to the change I made in racket/racket7@6c7574f. Previous versions also local-expanded the body, but handled the parameter environment differently.

This seems like a general problem for uses of local-expand nested within other instances of local-expand that use a stop list.

Perhaps we need a variant of local-expand, or at least local-expand/expression, that respects stop lists established by parent expansions.

@lexi-lambda
Copy link
Member Author

I believe the problem was actually present for all versions of Racket 7's syntax-parameterize implementation, rather than being specific to the change I made in racket/racket7@6c7574f. Previous versions also local-expanded the body, but handled the parameter environment differently.

I think you’re right. I was looking at the code in more detail last night after I opened this bug report, and I suspected that might be the case (given how similar the Racket 7 implementation of syntax-local-expand-expression/extend-environment was to syntax-local-expand-expression). I considered trying to find where in Racket 6 syntax-local-expand-expression/extend-environment is actually implemented to better understand what it used to do, but I still find the C expander mostly unreadable, and I didn’t get very far.

Perhaps we need a variant of local-expand, or at least local-expand/expression, that respects stop lists established by parent expansions.

I don’t think this alone would fix the problem if syntax parameters are implemented using phase 1 parameters, since that approach only works if the body is fully expanded (since otherwise the syntax parameter will reset to its old value once the expander returns to the pieces of syntax that were left unexpanded due to the stop list). I’m unconvinced that this problem can be fixed as long as syntax parameters are implemented using phase 1 dynamic binding rather than modifying the phase 0 lexical environment (which, if I understand correctly, they used to do prior to that commit).

@michaelballantyne
Copy link
Contributor

The previous Racket 7 implementation also used dynamic binding, just in the expander's context object rather than in a parameter in user code. Racket 6 didn't have syntax-local-expand-expression/extend-environment, and instead used a different implementation of syntax parameters using syntax-local-get-shadower to create unhygienic let-syntax bindings that shadow each other, regardless of lexical scope.

I don’t think this alone would fix the problem if syntax parameters are implemented using phase 1 parameters, since that approach only works if the body is fully expanded (since otherwise the syntax parameter will reset to its old value once the expander returns to the pieces of syntax that were left unexpanded due to the stop list)

Generally this scenario doesn't make sense to me because we haven't solved the problem of local-expanding under let-syntax binders and using any stop-list stops at letrec-syntaxes+values. It looks like you're doing something very creative to get around this, but without breaking the abstraction of the syntax-parameterize implementation as you do by replacing the letrec-syntaxes+values with definition context bindings, the difference between the three implementations isn't observable.

Can you explain what you're trying to achieve with the code that does this?

@lexi-lambda
Copy link
Member Author

The previous Racket 7 implementation also used dynamic binding, just in the expander's context object rather than in a parameter in user code. Racket 6 didn't have syntax-local-expand-expression/extend-environment, and instead used a different implementation of syntax parameters using syntax-local-get-shadower to create unhygienic let-syntax bindings that shadow each other, regardless of lexical scope.

Yes, once again you’re right and I was wrong here. I was looking at the history of racket/private/stxparam in the racket7 repo and saw that syntax-local-get-shadower was replaced with syntax-local-expand-expression/extend-environment in mid-2016, which led me to believe it was in Racket 6.12, but that commit actually exclusively applied to the rewritten expander. Oops.

In any case, I think you’re right that the heart of the issue is a desire to get rid of syntax-local-get-shadower, which is ugly, but other workarounds change semantics. To me, syntax parameters are a fundamentally lexical abstraction, and I expect them to operate like a “safe fluid-let-syntax”, as the original paper presented them. Forcing recursive local expansion and changing expansion order feels just as unsatisfying as syntax-local-get-shadower to me, and indeed, I think it’s worse, since syntax-local-get-shower was hidden behind a watertight abstraction. Now, syntax-parameterize is leaky.

Can you explain what you're trying to achieve with the code that does this?

I can, yes, and admittedly, your proposed solution of “a variant of local-expand […] that respects stop lists established by parent expansions” would fix my particular use case (but it wouldn’t necessarily fix things in general). For Hackett, I am defining a custom language with different core forms from Racket’s core forms. As we discussed at RacketCon, I wish to do this so that Hackett can have different compilation targets, as well as a representation more amenable to optimization, but in this case, I’m actually not doing that yet. I’m using this technique for Hackett’s type language.

Here’s an excerpt from what the real code actually looks like:

;; ---------------------------------------------------------------------------------------------------
;; fully-expanded types

(define-syntaxes [#%type:con #%type:app #%type:forall #%type:qual
                  #%type:bound-var #%type:wobbly-var #%type:rigid-var]
  (let ([type-literal (λ (stx) (raise-syntax-error #f "cannot be used as an expression" stx))])
    (values type-literal type-literal type-literal type-literal
            type-literal type-literal type-literal)))

(begin-for-syntax
  (define type-literal-ids
    (list #'#%type:con #'#%type:app #'#%type:forall #'#%type:qual
          #'#%type:bound-var #'#%type:wobbly-var #'#%type:rigid-var))
  (define-literal-set type-literals
    [#%type:con #%type:app #%type:forall #%type:qual
     #%type:bound-var #%type:wobbly-var #%type:rigid-var]))

;; ---------------------------------------------------------------------------------------------------
;; type expansion

(begin-for-syntax
  (define-syntax-class (type [intdef-ctx #f])
    #:description "type"
    #:attributes [expansion]
    [pattern _
             #:do [(println this-syntax)]
             #:with {~var || (expanded-type intdef-ctx)}
             (local-expand this-syntax 'expression type-literal-ids intdef-ctx)])

  (define-syntax-class (expanded-type intdef-ctx)
    #:description #f
    #:attributes [expansion]
    #:commit
    #:literal-sets [kernel-literals type-literals]
    [pattern (head:#%expression ~! {~var a (type intdef-ctx)})
             #:attr expansion (syntax-track-origin #'a.expansion this-syntax #'head)]
    [pattern (letrec-syntaxes+values ~! ([(id:id ...) e:expr] ...) () t:expr)
             #:do [(define intdef-ctx* (syntax-local-make-definition-context intdef-ctx))
                   (for ([ids (in-list (attribute id))]
                         [e (in-list (attribute e))])
                     (syntax-local-bind-syntaxes ids e intdef-ctx*))]
             #:with {~var || (type intdef-ctx*)} #'t]
    [pattern (#%type:con ~! _:id)
             #:attr expansion this-syntax]
    [pattern (head:#%type:app ~! {~var a (type intdef-ctx)} {~var b (type intdef-ctx)})
             #:attr expansion (syntax/loc/props this-syntax
                                (head a.expansion b.expansion))]
    [pattern (head:#%type:forall ~! x:id {~var t (type intdef-ctx)})
             #:attr expansion (syntax/loc/props this-syntax
                                (head x t.expansion))]
    [pattern (head:#%type:qual ~! {~var a (type intdef-ctx)} {~var b (type intdef-ctx)})
             #:attr expansion (syntax/loc/props this-syntax
                                (head a.expansion b.expansion))]
    [pattern (#%type:bound-var ~! _:id)
             #:attr expansion this-syntax]
    [pattern (#%type:wobbly-var ~! _:id)
             #:attr expansion this-syntax]
    [pattern (#%type:rigid-var ~! _:id)
             #:attr expansion this-syntax]))

Essentially, this code uses the Racket macroexpander to convert syntax objects like (forall [a] (Either String a)) to (#%type:forall a1 (#%type:app (#%type:app (#%type:con Either) (#%type:con String)) (#%type:bound-var a1))). These fully-expanded types can then be handed off to the typechecker in the same way that fully-expanded programs can be handed off to Racket’s compiler.

The expansion uses a stop list because it needs to treat things like #%type:forall specially in the same way the macroexpander treats #%plain-lambda specially (that is, it should not be expanded, but it still needs to expand some of its subforms). Therefore, despite using a stop list, type expansion is still essentially a recursive expansion, since it recursively calls local-expand in the appropriate places on kernel forms’ subforms.

Since the expansion is still fundamentally recursive, it’s possible to handle letrec-syntaxes+values, since we can just manually do what the real expander does (with the exception that we don’t expand to let-values because runtime bindings don’t make sense for the type language) by adding the syntax bindings to a first class definition context before recursing. But syntax-parameterize still breaks my code, since it ignores my stop list, and it tries expanding things like #%type:con.

@lexi-lambda
Copy link
Member Author

I’ve bumped into this again just now, though in a slightly different way this time. In this case, the only problem is that syntax-parameterize doesn’t respect my local-expand stop list. I’ve been working around the problem by using phase 1 parameters directly, but now I can’t: the use of syntax-parameterize is in match, which I can’t exactly easily reimplement in my code.

It seems clear that I am abusing the stop list. It is designed to be used for head expansion, but I am using it for a different purpose entirely (effectively creating multiple passes of macroexpansion for the purpose of dictionary elaboration). That said, whether it’s an abuse or not, it’s very useful, so I’d appreciate trying to figure out a way to make this work.

First, some context. In Hackett, I need to delay the process of dictionary elaboration, since dictionary elaboration requires fully-solved type information, and type information is only solved by expanding the program. To allow for this, dictionaries are inserted using a #%dictionary-placeholder macro, which is added to the stop list. Once local-expand has expanded everything except the dictionaries, the macroexpander walks the program a second time and expands all the remaining #%dictionary-placeholder macros.

However, syntax-parameterize breaks this, since it expands recursively without a stop list. Since this is used in the implementation of match, match itself indirectly breaks this technique, too. This causes trouble in Hackett’s implementation, and while I’ve traditionally managed to tweak things until they work well enough, it’s gotten impossible to work around.


One thing I realized over the past few days is this: the current handling of the stop list is already broken. For a while, I questioned whether or not the behavior that causes forms to be automatically added to a non-empty stop list served any purpose; if not, I figured we should just remove it (in a backwards-compatible way, of course). But then I realized that partial expansion of let-syntax actually can cause problems—forms left unexpanded won’t have access to the let-syntax bindings because they’re already gone.

However, I’ve recently realized this argument is flawed. While it’s true that partial expansion can mix poorly with local syntax binding forms, any form can be a local syntax binding form, due to the existence of first-class definition contexts. Users of partial expansion already must take care not to lose bindings this way, and currently, we don’t have any mechanisms in place to prevent such things.

So the guarantees we’re providing here are weak at best, which makes me a little uncomfortable. It’s tempting to add a syntax-local-stop-list function or even a current-local-expand parameter, but that seems potentially dangerous and certainly complicated, and existing code would need to be updated to use it. I’m not sure what the right way forward is, but I do seem to have gotten stuck.

@mfelleisen
Copy link
Collaborator

mfelleisen commented Jun 5, 2018 via email

@rmculpepper
Copy link
Collaborator

The point of local-expand is generally to expand a form until it fits a known grammar, then do case analysis of the expanded form following the structure of that grammar. The grammar is determined by the stop list. So allowing outside forces (ie the dynamic expansion context) to influence the stop list would probably tend to break the case analysis code.

@mfelleisen
Copy link
Collaborator

mfelleisen commented Jun 5, 2018 via email

@michaelballantyne
Copy link
Contributor

michaelballantyne commented Jun 5, 2018

How is partial expansion with a stop list getting far enough into the expansion of match to expand the syntax-parameterize it produces? Is syntax-parameterize appearing as the outermost form of the expansion, outside of any other core forms like let?

Or are you trying to replicate Racket's core form expansion and recurring into subexpressions, as you were for the type language and syntax parameters there?

Your point about the general issue of lost local syntax bindings reveals an implicit protocol between macros for the main language of Racket expressions and definitions which I had not previously noticed. Generally it is safe to expect an extension of the expander environment or a phase 1 parameterize to persist throughout subexpression expansions for a macro expanded with syntax-local-context being 'expression, but it is unsafe in other contexts. The implementation of syntax-parameterize follows this convention by wrapping itself in #%expression when expanded in a non-expression context. This convention holds because expansion with a stop list is mostly used to implement head-expansion of definitions in definition contexts, and expansion of expressions is deferred by stopping on all kernel forms and by #%expression wraps.

We probably shouldn't expect it to be safe to use local-expand with 'expression as the context together with a stop list, unless we know that we're expanding something that is not a normal Racket expression but rather an element of a custom language we've made with a different protocol like we do for types or type rule macros. That would mean that re-implementing expression expansion in user-space isn't safe. Other macros that expand in expression position and expect it to be safe to establish local syntax bindings would also be broken. (class, units, block... anything using definition contexts).

Thus, while we could solve the particular issue of syntax-parameterize not stopping per the surrounding stop list, I don't yet understand where it would help given the parameterization would be lost after the partial expansion. (An example solution: a variant of syntax-local-expand-expression that respects the stop list established by any surrounding local-expand, which syntax-parameterize could use).

Can Hackett work as turnstile does now, where all expansion of type-erased forms, not just dictionary elaboration, is delayed by wrapping with something in the stop list? Turnstile's assign-type now wraps with a form like your #%defer-expansion. Thus, the expansion of match and dictionary elaboration would happen together in the second pass, with no stop lists for syntax-parameterize to interfere with.

@lexi-lambda
Copy link
Member Author

A status update: I have, on a branch, adjusted the expander to finer control of the local-expand stop list, enabling a user to request that additional forms not be added automatically to the stop list. I’ve also adjusted the behavior of the core #%module-begin form to not throw the stop list away completely (as long as the stop list includes module*). This has gotten me somewhat unstuck, and compilation in modules seems to work again, though I have not yet figured out precisely how to handle the top level under Hackett’s new elaboration scheme, which is a little delicate.

This weakening I have done of both local-expand and #%module-begin is not strictly an improvement. As I mentioned in my last comment, this sort of expansion can break certain guarantees most macro users and writers probably expect to hold. In Hackett’s case, this is okay, since I control everything, and I’m actually alright with restricting the macro model to achieve my goals (using macros from Racket probably won’t work anyway, since they aren’t typed), but code without much in the way of strong guarantees always makes me uneasy, all the same.

The new system is also very slow. Hackett compilation times are already very poor, in some part due to recursive uses of local-expand; this makes them worse. The existing solution to nested local-expand is syntax-local-expand-expression, but by design, it doesn’t work with a stop list. Hackett needs to use the stop list, and it additional compounds the problem further with its new elaboration strategy, which involves multiple “passes” of macroexpansion—in fact, it allows an arbitrary number of such passes—and currently, each pass demands that the expander re-walk the entire tree of the (mostly expanded) program. I haven’t had this working for long enough to quantify (or even qualify) just how slow this is in practice, but from a purely theoretical point of view, the algorithmic complexity is not good.

Anyway, this is clearly an issue bigger than syntax-parameterize, so this issue thread is probably the wrong place to discuss these things. But I think I’m converging on the requests I’d like to ask of the macroexpander, requests the expander is currently unable to properly fulfill and which I am not immediately sure how to change the macroexpander so that it can fulfill them. But I think it’s a solvable problem—it just requires more thought and experimentation, and I figured you’d probably be at least a little interested in being kept in the loop. (I certainly appreciate @michaelballantyne’s musings; they’re consistently both interesting and helpful.)

Can Hackett work as turnstile does now, where all expansion of type-erased forms, not just dictionary elaboration, is delayed by wrapping with something in the stop list?

I’m actually not familiar with this technique. My guess is that no, it would not actually solve my problem, but I’m curious what exactly Turnstile is doing here and why it’s doing it. Could you explain in a little more detail and/or point me to the relevant place in the Turnstile code?

@michaelballantyne
Copy link
Contributor

Whenever a type rule has computed the type of a term and is ready to return the erased term and the type, it does so by expanding to syntax of the form (erased erased-term-here) with the type attached as a property. erased is in the stop list for the local-expand call in infer+erase, so it and the syntax within will not be expanded. If the type rule is not expanding in the local-expand of another type rule but rather as the outermost typed form, erased won't be in the stop list and will expand to its subform (erased-term-here).

The result is that we build up the entire type-erased term during typechecking before it gets expanded at all, and when it gets expanded it is expanded all together. So there's no problem if a typed term erases to something using syntax-parameterize and a subexpression erases to something that accesses that syntax parameter; everything works out. The other benefit is that we avoid repeatedly re-expanding the core forms from the type-erased term we've already expanded as we build up the type-erased expression on our way back up the tree during typechecking.

Here's where we add the wrapper: https://github.com/stchang/macrotypes/blob/7ad0789111a83b97d02f5749f749396d72c4fb1b/macrotypes/typecheck-core.rkt#L349

@lexi-lambda
Copy link
Member Author

Closing this because the discussion has long since settled, and the current implementation is here to stay (which I think is just fine).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants