Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.Sign up
GitHub is where the world builds software
Millions of developers and companies build, ship, and maintain their software on GitHub — the largest and most advanced development platform in the world.
cmd/compile: add a way to declare variables in rewrite rules #37423
Consider this rewrite rule:
It'd be a lot easier to read and write (and compile) if we could declare a variable somewhere to hold the value
Maybe we could let the condition section contain short variable declarations?
It looks a bit weird. Other ideas?
Kinda related: #30818
I hacked up a CL to allow
(Thanks, @mvdan, for your rulegen improvements. They make doing stuff like this so much easier.)
Having added a few
This could potentially be really valuable in a CL like CL 173659 that uses helper types heavily (
Another syntax idea might be to borrow the from if statements a little bit (
I'm not a big fan of this because the syntax seems ambiguous. One might read
Though I'm not a fan of parentheses for this use case. I like @mundaym's suggestion to reuse semicolons a little better.
Here's another idea; don't try to fit this straight into the rule, as we could end with complex syntax all in one single "expression". We don't want to have a full language, but we could add a couple more elements to it, like:
We could also allow many such scoped definitions, like
The reason I make this suggestion is because I think it would keep the code readable, and allow us to reuse definitions across multiple related rules without having to mess with global variables or macros.
Yeah, that's a bummer. We're not exactly using a bulletproof parser for this (as I discovered recently when I wrote something like
I like the semicolon idea (and it could then subsume #30818 by allowing any statement), but there's one weird case, the last cond/stmt before the arrow.
you'd have to write
I think it'd be better to be inline, because it makes scoping obvious, and gives more control over order of evaluation (for both semantics and performance).
The fundamental issue here is that we started with a LISP-like parenthesized prefix notation for the rules, but then tried to bolt on more syntax that departed from the S-expressions ways of writing things.
I think adding even more may be a mistake. If anything, I would move back to a more regular (and easier to write a parser for!) LISP-like prefix notation.
I understand that many people don't like the LISP fully-parenthesized notation, but this weird mix of prefix/infix is starting to show its limitations.
Do you have a concrete suggestion that we can discuss? That often helps.
I am sympathetic to this, but I'm not sure I agree. I don't actually care that the parser isn't bulletproof; I care that the DSL is easy to use, read, and reason about. The rewrite rules are very much a shop built jig.
Rules are in the form:
where LHS and RHS are S-expressions. Someone would actually suggest that the whole expression should be an S-expr:
but maybe it's not necessary. Keeping pattern-arrow-replacement (in this order) may be clearer.
This rule is already in the
No Go-like calls. It's
No mix of infix && and S-expressions. This:
If you don't like that
What to introduce variable declaration? No need to come up with new syntax. Your example above:
Allowing side-effects expressions in the RHS (proposed in #30818)? Call it
You don't have to like any of this, obviously. I'm just throwing out this idea because it seems possible to me that in some (near, or maybe far) future we'll start to realize that the rules syntax is getting too hairy to handle, and if it ever becomes clear that what was build is too hard to extend with new syntax, you may want to reconsider the whole approach, and turn to something that is simpler, and more regular.
Another possible syntax for this would be to use backticks to indicate that a statement is meant to be inserted directly inline:
That solves the ambiguity problem, and doesn't have the weird trailing semicolon issue.
This last idea is pretty hacky but I like it. We could limit its power by only allowing declarations initially, and perhaps expand it to any single statement later. It would complicate the rules language for sure, but it's not really meant to be used directly by anyone other than rulegen anyway.
Another approach could be to CSE the expressions instead of giving an expression a symbol as the author of a rule.
That will help the generated code but keep the human written code a little repetitive. But the repetitions actually help readability in the examples shown, since one doesn't have to juggle so many symbols.
The small helper functions the Go team introduces all time are pretty good at keeping the rules small and understandable.