Every repository with this icon (
Every repository with this icon (
With, Unhygienic, and Call-By-Name Semantics in Rewrite
Rewrite is a gem that adds code-rewriting to the Ruby programming language. Recently, Caleb Clausen announced RubyMacros. We've had some discussion about the differences between the two projects on the Ruby Forum list and in emails: Caleb has graciously given me permission to repeat some of our discussion here.
Note well that I will say things like "In rewrite, you can do X" and also things like "In rewrite, Y is the case." These statements do not imply that you cannot do X in RubyMacros, nor do they imply that Y is not the case in RubyMacros. They are simply statements about rewrite.
what's with 'with' in rewrite?
Rewrite does no rewrite all of the code in your project out-of-the-box. Instead, you supply code you wish to be rewritten in a block to the with method, indicating which "rewrites" you wish to apply. In this example, we are taking a block of code and applying the andand rewrite to it:
with(andand) do
...
first_name = Person.find_by_last_name('Braithwaite').andand.first_name
...
end
You have probably figured this out from the example, but rewrites are first-class Ruby objects. You can define them in local variables like this:
andand = Rewrite::ByExample::Unhygienic.
from(:receiver, :message, [:parameters]) {
receiver.andand.message(parameters)
}.to {
lambda { |andand_temp|
andand_temp.message(parameters) if andand_temp
}.call(receiver)
}
Or you can define them inline, just as Ruby lambdas and blocks are equivalent to functions defined inline:
with(
Rewrite::ByExample::Unhygienic.
from(:receiver, :message, [:parameters]) {
receiver.andand.message(parameters)
}.to {
lambda { |andand_temp|
andand_temp.message(parameters) if andand_temp
}.call(receiver)
}
) do
...
first_name = Person.find_by_last_name('Braithwaite').andand.first_name
...
end
Or you can use some of the built-in rewriters from rewrite's prelude:
include Rewrite::Prelude
with(please, try) do
# ...
@phone = Location.find(:first, ...elided... ).try(:phone)
# ...
@area_code = @phone.please.area_code
# ...
end
with is a deliberate design choice. The idea is that you can explicitly state what is to be rewritten and how it
is to be rewritten, using Ruby in a Ruby-like way. Of course, some people like magic, and if you look at Rails, the
initializers and environment.rb file allow you to sprinkle magic throughout your project implicitly. My feeling when
I designed rewrite was that that if I started with explicit "with," it would easy to build implicit into a project
or framework later.
And yes, with can accept a list of rewrites and it can be nested.
what is the difference between unhygienic and called_by_name?
Rewrite provides a facility for code rewriting, which is one level above unhygienic macros. A traditional unhygienic macro is a way of saying "when you see something that looks like a method call, replace it with the following code, performing substitutions here and here and here." Rewrite supports this as well as a number of other arbitrary rewriting rules.
Rewrite is very low-level. It uses these ridiculous s-expressions generated by a gem called ParseTree. That is not ParseTree's fault, ParseTree is giving us the very implementation-specific AST that MRI 1.8.x produces. Other Ruby implementations will have different trees. RubyMacros uses its own AST format.
Here's an example of rewrite working directly with s-expressions. It's an excerpt from try.rb:
def process_call(exp)
# [:call, [:dvar, :foo], :try, [:array, [:lit, :bar]]]
exp.shift
# [[:dvar, :foo], :try, [:array, [:lit, :bar]]]]
receiver_sexp = exp.first
if exp[1] == :try
message_expression = exp[2][1]
exp.clear
s(:call,
s(:iter,
s(:fcall, :lambda),
s(:masgn,
s(:array,
s(:dasgn_curr, :receiver),
s(:dasgn_curr, :message)
)
),
s(:if,
s(:call, s(:dvar, :receiver), :respond_to?, s(:array, s(:dvar, :receiver))),
s(:call, s(:dvar, :receiver), :send, s(:array, s(:dvar, :message))),
s(:nil)
)
),
:call,
s(:array,
process_inner_expr(receiver_sexp), # [:dvar, :foo]
process_inner_expr(message_expression)
)
)
else
# pass through
begin
s(:call,
*(exp.map { |inner| process_inner_expr inner })
)
ensure
exp.clear
end
end
end
Lovely stuff, that.
Unhygienic and called_by_name are both a level above that kind of direct manipulation of the Abstract Syntax Tree. They both work by defining rewrites in Ruby code, and of course they do it in different ways. So, Rewrite provides a low-level, implementation-specific way to rewrite code. Unhygienic and called_by_name are built on top of rewrite and provide a higher level of abstraction.
unhygienic
Unhygienic defines something like a simple search-and-replace. You define a from and a to, specifying which pieces of the from are variables. For example, defining something like && using Unhygienic is:
Unhygienic.from(:x, :y) {
our_and(x, y)
}.to {
if temp = x
y
else
temp
end
}
And we could use it like this:
with(
Unhygienic.from(:x, :y) {
our_and(x, y)
}.to {
if temp = x
y
else
temp
end
}
) do
# ...
our_and(MyActiveRecordModel.find(:first, ...), something_something())
#...
end
And you will get:
begin
# ...
if temp = MyActiveRecordModel.find(:first, ...)
something_something()
else
temp
end
# ...
end
Unhygienic is very literal, so it will always call the temporary variable temp. In Ruby 1.8, this is a problem. Also, it replaces x and y with any expression you put in, so if you use one of these variables twice, you can have interesting issues if the expression generates side effects or is computationally expensive.
That's why the example above uses temp. Had we written it as:
Unhygienic.from(:x, :y) {
our_and(x, y)
}.to {
if x
y
else
x
end
}
Then we would hit the database twice whenever we wrote something like our_and(MyActiveRecordModel.find(:first, ...), something_something()). Note also the scoping issues in Ruby 1.8: temp will interfere with any other variable named temp. Now you know why it is called unhygienic.
called_by_name
called_by_name is a little more complicated that a simple unhygienic rewrite: called_by_name actually defines a lambda that you use. When you write:
with(
called_by_name(:our_and) { |x,y|
if temp = x
y
else
temp
end
}
) do
# ...
our_and(MyActiveRecordModel.find(:first, ...), something_something())
#...
end
You get:
lambda do |our_and|
# ...
our_and.call(
lambda { MyActiveRecordModel.find(:first, ...) }, lambda { something_something() })
#...
end.call(
lambda do |x,y|
if temp = x.call
y.call
else
temp
end
end
)
What just happened is that our_and is defined as a lambda, with called_by_name doing some jigger_pokery to turn the expressions you provide into thunks. This implements call-by-name semantics for Ruby lambdas. And as a bonus, you can get rid of the annoying .call method invocation.
There are some important implications of this approach. First, with unhygienic, our_and disappears. There is no our_and function or method, invocations are replaced by whatever to expression you provide. Whereas, called_by_name actually defines a lambda for our_and and defines it in scope for our block of code.
Note: Caleb asked about the fetish for lambdas. In Ruby 1.8 MRI, it makes no difference. In other implementations, these constructs will by hygienic. Yet, these implementations do not exist yet. So, either I don't understand YAGNI, or perhaps I have spent too much time with languages like Javascript that actually get this right, or I am from the future and I know that Ruby will get this right.
Second, even though temp is still around and still could shadow some variable where it is defined, it doesn't shadow any definition inside your block. So you could define rewriters with called_by_name at the top level or inside of a method somewhere and be assured that you are making 100% hygienic code.
Caleb pointed out that called_by_name is less powerful than full-blown macros. True and false. It is less powerful than macros, plural. In that you can use macros to define a called_by_name macro that you could then use on your code. And there are things you can do with a macro that you obviously cannot do with called_by_name, because called_by_name does a very specific transformation (defining a lambda and transforming parameters into thunks).
However, called_by_name cannot be replicated using a single macro, because it needs to do one transformation on the entire block and then another on each invocation. If I were implementing it with RubyMacros, I would write a macro-writing macro, in the tradition of Paul Graham's On Lisp.
Although rewrite can do a lot more than called_by_name, I have found that most of what I want to accomplish with macros works surprisingly well with call-by-name semantics. I sincerely think that if call-by-name semantics were an option throughout Ruby, including with method calls and block invocations, the language would become ridiculously powerful.
As a example, things like andand become trivial if you have called-by-name semantics for method calls. YMMV.
Subscribe to new posts and daily links:








