Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
Newer
Older
100644 322 lines (202 sloc) 20.146 kB
e6d31df @raganwald Anaphora 2012
raganwald authored
1 # Anaphora in Ruby, 2012 Edition
2
3 *The following is an update to a post I wrote in September, 2009. I was spurred to update it by the release of a new anaphoric library for Ruby called [Ampex](https://github.com/rapportive-oss/ampex). It seems that if people have a problem that doesn't get solved, they will keep re-inventing solutions for it.*
4
5 > In natural language, an anaphor is an expression which refers back in the conversation. The most common anaphor in English is probably "it," as in "Get the wrench and put it on the table." Anaphora are a great convenience in everyday language--imagine trying to get along without them--but they don't appear much in programming languages. For the most part, this is good. Anaphoric expressions are often genuinely ambiguous, and present-day programming languages are not designed to handle ambiguity. --Paul Graham, [On Lisp](http://www.paulgraham.com/onlisp.html "On Lisp")
6
248b6eb @raganwald title
raganwald authored
7 ## Old School Global Variable Anaphora
e6d31df @raganwald Anaphora 2012
raganwald authored
8
9 Anaphora have actually been baked into Ruby from its earliest days. Thanks to its Perl heritage, a number of global variables act like anaphora. For example, `$&` is a global variable containing the last successful regular expression match, or nil if the last attempt to match failed. So instead of writing something like:
10
8ed541e @raganwald try some syntax hilighting
raganwald authored
11 ```ruby
12 if match_data = /reg(inald)?/.match(full_name) then puts match_data[0] end
13 ```
14
e6d31df @raganwald Anaphora 2012
raganwald authored
15 You can use $& as an anaphor and avoid creating another explicit temporary variable, just like the anaphor in a conditional:
16
8ed541e @raganwald try some syntax hilighting
raganwald authored
17 ```ruby
18 if /reg(inald)?/.match(full_name) then puts $& end
19 ```
e6d31df @raganwald Anaphora 2012
raganwald authored
20
21 These 'anaphoric' global variables have a couple of advantages. Since they are tied to the use of things like regular expression matching rather than a specific syntactic construct like an if expression, they are more flexible and can be used in more ways. Their behaviour is very well defined.
22
23 The disadvantage is that there is a complete hodge-podge of them. Some are read only, some read-write, and none have descriptive names. They look like line noise to the typical programmer, and as a result many people (myself included) simply don't use them outside of writing extremely short shell scripts in Ruby.
24
25 Anaphors like the underscore or a special variable called "it" have the advantage of providing a smaller surface area for understanding. Consider Lisp's anaphoric macro where "it" refers to the value of the test expression and nothing more (we ignore the special cases and other ways Ruby expresses conditionals). Compare:
26
8ed541e @raganwald try some syntax hilighting
raganwald authored
27 ```ruby
28 if /reg(inald)?/.match(full_name) then puts $& end
29 ```
30
e6d31df @raganwald Anaphora 2012
raganwald authored
31 To:
32
8ed541e @raganwald try some syntax hilighting
raganwald authored
33 ```ruby
34 if /reg(inald)?/.match(full_name) then puts it[0] end
35 ```
36
37 To my eyes, "it" is easier to understand because it is a very general, well-understood anaphor. "It" always matches the test expression. We don't have to worry about whether `$&` is the result of a match or all the text to the left of a match or the command line parameters or what-have-you. Of course, "it" isn't an anaphor in Ruby. It is (forgive the expression) in other languages like Groovy.
38
39 Could anaphors be added to Ruby where none previously existed? Yes. Sort of.
e6d31df @raganwald Anaphora 2012
raganwald authored
40
41 ## New School Block Anaphora
42
43 ### Methodphitamine
44
45 A *block anaphor* is a meta-variable that can be used in a Ruby block to refer to its only parameter. Consider the popular Symbol#to\_proc. Symbol#to\_proc is the standard way to abbreviate blocks that consist of a single method invocation, typically without parameters. For example if you want the first name of a collection of people records, you might use `Person.all(...).map(&:first_name)`.
46
9ed928d @r00k Fix typo.
r00k authored
47 Some languages provide a special meta-variable that can be used in a similar way. if `it` was a block anaphor in Ruby, you could write `Person.all(...).map { it.first_name }`. Of course, Ruby doesn't have block anaphora built in, so people kludged workarounds, and Symbol#to\_proc was so popular that it became enshrined in the language itself.
e6d31df @raganwald Anaphora 2012
raganwald authored
48
49 Jay Phillips implemented a simple block anaphor called [Methodphitamine](http://jicksta.com/posts/the-methodphitamine "The Methodphitamine at Adhearsion Blog by Jay Phillips"). `it` doesn't seem like much of a win when you just want to send a message without parameters. But if you want to do more, such as invoke a method with a parameter, or if you want to chain several methods, you are out of luck. Symbol#to\_proc does not allow you to write `Person.all(...).map(&:first_name[0..3])`. With Methodphitamine you can write:
50
8ed541e @raganwald try some syntax hilighting
raganwald authored
51 ```ruby
52 Person.all(...).map(&it.first_name[0..3])
53 ```
e6d31df @raganwald Anaphora 2012
raganwald authored
54
55 Likewise with Symbol#to\_proc you can't write `Person.all(...).map(&:first_name.titlecase)`. You have to write `Person.all(...).map(&:first_name).map(&:titlecase)`. With Methodphitamine you can write:
56
8ed541e @raganwald try some syntax hilighting
raganwald authored
57 ```ruby
58 Person.all(...).map(&it.first_name.titlecase)
59 ```
e6d31df @raganwald Anaphora 2012
raganwald authored
60
61 This is easy to read and does what you expect for simple cases. Methodphitamine uses a proxy object to create the illusion of an anaphor, allowing you to invoke method with parameters and to chain more than one method. Here's some code illustrating the technique:
62
8ed541e @raganwald try some syntax hilighting
raganwald authored
63 ```ruby
64 class AnaphorProxy < BlankSlate
65
66 def initialize(proc = lambda { |x| x })
67 @proc = proc
68 end
69
70 def to_proc
71 @proc
72 end
73
74 def method_missing(symbol, *arguments, &block)
75 AnaphorProxy.new(
76 lambda { |x| self.to_proc.call(x).send(symbol, *arguments, &block) }
77 )
78 end
79
80 end
81
82 class Object
83
84 def it
85 AnaphorProxy.new
86 end
87
88 end
89
90 (1..10).map(&it * 2 + 1) # => [3, 5, 7, 9, 11, 13, 15, 17, 19, 21]
91 ```
e6d31df @raganwald Anaphora 2012
raganwald authored
92
09437a3 @r00k Out with 'out'.
r00k authored
93 What happens is that "it" is a method that returns an AnaphorProxy. The default proxy is an object that answers the Identity function in response to #to\_proc. Think about how `(1..10).map(&it) => [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]` works: "it" is a method that returns the default AnaphorProxy; using &it calls AnaphorProxy#to\_proc and receives `lambda { |x| x }` in return; #map now applies this to `1..10` and you get `[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]`.
e6d31df @raganwald Anaphora 2012
raganwald authored
94
95 If you send messages to an AnaphorProxy, you get another AnaphorProxy that "records" the messages you send. So `it * 2 + 1` evaluates to an AnaphorProxy that returns `lambda { |x| lambda { |x| lambda { |x| x }.call(x) * 2 }.call(x) + 1 }`. This is equivalent to `lambda { |x| x * 2 + 1}` but more expensive to compute and dragging with it some closed over variables.
96
97 As you might expect from a hack along these lines, there are all sorts of things to trip us up. `(1..10).map(&it * 2 + 1)` works, however what would you expect from:
98
8ed541e @raganwald try some syntax hilighting
raganwald authored
99 ```ruby
100 (1..10).map(&1 + it * 2) # no!
101 ```
e6d31df @raganwald Anaphora 2012
raganwald authored
102
103 This does not work with Methodphitamine, and neither does something like:
104
8ed541e @raganwald try some syntax hilighting
raganwald authored
105 ```ruby
106 Person.all(...).select(&it.first_name == it.last_name) # no!
107 ```
e6d31df @raganwald Anaphora 2012
raganwald authored
108
109 Also, unexpected things happen if you try to "record" an invocation of #to\_proc:
110
8ed541e @raganwald try some syntax hilighting
raganwald authored
111 ```ruby
112 [:foo, :bar, :blitz].map(&it.to_proc.call(some_object)) # no!
113 ```
e6d31df @raganwald Anaphora 2012
raganwald authored
114
7b60ae6 @raganwald clean up string to proc references
raganwald authored
115 We'll have another look at these "gotchas" [below](http:#technical-gotchas).
e6d31df @raganwald Anaphora 2012
raganwald authored
116
117 ### Ampex: Block anaphora updated
118
119 [Ampex](https://github.com/rapportive-oss/ampex) is a new block anaphora library. Instead of `it`, Ampex uses `X`:
120
8ed541e @raganwald try some syntax hilighting
raganwald authored
121 ```ruby
122 ["a", "b", "c"].map &(X * 2)
123 # => ["aa", "bb", "cc"]
124 ```
125
e6d31df @raganwald Anaphora 2012
raganwald authored
126 As Conrad Irwin explains in a [blog post](http://cirw.in/blog/ampex) announcing Ampex:
127
128 > The ampex library is distributed as a rubygem, so to use it, you can either install it one-off or add it to your Gemfile. We've been using ampex in production for over a year now, and beacuse it's written in pure Ruby, it works on Ruby 1.8.7, Ruby 1.9 and JRuby out of the box.
129
130 ### Technical Gotchas
131
132 Using proxy objects (as methodphitimine and ampex do) runs you into that curious problem of trying to implement symmetrical behaviour in object-oriented languages where everything is inherently *asymmetrical*. Block anaphora implemented by proxy objects only work properly when they're a receiver in a block. You cannot, for example, use methodphitimine or ampex to write:
133
8ed541e @raganwald try some syntax hilighting
raganwald authored
134 ```ruby
135 (1..10).map { 1 + it * 2 }
136 (1..10).map { 1 + X * 2 }
137 ```
e6d31df @raganwald Anaphora 2012
raganwald authored
138
139 You also have certain issues with respect to when arguments are evaluated:
140
8ed541e @raganwald try some syntax hilighting
raganwald authored
141 ```ruby
142 i = 1
143 (1..10).map { &it.frobbish(i += 1) }
144 ```
e6d31df @raganwald Anaphora 2012
raganwald authored
145
2e75b76 @raganwald rewrite rails rediscussion
raganwald authored
146 `i +=1` is only evaluated once, not for each iteration.
147
148 ### Anaphora via AST Rewriting
149
150 To "fix" the problems with using a proxy to implement anaphora, you need to parse and rewrite Ruby directly. No sane person would do this just for the convenience of using block anaphora in their code, however Github archeologists report that a now-extinct society of programmers did this very thing:
e6d31df @raganwald Anaphora 2012
raganwald authored
151
582168b @raganwald update deprecated projects
raganwald authored
152 The abandonware gem [rewrite_rails](http://github.com/raganwald-deprecated/rewrite_rails "raganwald's rewrite_rails at master - GitHub") supported `it`, `its`, or `_` as block anaphora for blocks taking one argument. When writing a block that takes just one parameter, you can use either `it` or `its` as a parameter without actually declaring the parameter using `{ |it| ... }`.
e6d31df @raganwald Anaphora 2012
raganwald authored
153
2f51aca @raganwald incorporate the read me text
raganwald authored
154 Like Methodphitimine and Ampex, you can supply parameters:
155
156 User.all(...).each { it.increment(:visits) }
157
158 Or chain methods:
159
160 Person.all(...).map { its.first_name.titlecase }
161
162 Unlike the other gems, `it` needn't be the receiver:
2e75b76 @raganwald rewrite rails rediscussion
raganwald authored
163
2f51aca @raganwald incorporate the read me text
raganwald authored
164 Person.all(...).each { (name_count[its.first_name] ||= 0) += 1 }
165
3e6d604 @raganwald fixed link
raganwald authored
166 This style of code works best when you would naturally use the word "it" or the possessive "its" if you were reading the code aloud to a colleague. (You can use the underscore, `_` instead of `it` or `its` for visual compatibility with certain functional programming languages)Stri. `rewrte_rails` does its magic by parsing the block and rewriting it. So when you write:
e6d31df @raganwald Anaphora 2012
raganwald authored
167
8ed541e @raganwald try some syntax hilighting
raganwald authored
168 ```ruby
169 (1..10).map { 1 + it * 2 }
170 Person.all(...).select { its.first_name == its.last_name } # and,
171 [:foo, :bar, :blitz].map { it.to_proc.call(some_object) }
172 (1..100).map { (1/_)+1 }
173 ```
2e75b76 @raganwald rewrite rails rediscussion
raganwald authored
174
175 `rewrite_rails` actually rewrites your code into:
176
177 ```ruby
178 (1..10).map { |it| 1 + it * 2 }
179 Person.all(...).select { |its| its.first_name == its.last_name } # and,
180 [:foo, :bar, :blitz].map { |it| it.to_proc.call(some_object) }
181 (1..100).map { |_| (1/_)+1 }
182 ```
183
184 Needless to say, this is a very heavyweight approach to implementing block anaphora, although the result is semantically much cleaner. It's best considered a proof of concept, a pointer towards what could be done if the people governing the Ruby language want to consider baking Anaphora directly into the interpreter.
e6d31df @raganwald Anaphora 2012
raganwald authored
185
186 ## Speculative Digression: Anaphors for conditionals
187
188 Many people are familiar with the [andand gem](http://github.com/raganwald/andand "raganwald's andand at master - GitHub"). Say you want to write some code like this:
189
190 big_long_calculation() && big_long_calculation().foo
191
192 Most of the time you ought to "cache" the big long calculation in a temporary variable like this:
193
194 (it = big_long_calculation()) && it.foo
195
196 That's such a common idiom, #andand gives you a much more succinct way to write it:
197
198 big_long_calculation().andand.foo
199
200 So the idea behind #andand is to express a test for nil and doing something with the result if it is not nil in a very compact way. This is not a new idea. Paul Graham gives this very example when describing the rationale for [anaphoric macros](http://www.bookshelf.jp/texi/onlisp/onlisp_15.html "Onlisp: Anaphoric Macros"):
201
202 > It's not uncommon in a Lisp program to want to test whether an expression returns a non-nil value, and if so, to do something with the value. If the expression is costly to evaluate, then one must normally do something like this:
203
204 (let ((result (big-long-calculation)))
205 (if result
206 (foo result)))
207
208 > Wouldn't it be easier if we could just say, as we would in English:
209
210 (if (big-long-calculation)
211 (foo it))
212
213 > In natural language, an anaphor is an expression which refers back in the conversation. The most common anaphor in English is probably "it," as in "Get the wrench and put it on the table." Anaphora are a great convenience in everyday language--imagine trying to get along without them--but they don't appear much in programming languages. For the most part, this is good. Anaphoric expressions are often genuinely ambiguous, and present-day programming languages are not designed to handle ambiguity.
214
215 WIth an anaphoric macro, the anaphor "it" is bound to the result of the if expression's test clause, so you can express "test for nil and do something with the result if it is not nil" in a compact way.
216
217 ### Anaphors for conditionals in Ruby?
218
219 Reading about Lisp's anaphoric macros made me wonder whether anaphora for conditionals would work in Ruby. I find `(it = big_long_calculation()) && it.foo` cluttered and ugly, but perhaps I could live without #andand if I could write things like:
220
221 if big_long_calculation(): it.foo end
222
582168b @raganwald update deprecated projects
raganwald authored
223 This is relatively easy to accomplish using [rewrite_rails](http://github.com/raganwald-deprecated/rewrite_rails "raganwald's rewrite_rails at master - GitHub"). In the most naïve case, you want to rewrite all of your if statements such that:
e6d31df @raganwald Anaphora 2012
raganwald authored
224
225 if big_long_calculation()
226 it.foo
227 end
228
229 Becomes:
230
231 if (it = big_long_calculation())
232 it.foo
233 end
234
235 You can embellish such a hypothetical rewriter with optimizations such as not assigning `it` unless there is a variable reference somewhere in the consequent or alternate clauses and so forth, but the basic implementation is straightforward.
236
237 The trouble with this idea is that in Ruby, *There Is More Than One Way To Do It* (for any value of "it"). If we implement anaphora for conditionals, we ought to implement them for all of the ways a Ruby programmer might write a conditional. As discussed, we must support:
238
239 if big_long_calculation()
240 it.foo
241 end
242
243 Luckily, that's the exact same thing as:
244
245 if big_long_calculation(): it.foo end
246
247 They both are parsed into the exact same abstract syntax tree expression. Good. Now what about this case:
248
249 it.foo if big_long_calculation()
250
251 That doesn't read properly. The anaphor should follow the subject, not precede it. If we want our anaphora to read sensibly, we really want to write:
252
253 big_long_calculation().foo if it # or
254 big_long_calculation().foo unless it.nil?
255
256 These read more naturally, but supporting these expressions would invite Yellow Edge Case Cranial Headache or "YECCH." Behind the scenes, Ruby parses both of the following expressions identically:
257
258 big_long_calculation().foo unless it.nil? # and
259 unless it.nil?
260 big_long_calculation().foo
261 end
262
263 So you would have to have a rule that if the anaphor appears in the test expression, it refers to something from the consequent expression, not from any preceding test expression. But if you tried that rule, how would you handle this code?
264
265 if calculation_that_might_return_a_foobar()
266 if it.kind_of?(:Foobar)
267 number_of_foobars += 1
268 end
269 end
270
271 This doesn't work as expected because the anaphor would refer forward to its consequent expression `number_of_foobars += 1` rather than backwards to the enclosing test expression `calculation_that_might_return_a_foobar()`. You can try to construct some rules for disambiguating things, but you're going to end up asking programmers to memorize the implementation of how things actually work rather than relying on familiarity with how anaphora work in English.
272
273 Another problem with supporting `big_long_calculation().foo unless it.nil?` is that we now need some rules to figure out that the anaphor refers to `big_long_calculation()` and not to `big_long_calculation().foo`. Whatever arbitrary rules we pick are going to introduce ambiguity. What shall we do about:
274
275 big_long_calculation().foo unless it.nil?
276 big_long_calculation().foo.bar unless it.nil?
277 big_long_calculation() + 3 unless it.nil?
278 3 + big_long_calculation() unless it.nil?
279 big_long_calculation(3) unless it.nil?
280 big_long_calculation(foo()) unless it.nil?
281 big_long_calculation(foo(bar())) unless it.nil?
282
283 In my opinion, if we can't find clean and easy to understand support for writing conditionals as suffixes, we aren't supporting Ruby conditionals. To underscore the difficulty, let's also remember that Ruby programmers idiomatically use operators to express conditional expressions. Given:
284
285 big_long_calculation() && big_long_calculation().foo
286
287 We want to write:
288
289 big_long_calculation() && it.foo
290
291 This is near and dear to my heart: The name "andand" comes from this exact formulation. #andand doesn't enhance an if expression, it enhances the double ampersand operator. One can see at a glance that implementing support for `big_long_calculation() && it.foo` is fraught with perils. What about `big_long_calculation() + it.foo`? What about `big_long_calculation().bar && it.foo`?
292
293 It seems that it is much harder to support anaphora for conditionals in Ruby than it is to support anaphora for conditionals in Lisp. This isn't surprising. Lisp has an extremely regular lack of syntax, so we don't have to concern ourselves with as many cases as we do in Ruby.
294
295 ## Summing "it" Up
296
297 Anaphora allow us to abbreviate code, hiding parameters and temporary variables for certain special cases. This can be a win for readability for short code snippets where the extra verbiage is almost as long as what you're trying to express. That being said, implementing anaphora in Ruby is a hard design problem, in part because There Is More Than One Way To Do It, and trying to provide complete support leads to ambiguities, inconsistencies, and conflicts. And old school anaphora? They are clearly an acquired taste.
298
299 ## More to read
300
301 * [String#to\_proc](http://github.com/raganwald/homoiconic/blob/master/2008-11-28/you_cant_be_serious.md "You can't be serious!?") and its original [blog post](http://weblog.raganwald.com/2007/10/stringtoproc.html "String#to_proc").
302 * [Methodphitamine](http://github.com/jicksta/methodphitamine "jicksta's methodphitamine at master - GitHub") and its original [blog post](http://jicksta.com/posts/the-methodphitamine "The Methodphitamine at Adhearsion Blog by Jay Phillips")
303 * [Anaphoric macros](http://www.bookshelf.jp/texi/onlisp/onlisp_15.html "Onlisp: Anaphoric Macros")
582168b @raganwald update deprecated projects
raganwald authored
304 * [rewrite_rails](http://github.com/raganwald-deprecated/rewrite_rails "raganwald's rewrite_rails at master - GitHub") contains an improved implementation of String#to\_proc.
e6d31df @raganwald Anaphora 2012
raganwald authored
305 * A [usenet discussion](http://groups.google.com/group/ruby-talk-google/browse_thread/thread/26445dcef22f5a5/1772d0c487d4c570?hl=en&amp;lnk=ol&amp; "Introducing the &quot;it&quot; keyword") about anaphora in Ruby.
306 * [@RobertFischer](http://twitter.com/RobertFischer "Robert Fischer") pointed out that Groovy implements Block Anaphora using exactly the same syntax as rewrite\_rails, as well as mentioning that Groovy provides a special operator, `?.`, for the Maybe Monad.
307 * Perl has some [anaphora of its own](http://www.wellho.net/mouth/969_Perl-and-.html "Perl - $_ and @_").
308
309 ---
310
8c4fb37 @raganwald My recent work
raganwald authored
311 My recent work:
312
f9d4487 @raganwald allongé
raganwald authored
313 ![](http://i.minus.com/iL337yTdgFj7.png)[![JavaScript Allongé](http://i.minus.com/iW2E1A8M5UWe6.jpeg)](http://leanpub.com/javascript-allonge "JavaScript Allongé")![](http://i.minus.com/iL337yTdgFj7.png)[![CoffeeScript Ristretto](http://i.minus.com/iMmGxzIZkHSLD.jpeg)](http://leanpub.com/coffeescript-ristretto "CoffeeScript Ristretto")![](http://i.minus.com/iL337yTdgFj7.png)[![Kestrels, Quirky Birds, and Hopeless Egocentricity](http://i.minus.com/ibw1f1ARQ4bhi1.jpeg)](http://leanpub.com/combinators "Kestrels, Quirky Birds, and Hopeless Egocentricity")
e6d31df @raganwald Anaphora 2012
raganwald authored
314
f9d4487 @raganwald allongé
raganwald authored
315 * [JavaScript Allongé](http://leanpub.com/javascript-allonge), [CoffeeScript Ristretto](http://leanpub.com/coffeescript-ristretto), and my [other books](http://leanpub.com/u/raganwald).
21fcc00 @raganwald redirect to allong.es
raganwald authored
316 * [allong.es](http://allong.es), practical function combinators and decorators for JavaScript.
17be325 @raganwald revised footers
raganwald authored
317 * [Method Combinators](https://github.com/raganwald/method-combinators), a CoffeeScript/JavaScript library for writing method decorators, simply and easily.
4f7cce6 @raganwald githiub
raganwald authored
318 * [jQuery Combinators](http://github.com/raganwald/jquery-combinators), what else? A jQuery plugin for writing your own fluent, jQuery-like code.
e6d31df @raganwald Anaphora 2012
raganwald authored
319
320 ---
321
322 [Reg Braithwaite](http://braythwayt.com) | [@raganwald](http://twitter.com/raganwald)
Something went wrong with that request. Please try again.