Regex ("Like a Knife")
Yea, I know what you are going to say, "You just reimplmented
you would be absolutely correct. But! There is a little bit more to this story.
Sed is not a "Langauge 2.0" tool (i.e. "post-Ruby"). And want I wanted is a
command-line tool that is both a bit easier to use and a bit more flexible
Now I could have written this in Perl. I'm sure it would just as good, if not better since Perl's Regular Expression engine rocks, or so I hear. But Ruby's is pretty damn good too, and getting better (with 1.9+). And since I know Ruby very well. Well that's what you get.
In brief, usage simply entails supplying a regular expression and a list of files
to be searched to the
$ regex '/=begin.*?\n(.*)\n=end/' sample.rb
This example does exactly what you would expect. It returns the content between
=begin ... =end clause it comes across. To see all such
block comments, as you would expect, you can use add the
expression mode flag.
$ regex '/=begin.*?\n(.*)\n=end/g' sample.rb
Alternatively you can use the
$ regex -g '/=begin.*?\n(.*)\n=end/' sample.rb
Notice that in all these examples we have used single quotes to wrap the
regular expression. This is to prevent the shell from expanding
By default regex produces string output. Regular expression groups are delimited by ASCII 29 (035 1D) END OF GROUP, and repeat matches are delimited by ASCII character 30 (036 1E) END OF RECORD.
Instead of string output, regex also supports YAML and JSON formats using the
$ regex -y -g '/=begin.*?\n(.*)\n=end/' sample.rb
In this case the returned matches are delimited using as an array of arrays.
To get more information than just the match results use the
Also, we can do without the
/ / deliminators on the regular
expression if we use the
--search/-s option instead. Going back to
our first example:
$ regex -s '=begin.*?\n(.*)\n=end' sample.rb
To replace text, use the
$ regex --yaml --repeat -s 'Tom' -r 'Bob' sample.rb
This will replace every occurrence of "Tom" with "Bob" in the
file. By default
regex will backup any file it changes by adding a
.bak extension to the original copy.
Check out the
--help and I am sure the rest will be smooth sailing.
But it you want more information, then do us the good favor of jumping over
to the wiki. Feel free to add
additional information there to help others.
As mentioned above, regex has three output modes. YAML, JSON and standard text. The standard text output is unique in that it utilizes special ASCII characters to separate matches and regex groups. ASCII 29, called the record separator, is used to separate repeat matches. ASCII 30, called the group separator, is is used to separate regular expression groups.
The project is maturing but still a touch wet behind the ears. So don't be too surprised if it doesn't have every feature under the sun just yet, or that every detail is going to work absolutely peachy. But hey, if something needs fixing or a feature needs adding, well then get in there and send us a patch. Open source software is built on TEAM WORK, right?
Copyright © 2010 Rubyworks
Regex is licensed under the terms of the FreeBSD license.
See LICENSE.txt file for details.