Skip to content

Commit a9bb48a

Browse files
committed
Add 17: "Stream Editing"
1 parent 2a5a60b commit a9bb48a

1 file changed

Lines changed: 117 additions & 0 deletions

File tree

Lines changed: 117 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,117 @@
1+
---
2+
title: Stream Editing
3+
date: 2015-05-17
4+
tags: cli-options, golf, globals, one-liner
5+
---
6+
7+
One of Ruby's goals was to replace popular unix *stream editors* like `awk` or `sed`, which both have the concept of manipulating files in a line-based manner. Ruby has the `-n` option for this:
8+
9+
Causes Ruby to assume the following loop around your script, which makes it
10+
iterate over file name arguments somewhat like sed -n or awk.
11+
12+
while gets
13+
...
14+
end
15+
16+
And its sibling `-p`:
17+
18+
Acts mostly same as -n switch, but print the value of variable $_ at the each
19+
end of the loop.
20+
For example:
21+
22+
% echo matz | ruby -p -e '$_.tr! "a-z", "A-Z"'
23+
MATZ
24+
25+
What you need to know is that the [special global variable](http://idiosyncratic-ruby.com/9-globalization.html) `$_` contains the last read input. When using `-n` or `-p`, this usualy means the current line. Another thing to keep in mind: `gets` reads from [`ARGF`](http://readruby.io/io#argf), not from `STDIN`, so you can pass arguments that will be interpreted as filenames of the files that should be processed. Equipped with this knowlegde, you can build a very basic example, which just prints out the given file:
26+
27+
$ ruby -ne 'print $_' filename
28+
29+
Since print without arguments implicitely prints out `$_`, this can be shortened to:
30+
31+
$ ruby -ne 'print' filename
32+
33+
If one uses `-p`, instead of `-n`, no code is required, because `-p` will call `print` implicitely:
34+
35+
$ ruby -pe '' filename
36+
37+
Now let's modify each line:
38+
39+
$ ruby -pe '$_.reverse!' filename
40+
41+
This will print out the file with all its lines reversed.
42+
43+
Here is another example, which will print every line in a random ANSI color:
44+
45+
$ ruby -ne 'print "\e[3#{rand(8)}m#$_"' filename
46+
47+
There is more to assist you in writing these short line manipulation scripts:
48+
49+
## The Ruby One-Liner Toolbox
50+
51+
* CLI Options: `-n` `-p` `-0` `-F` `-a` `-i` `-l`
52+
* Global Variables: `$_` `$/` `$\` `$;` `$F` `$.`
53+
* Methods that operate on `$_`, implicetly: `print` `~`
54+
* The special `BEGIN{}` and `END{}` blocks
55+
56+
## Running Code Before or After Processing the Input
57+
58+
You can run code before the loop starts with `BEFORE` and after the loop with `END`. For example, this will count characters:
59+
60+
$ ruby -ne 'BEGIN{ count = 0 }; count += $_.size; END{ print count }' filename
61+
62+
## Using Line Numbers
63+
64+
`$.` contains the current line number. A use-case would be counting the lines of a file:
65+
66+
$ ruby -ne 'END{p$.}' filename
67+
68+
## String Matching
69+
70+
Now let's do some conditional processing: Only print a line if it contains a digit:
71+
72+
$ ruby -ne 'print if ~/\d/' filename
73+
74+
The message to take away: The `~` method implicitely matches the regex against `$_`.
75+
76+
But it gets even better:
77+
78+
$ ruby -ne 'print if /\d/' filename
79+
80+
You thought conditions with a truthy value will always execute the `if`-branch of a conditions? They will not, if the truthy value is a non-matching regex literal!
81+
82+
This also works when using the ternary operator for conditions:
83+
84+
$ ruby -ne 'puts "#$.: #{ /\d/ ? "first digit: #$&" : "no digit" }"' filename
85+
86+
## Inplace-Editing files
87+
88+
Using the `-i` option, you can modify files directy (just like `sed`'s `-i` mode). For example, removing all trailing spaces:
89+
90+
$ ruby -ne 'puts $_.rstrip!' -i filename
91+
92+
Like in `sed`, you can provide a file extension to the `-i` option which will be used to create a backup file before processing:
93+
94+
$ ruby -pe '$_.upcase!' -i.original filename
95+
96+
## Auto-splitting Lines
97+
98+
The `-a` option will run `$F = $_.split` for every line:
99+
100+
$ ruby -nae 'puts $F.reverse.join(" ")' filename
101+
102+
## Specify Line Format
103+
104+
You might not always want to use `\n` as the character that separates lines. Fortunately, Ruby has [record separators](http://idiosyncratic-ruby.com/16-changing-the-rules.html#change-a-global-default-separator), and you can set some of them via command-line options:
105+
106+
Option | Variable | Description
107+
-------|-----------|------------
108+
`-0` | `$/` | Sets the *input record separator*, which is used by `Kernel#gets`. Character to use must be given as [octal number](http://en.wikipedia.org/wiki/Octal). If no number is given (`-0`), it will use null bytes as separator. Using `-0777` will read in the whole file at once. Another special value is `-00`, which will set `$_` to `"\n\n"` (paragraph mode).
109+
`-F` | `$;` | Sets the *input field separator*, which is used by `Array#split`. Useful in combination with the `-a` option.
110+
`-l` | `$\` | Sets the *output record separator* to the value of the *input record separator* (`$/`). Also runs [String#chop!](http://ruby-doc.org/core-2.2.2/String.html#method-i-chop-21) on every line!
111+
{:.table-10-10-X}
112+
113+
## Further Reading
114+
115+
- [sed](https://en.wikipedia.org/wiki/Sed)
116+
- [un](http://idiosyncratic-ruby.com/6-run-ruby-run.html)
117+
- [pru](https://github.com/grosser/pru)

0 commit comments

Comments
 (0)