Skip to content
This repository was archived by the owner on Jun 5, 2024. It is now read-only.

Commit 0b0242e

Browse files
examples for set and duplicates
1 parent a0e26eb commit 0b0242e

File tree

1 file changed

+38
-1
lines changed

1 file changed

+38
-1
lines changed

ruby_one_liners.md

Lines changed: 38 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,7 @@
2626
* [Modifiers](#modifiers)
2727
* [Code in replacement section](#code-in-replacement-section)
2828
* [Quoting metacharacters](#quoting-metacharacters)
29+
* [Dealing with duplicates](#dealing-with-duplicates)
2930

3031
<br>
3132

@@ -1338,7 +1339,7 @@ i*(t+9-g)/8,4-a+b
13381339
13391340
$ # since + is a metacharacter, no match found
13401341
$ # note that #{} allows interpolation
1341-
$ s='a+b' ruby -ne 'print if /^#{ENV["s"]}/' eqns.txt
1342+
$ s='a+b' ruby -ne 'print if /#{ENV["s"]}/' eqns.txt
13421343
13431344
$ # same as: s='a+b' perl -ne 'print if /\Q$ENV{s}/' eqns.txt
13441345
$ s='a+b' ruby -ne 'print if /#{Regexp.escape(ENV["s"])}/' eqns.txt
@@ -1352,6 +1353,42 @@ a+b,pi=3.14,5e12
13521353
i*(t+9-g)/8,4-a**b
13531354
```
13541355
1356+
<br>
1357+
1358+
## <a name="dealing-with-duplicates"></a>Dealing with duplicates
1359+
1360+
* retain only first copy of duplicates
1361+
* `-r` command line option allows to specify library required
1362+
* here, `set` data type is used to keep track of unique values - be it whole line or a particular field
1363+
* the `add?` method will add element to `set` and returns `nil` if element already exists
1364+
* See [ruby-doc add?(o)](https://ruby-doc.org/stdlib-2.5.0/libdoc/set/rdoc/Set.html#method-i-add-3F) for syntax details
1365+
1366+
```bash
1367+
$ cat duplicates.txt
1368+
abc 7 4
1369+
food toy ****
1370+
abc 7 4
1371+
test toy 123
1372+
good toy ****
1373+
1374+
$ # whole line, same as: perl -ne 'print if !$seen{$_}++' duplicates.txt
1375+
$ ruby -rset -ne 'BEGIN{s=Set.new}; print if s.add?($_)' duplicates.txt
1376+
abc 7 4
1377+
food toy ****
1378+
test toy 123
1379+
good toy ****
1380+
1381+
$ # particular column, same as: perl -ane 'print if !$seen{$F[1]}++'
1382+
$ ruby -rset -ane 'BEGIN{s=Set.new}; print if s.add?($F[1])' duplicates.txt
1383+
abc 7 4
1384+
food toy ****
1385+
1386+
$ # total count, same as: perl -lane '$c++ if !$seen{$F[1]}++; END{print $c}'
1387+
$ ruby -rset -ane 'BEGIN{s=Set.new}; s.add($F[1]);
1388+
END{puts s.length}' duplicates.txt
1389+
2
1390+
```
1391+
13551392
13561393
13571394

0 commit comments

Comments
 (0)