2626 * [ Modifiers] ( #modifiers )
2727 * [ Code in replacement section] ( #code-in-replacement-section )
2828 * [ Quoting metacharacters] ( #quoting-metacharacters )
29+ * [ Dealing with duplicates] ( #dealing-with-duplicates )
2930
3031<br >
3132
@@ -1338,7 +1339,7 @@ i*(t+9-g)/8,4-a+b
13381339
13391340$ # since + is a metacharacter, no match found
13401341$ # note that #{} allows interpolation
1341- $ s=' a+b' ruby -ne ' print if /^ #{ENV[" s" ]}/' eqns.txt
1342+ $ s=' a+b' ruby -ne ' print if /#{ENV[" s" ]}/' eqns.txt
13421343
13431344$ # same as: s=' a+b' perl -ne ' print if /\Q $ENV {s}/' eqns.txt
13441345$ s=' a+b' ruby -ne ' print if /#{Regexp.escape(ENV[" s" ])}/' eqns.txt
@@ -1352,6 +1353,42 @@ a+b,pi=3.14,5e12
13521353i*(t+9-g)/8,4-a**b
13531354```
13541355
1356+ <br>
1357+
1358+ ## <a name="dealing-with-duplicates"></a>Dealing with duplicates
1359+
1360+ * retain only first copy of duplicates
1361+ * `-r` command line option allows to specify library required
1362+ * here, `set` data type is used to keep track of unique values - be it whole line or a particular field
1363+ * the `add?` method will add element to `set` and returns `nil` if element already exists
1364+ * See [ruby-doc add?(o)](https://ruby-doc.org/stdlib-2.5.0/libdoc/set/rdoc/Set.html#method-i-add-3F) for syntax details
1365+
1366+ ```bash
1367+ $ cat duplicates.txt
1368+ abc 7 4
1369+ food toy ****
1370+ abc 7 4
1371+ test toy 123
1372+ good toy ****
1373+
1374+ $ # whole line, same as: perl -ne ' print if ! $seen {$_ }++' duplicates.txt
1375+ $ ruby -rset -ne ' BEGIN{s=Set.new}; print if s.add? ($_ )' duplicates.txt
1376+ abc 7 4
1377+ food toy ****
1378+ test toy 123
1379+ good toy ****
1380+
1381+ $ # particular column, same as: perl -ane ' print if ! $seen {$F [1]}++'
1382+ $ ruby -rset -ane ' BEGIN{s=Set.new}; print if s.add? ($F [1])' duplicates.txt
1383+ abc 7 4
1384+ food toy ****
1385+
1386+ $ # total count, same as: perl -lane ' $c ++ if ! $seen {$F [1]}++; END{print $c }'
1387+ $ ruby -rset -ane ' BEGIN{s=Set.new}; s.add($F [1]);
1388+ END{puts s.length}' duplicates.txt
1389+ 2
1390+ ```
1391+
13551392
13561393
13571394
0 commit comments