@@ -211,7 +211,7 @@ I bought two bananas and three mangoes
211211* one way is to use ` variable =~ /REGEXP/FLAGS ` to check for a match
212212 * use ` variable !~ /REGEXP/FLAGS ` for negated match
213213 * by default acts on ` $_ ` if variable is not specified
214- * see [ ruby-doc: Regexp] ( https://ruby-doc.org/core-2.5.0/Regexp.html ) for regular expression details
214+ * see [ ruby-doc: Regexp] ( https://ruby-doc.org/core-2.5.0/Regexp.html ) for documentation
215215* as we need to print only selective lines, use ` -n ` option
216216 * by default, contents of ` $_ ` will be printed if no argument is passed to ` print `
217217
925925
926926* assuming that you are already familiar with basics of regular expressions
927927 * if not, check out [ Ruby Regexp] ( https://leanpub.com/rubyregexp ) ebook - step by step guide from beginner to advanced levels
928- * examples/descriptions based only on ASCII encoding
929- * See [ ruby-doc: Regexp] ( https://ruby-doc.org/core-2.5.0/Regexp.html ) for syntax and feature details
928+ * examples/descriptions are for string containing ASCII characters only
929+ * See [ ruby-doc: Regexp] ( https://ruby-doc.org/core-2.5.0/Regexp.html ) for documentation
930930* See [ rexegg ruby] ( https://www.rexegg.com/regex-ruby.html ) for a bit of ruby regexp history and differences with other regexp engines
931931
932932<br >
@@ -960,9 +960,9 @@ $ echo 'foo,baz,,xyz,,,123' | ruby -lpe 'gsub(/[^,]*/, "A")'
960960AA,AA,A,AA,A,A,AA
961961
962962$ # one workaround is to use lookarounds(covered later)
963- $ echo ' ,baz,,xyz,,,' | ruby -lpe ' gsub(/(?<=^|,)[^,]*(?=,|$) /, "A")'
963+ $ echo ' ,baz,,xyz,,,' | ruby -lpe ' gsub(/(?<=^|,)[^,]*/, "A")'
964964A,A,A,A,A,A,A
965- $ echo ' foo,baz,,xyz,,,123' | ruby -lpe ' gsub(/(?<=^|,)[^,]*(?=,|$) /, "A")'
965+ $ echo ' foo,baz,,xyz,,,123' | ruby -lpe ' gsub(/(?<=^|,)[^,]*/, "A")'
966966A,A,A,A,A,A,A
967967```
968968
@@ -1007,7 +1007,10 @@ $ seq 14 | ruby -ne 'print if /2\n\z/'
100710072
1008100812
10091009
1010- $ # without newline at end of line, both \z and \Z will give same result
1010+ $ # without newline at end of line, both \z and \Z will behave same
1011+ $ seq 14 | ruby -lne ' print if /2\z/'
1012+ 2
1013+ 12
10111014```
10121015
10131016* delimiters and quoting
@@ -1045,10 +1048,12 @@ $ # \& can also be used instead of \0
10451048
10461049#### <a name =" backslash-sequences " ></a >Backslash sequences
10471050
1051+ * ` \w ` for ` [A-Za-z0-9_] `
10481052* ` \d ` for ` [0-9] `
10491053* ` \s ` for ` [ \t\r\n\f\v] `
10501054* ` \h ` for ` [0-9a-fA-F] ` or ` [[:xdigit:]] `
1051- * ` \D ` , ` \S ` , ` \H ` , respectively for their opposites
1055+ * ` \W ` , ` \D ` , ` \S ` , ` \H ` , respectively for their opposites
1056+ * See also [ ruby-doc: scan] ( https://ruby-doc.org/core-2.5.0/String.html#method-i-scan )
10521057
10531058``` bash
10541059$ # same as: perl -ne 'print if /^[[:xdigit:]]+$/'
@@ -1066,6 +1071,11 @@ $ # note again the use of -l because of newline in input record
10661071$ # same as: perl -lpe 's/\D+/xxx/g'
10671072$ echo ' like 42 and 37' | ruby -lpe ' gsub(/\D+/, "xxx")'
10681073xxx42xxx37
1074+
1075+ $ # get all matches as an array
1076+ $ echo ' tea sea-pit sit' | ruby -ne ' puts $_.scan(/[\w\s]+/)'
1077+ tea sea
1078+ pit sit
10691079```
10701080
10711081<br >
@@ -1077,11 +1087,11 @@ xxx42xxx37
10771087
10781088``` bash
10791089$ # greedy matching
1080- $ echo ' foo and bar and baz land good' | ruby -pe ' sub(/foo .*and/, "" )'
1081- good
1090+ $ echo ' foo and bar and baz land good' | ruby -lne ' print $_.scan(/ .*and/)'
1091+ [ " foo and bar and baz land " ]
10821092$ # non-greedy matching
1083- $ echo ' foo and bar and baz land good' | ruby -pe ' sub(/foo .*?and/, "" )'
1084- bar and baz land good
1093+ $ echo ' foo and bar and baz land good' | ruby -lne ' print $_.scan(/ .*?and/)'
1094+ [ " foo and " , " bar and" , " baz land" ]
10851095
10861096$ echo ' 12342789' | ruby -pe ' sub(/\d{2,5}/, "")'
10871097789
@@ -1116,7 +1126,6 @@ $ echo '123:42:789:good:5:bad' | ruby -pe 'sub(/:.*:[a-z]/, ":")'
11161126The string matched by lookarounds are like word boundaries and anchors, do not constitute as part of matched string. They are termed as ** zero-width patterns**
11171127
11181128* positive lookbehind ` (?<= `
1119- * See also [ ruby-doc: scan] ( https://ruby-doc.org/core-2.5.0/String.html#method-i-scan )
11201129
11211130``` bash
11221131$ s=' foo=5, bar=3; x=83, y=120'
@@ -1211,8 +1220,8 @@ $ echo '1 and 2 and 3 land 4' | ruby -pe 'sub(/(and.*?){2}\Kand/, "-")'
121112201 and 2 and 3 l- 4
12121221```
12131222
1214- * note that ` \K ` behaves differently than ` perl ` or ` vim ` 's ` \zs ` when it comes to consecutive matches with empty string in between
1215- * ` \K ` is [ not mentioned in documentation ] ( https://bugs.ruby-lang.org/issues/14500 ) , so not sure if this is intended behavior or a bug
1223+ * don't use ` \K ` if there are consecutive matches
1224+ * this is because of how the regexp engine has been implemented, ` perl ` or ` vim ` 's ` \zs ` don't have this limitation
12161225
12171226``` bash
12181227$ echo ' ,,' | perl -pe ' s/,\K/foo/g'
@@ -2628,6 +2637,7 @@ $ ruby -e 'nums = %x/seq 3/; print nums'
26282637 * [Ruby one-liners](http://benoithamelin.tumblr.com/ruby1line) based on [awk one-liners](http://www.pement.org/awk/awk1line.txt)
26292638 * [Ruby Tricks, Idiomatic Ruby, Refactorings and Best Practices](https://franzejr.github.io/best-ruby/index.html)
26302639 * [freecodecamp - learning Ruby](https://medium.freecodecamp.org/learning-ruby-from-zero-to-hero-90ad4eecc82d)
2640+ * [Ruby Regexp](https://leanpub.com/rubyregexp) ebook - step by step guide from beginner to advanced levels
26312641 * [regex FAQ on SO](https://stackoverflow.com/questions/22937618/reference-what-does-this-regex-mean)
26322642* Alternatives
26332643 * [bioruby](https://github.com/bioruby/bioruby)
0 commit comments