From e026a7cd771d1276b450e76dd3fd72a384578ef8 Mon Sep 17 00:00:00 2001 From: Vasil Sakarov Date: Thu, 1 Mar 2012 12:53:11 +0200 Subject: [PATCH] Added section for regular expression. --- README.md | 61 +++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 61 insertions(+) diff --git a/README.md b/README.md index b99ed9013..588982517 100644 --- a/README.md +++ b/README.md @@ -990,6 +990,67 @@ syntax. end ``` +## Regular Expressions + +* Don't use regular expressions if you just need plain text search in string: + `string['text']` +* For simple constructions you can use regexp directly through string index. + + ```Ruby + match = string[/regexp/] # get content of matched regexp + first_group = string[/text(grp)/, 1] # get content of captured group + string[/text (grp)/, 1] = 'replace' # string => 'text replace' + ``` + +* Use non capturing groups when you don't use captured result of parenthesis. + + ```Ruby + /(first|second)/ # bad + /(?:first|second)/ # good + ``` + +* Avoid using $1-9 as it can be hard to track what they contain. Named groups + can be used instead. + + ```Ruby + # bad + /(regexp)/ =~ string + ... + process $1 + + # good + /(?regexp)/ =~ string + ... + process meaningful_var + ``` + +* Character classes have only few special characters you should care about: + `^`, `-`, `\`, `]`, so don't escape `.` or brackets in `[]`. + +* Be careful with `^` and `$` as they match start/end of line, not string endings. + If you want to match the whole string use: `\A` and `\Z`. + + ```Ruby + string = "some injection\nusername" + string[/^username$/] # matches + string[/\Ausername\Z/] # don't match + ``` + +* Use `x` modifier for complex regexps. This makes them more readable and you + can add some useful comments. Just be careful as spaces are ignored. + + ```Ruby + regexp = %r{ + start # some text + \s # white space char + (group) # first group + (?:alt1|alt2) # some alternation + end + }x + ``` + +* For complex replacements `sub`/`gsub` can be used with block or hash. + ## Percent Literals * Use `%w` freely.