Skip to content

Commit

Permalink
Cleanup some discouraged use of 'match'; a few tiny fixes
Browse files Browse the repository at this point in the history
  • Loading branch information
DmitryOlshansky committed Jun 26, 2015
1 parent aaf842f commit 28d326e
Showing 1 changed file with 9 additions and 12 deletions.
21 changes: 9 additions & 12 deletions regular-expression.dd
Expand Up @@ -22,7 +22,7 @@ $(D_S Regular Expressions,
$(B convenient and friendly syntax) for typical operations, and integrating it well
)
$(P The D programming language provides a standard library module $(STD regex).
Being a highly expressive systems language, D allows regexes to be $(I efficiently) implemented
Being a highly expressive systems language, D allows regexes to be $(I efficiently) implemented
within the language itself, yet have good level of readability and usability.
And there a few things a pure D implementation adds to the table that are completly unbelivable
in a traditional compiled langauge, more on that at the end of article.
Expand Down Expand Up @@ -114,14 +114,14 @@ $(D_S Regular Expressions,
---
import std.algorithm, std.file;
auto buffer = std.file.readText("regex.d");
int count = count(match(text, regex(r"^.*\P{WhiteSpace}+.*$","gm")));
int count = count(matchAll(text, regex(r"^.*\P{WhiteSpace}+.*$", "m")));
---
$(P This by the way tells me that $(STD regex) has 7128 non-blank lines as of this writing.
But let's get back to the regular expression itself.
A seasoned regex user catches instantly that Unicode properties are supported with perl-style \p{xxx},
to spice that all of Scripts and Blocks are supported as well. Let us dully note that \P{xxx} means not
having an xxx property, i.e here not a white space character. A Unicode is vital subject to know, and it won't suffice
to try to cover it here. For details see level 1 of conformance as per Unicode standard
having an xxx property, i.e here not a white space character. A Unicode is a vital subject to know, and it won't suffice
to try to cover it here. For details see the accessible $(STD uni) documentation and level 1 of conformance as per Unicode standard
$(LINK2 http://Unicode.org/reports/tr18/, UTS 18).
)

Expand Down Expand Up @@ -173,7 +173,7 @@ $(D_S Regular Expressions,
write(converted);
}
---
$(P Getting current conversion rates and supporting more currencies is left as an exercise for the reader.
$(P Getting current conversion rates and supporting more currencies is left as an exercise for the reader.
What at work here is so-called replace with delegate, analogous to a callout ability found in other languages
and regex libraries. The magic is simple: whenever replace finds a match it calls a user supplied callback
on the captured piece, then it uses the return value as replacement.
Expand Down Expand Up @@ -206,11 +206,11 @@ $(D_S Regular Expressions,
$(P Again the type of splitter is range, thus allowing foreach iteration.
Notice the usage of lookaround in regex, it's a neat trick here as stripping off final punctuation is
not our intention. Breaking down this example, (?<=[.?!]) part looks behind for first ., ? or !.
This get us half way to our goal because \s* also matches between elements of punctuation like "?!",
This get us half way to our goal because \s* also matches between elements of punctuation like "?!",
so a negative lookahead is introduced $(I inside lookbehind) to make sure we are past all of the punctuation marks.
Admittedly, barrage of ? and ! makes this regex rather obscure, more then it's actually is.
Observe that there are no restrictions on contents of lookaround expressions,
one can go for lookahead inside lookbehind and so on.
one can go for lookahead inside lookbehind and so on.
However in general it's recommended to use them sparingly, keeping them as the weapon of last resort.
)

Expand Down Expand Up @@ -258,13 +258,13 @@ $(D_S Regular Expressions,
$(P The article represents a walkthrough of $(D std.regex) focused on showcasing the API.
By following a series of easy yet meaningful tasks, its features were exposed in combination,
that underline the elegance and flexibility of this library solution.
The good thing that not only the API is natural, but it also follows established
The good thing that not only the API is natural, but it also follows established
standards and integrates well with the rest of Phobos.
Putting together its major features for a short-list, $(STD regex) is:
)
$(UL
$(LI Fully Unicode-aware, qualifies to standard full level 1 Unicode support)
$(LI Lots of modern extensions, including unlimited generalized lookaround.
$(LI Lots of modern extensions, including unlimited generalized lookaround.
That makes porting regexes from other libraries a breeze)
$(LI Lean API that consists of a few flexible tools: $(D matchFirst)/$(D matchAll), $(D replaceFirst)/$(D replaceAll) and $(D splitter).)
$(LI Uniform and powerful, with unique abilities like precompiling regex or generating
Expand All @@ -278,9 +278,6 @@ $(D_S Regular Expressions,

)
Macros:
TITLE=Regular expressions
H3 = <h3>$0</h3>
DOLLAR = $
STD = $(LINK2 phobos/std_$0.html, std.$0)
WIKI =
_=

0 comments on commit 28d326e

Please sign in to comment.