-
Notifications
You must be signed in to change notification settings - Fork 18
/
Copy pathre-tutorial-testbench-master.lhs
160 lines (126 loc) · 5.12 KB
/
re-tutorial-testbench-master.lhs
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
The regex Test-Bench Tutorial
=============================
Language Options and Imports
-----------------------------
This tutorial is a literate Haskell program whwre we start by specifying
the language pragmas and imports we will need for this module.
\begin{code}
{-# LANGUAGE QuasiQuotes #-}
{-# LANGUAGE FlexibleContexts #-}
{-# OPTIONS_GHC -fno-warn-missing-signatures #-}
\end{code}
%main top
\begin{code}
import Data.Functor.Identity
import qualified Data.HashMap.Lazy as HML
import TestKit
import Text.RE.REOptions
import qualified Text.RE.TDFA as TDFA
import Text.RE.TDFA.String
import Text.RE.TestBench
\end{code}
Macros and Parsers
------------------
regex supports macros in regular expressions. There are a bunch of
standard macros and you can define your own.
RE macros are enclosed in `@{` ... '}'. By convention the macros in
the standard environment start with a '%'. `@{%date}` will match an
ISO 8601 date, this
\begin{code}
evalme_MAC_00 = checkThis "" (2) $ countMatches $ "2016-01-09 2015-12-5 2015-10-05" *=~ [re|@{%date}|]
\end{code}
picking out the two dates.
See the tables listing the standard macros in the tables folder of
the distribution.
See the log-processor example and the `Text.RE.TestBench` for
more on how you can develop, document and test RE macros with the
regex test bench.
Adding the Epsilon Macro
------------------------
You can use the regex test bench to add you own macros. As a simple example
we will add an 'epsilon' macro to the standard 'prelude' macro environment.
(See the [`re-nginx-log-processor`](re-nginx-log-processor) for a more
extensive example of macro environments.)
The `@{epsilon}` macro will expand to a RE that matches only the empty
string:
```
.{0}
```
(A use such a seemingly useless RE macro will become apparent in the
test example below.)
Firstly we define a two argument function function to create a `MacroDescriptor`
from:
1. the `MacroEnv` macro environment argument will be used to compile
the macro RE (we don't need it in this instance, of course,
but we are following a general recipe);
2. the `macroId` name of the macro (which is passed into us because
the calling context need the name of the macro).
\begin{code}
epsilon_macro :: MacroEnv -> MacroID -> MacroDescriptor
epsilon_macro env mid =
runTests TDFA.regexType Just samples env mid
MacroDescriptor
{ macroSource = RegexSource ".{0}" -- the RE to be substituted for the macro
, macroSamples = map fst samples -- list of string that should match the above macro RE
, macroCounterSamples = counter_samples -- list of string that should **not** match the above macro RE
, macroTestResults = [] -- for bookkeeping
, macroParser = Nothing -- no parser needed for this one!
, macroDescription = "an epsilon parser, matching the empty string only"
}
where
samples :: [(String,String)]
samples =
[ dup ""
]
where
dup x = (x,x)
counter_samples =
[ "not an empty string"
]
\end{code}
The compiled `Macros RE` that we will slot into the `REOptions` used to
compile the RE is constructed in two steps. Firstly we provide a function
that takes the @MacroEnv@ that all of the macros will use to build their
REs and returns the augmented `MacroEnv` with the new macro definitions.
This `MacroEnv` is generic and not dependent upon any back end —
none of the macros have been compiled.
\begin{code}
my_env :: MacroEnv -> MacroEnv
my_env env0 = env
where
env = env0 `HML.union` HML.fromList
[ f "epsilon" epsilon_macro
]
f nm mk = (mid, mk env mid)
where
mid = MacroID nm
\end{code}
From the `MacroEnv` we compile the macros into a `Macros RE` macro table
that we can insert into an `REOptions` that can be used to compile REs
in the application.
\begin{code}
my_macros :: Macros RE
my_macros = runIdentity $ mkMacros mk TDFA.regexType ExclCaptures $ my_env TDFA.preludeEnv
where
mk = maybe oops Identity . TDFA.compileRegexWithOptions TDFA.noPreludeREOptions
oops = error "my_macros: unexpected RE compilation error"
\end{code}
The `makeREOptions` function can be used to construct an `REOptions`
for compiling REs with `[re_| ... |]` and `[ed_| ... /// ... |]` quasi
quoters.
\begin{code}
myOptions :: TDFA.REOptions
myOptions = TDFA.makeREOptions my_macros
\end{code}
Now we can try out the `@{epsilon}` macro, using it to match nothing!
\begin{code}
evalme_TST_00 = checkThis "" (True) $ matched $ "///" ?=~ [re_|^//@{epsilon}/$|] myOptions
\end{code}
Why would we we want to match nothing? To break up three '/' in the RE part
of a `[ed_| ... /// ... |]` `SearchReplace` template.
\begin{code}
evalme_TST_01 = checkThis "" ("a <three slashes> replacement example") $ "a <///> replacement example" *=~/ [ed_|<//@{epsilon}/>///<three slashes>|] myOptions
\end{code}
For a more extensive example of macro environments see the
[`re-nginx-log-processor`](re-nginx-log-processor)
%main bottom