xre
exists to bring the awesome power of Rob Pike's Structural Regular
Expressions beyond the reach of the sam
editor
(approriately/coincidentally/ironically it is implemented in Go, yielding
more Rob Pike reference).
WARNING: It is still in a primordial / experimental phase, but works well as a proof of concept.
A short comparison to the grep/ed model:
- a new
x/re/
command extracts structure matched by a regular expression - ...
x[
x{
x(
andx<
extract a balanced pair of braces - a new
y/re/
command extracts structure delimited by a regular expression - ...
y"delim"
extracts structure between occurrences of a static delimiter, e.g.y"\n"
for classic UNIX line-orientation - ...
y/start/end/
extracts structure between two regular expressions - ...
y[
y{
y(
andy<
extract content within a balanced pair of braces - the
g/re/
command filters the current buffer (as extracted byx
ory
) if the given pattern matches - the
v/re/
command filters the current buffer (as extracted byx
ory
) if the given pattern doesn't matches - the
p
command prints - ...
p"delim"
prints with a delimiter, e.g.p"\n"
to return to the warm embrace of classic UNIX tools - ...
p%"format"
prints with a format pattern, e.g.p"%q\n"
is particularly useful while developing an xre program
Loosely quoting from Structural Regular Expressions:
...if the interesting quantum of information isn’t a line, most of the (UNIX) tools don’t help, or at best do poorly
For example, it is sometimes useful to deal with things like paragraphs (bytes
that are delimited by a blank line, i.e. "\n\n"
). For maximal self reference,
such a data set can be had from your nearest Go program form either its
/debug/pprof/heap?debug=1
endpoint, or by calling
pprof.Lookup("heap").WriteTo(f, 1)
yourself.
For example, the following xre program extracts just the allocation bytes from
heap allocations involving a call to bytes.makeSlice
(i.e. when a
bytes.Buffer
needs to grow):
xre 'y"\n\n" v/bytes.makeSlice/ y"\n" v/^#|^$/ x[x/^\d: (\d+)/ p"\n"'
Breaking down the above command
- extract paragraphs (buffers defined delimited by blank lines)
- keep only the paragraphs that mention "bytes.makeSlice"
- extract lines within those paragraphs
- and keep only the lines that aren't blank and don't start with a
"#"
- on those lines, extract the contents of the first balanced
[ ]
pair - and then extract the "MMM" in a "NNN: MMM" match within it
- finally, print those numbers delimited by new lines (the classic UNIX paradigm)
As always, summing a stream of numbers is left as an exercise to the reader.