Henry Spencer's old regular expression library, also known as the book regex library, circa 1986.
Switch branches/tags
Nothing to show
Clone or download
Latest commit c7db272 Dec 26, 2014
Type Name Latest commit message Commit time
Failed to load latest commit information.
COPYRIGHT import Dec 26, 2014
Makefile import Dec 26, 2014
README import Dec 26, 2014
regerror.c import Dec 26, 2014
regexp.3 import Dec 26, 2014
regexp.c import Dec 26, 2014
regexp.h import Dec 26, 2014
regmagic.h import Dec 26, 2014
regsub.c import Dec 26, 2014
tests import Dec 26, 2014
timer.c import Dec 26, 2014
try.c import Dec 26, 2014


This is a revision of my well-known regular-expression package, regexp(3).
It gives C programs the ability to use egrep-style regular expressions, and
does it in a much cleaner fashion than the analogous routines in SysV.
It is not, alas, fully POSIX.2-compliant; that is hard.  (I'm working on
a full reimplementation that will do that.)

This version is the one which is examined and explained in one chapter of
"Software Solutions in C" (Dale Schumacher, ed.; AP Professional 1994;
ISBN 0-12-632360-7), plus a couple of insignificant updates, plus one
significant bug fix (done 10 Nov 1995).

Although this package was inspired by the Bell V8 regexp(3), this
implementation is *NOT* AT&T/Bell code, and is not derived from licensed
software.  Even though U of T is a V8 licensee.  This software is based on
a V8 manual page sent to me by Dennis Ritchie (the manual page enclosed
here is a complete rewrite and hence is not covered by AT&T copyright).
I admit to some familiarity with regular-expression implementations of
the past, but the only one that this code traces any ancestry to is the
one published in Kernighan & Plauger's "Software Tools" (from which
this one draws ideas but not code).

Simplistically:  put this stuff into a source directory, inspect Makefile
for compilation options that need changing to suit your local environment,
and then do "make".  This compiles the regexp(3) functions, builds a
library containing them, compiles a test program, and runs a large set of
regression tests.  If there are no complaints, then put regexp.h into
/usr/include, add regexp.o, regsub.o, and regerror.o into your C library
(or put libre.a into /usr/lib), and install regexp.3 (perhaps with slight
modifications) in your manual-pages directory. 

The files are:

COPYRIGHT	copyright notice
README		this text
Makefile	instructions to make everything
regexp.3	manual page
regexp.h	header file, for /usr/include
regexp.c	source for regcomp() and regexec()
regsub.c	source for regsub()
regerror.c	source for default regerror()
regmagic.h	internal header file
try.c		source for test program
timer.c		source for timing program
tests		test list for try and timer

This implementation uses nondeterministic automata rather than the
deterministic ones found in some other implementations, which makes it
simpler, smaller, and faster at compiling regular expressions, but slower
at executing them.  Many users have found the speed perfectly adequate,
although replacing the insides of egrep with this code would be a mistake.

This stuff should be pretty portable, given an ANSI C compiler and
appropriate option settings.  There are no "reserved" char values except for
NUL, and no special significance is attached to the top bit of chars.
The string(3) functions are used a fair bit, on the grounds that they are
probably faster than coding the operations in line.  Some attempts at code
tuning have been made, but this is invariably a bit machine-specific.

This distribution lives at ftp://ftp.zoo.toronto.edu/pub/bookregexp.{tar|shar}
at present.