Skip to content

Henry Spencer's old regular expression library, also known as the book regex library, circa 1986.

Notifications You must be signed in to change notification settings


Folders and files

Last commit message
Last commit date

Latest commit



1 Commit

Repository files navigation

This is a revision of my well-known regular-expression package, regexp(3).
It gives C programs the ability to use egrep-style regular expressions, and
does it in a much cleaner fashion than the analogous routines in SysV.
It is not, alas, fully POSIX.2-compliant; that is hard.  (I'm working on
a full reimplementation that will do that.)

This version is the one which is examined and explained in one chapter of
"Software Solutions in C" (Dale Schumacher, ed.; AP Professional 1994;
ISBN 0-12-632360-7), plus a couple of insignificant updates, plus one
significant bug fix (done 10 Nov 1995).

Although this package was inspired by the Bell V8 regexp(3), this
implementation is *NOT* AT&T/Bell code, and is not derived from licensed
software.  Even though U of T is a V8 licensee.  This software is based on
a V8 manual page sent to me by Dennis Ritchie (the manual page enclosed
here is a complete rewrite and hence is not covered by AT&T copyright).
I admit to some familiarity with regular-expression implementations of
the past, but the only one that this code traces any ancestry to is the
one published in Kernighan & Plauger's "Software Tools" (from which
this one draws ideas but not code).

Simplistically:  put this stuff into a source directory, inspect Makefile
for compilation options that need changing to suit your local environment,
and then do "make".  This compiles the regexp(3) functions, builds a
library containing them, compiles a test program, and runs a large set of
regression tests.  If there are no complaints, then put regexp.h into
/usr/include, add regexp.o, regsub.o, and regerror.o into your C library
(or put libre.a into /usr/lib), and install regexp.3 (perhaps with slight
modifications) in your manual-pages directory. 

The files are:

COPYRIGHT	copyright notice
README		this text
Makefile	instructions to make everything
regexp.3	manual page
regexp.h	header file, for /usr/include
regexp.c	source for regcomp() and regexec()
regsub.c	source for regsub()
regerror.c	source for default regerror()
regmagic.h	internal header file
try.c		source for test program
timer.c		source for timing program
tests		test list for try and timer

This implementation uses nondeterministic automata rather than the
deterministic ones found in some other implementations, which makes it
simpler, smaller, and faster at compiling regular expressions, but slower
at executing them.  Many users have found the speed perfectly adequate,
although replacing the insides of egrep with this code would be a mistake.

This stuff should be pretty portable, given an ANSI C compiler and
appropriate option settings.  There are no "reserved" char values except for
NUL, and no special significance is attached to the top bit of chars.
The string(3) functions are used a fair bit, on the grounds that they are
probably faster than coding the operations in line.  Some attempts at code
tuning have been made, but this is invariably a bit machine-specific.

This distribution lives at{tar|shar}
at present.


Henry Spencer's old regular expression library, also known as the book regex library, circa 1986.







No releases published