Skip to content

Commit

Permalink
Makefile & compat/pcre2: add ability to build an embedded PCRE
Browse files Browse the repository at this point in the history
Add a USE_LIBPCRE2_BUNDLED=YesIHaveNoPackagedVersion flag to the
Makefile. If enabled we'll use the PCRE v2 shipped in compat/pcre2,
instead of trying to find it via -lpcre2-8 on the installed system.

As covered in a previous commits ("grep: add support for PCRE v2",
2017-04-08) there are major benefits to using a bleeding edge PCRE v2,
but more importantly I'd like to experiment with making PCRE a
mandatory dependency to power various internal features of grep/log
without the user being aware that they're using the library under the
hood, similar to how we use kwset now for fixed-string searches.

Imposing that hard dependency on everyone using git would bother a lot
of people, whereas if git itself ships PCRE it's no more bothersome
than the code using kwset, i.e. it can be invisible to the builder &
user, and even allow git to target newer PCRE APIs (or parts of the
API, e.g. pattern conversion) without worrying about versioning.

See [1] for a mostly one-sided pcre-dev mailing list thread discussing
how embed the library.

Implementation details:

 * I configured PCRE v2 with ./configure --enable-jit --enable-utf

 * It sets a lot of -DHAVE_* but these are used by the subset of the
   files I copied, many are either unused or only used by pcre2test.c
   which isn't brought in by the script.

 * These -DHAVE_* flags are something we have already by default &
   assume in other git.git code, so it should be fine to define it.

 * -DPCRE2_CODE_UNIT_WIDTH=8 only compiles the functions linking to
   -lpcre2-8 would have gotten us.

 * All the limits / sizes are the PCRE defaults, the
   MATCH_LIMIT_RECURSION define is a synonym for MATCH_LIMIT_DEPTH in
   older versions, it allows building against older (currently
   release) versions of the library.

 * -DNEWLINE_DEFAULT=2 means only \n is recognized as a newline. This
    corresponds to the --enable-newline-is-lf option. It's also
    possible to set this to CR, CRLF, any of CR, LF, or CRLF, or any
    Unicode newline character being recognized as \n.

    This *might* have to be customized on Windows, but I think the
    grep machinery always splits on newlines for us already, so this
    probably works on Windows as-is, but needs testing.

1. https://lists.exim.org/lurker/thread/20170507.223619.fbee8f00.en.html

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
  • Loading branch information
avar committed Jun 19, 2017
1 parent 05ec6e1 commit 79256ee
Show file tree
Hide file tree
Showing 2 changed files with 119 additions and 0 deletions.
52 changes: 52 additions & 0 deletions Makefile
Expand Up @@ -47,6 +47,11 @@ all::
# PCRE this points to determined by the USE_LIBPCRE1 and USE_LIBPCRE2
# variables.
#
# Define USE_LIBPCRE2_BUNDLED=YesIHaveNoPackagedVersion in addition to
# USE_LIBPCRE2=YesPlease if you'd like to use a copy of PCRE version 2
# bunded with Git. This is for setups where getting a hold of a
# packaged PCRE is inconvenient.
#
# Define HAVE_ALLOCA_H if you have working alloca(3) defined in that header.
#
# Define NO_CURL if you do not have libcurl installed. git-http-fetch and
Expand Down Expand Up @@ -1121,8 +1126,10 @@ endif

ifdef USE_LIBPCRE2
BASIC_CFLAGS += -DUSE_LIBPCRE2
ifndef USE_LIBPCRE2_BUNDLED
EXTLIBS += -lpcre2-8
endif
endif

ifdef LIBPCREDIR
BASIC_CFLAGS += -I$(LIBPCREDIR)/include
Expand Down Expand Up @@ -1528,6 +1535,50 @@ ifdef NO_REGEX
COMPAT_CFLAGS += -Icompat/regex
COMPAT_OBJS += compat/regex/regex.o
endif
ifdef USE_LIBPCRE2_BUNDLED
ifndef USE_LIBPCRE2
$(error please set USE_LIBPCRE2=YesPlease when setting \
USE_LIBPCRE2_BUNDLED=$(USE_LIBPCRE2_BUNDLED))
endif
COMPAT_CFLAGS += \
-Icompat/pcre2/src \
-DHAVE_BCOPY=1 \
-DHAVE_INTTYPES_H=1 \
-DHAVE_MEMMOVE=1 \
-DHAVE_STDINT_H=1 \
-DPCRE2_CODE_UNIT_WIDTH=8 \
-DLINK_SIZE=2 \
-DHEAP_LIMIT=20000000 \
-DMATCH_LIMIT=10000000 \
-DMATCH_LIMIT_DEPTH=10000000 \
-DMATCH_LIMIT_RECURSION=10000000 \
-DMAX_NAME_COUNT=10000 \
-DMAX_NAME_SIZE=32 \
-DPARENS_NEST_LIMIT=250 \
-DNEWLINE_DEFAULT=2 \
-DSUPPORT_JIT \
-DSUPPORT_UNICODE
COMPAT_OBJS += \
compat/pcre2/src/pcre2_auto_possess.o \
compat/pcre2/src/pcre2_chartables.o \
compat/pcre2/src/pcre2_compile.o \
compat/pcre2/src/pcre2_config.o \
compat/pcre2/src/pcre2_context.o \
compat/pcre2/src/pcre2_error.o \
compat/pcre2/src/pcre2_find_bracket.o \
compat/pcre2/src/pcre2_jit_compile.o \
compat/pcre2/src/pcre2_maketables.o \
compat/pcre2/src/pcre2_match.o \
compat/pcre2/src/pcre2_match_data.o \
compat/pcre2/src/pcre2_newline.o \
compat/pcre2/src/pcre2_ord2utf.o \
compat/pcre2/src/pcre2_string_utils.o \
compat/pcre2/src/pcre2_study.o \
compat/pcre2/src/pcre2_tables.o \
compat/pcre2/src/pcre2_ucd.o \
compat/pcre2/src/pcre2_valid_utf.o \
compat/pcre2/src/pcre2_xclass.o
endif
ifdef NATIVE_CRLF
BASIC_CFLAGS += -DNATIVE_CRLF
endif
Expand Down Expand Up @@ -2282,6 +2333,7 @@ GIT-BUILD-OPTIONS: FORCE
@echo NO_EXPAT=\''$(subst ','\'',$(subst ','\'',$(NO_EXPAT)))'\' >>$@+
@echo USE_LIBPCRE1=\''$(subst ','\'',$(subst ','\'',$(USE_LIBPCRE1)))'\' >>$@+
@echo USE_LIBPCRE2=\''$(subst ','\'',$(subst ','\'',$(USE_LIBPCRE2)))'\' >>$@+
@echo USE_LIBPCRE2_BUNDLED=\''$(subst ','\'',$(subst ','\'',$(USE_LIBPCRE2_BUNDLED)))'\' >>$@+
@echo NO_LIBPCRE1_JIT=\''$(subst ','\'',$(subst ','\'',$(NO_LIBPCRE1_JIT)))'\' >>$@+
@echo NO_PERL=\''$(subst ','\'',$(subst ','\'',$(NO_PERL)))'\' >>$@+
@echo NO_PTHREADS=\''$(subst ','\'',$(subst ','\'',$(NO_PTHREADS)))'\' >>$@+
Expand Down
67 changes: 67 additions & 0 deletions compat/pcre2/get-pcre2.sh
@@ -0,0 +1,67 @@
#!/bin/sh -e

# Usage:
# ./get-pcre2.sh '' 'trunk'
# ./get-pcre2.sh '' 'tags/pcre2-10.23'
# ./get-pcre2.sh ~/g/pcre2 ''

srcdir=$1
version=$2
if test -z "$version"
then
version="tags/pcre2-10.23"
fi

echo Getting PCRE v2 version $version
rm -rfv src
mkdir src src/sljit

for srcfile in \
pcre2.h \
pcre2_internal.h \
pcre2_intmodedep.h \
pcre2_ucp.h \
pcre2_auto_possess.c \
pcre2_chartables.c.dist \
pcre2_compile.c \
pcre2_config.c \
pcre2_context.c \
pcre2_error.c \
pcre2_find_bracket.c \
pcre2_jit_compile.c \
pcre2_jit_match.c \
pcre2_jit_misc.c \
pcre2_maketables.c \
pcre2_match.c \
pcre2_match_data.c \
pcre2_newline.c \
pcre2_ord2utf.c \
pcre2_string_utils.c \
pcre2_study.c \
pcre2_tables.c \
pcre2_ucd.c \
pcre2_valid_utf.c \
pcre2_xclass.c
do
if test -z "$srcdir"
then
svn cat svn://vcs.exim.org/pcre2/code/$version/src/$srcfile >src/$srcfile
else
cp "$srcdir/src/$srcfile" src/$srcfile
fi
wc -l src/$srcfile
done

(cd src && ln -sf pcre2_chartables.c.dist pcre2_chartables.c)

if test -z "$srcdir"
then
for srcfile in $(svn ls svn://vcs.exim.org/pcre2/code/tags/pcre2-10.23/src/sljit)
do
svn cat svn://vcs.exim.org/pcre2/code/$version/src/sljit/$srcfile >src/sljit/$srcfile
wc -l src/sljit/$srcfile
done
else
cp -R "$srcdir/src/sljit" src/
wc -l src/sljit/*
fi

0 comments on commit 79256ee

Please sign in to comment.