Skip to content

Commit 1cac281

Browse files
committed
Merge branch 'merge-pcre' into 10.0
2 parents 895b253 + dfd7749 commit 1cac281

30 files changed

+1449
-1166
lines changed

pcre/AUTHORS

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ Email domain: cam.ac.uk
88
University of Cambridge Computing Service,
99
Cambridge, England.
1010

11-
Copyright (c) 1997-2016 University of Cambridge
11+
Copyright (c) 1997-2017 University of Cambridge
1212
All rights reserved
1313

1414

@@ -19,7 +19,7 @@ Written by: Zoltan Herczeg
1919
Email local part: hzmester
2020
Emain domain: freemail.hu
2121

22-
Copyright(c) 2010-2016 Zoltan Herczeg
22+
Copyright(c) 2010-2017 Zoltan Herczeg
2323
All rights reserved.
2424

2525

@@ -30,7 +30,7 @@ Written by: Zoltan Herczeg
3030
Email local part: hzmester
3131
Emain domain: freemail.hu
3232

33-
Copyright(c) 2009-2016 Zoltan Herczeg
33+
Copyright(c) 2009-2017 Zoltan Herczeg
3434
All rights reserved.
3535

3636

pcre/CMakeLists.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -66,6 +66,7 @@
6666
# 2013-10-08 PH got rid of the "source" command, which is a bash-ism (use ".")
6767
# 2013-11-05 PH added support for PARENS_NEST_LIMIT
6868
# 2016-03-01 PH applied Chris Wilson's patch for MSVC static build
69+
# 2016-06-24 PH applied Chris Wilson's revised patch (adds a separate option)
6970

7071
PROJECT(PCRE C CXX)
7172

pcre/ChangeLog

Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,53 @@ ChangeLog for PCRE
44
Note that the PCRE 8.xx series (PCRE1) is now in a bugfix-only state. All
55
development is happening in the PCRE2 10.xx series.
66

7+
Version 8.40 11-January-2017
8+
----------------------------
9+
10+
1. Using -o with -M in pcregrep could cause unnecessary repeated output when
11+
the match extended over a line boundary.
12+
13+
2. Applied Chris Wilson's second patch (Bugzilla #1681) to CMakeLists.txt for
14+
MSVC static compilation, putting the first patch under a new option.
15+
16+
3. Fix register overwite in JIT when SSE2 acceleration is enabled.
17+
18+
4. Ignore "show all captures" (/=) for DFA matching.
19+
20+
5. Fix JIT unaligned accesses on x86. Patch by Marc Mutz.
21+
22+
6. In any wide-character mode (8-bit UTF or any 16-bit or 32-bit mode),
23+
without PCRE_UCP set, a negative character type such as \D in a positive
24+
class should cause all characters greater than 255 to match, whatever else
25+
is in the class. There was a bug that caused this not to happen if a
26+
Unicode property item was added to such a class, for example [\D\P{Nd}] or
27+
[\W\pL].
28+
29+
7. When pcretest was outputing information from a callout, the caret indicator
30+
for the current position in the subject line was incorrect if it was after
31+
an escape sequence for a character whose code point was greater than
32+
\x{ff}.
33+
34+
8. A pattern such as (?<RA>abc)(?(R)xyz) was incorrectly compiled such that
35+
the conditional was interpreted as a reference to capturing group 1 instead
36+
of a test for recursion. Any group whose name began with R was
37+
misinterpreted in this way. (The reference interpretation should only
38+
happen if the group's name is precisely "R".)
39+
40+
9. A number of bugs have been mended relating to match start-up optimizations
41+
when the first thing in a pattern is a positive lookahead. These all
42+
applied only when PCRE_NO_START_OPTIMIZE was *not* set:
43+
44+
(a) A pattern such as (?=.*X)X$ was incorrectly optimized as if it needed
45+
both an initial 'X' and a following 'X'.
46+
(b) Some patterns starting with an assertion that started with .* were
47+
incorrectly optimized as having to match at the start of the subject or
48+
after a newline. There are cases where this is not true, for example,
49+
(?=.*[A-Z])(?=.{8,16})(?!.*[\s]) matches after the start in lines that
50+
start with spaces. Starting .* in an assertion is no longer taken as an
51+
indication of matching at the start (or after a newline).
52+
53+
754
Version 8.39 14-June-2016
855
-------------------------
956

pcre/LICENCE

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@ Email domain: cam.ac.uk
2525
University of Cambridge Computing Service,
2626
Cambridge, England.
2727

28-
Copyright (c) 1997-2016 University of Cambridge
28+
Copyright (c) 1997-2017 University of Cambridge
2929
All rights reserved.
3030

3131

@@ -36,7 +36,7 @@ Written by: Zoltan Herczeg
3636
Email local part: hzmester
3737
Emain domain: freemail.hu
3838

39-
Copyright(c) 2010-2016 Zoltan Herczeg
39+
Copyright(c) 2010-2017 Zoltan Herczeg
4040
All rights reserved.
4141

4242

@@ -47,7 +47,7 @@ Written by: Zoltan Herczeg
4747
Email local part: hzmester
4848
Emain domain: freemail.hu
4949

50-
Copyright(c) 2009-2016 Zoltan Herczeg
50+
Copyright(c) 2009-2017 Zoltan Herczeg
5151
All rights reserved.
5252

5353

pcre/NEWS

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,12 @@
11
News about PCRE releases
22
------------------------
33

4+
Release 8.40 11-January-2017
5+
----------------------------
6+
7+
This is a bug-fix release.
8+
9+
410
Release 8.39 14-June-2016
511
-------------------------
612

pcre/configure.ac

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -9,17 +9,17 @@ dnl The PCRE_PRERELEASE feature is for identifying release candidates. It might
99
dnl be defined as -RC2, for example. For real releases, it should be empty.
1010

1111
m4_define(pcre_major, [8])
12-
m4_define(pcre_minor, [39])
12+
m4_define(pcre_minor, [40])
1313
m4_define(pcre_prerelease, [])
14-
m4_define(pcre_date, [2016-06-14])
14+
m4_define(pcre_date, [2017-01-11])
1515

1616
# NOTE: The CMakeLists.txt file searches for the above variables in the first
1717
# 50 lines of this file. Please update that if the variables above are moved.
1818

1919
# Libtool shared library interface versions (current:revision:age)
20-
m4_define(libpcre_version, [3:7:2])
21-
m4_define(libpcre16_version, [2:7:2])
22-
m4_define(libpcre32_version, [0:7:0])
20+
m4_define(libpcre_version, [3:8:2])
21+
m4_define(libpcre16_version, [2:8:2])
22+
m4_define(libpcre32_version, [0:8:0])
2323
m4_define(libpcreposix_version, [0:4:0])
2424
m4_define(libpcrecpp_version, [0:1:0])
2525

pcre/doc/html/pcrecompat.html

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -128,7 +128,7 @@ <h1>pcrecompat man page</h1>
128128
14. PCRE's handling of duplicate subpattern numbers and duplicate subpattern
129129
names is not as general as Perl's. This is a consequence of the fact the PCRE
130130
works internally just with numbers, using an external table to translate
131-
between numbers and names. In particular, a pattern such as (?|(?&#60;a&#62;A)|(?&#60;b)B),
131+
between numbers and names. In particular, a pattern such as (?|(?&#60;a&#62;A)|(?&#60;b&#62;B),
132132
where the two capturing parentheses have the same number but different names,
133133
is not supported, and causes an error at compile time. If it were allowed, it
134134
would not be possible to distinguish which parentheses matched, because both

pcre/doc/html/pcrepattern.html

Lines changed: 20 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -358,24 +358,24 @@ <h1>pcrepattern man page</h1>
358358
generate the appropriate EBCDIC code values. The \c escape is processed
359359
as specified for Perl in the <b>perlebcdic</b> document. The only characters
360360
that are allowed after \c are A-Z, a-z, or one of @, [, \, ], ^, _, or ?. Any
361-
other character provokes a compile-time error. The sequence \@ encodes
362-
character code 0; the letters (in either case) encode characters 1-26 (hex 01
363-
to hex 1A); [, \, ], ^, and _ encode characters 27-31 (hex 1B to hex 1F), and
364-
\? becomes either 255 (hex FF) or 95 (hex 5F).
361+
other character provokes a compile-time error. The sequence \c@ encodes
362+
character code 0; after \c the letters (in either case) encode characters 1-26
363+
(hex 01 to hex 1A); [, \, ], ^, and _ encode characters 27-31 (hex 1B to hex
364+
1F), and \c? becomes either 255 (hex FF) or 95 (hex 5F).
365365
</P>
366366
<P>
367-
Thus, apart from \?, these escapes generate the same character code values as
367+
Thus, apart from \c?, these escapes generate the same character code values as
368368
they do in an ASCII environment, though the meanings of the values mostly
369-
differ. For example, \G always generates code value 7, which is BEL in ASCII
369+
differ. For example, \cG always generates code value 7, which is BEL in ASCII
370370
but DEL in EBCDIC.
371371
</P>
372372
<P>
373-
The sequence \? generates DEL (127, hex 7F) in an ASCII environment, but
373+
The sequence \c? generates DEL (127, hex 7F) in an ASCII environment, but
374374
because 127 is not a control character in EBCDIC, Perl makes it generate the
375375
APC character. Unfortunately, there are several variants of EBCDIC. In most of
376376
them the APC character has the value 255 (hex FF), but in the one Perl calls
377377
POSIX-BC its value is 95 (hex 5F). If certain other characters have POSIX-BC
378-
values, PCRE makes \? generate 95; otherwise it generates 255.
378+
values, PCRE makes \c? generate 95; otherwise it generates 255.
379379
</P>
380380
<P>
381381
After \0 up to two further octal digits are read. If there are fewer than two
@@ -1512,13 +1512,8 @@ <h1>pcrepattern man page</h1>
15121512
<P>
15131513
When one of these option changes occurs at top level (that is, not inside
15141514
subpattern parentheses), the change applies to the remainder of the pattern
1515-
that follows. If the change is placed right at the start of a pattern, PCRE
1516-
extracts it into the global options (and it will therefore show up in data
1517-
extracted by the <b>pcre_fullinfo()</b> function).
1518-
</P>
1519-
<P>
1520-
An option change within a subpattern (see below for a description of
1521-
subpatterns) affects only that part of the subpattern that follows it, so
1515+
that follows. An option change within a subpattern (see below for a description
1516+
of subpatterns) affects only that part of the subpattern that follows it, so
15221517
<pre>
15231518
(a(?i)b)c
15241519
</pre>
@@ -2160,6 +2155,14 @@ <h1>pcrepattern man page</h1>
21602155
always, does do capturing in negative assertions.)
21612156
</P>
21622157
<P>
2158+
WARNING: If a positive assertion containing one or more capturing subpatterns
2159+
succeeds, but failure to match later in the pattern causes backtracking over
2160+
this assertion, the captures within the assertion are reset only if no higher
2161+
numbered captures are already set. This is, unfortunately, a fundamental
2162+
limitation of the current implementation, and as PCRE1 is now in
2163+
maintenance-only status, it is unlikely ever to change.
2164+
</P>
2165+
<P>
21632166
For compatibility with Perl, assertion subpatterns may be repeated; though
21642167
it makes no sense to assert the same thing several times, the side effect of
21652168
capturing parentheses may occasionally be useful. In practice, there only three
@@ -3264,9 +3267,9 @@ <h1>pcrepattern man page</h1>
32643267
</P>
32653268
<br><a name="SEC30" href="#TOC1">REVISION</a><br>
32663269
<P>
3267-
Last updated: 14 June 2015
3270+
Last updated: 23 October 2016
32683271
<br>
3269-
Copyright &copy; 1997-2015 University of Cambridge.
3272+
Copyright &copy; 1997-2016 University of Cambridge.
32703273
<br>
32713274
<p>
32723275
Return to the <a href="index.html">PCRE index page</a>.

0 commit comments

Comments
 (0)