Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revive 'pcre2-chartables-leakfix' #402

Closed
wants to merge 3 commits into from

Conversation

dscho
Copy link
Member

@dscho dscho commented Oct 16, 2019

As I had stated several times, I was really unhappy how the original fix harped on nedmalloc and totally ignored runtime-configured custom allocators.

So this is, at long last, my attempt to give this a new life. It is based off of maint and needs trivial merge conflict resolutions relative to master.

Changes since v1:

  • I managed to mess up the authorship of 3/3. Sorry for that. I fixed it, so that Carlo is shown as author again.

Cc: Carlo Marcelo Arenas Belón carenas@gmail.com
cc: Ævar Arnfjörð Bjarmason avarab@gmail.com

63e7e9d ("git-grep: Learn PCRE", 2011-05-09) didn't include a way
to override the system alocator, and so it is incompatible with
USE_NED_ALLOCATOR as reported by Dscho[1] (in similar code from PCRE2)

Note that nedmalloc, as well as other custom allocators like jemalloc
and mi-malloc, can be configured at runtime (via `LD_PRELOAD`),
therefore we cannot know at compile time whether a custom allocator is
used or not.

Make the minimum change possible to ensure this combination is supported
by extending `grep_init()` to set the PCRE1 specific functions to Git's
idea of `malloc()` and `free()` and therefore making sure all
allocations are done inside PCRE1 with the same allocator than the rest
of Git.

This change has negligible performance impact: PCRE needs to allocate
memory once per program run for the character table and for each pattern
compilation. These are both rare events compared to matching patterns
against lines. Actual measurements[2] show that the impact is lost in
the noise.

[1] https://public-inbox.org/git/pull.306.git.gitgitgadget@gmail.com
[2] https://public-inbox.org/git/7f42007f-911b-c570-17f6-1c6af0429586@web.de

Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
94da919 (grep: add support for PCRE v2, 2017-06-01) didn't include
a way to override the system allocator, and so it is incompatible with
custom allocators (e.g. nedmalloc). This problem became obvious when we
tried to plug a memory leak by `free()`ing a data structure allocated by
PCRE2, triggering a segfault in Windows (where we use nedmalloc by
default).

PCRE2 requires the use of a general context to override the allocator
and therefore, there is a lot more code needed than in PCRE1, including
a couple of wrapper functions.

Extend the grep API with a "destructor" that could be called to cleanup
any objects that were created and used globally.

Update `builtin/grep.c` to use that new API, but any other future users
should make sure to have matching `grep_init()`/`grep_destroy()` calls
if they are using the pattern matching functionality.

Move some of the logic that was before done per thread (in the workers)
into an earlier phase to avoid degrading performance, but as the use of
PCRE2 with custom allocators is better understood it is expected more of
its functions will be instructed to use the custom allocator as well as
was done in the original code[1] this work was based on.

[1] https://public-inbox.org/git/3397e6797f872aedd18c6d795f4976e1c579514b.1565005867.git.gitgitgadget@gmail.com/

Reported-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
@dscho
Copy link
Member Author

dscho commented Oct 16, 2019

/submit

94da919 ("grep: add support for PCRE v2", 2017-06-01) introduced
a small memory leak visible with valgrind in t7813.

Complete the creation of a PCRE2 specific variable that was missing from
the original change and free the generated table just like it is done
for PCRE1.

Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
@dscho
Copy link
Member Author

dscho commented Oct 16, 2019

/submit

@gitgitgadget
Copy link

gitgitgadget bot commented Oct 16, 2019

@@ -1145,5 +1145,6 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
run_pager(&opt, prefix);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Junio C Hamano wrote (reply to this):

"Carlo Marcelo Arenas Bel=C3=B3n via GitGitGadget"
<gitgitgadget@gmail.com> writes:

>  builtin/grep.c |  1 +
>  grep.c         | 34 +++++++++++++++++++++++++++++++++-
>  grep.h         |  1 +
>  3 files changed, 35 insertions(+), 1 deletion(-)

> =20
> +#if defined(USE_LIBPCRE2)
> +	if (!pcre2_global_context)
> +		pcre2_global_context =3D pcre2_general_context_create(
> +					pcre2_malloc, pcre2_free, NULL);
> +#endif

This part should use the same "#ifdef" as the other one which is a
minor nit, for consistency.  I do not care too deeply which way we
unify, but we should stick to one.

> +
>  #ifdef USE_LIBPCRE1
>  	pcre_malloc =3D malloc;
>  	pcre_free =3D free;

Other than that, all 3 patches look sensible, and they certainly got
simplified by dropping the conditional ;-).

@gitgitgadget
Copy link

gitgitgadget bot commented Oct 18, 2019

This branch is now known as cb/pcre2-chartables-leakfix.

@gitgitgadget
Copy link

gitgitgadget bot commented Oct 18, 2019

This patch series was integrated into pu via git@f2a3529.

@gitgitgadget
Copy link

gitgitgadget bot commented Oct 18, 2019

This patch series was integrated into pu via git@14abe4c.

@gitgitgadget
Copy link

gitgitgadget bot commented Oct 18, 2019

This patch series was integrated into next via git@5bc67c6.

@gitgitgadget
Copy link

gitgitgadget bot commented Oct 22, 2019

This patch series was integrated into pu via git@236d688.

@gitgitgadget
Copy link

gitgitgadget bot commented Oct 23, 2019

This patch series was integrated into pu via git@24cb86d.

@gitgitgadget
Copy link

gitgitgadget bot commented Oct 23, 2019

This patch series was integrated into pu via git@e0ff2d4.

@gitgitgadget
Copy link

gitgitgadget bot commented Oct 23, 2019

This patch series was integrated into next via git@e0ff2d4.

@gitgitgadget
Copy link

gitgitgadget bot commented Oct 23, 2019

This patch series was integrated into master via git@e0ff2d4.

@gitgitgadget gitgitgadget bot added the master label Oct 23, 2019
@gitgitgadget
Copy link

gitgitgadget bot commented Oct 23, 2019

Closed via e0ff2d4.

@gitgitgadget gitgitgadget bot closed this Oct 23, 2019
@dscho dscho deleted the pcre2-chartables-leakfix branch October 23, 2019 12:23
@@ -1145,5 +1145,6 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
run_pager(&opt, prefix);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Ævar Arnfjörð Bjarmason wrote (reply to this):

This series fixes up bugs in our PCRE v2 wrapper code and how it
handles malloc()/free().

Junio: I'm splitting this off my recently sent 25 patch series, which
I should probably have sent as an RFC:
https://lore.kernel.org/git/20210203032811.14979-1-avarab@gmail.com/

It's on top of "next", as it would otherwise conflict with my
in-flight ab/grep-pcre-invalid-utf8, ab/lose-grep-debug and ab/retire-pcre1.

06/10 is a follow-up improvement (not a fix, the in-flight works fine
too) for ab/grep-pcre-invalid-utf8. The latter two just touch adjacent
lines of code.

There's no notable new behavior here, just cleanup of existing
functionality. In mid-2019 there was a lot of discussion around the
code being fixed here:

    https://lore.kernel.org/git/pull.306.git.gitgitgadget@gmail.com/#t
    https://lore.kernel.org/git/pull.402.git.1571227613.gitgitgadget@gmail.com/

As discussed in 08/10 I believe that fix was so difficult to get right
because it was starting out with a fundamentally incorrect assumption
about how PCRE v2's memory handling works. With 08-10/10 we end up
with a much easier to reason about end-state, as the API itself is
actually quite simple.

Ævar Arnfjörð Bjarmason (10):
  grep/pcre2: drop needless assignment + assert() on opt->pcre2
  grep/pcre2: drop needless assignment to NULL
  grep/pcre2: correct reference to grep_init() in comment
  grep/pcre2: prepare to add debugging to pcre2_malloc()
  grep/pcre2: add GREP_PCRE2_DEBUG_MALLOC debug mode
  grep/pcre2: use compile-time PCREv2 version test
  grep/pcre2: use pcre2_maketables_free() function
  grep/pcre2: actually make pcre2 use custom allocator
  grep/pcre2: move back to thread-only PCREv2 structures
  grep/pcre2: move definitions of pcre2_{malloc,free}

 builtin/grep.c |  1 -
 grep.c         | 99 ++++++++++++++++++++++----------------------------
 grep.h         |  9 ++++-
 3 files changed, 51 insertions(+), 58 deletions(-)

-- 
2.30.0.284.gd98b1dd5eaa7

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Junio C Hamano wrote (reply to this):

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> This series fixes up bugs in our PCRE v2 wrapper code and how it
> handles malloc()/free().
>
> Junio: I'm splitting this off my recently sent 25 patch series, which
> I should probably have sent as an RFC:
> https://lore.kernel.org/git/20210203032811.14979-1-avarab@gmail.com/
>
> It's on top of "next", as it would otherwise conflict with my
> in-flight ab/grep-pcre-invalid-utf8, ab/lose-grep-debug and ab/retire-pcre1.

These three seem to be solid and I am planning to merge them down to
'master' soonish.  I was hoping that the series would get enough
reviews by the time it happens so that I can expect an update that
cleanly applies on top of 'master', and the plan seems to be working
well ;-)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Ævar Arnfjörð Bjarmason wrote (reply to this):

Now based on "master", when I sent v1[1] it dependen on other
in-flight topics of mine.

Addresses minor issues with the commit messages raised by Johannes on
v1, and other commit message issues I noticed myself on re-reading
this.

1. https://lore.kernel.org/git/20210204210556.25242-1-avarab@gmail.com/

Ævar Arnfjörð Bjarmason (10):
  grep/pcre2: drop needless assignment + assert() on opt->pcre2
  grep/pcre2: drop needless assignment to NULL
  grep/pcre2: correct reference to grep_init() in comment
  grep/pcre2: prepare to add debugging to pcre2_malloc()
  grep/pcre2: add GREP_PCRE2_DEBUG_MALLOC debug mode
  grep/pcre2: use compile-time PCREv2 version test
  grep/pcre2: use pcre2_maketables_free() function
  grep/pcre2: actually make pcre2 use custom allocator
  grep/pcre2: move back to thread-only PCREv2 structures
  grep/pcre2: move definitions of pcre2_{malloc,free}

 builtin/grep.c |  1 -
 grep.c         | 99 ++++++++++++++++++++++----------------------------
 grep.h         |  9 ++++-
 3 files changed, 51 insertions(+), 58 deletions(-)

Range-diff:
 1:  a11f1e91552 !  1:  40d2e47c540 grep/pcre2: drop needless assignment + assert() on opt->pcre2
    @@ Commit message
         There was never a good reason for this, it's just a relic from when I
         initially wrote the PCREv2 support. We're not going to have confusion
         about compile_pcre2_pattern() being called when it shouldn't just
    -    because we forgot to cargo-cult this opt->pcre2 option, and "opt"
    -    is (mostly) used for the options the user supplied, let's avoid the
    -    pattern of needlessly assigning to it.
    +    because we forgot to cargo-cult this opt->pcre2 option.
     
    -    With my in-flight removal of PCREv1 [1] ("Remove support for v1 of the
    -    PCRE library", 2021-01-24) there'll be even less confusion around what
    -    we call where in these codepaths, which is one more reason to remove
    -    this.
    +    Furthermore the "struct grep_opt" is (mostly) used for the options the
    +    user supplied, let's avoid the pattern of needlessly assigning to it.
     
    -    1. https://lore.kernel.org/git/xmqqmtwy29x8.fsf@gitster.c.googlers.com/
    +    With my recent removal of the PCREv1 backend in 7599730b7e2 (Remove
    +    support for v1 of the PCRE library, 2021-01-24) there's even less
    +    confusion around what we call where in these codepaths, which is one
    +    more reason to remove this.
     
         Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
 2:  db0ef9189e3 !  2:  e02f9b5ab50 grep/pcre2: drop needless assignment to NULL
    @@ Commit message
         grep/pcre2: drop needless assignment to NULL
     
         Remove a redundant assignment of pcre2_compile_context dating back to
    -    my 94da9193a6e (grep: add support for PCRE v2, 2017-06-01). In
    -    create_grep_pat() we xcalloc() the "grep_pat" struct, so there's no
    +    my 94da9193a6e (grep: add support for PCRE v2, 2017-06-01).
    +
    +    In create_grep_pat() we xcalloc() the "grep_pat" struct, so there's no
         need to NULL out individual members here.
     
         I think this was probably something left over from an earlier
 3:  9c5429f4d96 =  3:  2bcb6c53589 grep/pcre2: correct reference to grep_init() in comment
 4:  e5558f5f0f1 !  4:  78477193cd4 grep/pcre2: prepare to add debugging to pcre2_malloc()
    @@ Commit message
         grep/pcre2: prepare to add debugging to pcre2_malloc()
     
         Change pcre2_malloc() in a way that'll make it easier for a debugging
    -    fprintf() to spew out the allocated pointer. This doesn't introduce
    -    any functional change, it just makes a subsequent commit's diff easier
    -    to read. Changes code added in 513f2b0bbd4 (grep: make PCRE2 aware of
    -    custom allocator, 2019-10-16).
    +    fprintf() to spew out the allocated pointer.
    +
    +    This doesn't introduce any functional change, it just makes a
    +    subsequent commit's diff easier to read. Changes code added in
    +    513f2b0bbd4 (grep: make PCRE2 aware of custom allocator, 2019-10-16).
     
         Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
 5:  7968effaa8a !  5:  cbeb521bd65 grep/pcre2: add GREP_PCRE2_DEBUG_MALLOC debug mode
    @@ Commit message
         grep/pcre2: add GREP_PCRE2_DEBUG_MALLOC debug mode
     
         Add optional printing of PCREv2 allocations to stderr for a developer
    -    who manually changes the GREP_PCRE2_DEBUG_MALLOC definition to
    -    "1".
    +    who manually changes the GREP_PCRE2_DEBUG_MALLOC definition to "1".
    +
    +    You need to manually change the definition in the source file similar
    +    to the DEBUG_MAILMAP, there's no Makefile knob for this.
     
         This will be referenced a subsequent commit, and is generally useful
         to manually see what's going on with PCREv2 allocations while working
 6:  604e183c224 =  6:  788929c3de6 grep/pcre2: use compile-time PCREv2 version test
 7:  f864a9aac4c !  7:  6a4552c6d4e grep/pcre2: use pcre2_maketables_free() function
    @@ Commit message
         grep/pcre2: use pcre2_maketables_free() function
     
         Make use of the pcre2_maketables_free() function to free the memory
    -    allocated by pcre2_maketables(). At first sight it's strange that
    -    10da030ab75 (grep: avoid leak of chartables in PCRE2, 2019-10-16)
    -    which added the free() call here doesn't make use of the pcre2_free()
    -    the author introduced in the preceding commit in 513f2b0bbd4 (grep:
    -    make PCRE2 aware of custom allocator, 2019-10-16).
    +    allocated by pcre2_maketables().
    +
    +    At first sight it's strange that 10da030ab75 (grep: avoid leak of
    +    chartables in PCRE2, 2019-10-16) which added the free() call here
    +    doesn't make use of the pcre2_free() the author introduced in the
    +    preceding commit in 513f2b0bbd4 (grep: make PCRE2 aware of custom
    +    allocator, 2019-10-16).
     
         The reason is that at the time the function didn't exist. It was first
         introduced in PCREv2 version 10.34, released on 2019-11-21.
     
         Let's make use of it behind a macro. I don't think this matters for
         anything to do with custom allocators, but it makes our use of PCREv2
    -    more discoverable. At some distant point in the future we'll be able
    -    to drop the version guard, as nobody will be running a version older
    -    than 10.34.
    +    more discoverable.
    +
    +    At some distant point in the future we'll be able to drop the version
    +    guard, as nobody will be running a version older than 10.34.
     
         Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
 8:  cea9e066951 !  8:  bf58d597a4b grep/pcre2: actually make pcre2 use custom allocator
    @@ Commit message
         functions for allocation. We'll now use it for all PCREv2 allocations.
     
         The reason 513f2b0bbd4 worked as a bugfix for the USE_NED_ALLOCATOR
    -    issue is because it managed to target pretty much the allocation freed
    -    via free(), as opposed to by a pcre2_*free() function. I.e. the
    -    pcre2_maketables() and pcre2_maketables_free() pair. For most of the
    -    rest we continued allocating with stock malloc() inside PCREv2 itself,
    -    but didn't segfault because we'd use its corresponding free().
    +    issue is because it targeted the allocation freed via free(), as
    +    opposed to by a pcre2_*free() function. I.e. the pcre2_maketables()
    +    and pcre2_maketables_free() pair.
    +
    +    For most of the rest we continued allocating with stock malloc()
    +    inside PCREv2 itself, but didn't segfault because we'd use its
    +    corresponding free().
     
         In a preceding commit of mine I changed the free() to
         pcre2_maketables_free() on versions of PCREv2 10.34 and newer. So as
    @@ Commit message
             grep --threads=1 -iP æ.*var.*xyz
     
         Only use pcre2_{malloc,free}() for 2 malloc() calls and 2
    -    corresponding free() call. Now it's 12 calls to each. This can be
    +    corresponding free() calls. Now it's 12 calls to each. This can be
         observed with the GREP_PCRE2_DEBUG_MALLOC debug mode.
     
         Reading the history of how this bug got introduced it wasn't present
    @@ Commit message
         Thus the failed attempts to go down the route of only creating the
         general context in cases where we ourselves call pcre2_maketables(),
         before finally settling on the approach 513f2b0bbd4 took of always
    -    creating it.
    +    creating it, but then mostly not using it.
     
         Instead we should always create it, and then pass the general context
         to those functions that accept it, so that they'll consistently use
         our preferred memory allocation functions.
     
    -    1. https://public-inbox.org/git/3397e6797f872aedd18c6d795f4976e1c579514b.1565005867.git.gitgitgadget@gmail.com/
    +    1. https://lore.kernel.org/git/3397e6797f872aedd18c6d795f4976e1c579514b.1565005867.git.gitgitgadget@gmail.com/
         2. https://lore.kernel.org/git/CAPUEsphMh_ZqcH3M7PXC9jHTfEdQN3mhTAK2JDkdvKBp53YBoA@mail.gmail.com/
         3. https://lore.kernel.org/git/20190806085014.47776-3-carenas@gmail.com/
     
 9:  99eaa918261 !  9:  129adf872fb grep/pcre2: move back to thread-only PCREv2 structures
    @@ Commit message
         grep/pcre2: move back to thread-only PCREv2 structures
     
         Change the setup of the "pcre2_general_context" to happen per-thread
    -    in compile_pcre2_pattern() instead of in grep_init(), as happens with
    -    all the rest of the pcre2_* members of the grep_pat structure.
    +    in compile_pcre2_pattern() instead of in grep_init().
    +
    +    This change brings it in line with how the rest of the pcre2_* members
    +    in the grep_pat structure are set up.
     
         As noted in the preceding commit the approach 513f2b0bbd4 (grep: make
         PCRE2 aware of custom allocator, 2019-10-16) took to allocate the
         pcre2_general_context seems to have been initially based on a
         misunderstanding of how PCREv2 memory allocation works.
     
    -    This approach of creating a global context is just added complexity
    -    for almost zero gain. On my system it's 24 bytes saved per-thread, for
    -    context PCREv2 will then go on to some kilobytes for its own
    -    thread-local state.
    +    The approach of creating a global context in grep_init() is just added
    +    complexity for almost zero gain. On my system it's 24 bytes saved
    +    per-thread. For comparison PCREv2 will then go on to allocate at least
    +    a kilobyte for its own thread-local state.
     
         As noted in 6d423dd542f (grep: don't redundantly compile throwaway
         patterns under threading, 2017-05-25) the grep code is intentionally
    @@ Commit message
         structures globally, while making others thread-local.
     
         So let's remove this special case and make all of them thread-local
    -    for simplicity again.
    +    again for simplicity. With this change we could move the
    +    pcre2_{malloc,free} functions around to live closer to their current
    +    use. I'm not doing that here to keep this change small, that cleanup
    +    will be done in a follow-up commit.
     
         See also the discussion in 94da9193a6 (grep: add support for PCRE v2,
         2017-06-01) about thread safety, and Johannes's comments[1] to the
10:  462f12126c8 ! 10:  688d1b00d5d grep/pcre2: move definitions of pcre2_{malloc,free}
    @@ Commit message
         grep/pcre2: move definitions of pcre2_{malloc,free}
     
         Move the definitions of the pcre2_{malloc,free} functions above the
    -    compile_pcre2_pattern() function they're used it. Before the preceding
    -    commit they used to be needed earlier, but now we can move them to be
    -    adjacent to the other PCREv2 functions.
    +    compile_pcre2_pattern() function they're used in.
    +
    +    Before the preceding commit they used to be needed earlier, but now we
    +    can move them to be adjacent to the other PCREv2 functions.
     
         Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
-- 
2.30.0.284.gd98b1dd5eaa7

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Ævar Arnfjörð Bjarmason wrote (reply to this):

Correct a comment added in 513f2b0bbd4 (grep: make PCRE2 aware of
custom allocator, 2019-10-16). This comment was never correct in
git.git, but was consistent with an older version of the patch[1].

1. https://lore.kernel.org/git/20190806163658.66932-3-carenas@gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 grep.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/grep.c b/grep.c
index f27c5de7f56..b9adcd83e7a 100644
--- a/grep.c
+++ b/grep.c
@@ -373,7 +373,7 @@ static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt
 	int patinforet;
 	size_t jitsizearg;
 
-	/* pcre2_global_context is initialized in append_grep_pattern */
+	/* pcre2_global_context is initialized in grep_init */
 	if (opt->ignore_case) {
 		if (!opt->ignore_locale && has_non_ascii(p->pattern)) {
 			if (!pcre2_global_context)
-- 
2.30.0.284.gd98b1dd5eaa7

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Ævar Arnfjörð Bjarmason wrote (reply to this):

Change pcre2_malloc() in a way that'll make it easier for a debugging
fprintf() to spew out the allocated pointer.

This doesn't introduce any functional change, it just makes a
subsequent commit's diff easier to read. Changes code added in
513f2b0bbd4 (grep: make PCRE2 aware of custom allocator, 2019-10-16).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 grep.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/grep.c b/grep.c
index b9adcd83e7a..f96d86c9293 100644
--- a/grep.c
+++ b/grep.c
@@ -45,7 +45,8 @@ static pcre2_general_context *pcre2_global_context;
 
 static void *pcre2_malloc(PCRE2_SIZE size, MAYBE_UNUSED void *memory_data)
 {
-	return malloc(size);
+	void *pointer = malloc(size);
+	return pointer;
 }
 
 static void pcre2_free(void *pointer, MAYBE_UNUSED void *memory_data)
-- 
2.30.0.284.gd98b1dd5eaa7

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Junio C Hamano wrote (reply to this):

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> +#if (PCRE2_MAJOR >= 10 && PCRE2_MINOR >= 36) || PCRE2_MAJOR >= 11
> +#define GIT_PCRE2_VERSION_10_36_OR_HIGHER
> +#endif

This is not wrong per-se but it does look misleading not to spell it
as

#if (PCRE2_MAJOR == 10 && PCRE2_MINOR >= 36) || PCRE2_MAJOR >= 11

which would convey the intention much clearly.

Hint to notice the difference: imagine if it were "9.37 and later in
9.X series, 10.36 and later in 10.X series and anything after 11 are
OK".

In other words, the minor version is always tied to a particular
major version and "major >= X && minor >= Y" is often a bug, even
though in this case it happens to be OK only because 10 and 11 are
consecutive.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Junio C Hamano wrote (reply to this):

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> Continue work started in 513f2b0bbd4 (grep: make PCRE2 aware of custom
> allocator, 2019-10-16) and make PCREv2 use our pcre2_{malloc,free}().
> functions for allocation. We'll now use it for all PCREv2 allocations.
>
> The reason 513f2b0bbd4 worked as a bugfix for the USE_NED_ALLOCATOR
> issue is because it targeted the allocation freed via free(), as
> opposed to by a pcre2_*free() function. I.e. the pcre2_maketables()
> and pcre2_maketables_free() pair.
>
> For most of the rest we continued allocating with stock malloc()
> inside PCREv2 itself, but didn't segfault because we'd use its
> corresponding free().
>
> In a preceding commit of mine I changed the free() to
> pcre2_maketables_free() on versions of PCREv2 10.34 and newer. So as
> far as fixing the segfault goes ...

Wait, wait.  So, because of the previous step, we would start
segfaulting and we need to fix that breakage, which is the reason
why this commit exists?

If so, ...

> we could revert 513f2b0bbd4. But then
> we wouldn't use the desired allocator, let's just use it instead.

... I agree with the conclusion that both the previous step and this
step are needed and better than a reversion of 513f2b0b (grep: make
PCRE2 aware of custom allocator, 2019-10-16) and the previou step.

But even then, it feels somewhat backwards.  Shouldn't this step
come first, so that we would be using a matching alloc/free pair,
and then do the previous step?

> Instead we should always create it, and then pass the general context
> to those functions that accept it, so that they'll consistently use
> our preferred memory allocation functions.

Thanks.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Junio C Hamano wrote (reply to this):

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> Now based on "master", when I sent v1[1] it dependen on other
> in-flight topics of mine.

This has been in "Needs review" state for too long, so I looked at
them myself.  Aside from a few minor issues, they all made sense to
me.

Thanks.


Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Junio C Hamano wrote (reply to this):

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> Move the definitions of the pcre2_{malloc,free} functions above the
> compile_pcre2_pattern() function they're used in.
>
> Before the preceding commit they used to be needed earlier, but now we
> can move them to be adjacent to the other PCREv2 functions.

Yes, much nicer.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Junio C Hamano wrote (reply to this):

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> So let's remove this special case and make all of them thread-local
> again for simplicity. With this change we could move the
> pcre2_{malloc,free} functions around to live closer to their current
> use. I'm not doing that here to keep this change small, that cleanup
> will be done in a follow-up commit.

Nice.

@gitgitgadget
Copy link

gitgitgadget bot commented Feb 4, 2021

User Ævar Arnfjörð Bjarmason <avarab@gmail.com> has been added to the cc: list.

@@ -1145,5 +1145,6 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
run_pager(&opt, prefix);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Ævar Arnfjörð Bjarmason wrote (reply to this):

Drop an assignment added in b65abcafc7a (grep: use PCRE v2 for
optimized fixed-string search, 2019-07-01) and the overly cautious
assert() I added in 94da9193a6e (grep: add support for PCRE v2,
2017-06-01).

There was never a good reason for this, it's just a relic from when I
initially wrote the PCREv2 support. We're not going to have confusion
about compile_pcre2_pattern() being called when it shouldn't just
because we forgot to cargo-cult this opt->pcre2 option, and "opt"
is (mostly) used for the options the user supplied, let's avoid the
pattern of needlessly assigning to it.

With my in-flight removal of PCREv1 [1] ("Remove support for v1 of the
PCRE library", 2021-01-24) there'll be even less confusion around what
we call where in these codepaths, which is one more reason to remove
this.

1. https://lore.kernel.org/git/xmqqmtwy29x8.fsf@gitster.c.googlers.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 grep.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/grep.c b/grep.c
index aabfaaa4c3..816e23f17e 100644
--- a/grep.c
+++ b/grep.c
@@ -373,8 +373,6 @@ static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt
 	int patinforet;
 	size_t jitsizearg;
 
-	assert(opt->pcre2);
-
 	p->pcre2_compile_context = NULL;
 
 	/* pcre2_global_context is initialized in append_grep_pattern */
@@ -555,7 +553,6 @@ static void compile_regexp(struct grep_pat *p, struct grep_opt *opt)
 #endif
 	if (p->fixed || p->is_fixed) {
 #ifdef USE_LIBPCRE2
-		opt->pcre2 = 1;
 		if (p->is_fixed) {
 			compile_pcre2_pattern(p, opt);
 		} else {
-- 
2.30.0.284.gd98b1dd5eaa7

@@ -1145,5 +1145,6 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
run_pager(&opt, prefix);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Ævar Arnfjörð Bjarmason wrote (reply to this):

Remove a redundant assignment of pcre2_compile_context dating back to
my 94da9193a6e (grep: add support for PCRE v2, 2017-06-01). In
create_grep_pat() we xcalloc() the "grep_pat" struct, so there's no
need to NULL out individual members here.

I think this was probably something left over from an earlier
development version of mine.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 grep.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/grep.c b/grep.c
index 816e23f17e..f27c5de7f5 100644
--- a/grep.c
+++ b/grep.c
@@ -373,8 +373,6 @@ static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt
 	int patinforet;
 	size_t jitsizearg;
 
-	p->pcre2_compile_context = NULL;
-
 	/* pcre2_global_context is initialized in append_grep_pattern */
 	if (opt->ignore_case) {
 		if (!opt->ignore_locale && has_non_ascii(p->pattern)) {
-- 
2.30.0.284.gd98b1dd5eaa7

@@ -1145,5 +1145,6 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
run_pager(&opt, prefix);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Ævar Arnfjörð Bjarmason wrote (reply to this):

Correct a comment added in 513f2b0bbd4 (grep: make PCRE2 aware of
custom allocator, 2019-10-16). This comment was never correct in
git.git, but was consistent with an older version of the patch[1].

1. https://lore.kernel.org/git/20190806163658.66932-3-carenas@gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 grep.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/grep.c b/grep.c
index f27c5de7f5..b9adcd83e7 100644
--- a/grep.c
+++ b/grep.c
@@ -373,7 +373,7 @@ static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt
 	int patinforet;
 	size_t jitsizearg;
 
-	/* pcre2_global_context is initialized in append_grep_pattern */
+	/* pcre2_global_context is initialized in grep_init */
 	if (opt->ignore_case) {
 		if (!opt->ignore_locale && has_non_ascii(p->pattern)) {
 			if (!pcre2_global_context)
-- 
2.30.0.284.gd98b1dd5eaa7

@@ -1145,5 +1145,6 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
run_pager(&opt, prefix);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Ævar Arnfjörð Bjarmason wrote (reply to this):

Change pcre2_malloc() in a way that'll make it easier for a debugging
fprintf() to spew out the allocated pointer. This doesn't introduce
any functional change, it just makes a subsequent commit's diff easier
to read. Changes code added in 513f2b0bbd4 (grep: make PCRE2 aware of
custom allocator, 2019-10-16).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 grep.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/grep.c b/grep.c
index b9adcd83e7..f96d86c929 100644
--- a/grep.c
+++ b/grep.c
@@ -45,7 +45,8 @@ static pcre2_general_context *pcre2_global_context;
 
 static void *pcre2_malloc(PCRE2_SIZE size, MAYBE_UNUSED void *memory_data)
 {
-	return malloc(size);
+	void *pointer = malloc(size);
+	return pointer;
 }
 
 static void pcre2_free(void *pointer, MAYBE_UNUSED void *memory_data)
-- 
2.30.0.284.gd98b1dd5eaa7

@@ -1145,5 +1145,6 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
run_pager(&opt, prefix);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Ævar Arnfjörð Bjarmason wrote (reply to this):

Add optional printing of PCREv2 allocations to stderr for a developer
who manually changes the GREP_PCRE2_DEBUG_MALLOC definition to
"1".

This will be referenced a subsequent commit, and is generally useful
to manually see what's going on with PCREv2 allocations while working
on that code.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 grep.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/grep.c b/grep.c
index f96d86c929..7d262a23d8 100644
--- a/grep.c
+++ b/grep.c
@@ -42,15 +42,25 @@ static struct grep_opt grep_defaults = {
 
 #ifdef USE_LIBPCRE2
 static pcre2_general_context *pcre2_global_context;
+#define GREP_PCRE2_DEBUG_MALLOC 0
 
 static void *pcre2_malloc(PCRE2_SIZE size, MAYBE_UNUSED void *memory_data)
 {
 	void *pointer = malloc(size);
+#if GREP_PCRE2_DEBUG_MALLOC
+	static int count = 1;
+	fprintf(stderr, "PCRE2:%p -> #%02d: alloc(%lu)\n", pointer, count++, size);
+#endif
 	return pointer;
 }
 
 static void pcre2_free(void *pointer, MAYBE_UNUSED void *memory_data)
 {
+#if GREP_PCRE2_DEBUG_MALLOC
+	static int count = 1;
+	if (pointer)
+		fprintf(stderr, "PCRE2:%p -> #%02d: free()\n", pointer, count++);
+#endif
 	free(pointer);
 }
 #endif
-- 
2.30.0.284.gd98b1dd5eaa7

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Johannes Schindelin wrote (reply to this):

  This message is in MIME format.  The first part should be readable text,
  while the remaining parts are likely unreadable without MIME-aware tools.

--8323328-1579169088-1612953538=:29765
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

Hi =C3=86var,

On Thu, 4 Feb 2021, =C3=86var Arnfj=C3=B6r=C3=B0 Bjarmason wrote:

> Add optional printing of PCREv2 allocations to stderr for a developer
> who manually changes the GREP_PCRE2_DEBUG_MALLOC definition to
> "1".

Maybe clarify in the oneline that this is not an environment variable, but
a Makefile knob? I had to read all the way to the diff to understand that
aspect.

Otherwise, the patch series so far looks really fine to me.

Thanks,
Dscho

>
> This will be referenced a subsequent commit, and is generally useful
> to manually see what's going on with PCREv2 allocations while working
> on that code.
>
> Signed-off-by: =C3=86var Arnfj=C3=B6r=C3=B0 Bjarmason <avarab@gmail.com>
> ---
>  grep.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
>
> diff --git a/grep.c b/grep.c
> index f96d86c929..7d262a23d8 100644
> --- a/grep.c
> +++ b/grep.c
> @@ -42,15 +42,25 @@ static struct grep_opt grep_defaults =3D {
>
>  #ifdef USE_LIBPCRE2
>  static pcre2_general_context *pcre2_global_context;
> +#define GREP_PCRE2_DEBUG_MALLOC 0
>
>  static void *pcre2_malloc(PCRE2_SIZE size, MAYBE_UNUSED void *memory_da=
ta)
>  {
>  	void *pointer =3D malloc(size);
> +#if GREP_PCRE2_DEBUG_MALLOC
> +	static int count =3D 1;
> +	fprintf(stderr, "PCRE2:%p -> #%02d: alloc(%lu)\n", pointer, count++, s=
ize);
> +#endif
>  	return pointer;
>  }
>
>  static void pcre2_free(void *pointer, MAYBE_UNUSED void *memory_data)
>  {
> +#if GREP_PCRE2_DEBUG_MALLOC
> +	static int count =3D 1;
> +	if (pointer)
> +		fprintf(stderr, "PCRE2:%p -> #%02d: free()\n", pointer, count++);
> +#endif
>  	free(pointer);
>  }
>  #endif
> --
> 2.30.0.284.gd98b1dd5eaa7
>
>

--8323328-1579169088-1612953538=:29765--

@@ -1145,5 +1145,6 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
run_pager(&opt, prefix);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Ævar Arnfjörð Bjarmason wrote (reply to this):

Replace a use of pcre2_config(PCRE2_CONFIG_VERSION, ...) which I added
in 95ca1f987ed (grep/pcre2: better support invalid UTF-8 haystacks,
2021-01-24) with the same test done at compile-time.

It might be cuter to do this at runtime since we don't have to do the
"major >= 11 || (major >= 10 && ...)" test. But in the next commit
we'll add another version comparison that absolutely needs to be done
at compile-time, so we're better of being consistent across the board.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 grep.c | 18 ++++--------------
 grep.h |  3 +++
 2 files changed, 7 insertions(+), 14 deletions(-)

diff --git a/grep.c b/grep.c
index 7d262a23d8..e58044474d 100644
--- a/grep.c
+++ b/grep.c
@@ -400,21 +400,11 @@ static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt
 	    !(!opt->ignore_case && (p->fixed || p->is_fixed)))
 		options |= (PCRE2_UTF | PCRE2_MATCH_INVALID_UTF);
 
+#ifdef GIT_PCRE2_VERSION_10_36_OR_HIGHER
 	/* Work around https://bugs.exim.org/show_bug.cgi?id=2642 fixed in 10.36 */
-	if (PCRE2_MATCH_INVALID_UTF && options & (PCRE2_UTF | PCRE2_CASELESS)) {
-		struct strbuf buf;
-		int len;
-		int err;
-
-		if ((len = pcre2_config(PCRE2_CONFIG_VERSION, NULL)) < 0)
-			BUG("pcre2_config(..., NULL) failed: %d", len);
-		strbuf_init(&buf, len + 1);
-		if ((err = pcre2_config(PCRE2_CONFIG_VERSION, buf.buf)) < 0)
-			BUG("pcre2_config(..., buf.buf) failed: %d", err);
-		if (versioncmp(buf.buf, "10.36") < 0)
-			options |= PCRE2_NO_START_OPTIMIZE;
-		strbuf_release(&buf);
-	}
+	if (PCRE2_MATCH_INVALID_UTF && options & (PCRE2_UTF | PCRE2_CASELESS))
+		options |= PCRE2_NO_START_OPTIMIZE;
+#endif
 
 	p->pcre2_pattern = pcre2_compile((PCRE2_SPTR)p->pattern,
 					 p->patternlen, options, &error, &erroffset,
diff --git a/grep.h b/grep.h
index ae89d6254b..54e52042cb 100644
--- a/grep.h
+++ b/grep.h
@@ -4,6 +4,9 @@
 #ifdef USE_LIBPCRE2
 #define PCRE2_CODE_UNIT_WIDTH 8
 #include <pcre2.h>
+#if (PCRE2_MAJOR >= 10 && PCRE2_MINOR >= 36) || PCRE2_MAJOR >= 11
+#define GIT_PCRE2_VERSION_10_36_OR_HIGHER
+#endif
 #else
 typedef int pcre2_code;
 typedef int pcre2_match_data;
-- 
2.30.0.284.gd98b1dd5eaa7

@@ -1145,5 +1145,6 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
run_pager(&opt, prefix);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Ævar Arnfjörð Bjarmason wrote (reply to this):

Continue work started in 513f2b0bbd4 (grep: make PCRE2 aware of custom
allocator, 2019-10-16) and make PCREv2 use our pcre2_{malloc,free}().
functions for allocation. We'll now use it for all PCREv2 allocations.

The reason 513f2b0bbd4 worked as a bugfix for the USE_NED_ALLOCATOR
issue is because it managed to target pretty much the allocation freed
via free(), as opposed to by a pcre2_*free() function. I.e. the
pcre2_maketables() and pcre2_maketables_free() pair. For most of the
rest we continued allocating with stock malloc() inside PCREv2 itself,
but didn't segfault because we'd use its corresponding free().

In a preceding commit of mine I changed the free() to
pcre2_maketables_free() on versions of PCREv2 10.34 and newer. So as
far as fixing the segfault goes we could revert 513f2b0bbd4. But then
we wouldn't use the desired allocator, let's just use it instead.

Before this patch we'd on e.g.:

    grep --threads=1 -iP æ.*var.*xyz

Only use pcre2_{malloc,free}() for 2 malloc() calls and 2
corresponding free() call. Now it's 12 calls to each. This can be
observed with the GREP_PCRE2_DEBUG_MALLOC debug mode.

Reading the history of how this bug got introduced it wasn't present
in Johannes's original patch[1] to fix the issue.

My reading of that thread is that the approach the follow-up patches
to Johannes's original pursued were based on misunderstanding of how
the PCREv2 API works. In particular this part of [2]:

    "most of the time (like when using UTF-8) the chartable (and
    therefore the global context) is not needed (even when using
    alternate allocators)"

That's simply not how PCREv2 memory allocation works. It's easy to see
how the misunderstanding came about. It's because (as noted above) the
issue was noticed because of our use of free() in our own grep.c for
freeing the memory allocated by pcre2_maketables().

Thus the misunderstanding that PCREv2's compile context is something
only needed for pcre2_maketables(), and e.g. an aborted earlier
attempt[3] to only set it up when we ourselves called
pcre2_maketables().

That's not what PCREv2's compile context is. To quote PCREv2's
documentation:

    "This context just contains pointers to (and data for) external
    memory management functions that are called from several places in
    the PCRE2 library."

Thus the failed attempts to go down the route of only creating the
general context in cases where we ourselves call pcre2_maketables(),
before finally settling on the approach 513f2b0bbd4 took of always
creating it.

Instead we should always create it, and then pass the general context
to those functions that accept it, so that they'll consistently use
our preferred memory allocation functions.

1. https://public-inbox.org/git/3397e6797f872aedd18c6d795f4976e1c579514b.1565005867.git.gitgitgadget@gmail.com/
2. https://lore.kernel.org/git/CAPUEsphMh_ZqcH3M7PXC9jHTfEdQN3mhTAK2JDkdvKBp53YBoA@mail.gmail.com/
3. https://lore.kernel.org/git/20190806085014.47776-3-carenas@gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 grep.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/grep.c b/grep.c
index c63dbff4b2..0116ff5f09 100644
--- a/grep.c
+++ b/grep.c
@@ -390,7 +390,7 @@ static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt
 			if (!pcre2_global_context)
 				BUG("pcre2_global_context uninitialized");
 			p->pcre2_tables = pcre2_maketables(pcre2_global_context);
-			p->pcre2_compile_context = pcre2_compile_context_create(NULL);
+			p->pcre2_compile_context = pcre2_compile_context_create(pcre2_global_context);
 			pcre2_set_character_tables(p->pcre2_compile_context,
 							p->pcre2_tables);
 		}
@@ -411,7 +411,7 @@ static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt
 					 p->pcre2_compile_context);
 
 	if (p->pcre2_pattern) {
-		p->pcre2_match_data = pcre2_match_data_create_from_pattern(p->pcre2_pattern, NULL);
+		p->pcre2_match_data = pcre2_match_data_create_from_pattern(p->pcre2_pattern, pcre2_global_context);
 		if (!p->pcre2_match_data)
 			die("Couldn't allocate PCRE2 match data");
 	} else {
-- 
2.30.0.284.gd98b1dd5eaa7

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Johannes Schindelin wrote (reply to this):

  This message is in MIME format.  The first part should be readable text,
  while the remaining parts are likely unreadable without MIME-aware tools.

--8323328-1170852189-1612960719=:29765
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

Hi =C3=86var,

ACK!

And thank you for this patch,
Dscho

On Thu, 4 Feb 2021, =C3=86var Arnfj=C3=B6r=C3=B0 Bjarmason wrote:

> Continue work started in 513f2b0bbd4 (grep: make PCRE2 aware of custom
> allocator, 2019-10-16) and make PCREv2 use our pcre2_{malloc,free}().
> functions for allocation. We'll now use it for all PCREv2 allocations.
>
> The reason 513f2b0bbd4 worked as a bugfix for the USE_NED_ALLOCATOR
> issue is because it managed to target pretty much the allocation freed
> via free(), as opposed to by a pcre2_*free() function. I.e. the
> pcre2_maketables() and pcre2_maketables_free() pair. For most of the
> rest we continued allocating with stock malloc() inside PCREv2 itself,
> but didn't segfault because we'd use its corresponding free().
>
> In a preceding commit of mine I changed the free() to
> pcre2_maketables_free() on versions of PCREv2 10.34 and newer. So as
> far as fixing the segfault goes we could revert 513f2b0bbd4. But then
> we wouldn't use the desired allocator, let's just use it instead.
>
> Before this patch we'd on e.g.:
>
>     grep --threads=3D1 -iP =C3=A6.*var.*xyz
>
> Only use pcre2_{malloc,free}() for 2 malloc() calls and 2
> corresponding free() call. Now it's 12 calls to each. This can be
> observed with the GREP_PCRE2_DEBUG_MALLOC debug mode.
>
> Reading the history of how this bug got introduced it wasn't present
> in Johannes's original patch[1] to fix the issue.
>
> My reading of that thread is that the approach the follow-up patches
> to Johannes's original pursued were based on misunderstanding of how
> the PCREv2 API works. In particular this part of [2]:
>
>     "most of the time (like when using UTF-8) the chartable (and
>     therefore the global context) is not needed (even when using
>     alternate allocators)"
>
> That's simply not how PCREv2 memory allocation works. It's easy to see
> how the misunderstanding came about. It's because (as noted above) the
> issue was noticed because of our use of free() in our own grep.c for
> freeing the memory allocated by pcre2_maketables().
>
> Thus the misunderstanding that PCREv2's compile context is something
> only needed for pcre2_maketables(), and e.g. an aborted earlier
> attempt[3] to only set it up when we ourselves called
> pcre2_maketables().
>
> That's not what PCREv2's compile context is. To quote PCREv2's
> documentation:
>
>     "This context just contains pointers to (and data for) external
>     memory management functions that are called from several places in
>     the PCRE2 library."
>
> Thus the failed attempts to go down the route of only creating the
> general context in cases where we ourselves call pcre2_maketables(),
> before finally settling on the approach 513f2b0bbd4 took of always
> creating it.
>
> Instead we should always create it, and then pass the general context
> to those functions that accept it, so that they'll consistently use
> our preferred memory allocation functions.
>
> 1. https://public-inbox.org/git/3397e6797f872aedd18c6d795f4976e1c579514b=
.1565005867.git.gitgitgadget@gmail.com/
> 2. https://lore.kernel.org/git/CAPUEsphMh_ZqcH3M7PXC9jHTfEdQN3mhTAK2JDkd=
vKBp53YBoA@mail.gmail.com/
> 3. https://lore.kernel.org/git/20190806085014.47776-3-carenas@gmail.com/
>
> Signed-off-by: =C3=86var Arnfj=C3=B6r=C3=B0 Bjarmason <avarab@gmail.com>
> ---
>  grep.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/grep.c b/grep.c
> index c63dbff4b2..0116ff5f09 100644
> --- a/grep.c
> +++ b/grep.c
> @@ -390,7 +390,7 @@ static void compile_pcre2_pattern(struct grep_pat *p=
, const struct grep_opt *opt
>  			if (!pcre2_global_context)
>  				BUG("pcre2_global_context uninitialized");
>  			p->pcre2_tables =3D pcre2_maketables(pcre2_global_context);
> -			p->pcre2_compile_context =3D pcre2_compile_context_create(NULL);
> +			p->pcre2_compile_context =3D pcre2_compile_context_create(pcre2_glob=
al_context);
>  			pcre2_set_character_tables(p->pcre2_compile_context,
>  							p->pcre2_tables);
>  		}
> @@ -411,7 +411,7 @@ static void compile_pcre2_pattern(struct grep_pat *p=
, const struct grep_opt *opt
>  					 p->pcre2_compile_context);
>
>  	if (p->pcre2_pattern) {
> -		p->pcre2_match_data =3D pcre2_match_data_create_from_pattern(p->pcre2=
_pattern, NULL);
> +		p->pcre2_match_data =3D pcre2_match_data_create_from_pattern(p->pcre2=
_pattern, pcre2_global_context);
>  		if (!p->pcre2_match_data)
>  			die("Couldn't allocate PCRE2 match data");
>  	} else {
> --
> 2.30.0.284.gd98b1dd5eaa7
>
>

--8323328-1170852189-1612960719=:29765--

@@ -1145,5 +1145,6 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
run_pager(&opt, prefix);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Ævar Arnfjörð Bjarmason wrote (reply to this):

Make use of the pcre2_maketables_free() function to free the memory
allocated by pcre2_maketables(). At first sight it's strange that
10da030ab75 (grep: avoid leak of chartables in PCRE2, 2019-10-16)
which added the free() call here doesn't make use of the pcre2_free()
the author introduced in the preceding commit in 513f2b0bbd4 (grep:
make PCRE2 aware of custom allocator, 2019-10-16).

The reason is that at the time the function didn't exist. It was first
introduced in PCREv2 version 10.34, released on 2019-11-21.

Let's make use of it behind a macro. I don't think this matters for
anything to do with custom allocators, but it makes our use of PCREv2
more discoverable. At some distant point in the future we'll be able
to drop the version guard, as nobody will be running a version older
than 10.34.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 grep.c | 4 ++++
 grep.h | 3 +++
 2 files changed, 7 insertions(+)

diff --git a/grep.c b/grep.c
index e58044474d..c63dbff4b2 100644
--- a/grep.c
+++ b/grep.c
@@ -490,7 +490,11 @@ static void free_pcre2_pattern(struct grep_pat *p)
 	pcre2_compile_context_free(p->pcre2_compile_context);
 	pcre2_code_free(p->pcre2_pattern);
 	pcre2_match_data_free(p->pcre2_match_data);
+#ifdef GIT_PCRE2_VERSION_10_34_OR_HIGHER
+	pcre2_maketables_free(pcre2_global_context, p->pcre2_tables);
+#else
 	free((void *)p->pcre2_tables);
+#endif
 }
 #else /* !USE_LIBPCRE2 */
 static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt)
diff --git a/grep.h b/grep.h
index 54e52042cb..64666e9204 100644
--- a/grep.h
+++ b/grep.h
@@ -7,6 +7,9 @@
 #if (PCRE2_MAJOR >= 10 && PCRE2_MINOR >= 36) || PCRE2_MAJOR >= 11
 #define GIT_PCRE2_VERSION_10_36_OR_HIGHER
 #endif
+#if (PCRE2_MAJOR >= 10 && PCRE2_MINOR >= 34) || PCRE2_MAJOR >= 11
+#define GIT_PCRE2_VERSION_10_34_OR_HIGHER
+#endif
 #else
 typedef int pcre2_code;
 typedef int pcre2_match_data;
-- 
2.30.0.284.gd98b1dd5eaa7

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Johannes Schindelin wrote (reply to this):

  This message is in MIME format.  The first part should be readable text,
  while the remaining parts are likely unreadable without MIME-aware tools.

--8323328-395204050-1612953783=:29765
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

Hi =C3=86var,

On Thu, 4 Feb 2021, =C3=86var Arnfj=C3=B6r=C3=B0 Bjarmason wrote:

> Make use of the pcre2_maketables_free() function to free the memory
> allocated by pcre2_maketables(). At first sight it's strange that
> 10da030ab75 (grep: avoid leak of chartables in PCRE2, 2019-10-16)
> which added the free() call here doesn't make use of the pcre2_free()
> the author introduced in the preceding commit in 513f2b0bbd4 (grep:
> make PCRE2 aware of custom allocator, 2019-10-16).
>
> The reason is that at the time the function didn't exist. It was first
> introduced in PCREv2 version 10.34, released on 2019-11-21.

Git for Windows still uses v10.33. Thanks for the prod, I will update the
library.

Ciao,
Dscho

>
> Let's make use of it behind a macro. I don't think this matters for
> anything to do with custom allocators, but it makes our use of PCREv2
> more discoverable. At some distant point in the future we'll be able
> to drop the version guard, as nobody will be running a version older
> than 10.34.
>
> Signed-off-by: =C3=86var Arnfj=C3=B6r=C3=B0 Bjarmason <avarab@gmail.com>
> ---
>  grep.c | 4 ++++
>  grep.h | 3 +++
>  2 files changed, 7 insertions(+)
>
> diff --git a/grep.c b/grep.c
> index e58044474d..c63dbff4b2 100644
> --- a/grep.c
> +++ b/grep.c
> @@ -490,7 +490,11 @@ static void free_pcre2_pattern(struct grep_pat *p)
>  	pcre2_compile_context_free(p->pcre2_compile_context);
>  	pcre2_code_free(p->pcre2_pattern);
>  	pcre2_match_data_free(p->pcre2_match_data);
> +#ifdef GIT_PCRE2_VERSION_10_34_OR_HIGHER
> +	pcre2_maketables_free(pcre2_global_context, p->pcre2_tables);
> +#else
>  	free((void *)p->pcre2_tables);
> +#endif
>  }
>  #else /* !USE_LIBPCRE2 */
>  static void compile_pcre2_pattern(struct grep_pat *p, const struct grep=
_opt *opt)
> diff --git a/grep.h b/grep.h
> index 54e52042cb..64666e9204 100644
> --- a/grep.h
> +++ b/grep.h
> @@ -7,6 +7,9 @@
>  #if (PCRE2_MAJOR >=3D 10 && PCRE2_MINOR >=3D 36) || PCRE2_MAJOR >=3D 11
>  #define GIT_PCRE2_VERSION_10_36_OR_HIGHER
>  #endif
> +#if (PCRE2_MAJOR >=3D 10 && PCRE2_MINOR >=3D 34) || PCRE2_MAJOR >=3D 11
> +#define GIT_PCRE2_VERSION_10_34_OR_HIGHER
> +#endif
>  #else
>  typedef int pcre2_code;
>  typedef int pcre2_match_data;
> --
> 2.30.0.284.gd98b1dd5eaa7
>
>

--8323328-395204050-1612953783=:29765--

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Junio C Hamano wrote (reply to this):

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> Make use of the pcre2_maketables_free() function to free the memory
> allocated by pcre2_maketables(). At first sight it's strange that
> 10da030ab75 (grep: avoid leak of chartables in PCRE2, 2019-10-16)
> which added the free() call here doesn't make use of the pcre2_free()
> the author introduced in the preceding commit in 513f2b0bbd4 (grep:
> make PCRE2 aware of custom allocator, 2019-10-16).
>
> The reason is that at the time the function didn't exist. It was first
> introduced in PCREv2 version 10.34, released on 2019-11-21.
>
> Let's make use of it behind a macro. I don't think this matters for
> anything to do with custom allocators, but it makes our use of PCREv2
> more discoverable. At some distant point in the future we'll be able
> to drop the version guard, as nobody will be running a version older
> than 10.34.

OK.  The same comment about the macro that happens to be OK only
because 10 and 11 are consecutive applies here.

>
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>  grep.c | 4 ++++
>  grep.h | 3 +++
>  2 files changed, 7 insertions(+)
>
> diff --git a/grep.c b/grep.c
> index e58044474d..c63dbff4b2 100644
> --- a/grep.c
> +++ b/grep.c
> @@ -490,7 +490,11 @@ static void free_pcre2_pattern(struct grep_pat *p)
>  	pcre2_compile_context_free(p->pcre2_compile_context);
>  	pcre2_code_free(p->pcre2_pattern);
>  	pcre2_match_data_free(p->pcre2_match_data);
> +#ifdef GIT_PCRE2_VERSION_10_34_OR_HIGHER
> +	pcre2_maketables_free(pcre2_global_context, p->pcre2_tables);
> +#else
>  	free((void *)p->pcre2_tables);
> +#endif
>  }
>  #else /* !USE_LIBPCRE2 */
>  static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt)
> diff --git a/grep.h b/grep.h
> index 54e52042cb..64666e9204 100644
> --- a/grep.h
> +++ b/grep.h
> @@ -7,6 +7,9 @@
>  #if (PCRE2_MAJOR >= 10 && PCRE2_MINOR >= 36) || PCRE2_MAJOR >= 11
>  #define GIT_PCRE2_VERSION_10_36_OR_HIGHER
>  #endif
> +#if (PCRE2_MAJOR >= 10 && PCRE2_MINOR >= 34) || PCRE2_MAJOR >= 11
> +#define GIT_PCRE2_VERSION_10_34_OR_HIGHER
> +#endif
>  #else
>  typedef int pcre2_code;
>  typedef int pcre2_match_data;

@@ -1145,5 +1145,6 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
run_pager(&opt, prefix);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Ævar Arnfjörð Bjarmason wrote (reply to this):

Change the setup of the "pcre2_general_context" to happen per-thread
in compile_pcre2_pattern() instead of in grep_init(), as happens with
all the rest of the pcre2_* members of the grep_pat structure.

As noted in the preceding commit the approach 513f2b0bbd4 (grep: make
PCRE2 aware of custom allocator, 2019-10-16) took to allocate the
pcre2_general_context seems to have been initially based on a
misunderstanding of how PCREv2 memory allocation works.

This approach of creating a global context is just added complexity
for almost zero gain. On my system it's 24 bytes saved per-thread, for
context PCREv2 will then go on to some kilobytes for its own
thread-local state.

As noted in 6d423dd542f (grep: don't redundantly compile throwaway
patterns under threading, 2017-05-25) the grep code is intentionally
not trying to micro-optimize allocations by e.g. sharing some PCREv2
structures globally, while making others thread-local.

So let's remove this special case and make all of them thread-local
for simplicity again.

See also the discussion in 94da9193a6 (grep: add support for PCRE v2,
2017-06-01) about thread safety, and Johannes's comments[1] to the
effect that we should be doing what this patch is doing.

1. https://lore.kernel.org/git/nycvar.QRO.7.76.6.1908052120302.46@tvgsbejvaqbjf.bet/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/grep.c |  1 -
 grep.c         | 41 +++++++++++++++--------------------------
 grep.h         |  3 ++-
 3 files changed, 17 insertions(+), 28 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index 55d06c9513..c69fe99340 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -1175,6 +1175,5 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 		run_pager(&opt, prefix);
 	clear_pathspec(&pathspec);
 	free_grep_patterns(&opt);
-	grep_destroy();
 	return !hit;
 }
diff --git a/grep.c b/grep.c
index 0116ff5f09..2599f329cd 100644
--- a/grep.c
+++ b/grep.c
@@ -41,7 +41,6 @@ static struct grep_opt grep_defaults = {
 };
 
 #ifdef USE_LIBPCRE2
-static pcre2_general_context *pcre2_global_context;
 #define GREP_PCRE2_DEBUG_MALLOC 0
 
 static void *pcre2_malloc(PCRE2_SIZE size, MAYBE_UNUSED void *memory_data)
@@ -163,20 +162,9 @@ int grep_config(const char *var, const char *value, void *cb)
  * Initialize one instance of grep_opt and copy the
  * default values from the template we read the configuration
  * information in an earlier call to git_config(grep_config).
- *
- * If using PCRE, make sure that the library is configured
- * to use the same allocator as Git (e.g. nedmalloc on Windows).
- *
- * Any allocated memory needs to be released in grep_destroy().
  */
 void grep_init(struct grep_opt *opt, struct repository *repo, const char *prefix)
 {
-#if defined(USE_LIBPCRE2)
-	if (!pcre2_global_context)
-		pcre2_global_context = pcre2_general_context_create(
-					pcre2_malloc, pcre2_free, NULL);
-#endif
-
 	*opt = grep_defaults;
 
 	opt->repo = repo;
@@ -186,13 +174,6 @@ void grep_init(struct grep_opt *opt, struct repository *repo, const char *prefix
 	opt->header_tail = &opt->header_list;
 }
 
-void grep_destroy(void)
-{
-#ifdef USE_LIBPCRE2
-	pcre2_general_context_free(pcre2_global_context);
-#endif
-}
-
 static void grep_set_pattern_type_option(enum grep_pattern_type pattern_type, struct grep_opt *opt)
 {
 	/*
@@ -384,13 +365,20 @@ static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt
 	int patinforet;
 	size_t jitsizearg;
 
-	/* pcre2_global_context is initialized in grep_init */
+	/*
+	 * Call pcre2_general_context_create() before calling any
+	 * other pcre2_*(). It sets up our malloc()/free() functions
+	 * with which everything else is allocated.
+	 */
+	p->pcre2_general_context = pcre2_general_context_create(
+		pcre2_malloc, pcre2_free, NULL);
+	if (!p->pcre2_general_context)
+		die("Couldn't allocate PCRE2 general context");
+
 	if (opt->ignore_case) {
 		if (!opt->ignore_locale && has_non_ascii(p->pattern)) {
-			if (!pcre2_global_context)
-				BUG("pcre2_global_context uninitialized");
-			p->pcre2_tables = pcre2_maketables(pcre2_global_context);
-			p->pcre2_compile_context = pcre2_compile_context_create(pcre2_global_context);
+			p->pcre2_tables = pcre2_maketables(p->pcre2_general_context);
+			p->pcre2_compile_context = pcre2_compile_context_create(p->pcre2_general_context);
 			pcre2_set_character_tables(p->pcre2_compile_context,
 							p->pcre2_tables);
 		}
@@ -411,7 +399,7 @@ static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt
 					 p->pcre2_compile_context);
 
 	if (p->pcre2_pattern) {
-		p->pcre2_match_data = pcre2_match_data_create_from_pattern(p->pcre2_pattern, pcre2_global_context);
+		p->pcre2_match_data = pcre2_match_data_create_from_pattern(p->pcre2_pattern, p->pcre2_general_context);
 		if (!p->pcre2_match_data)
 			die("Couldn't allocate PCRE2 match data");
 	} else {
@@ -491,10 +479,11 @@ static void free_pcre2_pattern(struct grep_pat *p)
 	pcre2_code_free(p->pcre2_pattern);
 	pcre2_match_data_free(p->pcre2_match_data);
 #ifdef GIT_PCRE2_VERSION_10_34_OR_HIGHER
-	pcre2_maketables_free(pcre2_global_context, p->pcre2_tables);
+	pcre2_maketables_free(p->pcre2_general_context, p->pcre2_tables);
 #else
 	free((void *)p->pcre2_tables);
 #endif
+	pcre2_general_context_free(p->pcre2_general_context);
 }
 #else /* !USE_LIBPCRE2 */
 static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt)
diff --git a/grep.h b/grep.h
index 64666e9204..72f82b1e30 100644
--- a/grep.h
+++ b/grep.h
@@ -14,6 +14,7 @@
 typedef int pcre2_code;
 typedef int pcre2_match_data;
 typedef int pcre2_compile_context;
+typedef int pcre2_general_context;
 #endif
 #ifndef PCRE2_MATCH_INVALID_UTF
 /* PCRE2_MATCH_* dummy also with !USE_LIBPCRE2, for test-pcre2-config.c */
@@ -75,6 +76,7 @@ struct grep_pat {
 	pcre2_code *pcre2_pattern;
 	pcre2_match_data *pcre2_match_data;
 	pcre2_compile_context *pcre2_compile_context;
+	pcre2_general_context *pcre2_general_context;
 	const uint8_t *pcre2_tables;
 	uint32_t pcre2_jit_on;
 	unsigned fixed:1;
@@ -167,7 +169,6 @@ struct grep_opt {
 
 int grep_config(const char *var, const char *value, void *);
 void grep_init(struct grep_opt *, struct repository *repo, const char *prefix);
-void grep_destroy(void);
 void grep_commit_pattern_type(enum grep_pattern_type, struct grep_opt *opt);
 
 void append_grep_pat(struct grep_opt *opt, const char *pat, size_t patlen, const char *origin, int no, enum grep_pat_token t);
-- 
2.30.0.284.gd98b1dd5eaa7

@@ -1145,5 +1145,6 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
run_pager(&opt, prefix);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Ævar Arnfjörð Bjarmason wrote (reply to this):

Move the definitions of the pcre2_{malloc,free} functions above the
compile_pcre2_pattern() function they're used it. Before the preceding
commit they used to be needed earlier, but now we can move them to be
adjacent to the other PCREv2 functions.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 grep.c | 46 ++++++++++++++++++++++------------------------
 1 file changed, 22 insertions(+), 24 deletions(-)

diff --git a/grep.c b/grep.c
index 2599f329cd..636ac48bf0 100644
--- a/grep.c
+++ b/grep.c
@@ -40,30 +40,6 @@ static struct grep_opt grep_defaults = {
 	.output = std_output,
 };
 
-#ifdef USE_LIBPCRE2
-#define GREP_PCRE2_DEBUG_MALLOC 0
-
-static void *pcre2_malloc(PCRE2_SIZE size, MAYBE_UNUSED void *memory_data)
-{
-	void *pointer = malloc(size);
-#if GREP_PCRE2_DEBUG_MALLOC
-	static int count = 1;
-	fprintf(stderr, "PCRE2:%p -> #%02d: alloc(%lu)\n", pointer, count++, size);
-#endif
-	return pointer;
-}
-
-static void pcre2_free(void *pointer, MAYBE_UNUSED void *memory_data)
-{
-#if GREP_PCRE2_DEBUG_MALLOC
-	static int count = 1;
-	if (pointer)
-		fprintf(stderr, "PCRE2:%p -> #%02d: free()\n", pointer, count++);
-#endif
-	free(pointer);
-}
-#endif
-
 static const char *color_grep_slots[] = {
 	[GREP_COLOR_CONTEXT]	    = "context",
 	[GREP_COLOR_FILENAME]	    = "filename",
@@ -355,6 +331,28 @@ static int is_fixed(const char *s, size_t len)
 }
 
 #ifdef USE_LIBPCRE2
+#define GREP_PCRE2_DEBUG_MALLOC 0
+
+static void *pcre2_malloc(PCRE2_SIZE size, MAYBE_UNUSED void *memory_data)
+{
+	void *pointer = malloc(size);
+#if GREP_PCRE2_DEBUG_MALLOC
+	static int count = 1;
+	fprintf(stderr, "PCRE2:%p -> #%02d: alloc(%lu)\n", pointer, count++, size);
+#endif
+	return pointer;
+}
+
+static void pcre2_free(void *pointer, MAYBE_UNUSED void *memory_data)
+{
+#if GREP_PCRE2_DEBUG_MALLOC
+	static int count = 1;
+	if (pointer)
+		fprintf(stderr, "PCRE2:%p -> #%02d: free()\n", pointer, count++);
+#endif
+	free(pointer);
+}
+
 static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt)
 {
 	int error;
-- 
2.30.0.284.gd98b1dd5eaa7

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Johannes Schindelin wrote (reply to this):

  This message is in MIME format.  The first part should be readable text,
  while the remaining parts are likely unreadable without MIME-aware tools.

--8323328-766125519-1612960859=:29765
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

Hi =C3=86var,

On Thu, 4 Feb 2021, =C3=86var Arnfj=C3=B6r=C3=B0 Bjarmason wrote:

> Move the definitions of the pcre2_{malloc,free} functions above the
> compile_pcre2_pattern() function they're used it. Before the preceding

s/it/in/

Thank you for this entire patch series. I like its intention and its
execution.

Ciao,
Dscho

> commit they used to be needed earlier, but now we can move them to be
> adjacent to the other PCREv2 functions.
>
> Signed-off-by: =C3=86var Arnfj=C3=B6r=C3=B0 Bjarmason <avarab@gmail.com>
> ---
>  grep.c | 46 ++++++++++++++++++++++------------------------
>  1 file changed, 22 insertions(+), 24 deletions(-)
>
> diff --git a/grep.c b/grep.c
> index 2599f329cd..636ac48bf0 100644
> --- a/grep.c
> +++ b/grep.c
> @@ -40,30 +40,6 @@ static struct grep_opt grep_defaults =3D {
>  	.output =3D std_output,
>  };
>
> -#ifdef USE_LIBPCRE2
> -#define GREP_PCRE2_DEBUG_MALLOC 0
> -
> -static void *pcre2_malloc(PCRE2_SIZE size, MAYBE_UNUSED void *memory_da=
ta)
> -{
> -	void *pointer =3D malloc(size);
> -#if GREP_PCRE2_DEBUG_MALLOC
> -	static int count =3D 1;
> -	fprintf(stderr, "PCRE2:%p -> #%02d: alloc(%lu)\n", pointer, count++, s=
ize);
> -#endif
> -	return pointer;
> -}
> -
> -static void pcre2_free(void *pointer, MAYBE_UNUSED void *memory_data)
> -{
> -#if GREP_PCRE2_DEBUG_MALLOC
> -	static int count =3D 1;
> -	if (pointer)
> -		fprintf(stderr, "PCRE2:%p -> #%02d: free()\n", pointer, count++);
> -#endif
> -	free(pointer);
> -}
> -#endif
> -
>  static const char *color_grep_slots[] =3D {
>  	[GREP_COLOR_CONTEXT]	    =3D "context",
>  	[GREP_COLOR_FILENAME]	    =3D "filename",
> @@ -355,6 +331,28 @@ static int is_fixed(const char *s, size_t len)
>  }
>
>  #ifdef USE_LIBPCRE2
> +#define GREP_PCRE2_DEBUG_MALLOC 0
> +
> +static void *pcre2_malloc(PCRE2_SIZE size, MAYBE_UNUSED void *memory_da=
ta)
> +{
> +	void *pointer =3D malloc(size);
> +#if GREP_PCRE2_DEBUG_MALLOC
> +	static int count =3D 1;
> +	fprintf(stderr, "PCRE2:%p -> #%02d: alloc(%lu)\n", pointer, count++, s=
ize);
> +#endif
> +	return pointer;
> +}
> +
> +static void pcre2_free(void *pointer, MAYBE_UNUSED void *memory_data)
> +{
> +#if GREP_PCRE2_DEBUG_MALLOC
> +	static int count =3D 1;
> +	if (pointer)
> +		fprintf(stderr, "PCRE2:%p -> #%02d: free()\n", pointer, count++);
> +#endif
> +	free(pointer);
> +}
> +
>  static void compile_pcre2_pattern(struct grep_pat *p, const struct grep=
_opt *opt)
>  {
>  	int error;
> --
> 2.30.0.284.gd98b1dd5eaa7
>
>

--8323328-766125519-1612960859=:29765--

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants