Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

preliminary pcre2 support #4450

Closed
mc-butler opened this issue Mar 12, 2023 · 11 comments
Closed

preliminary pcre2 support #4450

mc-butler opened this issue Mar 12, 2023 · 11 comments
Assignees
Labels
area: search Search subsystem prio: medium Has the potential to affect progress
Milestone

Comments

@mc-butler
Copy link

Important

This issue was migrated from Trac:

Origin https://midnight-commander.org/ticket/4450
Reporter broly (gagan@….com)

hi,

since my DD-WRT build is now moving to pcre2, i thought i'd give a stab at providing pcre2 support.

since brainslayer is using an ass-old version of MC, i have provided a patch set of the changes necessary to use pcre2.

it compiles fine and from what i can see, it runs fine.

but i haven't tested the search properly because i don't want to restart my router to accommodate the newer glib/pcre (i used LD_LIBRARY_PATH with the newer libraries to test the new executable).

Note

Original attachments:

@mc-butler
Copy link
Author

Changed by broly (gagan@….com) on Mar 12, 2023 at 20:42 UTC (comment 1)

if anyone wants to fix any potential bugs that may arise from my changes, use these:

apache/httpd@c602ba1#diff-4d479e1bf17e6539c339b182d3c0861a1bd0b0ccbad5226c437d0e30f8cae0c7

i3/i3#4682 (comment)

i will note that, as of the second patch, i have tested the search and it works exactly as you expect!

very happy about this, considering how annoying privoxy is for similar changes.

ENJOY BOYS!

@mc-butler
Copy link
Author

Changed by andrew_b (@aborodin) on Mar 13, 2023 at 5:34 UTC (comment 2)

  • Component changed from mc-core to mc-search

@mc-butler
Copy link
Author

Changed by andrew_b (@aborodin) on Mar 15, 2023 at 18:00 UTC (comment 3)

The patch is corrupted:

$ git apply shitty_pcre2.patch
shitty_pcre2.patch:27: trailing whitespace.
    lc_mc_search->num_results = pcre2_match (regex, 
shitty_pcre2.patch:53: trailing whitespace.
        int pcre_options = 
shitty_pcre2.patch:56: trailing whitespace.
#else           
shitty_pcre2.patch:62: trailing whitespace.
            pcre_options |= 
shitty_pcre2.patch:70: trailing whitespace.
                pcre_options |= 
error: corrupt patch at line 113

@mc-butler
Copy link
Author

Changed by broly (gagan@….com) on Mar 19, 2023 at 18:31 UTC (comment 3.4)

Replying to andrew_b:

The patch is corrupted:

$ git apply shitty_pcre2.patch
shitty_pcre2.patch:27: trailing whitespace.
    lc_mc_search->num_results = pcre2_match (regex, 
shitty_pcre2.patch:53: trailing whitespace.
        int pcre_options = 
shitty_pcre2.patch:56: trailing whitespace.
#else           
shitty_pcre2.patch:62: trailing whitespace.
            pcre_options |= 
shitty_pcre2.patch:70: trailing whitespace.
                pcre_options |= 
error: corrupt patch at line 113

sorry for the late reply. turns out trying to jam in a forgotten macro corrupted the patch, because i didn't wanna make a new diff and then delete all of the differences for the */Makefile.in files xD

the updated patch has been tested and applied on my end this time. check it out.

@mc-butler
Copy link
Author

Changed by broly (gagan@….com) on Mar 19, 2023 at 18:32 UTC

fix patch corruption issue.

@mc-butler
Copy link
Author

Changed by andrew_b (@aborodin) on Mar 25, 2023 at 7:34 UTC (comment 5)

  • Status changed from new to accepted
  • Branch state changed from no branch to on review
  • Milestone changed from Future Releases to 4.8.30
  • Owner set to andrew_b

Branch: 4450_pcre2
[49624e4]

@broly it would be nice if you revealed your real name for AUTHORS file.

@mc-butler
Copy link
Author

Changed by andrew_b (@aborodin) on Apr 2, 2023 at 17:53 UTC (comment 6)

  • Votes set to andrew_b
  • Branch state changed from on review to approved

@mc-butler
Copy link
Author

Changed by andrew_b (@aborodin) on Apr 2, 2023 at 17:58 UTC (comment 7)

  • Votes changed from andrew_b to committed-master
  • Branch state changed from approved to merged
  • Resolution set to fixed
  • Status changed from accepted to testing

Merged to master: [86a9e0b].

@mc-butler
Copy link
Author

Changed by andrew_b (@aborodin) on Apr 2, 2023 at 17:59 UTC (comment 8)

  • Status changed from testing to closed

@mc-butler
Copy link
Author

Changed by eugenesan (eugenesan@….com) on Aug 27, 2023 at 17:20 UTC (comment 9)

I've been doing some testing with PCRE2 enabled MC and found that it is ~2x faster but provides different search results, which is concerning.

I performed "Find File with Content (Regular expression)" ^.*pcre.*$" on a tree of mixed text, binaries and archives (total ~90k / 3GB).
PCRE2 compiled MC provided ~0.005% (4476 vs 4453) less results when compared to default search engine.
Most of the misses where in binary files (.jar archives to be precise).

@mc-butler
Copy link
Author

Changed by zaytsev (@zyv) on Aug 28, 2023 at 17:16 UTC (comment 10)

This is certainly concerning. Did you figure out whether these missed are "real" misses, or the old search is actually wrong?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: search Search subsystem prio: medium Has the potential to affect progress
Development

No branches or pull requests

2 participants