-
Notifications
You must be signed in to change notification settings - Fork 529
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BBC: Branch Reset Change Likely Broke Plugin Regex Engines. #20710
Comments
If we wanted to grep CPAN for libraries affected by this change, would there be any strings to grep on? |
re-engine-{RE2,TRE,GNU,Lua} is the place to start. |
Using this pattern/URL: https://grep.metacpan.org/search?size=20&q=re-engine-%28RE2%7CTRE%7CGNU%7CLua%29&qd=&qft= I identified 5 CPAN distributions likely to be affected by this change.
I then tested them on a very recent threaded build on FreeBSD: v5.37.7-134-g25948dfb24. The 4 're-engine' distros all FAILed. The MarpaX distro depends (I think) on re-engine-GNU and so was classified as DISCARD. Relevant CPANtesters reports: [fail] [PERRAD/re-engine-Lua-0.19.tar.gz] [amd64-freebsd-thread-multi] [perl-v5.37.8] [bc897e06-9529-11ed-96a9-c4aca9ff8ba7] [2023-01-15T23:10:00Z] [fail] [SYP/re-engine-TRE-0.09.tar.gz] [amd64-freebsd-thread-multi] [perl-v5.37.8] [b6c523e4-9529-11ed-b82d-1841aaff8ba7] [2023-01-15T23:09:51Z] [fail] [DGL/re-engine-RE2-0.14.tar.gz] [amd64-freebsd-thread-multi] [perl-v5.37.8] [b0855044-9529-11ed-aec9-c528a9ff8ba7] [2023-01-15T23:09:40Z] [fail] [JDDPAUSE/re-engine-GNU-0.024.tar.gz] [amd64-freebsd-thread-multi] [perl-v5.37.8] [8313edfa-9529-11ed-85de-e76caaff8ba7] [2023-01-15T23:08:24Z] |
On Mon, 16 Jan 2023 at 00:25, James E Keenan ***@***.***> wrote:
If we wanted to grep CPAN for libraries affected by this change, would
there be any strings to grep on?
re-engine-{RE2,TRE,GNU,Lua} is the place to start.
Using this pattern/URL:
https://grep.metacpan.org/search?size=20&q=re-engine-%28RE2%7CTRE%7CGNU%7CLua%29&qd=&qft=
I identified 5 CPAN distributions likely to be affected by this change.
MarpaX-Languages-M4
re-engine-RE2
re-engine-TRE
re-engine-GNU
re-engine-Lua
I then tested them on a very recent threaded build on FreeBSD:
v5.37.7-134-g25948dfb24.
The 4 're-engine' distros all FAILed. The MarpaX distro depends (I think)
on re-engine-GNU and so was classified as DISCARD.
Relevant CPANtesters reports:
[fail] [PERRAD/re-engine-Lua-0.19.tar.gz] [amd64-freebsd-thread-multi]
[perl-v5.37.8] [bc897e06-9529-11ed-96a9-c4aca9ff8ba7] [2023-01-15T23:10:00Z]
http://www.cpantesters.org/cpan/report/bc897e06-9529-11ed-96a9-c4aca9ff8ba7
[fail] [SYP/re-engine-TRE-0.09.tar.gz] [amd64-freebsd-thread-multi]
[perl-v5.37.8] [b6c523e4-9529-11ed-b82d-1841aaff8ba7] [2023-01-15T23:09:51Z]
http://www.cpantesters.org/cpan/report/b6c523e4-9529-11ed-b82d-1841aaff8ba7
[fail] [DGL/re-engine-RE2-0.14.tar.gz] [amd64-freebsd-thread-multi]
[perl-v5.37.8] [b0855044-9529-11ed-aec9-c528a9ff8ba7] [2023-01-15T23:09:40Z]
http://www.cpantesters.org/cpan/report/b0855044-9529-11ed-aec9-c528a9ff8ba7
[fail] [JDDPAUSE/re-engine-GNU-0.024.tar.gz] [amd64-freebsd-thread-multi]
[perl-v5.37.8] [8313edfa-9529-11ed-85de-e76caaff8ba7] [2023-01-15T23:08:24Z]
http://www.cpantesters.org/cpan/report/8313edfa-9529-11ed-85de-e76caaff8ba7
James, can you test them against the last minor release? Eg, before the
branch reset patch? I have a feeling some or all of them will fail anyway,
and it would be nice to rule it out.
cheers,
Yves
…--
perl -Mre=debug -e "/just|another|perl|hacker/"
|
Preliminary results (from my ~/.cpanreporter/reports-sent.db:
You can get more detailed resorts by going to each distro's page at fast-matrix.cpantesters.org and clicking on the freebsd column. |
More results: [pass] [JDDPAUSE/re-engine-GNU-0.024.tar.gz] [amd64-freebsd-thread-multi] [perl-v5.37.7] [ce30961a-95b7-11ed-84cb-ce76aaff8ba7] [2023-01-16T16:06:58Z] [pass] [SYP/re-engine-TRE-0.09.tar.gz] [amd64-freebsd-thread-multi] [perl-v5.37.7] [a4efc532-95b7-11ed-ae91-1c8ca9ff8ba7] [2023-01-16T16:05:49Z] pass] [DGL/re-engine-RE2-0.14.tar.gz] [amd64-freebsd-thread-multi] [perl-v5.37.7] [9ed38e5e-95b7-11ed-a06f-8df7a9ff8ba7] [2023-01-16T16:05:39Z] |
Finally, my run of MarpaX-Languages-M4 is hanging on that distro's |
@dgl , @creaktive , @jddurand please take note! |
@andk It is not clear if there is an action we should we take, and which action, or does it mean that external re engines are not supported anymore (?). |
A far as I understood Yves, he is in the middle of fixing things in the regex engine. Currently the breakage is a side effect that should be fixable in the now broken pluggable re engines. There is no guarantee though that this was the last breakage in the current patch frenzy. My take on it is that the sooner the affected parties know, the better are the chances to find the best way forward (or backward). So please keep paying the proper amount of attention. |
Will do - thank you very much for the notification and for the wonderful work you all do with perl btw ;) |
@jddurand: the scope of changes for an engine basically come down to whether the engine supports branch reset or not. If it does not then all that has to be done is to set the new struct member total_logical_parens ( saying this from a faulty memory and this name varies depending on which struct its in -- so check the patch to be sure) to the same value as total_parens. Maybe I can make the internals deal with 0 properly and fallback to the total_parens value automatically, but if it's nonzero and wrong there is nothing I can do. Best to set it to match total_parens. If your engine does support branch reset you need to populate some additional fields with an array of I32 which allows perl to map logical buffers to physical buffers properly, (they can be null if branch reset is not used). The key point is that as of this patch it is possible to set things up so that logical buffers like $1 map to more than one physical buffer at the regex engine. For instance:
Should cause $1 to check buffer 1 then 2 then 3. And $2 should map to buffer 4. It is arguable that using the offs[] array in the internals to expose capture buffer data is broken, and the perl internals should be calling an engine api to resolve the logical capture buffer however the engine likes, and that perl shouldn't need this data in the struct directly, and I think we will move that way long term. Imo if the api did not have abstraction leakage then a change like this shouldn't have broken any engine, and that it did is strongly suggestive we need to refine the api. Fwiw, I added the plug-in engine in the first place, I certainly don't want to do anything that would remove support for it, on the contrary I want to improve it more so that when I improve perls engine I don't break yours. 😀 |
The field is logical_nparens. |
This should fix this module under Perl 5.37.7 and later. See Perl/perl5#20710
I have pushed a patch to re::engine::GNU - jddurand/re-engine-GNU#6 |
I have pushed a patch for re::engine::TRE: creaktive/re-engine-tre#5 And found yet another Dist::Zilla bug at the same time: #20726 |
I tried to fix re-engine-PCRE2 but it depends on Alien::PCRE2 which does not build. I am stuck on that until i get help on |
Currently re-engine-PCRE2 cant build because it is cheekily using unexported variables and functions. Even if I add the defines to let it see PL_core_reg_engine it wants to call Perl_re_op_compile which is not marked API, and as such is not visible in an extension these days. I am not sure if I should change that so it builds or what. |
We can default a 0 rx->logical_nparens to rx->nparens. If rx->logical_nparens is zero then either rx->nparens is also zero, or it can be defaulted. This will fix most re::engine::XXX modules that do not know about the new field, provided they zero the rx structure during construction. If they don't then this patch won't hurt anything and we will have to patch them directly. Also mark re_op_compile() as available to extensions. Marking it as hidden means that re::engine::PCRE2 and others cannot build. This patch should go a long way towards fixing issue #20710.
We can default a 0 rx->logical_nparens to rx->nparens. If rx->logical_nparens is zero then either rx->nparens is also zero, or it can be defaulted. This will fix most re::engine::XXX modules that do not know about the new field, provided they zero the rx structure during construction. If they don't then this patch won't hurt anything and we will have to patch them directly. Also mark re_op_compile() as available to extensions. Marking it as hidden means that re::engine::PCRE2 and others cannot build. This patch should go a long way towards fixing issue #20710.
re-engine-GNU 0.025 has been released, tested successfully using perl 5, version 37, subversion 8 (v5.37.8 (c224bbd)) . I want to explicitly say big thanks here to @demerphq for all the time spent on doing the support on this package. |
On Fri, 20 Jan 2023, 02:55 jddurand, ***@***.***> wrote:
re-engine-GNU <https://metacpan.org/pod/re::engine::GNU> 0.025 has been
released, tested successfully using perl 5, version 37, subversion 8
(v5.37.8 (c224bbd
<c224bbd>))
.
I want to explicitly say big thanks here to @demerphq
<https://github.com/demerphq> for all the time spent on doing the support
on this package.
Thank you for the speedy release! It is a pleasure working with someone so
reactive.
Also thanks for maintaining this. It is nice to see that folks are using
the plug in framework.
Cheers
Yves
… |
We can default a 0 rx->logical_nparens to rx->nparens. If rx->logical_nparens is zero then either rx->nparens is also zero, or it can be defaulted. This will fix most re::engine::XXX modules that do not know about the new field, provided they zero the rx structure during construction. If they don't then this patch won't hurt anything and we will have to patch them directly. Also mark re_op_compile() as available to extensions. Marking it as hidden means that re::engine::PCRE2 and others cannot build. This patch should go a long way towards fixing issue #20710.
I have pushed a patch for re::engine::PCRE2. I think my patch wasnt the reason it was broken though, but its done. |
On Thu, Jan 19, 2023 at 06:49:50AM -0800, Yves Orton wrote:
Currently re-engine-PCRE2 cant build because it is cheekily using unexported variables and functions. Even if I add the defines to let it see PL_core_reg_engine it wants to call Perl_re_op_compile which is not marked API, and as such is not visible in an extension these days. I am not sure if I should change that so it builds or what.
I'm surprised it needs access to re_op_compile() rather than the basic
re_compile(). The former was added by me as part of my jumbo fix for
embedded code in regexps, i.e. /(?{...})/ and similar. It was needed
because the lexer and parser were changed to no longer treat a literal
pattern in the source as single string, but to parse any code blocks in it
and return the whole thing as an optree consisting of one or more
OP_CONSTs containing the stringy parts of the pattern, plus ops for
whatever the ... in /(?{...}) compiled to.
Does re-engine-PCRE2 handle code blocks?
…--
All wight. I will give you one more chance. This time, I want to hear
no Wubens. No Weginalds. No Wudolf the wed-nosed weindeers.
-- Life of Brian
|
On Tue, 24 Jan 2023 at 19:33, iabyn ***@***.***> wrote:
On Thu, Jan 19, 2023 at 06:49:50AM -0800, Yves Orton wrote:
> Currently re-engine-PCRE2 cant build because it is cheekily using unexported variables and functions. Even if I add the defines to let it see PL_core_reg_engine it wants to call Perl_re_op_compile which is not marked API, and as such is not visible in an extension these days. I am not sure if I should change that so it builds or what.
I'm surprised it needs access to re_op_compile() rather than the basic
re_compile(). The former was added by me as part of my jumbo fix for
embedded code in regexps, i.e. /(?{...})/ and similar. It was needed
because the lexer and parser were changed to no longer treat a literal
pattern in the source as single string, but to parse any code blocks in it
and return the whole thing as an optree consisting of one or more
OP_CONSTs containing the stringy parts of the pattern, plus ops for
whatever the ... in /(?{...}) compiled to.
Does re-engine-PCRE2 handle code blocks?
No, not directly. As far as I can tell what re::engine::PCRE2 wants
to do is fallback to the perl regex engine when the pattern cant be
compiled.
I have to admit I didnt dig /that/ deeply. All I know is that when we
introduced the "attribute hidden" stuff to the embed framework we
broke his code. Since his code "looked" like it was broken by my
change (which was also true), I wanted to get it to build so I could
scratch it off my BBC list. I couldn't even compile his code to test
my fix for the fallout from my patch until I pushed the patch that
allowed him to access re_op_compile().
Yves
…--
perl -Mre=debug -e "/just|another|perl|hacker/"
|
Also affected: PFEIFFER/App-pl-0.91.0.tgz
|
@jkeenan you can also grep for |
@andk out of curiosity have you checked against latest blead? There was a fence post error I fixed just after 5.37.8 was released. (Sigh.) I'll check myself shortly, but I'd wouldn't be surprised if this is already fixed. |
@andk fwiw, it is still in bleadperl. totally different issue. this is a very useful report, thanks! |
Try this search pattern:
... after a couple of refreshes |
I then built a threaded blead at v5.37.8-16-g4ef62b4258 and attempted to install most, but not all, of the 96 distributions found above. (I skipped distros
|
@jkeenan given the list context bug, I could imagine some stuff is failing. Amazing that wasn't caught by the tests when I wrote the original patch. I plan to beef up list context tests in the future. Ill take a look at a random sample later. |
@jkeenan if you wanted you could run your tests again against the branch yves/fix_branch_reset_list_context_assignment That would weed out stuff that is broken by the list assignment bug. |
I recommend you look first at http://www.cpantesters.org/cpan/report/08ba39cc-9dd7-11ed-8633-4c8da9ff8ba7. |
@jkeenan that has all the fingerprints of the bug i just mentioned. I will check in a bit, i have to do some chores first. |
I've got other work that needs my attention. You can build a
|
Thanks, that was helpful. With cpanm I get this from my branch:
same command on 5.36.0:
Looks to me to be about the same. There are a few additional fails on 5.37.x but as far as I can tell they are unrelated to branch reset patch. I will file reports on at least one, if not more of them. |
re-engine-GNU passes test on blead now. |
We can default a 0 rx->logical_nparens to rx->nparens. If rx->logical_nparens is zero then either rx->nparens is also zero, or it can be defaulted. This will fix most re::engine::XXX modules that do not know about the new field, provided they zero the rx structure during construction. If they don't then this patch won't hurt anything and we will have to patch them directly. Also mark re_op_compile() as available to extensions. Marking it as hidden means that re::engine::PCRE2 and others cannot build. This patch should go a long way towards fixing issue Perl#20710.
We can default a 0 rx->logical_nparens to rx->nparens. If rx->logical_nparens is zero then either rx->nparens is also zero, or it can be defaulted. This will fix most re::engine::XXX modules that do not know about the new field, provided they zero the rx structure during construction. If they don't then this patch won't hurt anything and we will have to patch them directly. Also mark re_op_compile() as available to extensions. Marking it as hidden means that re::engine::PCRE2 and others cannot build. This patch should go a long way towards fixing issue Perl#20710.
We can default a 0 rx->logical_nparens to rx->nparens. If rx->logical_nparens is zero then either rx->nparens is also zero, or it can be defaulted. This will fix most re::engine::XXX modules that do not know about the new field, provided they zero the rx structure during construction. If they don't then this patch won't hurt anything and we will have to patch them directly. Also mark re_op_compile() as available to extensions. Marking it as hidden means that re::engine::PCRE2 and others cannot build. This patch should go a long way towards fixing issue Perl#20710.
Description
In fe5492d ((v5.37.7-120-gfe5492d916) we likely broke plug-in regex engines if there are any still being maintained. This ticket is to collect any bugs related to this, although none have been reported as yet.
Steps to Reproduce
N/A
Expected behavior
N/A
Perl configuration
N/A really.
The text was updated successfully, but these errors were encountered: