Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PHP Warnings re: LinksEncoder.php fails with "... preg_match_all(): Compilation failed: recursive call could loop indefinitely at offset 20" #3848

Closed
JeremiPlazas opened this issue Mar 25, 2019 · 17 comments
Labels
bug Occurrence of an unintended or unanticipated behaviour that causes a vulnerability or fatal error question

Comments

@JeremiPlazas
Copy link

JeremiPlazas commented Mar 25, 2019

Setup and configuration

  • SMW version: 3.0.1
  • MW version: 1.31.1
  • PHP version: 7.0.33 (apache2handler)
  • DB system and version: MySQL 5.6.42-log

Issue

I'm encountering the following recurring errors when executing runJobs.php in this wiki farm of ours (just 2 wikis on there at the moment):

PHP Warning: Invalid argument supplied for foreach() in /mnt/tsadra/websites/MediawikiFarm/extensions/SemanticMediaWiki/src/Parser/LinksEncoder.php on line 182
PHP Warning: preg_match_all(): Compilation failed: recursive call could loop indefinitely at offset 20 in /mnt/tsadra/websites/MediawikiFarm/extensions/SemanticMediaWiki/src/Parser/LinksEncoder.php on line 173

Any clues what is causing this? Is it SMW version or other extensions needing updating?

Thank you for any help, it would be much appreciated.

@JeroenDeDauw JeroenDeDauw added the bug Occurrence of an unintended or unanticipated behaviour that causes a vulnerability or fatal error label Mar 25, 2019
@mwjames
Copy link
Contributor

mwjames commented Mar 25, 2019 via email

@kghbln
Copy link
Member

kghbln commented Mar 25, 2019

It could also be that you are using an outdated PCRE library. Again, without knowing the specifics this going to be difficult to answer.

The server looks like a Debian 9, so https://packages.debian.org/stretch/libpcre3 is at 8.39

If this is about links in values then perhaps $smwgLinksInValues = true; in "LocalSettings.php" which could cause pain, but in the end I am not sure about what could cause the issue observed.

@JeremiPlazas
Copy link
Author

JeremiPlazas commented Mar 26, 2019

Thanks for the replies so far.

@mwjames i am looking into trying to identify what exactly is triggering the warnings. I'm sorry if my lack of developer skills takes me a little while. Will get back to you on that...

@kghbln I will look into the PCRE version and make sure we don't need to update it. We did have issues with this $smwgLinksInValues setting. I noticed it was deprecated in the latest version of SMW so in this newest farm, we instead have $smwgParserFeatures = SMW_PARSER_LINV;.

@JeremiPlazas
Copy link
Author

Ok so it looks like we're running the latest version of PCRE that we can...

We're on an amazon ec2 and are running the latest we can:

[ec2-user@garuda ~]$ yum list | grep pcre
pcre.x86_64                          8.21-7.8.amzn1                @amzn-updates
pcre.i686                            8.21-7.8.amzn1                amzn-updates 
pcre-devel.x86_64                    8.21-7.8.amzn1                amzn-updates 
pcre-static.x86_64                   8.21-7.8.amzn1                amzn-updates 
pcre-tools.x86_64                    8.21-7.8.amzn1                amzn-updates 

[ec2-user@garuda ~]$ yum list installed | grep pcre
pcre.x86_64                          8.21-7.8.amzn1                @amzn-updates

@mwjames
Copy link
Contributor

mwjames commented Mar 26, 2019 via email

@JeremiPlazas
Copy link
Author

Yes i'm painfully aware of this. I'm having a hard time isolate what does it though. How would i go about isolating what change is causing this issue? The error is sprinkled all throughout the output of the runJobs.php command without a clear correlation to the lines that come right above it.

@mwjames
Copy link
Contributor

mwjames commented Mar 26, 2019 via email

@JeremiPlazas
Copy link
Author

JeremiPlazas commented Mar 26, 2019

So i did as you suggested. From the debug log, it seems the only lines of interest may be:

[runJobs] smw.update Category:2015_Applications origin=EntityRebuildDispatcher requestId=e45953a175ba6afc73ba0b7c (id=26753,timestamp=20190326180927) STARTING
[ContentHandler] Created handler for wikitext: WikitextContentHandler
[error] [e45953a175ba6afc73ba0b7c] /index.php?title=Special:SemanticMediaWiki&tab=rebuild   ErrorException from line 173 of /mnt/tsadra/websites/MediawikiFarm/extensions/SemanticMediaWiki/src/Parser/LinksEncoder.php: PHP Warning: preg_match_all(): Compilation failed: recursive call could loop indefinitely at offset 20
#0 [internal function]: MWExceptionHandler::handleError(integer, string, string, integer, array)
#1 /mnt/tsadra/websites/MediawikiFarm/extensions/SemanticMediaWiki/src/Parser/LinksEncoder.php(173): preg_match_all(string, string, NULL)
#2 /mnt/tsadra/websites/MediawikiFarm/extensions/SemanticMediaWiki/src/Parser/LinksEncoder.php(41): SMW\Parser\LinksEncoder::matchAndReplace(string, SMW\Parser\InTextAnnotationParser)
#3 /mnt/tsadra/websites/MediawikiFarm/extensions/SemanticMediaWiki/src/Parser/InTextAnnotationParser.php(160): SMW\Parser\LinksEncoder::findAndEncodeLinks(string, SMW\Parser\InTextAnnotationParser)
#4 /mnt/tsadra/websites/MediawikiFarm/extensions/SemanticMediaWiki/src/MediaWiki/Hooks/InternalParseBeforeLinks.php(134): SMW\Parser\InTextAnnotationParser->parse(string)
#5 /mnt/tsadra/websites/MediawikiFarm/extensions/SemanticMediaWiki/src/MediaWiki/Hooks/InternalParseBeforeLinks.php(65): SMW\MediaWiki\Hooks\InternalParseBeforeLinks->performUpdate(string)
#6 /mnt/tsadra/websites/MediawikiFarm/extensions/SemanticMediaWiki/src/MediaWiki/Hooks/HookListener.php(267): SMW\MediaWiki\Hooks\InternalParseBeforeLinks->process(string)
#7 /mnt/tsadra/websites/MediawikiFarm/includes/Hooks.php(177): SMW\MediaWiki\Hooks\HookListener->onInternalParseBeforeLinks(Parser, string, StripState)
#8 /mnt/tsadra/websites/MediawikiFarm/includes/Hooks.php(205): Hooks::callHook(string, array, array, NULL)
#9 /mnt/tsadra/websites/MediawikiFarm/includes/parser/Parser.php(1305): Hooks::run(string, array)
#10 /mnt/tsadra/websites/MediawikiFarm/includes/parser/Parser.php(443): Parser->internalParse(string)
#11 /mnt/tsadra/websites/MediawikiFarm/includes/content/WikitextContent.php(323): Parser->parse(string, Title, ParserOptions, boolean, boolean, integer)
#12 /mnt/tsadra/websites/MediawikiFarm/includes/content/AbstractContent.php(516): WikitextContent->fillParserOutput(Title, integer, ParserOptions, boolean, ParserOutput)
#13 /mnt/tsadra/websites/MediawikiFarm/extensions/SemanticMediaWiki/includes/ContentParser.php(183): AbstractContent->getParserOutput(Title, integer, ParserOptions, boolean)
#14 /mnt/tsadra/websites/MediawikiFarm/extensions/SemanticMediaWiki/includes/ContentParser.php(144): SMW\ContentParser->fetchFromContent()
#15 /mnt/tsadra/websites/MediawikiFarm/extensions/SemanticMediaWiki/src/MediaWiki/Jobs/UpdateJob.php(196): SMW\ContentParser->parse()
#16 /mnt/tsadra/websites/MediawikiFarm/extensions/SemanticMediaWiki/src/MediaWiki/Jobs/UpdateJob.php(136): SMW\MediaWiki\Jobs\UpdateJob->parse_content()
#17 /mnt/tsadra/websites/MediawikiFarm/extensions/SemanticMediaWiki/src/MediaWiki/Jobs/UpdateJob.php(93): SMW\MediaWiki\Jobs\UpdateJob->doUpdate()
#18 /mnt/tsadra/websites/MediawikiFarm/includes/jobqueue/JobRunner.php(296): SMW\MediaWiki\Jobs\UpdateJob->run()
#19 /mnt/tsadra/websites/MediawikiFarm/includes/jobqueue/JobRunner.php(193): JobRunner->executeJob(SMW\MediaWiki\Jobs\UpdateJob, Wikimedia\Rdbms\LBFactorySimple, BufferingStatsdDataFactory, integer)
#20 /mnt/tsadra/websites/MediawikiFarm/includes/MediaWiki.php(1002): JobRunner->run(array)
#21 /mnt/tsadra/websites/MediawikiFarm/includes/MediaWiki.php(988): MediaWiki->triggerSyncJobs(integer, MediaWiki\Logger\LegacyLogger)
#22 /mnt/tsadra/websites/MediawikiFarm/includes/MediaWiki.php(912): MediaWiki->triggerJobs()
#23 /mnt/tsadra/websites/MediawikiFarm/includes/MediaWiki.php(727): MediaWiki->restInPeace(string, boolean)
#24 /mnt/tsadra/websites/MediawikiFarm/includes/MediaWiki.php(750): MediaWiki->{closure}()
#25 /mnt/tsadra/websites/MediawikiFarm/includes/MediaWiki.php(557): MediaWiki->doPostOutputShutdown(string)
#26 /mnt/tsadra/websites/MediawikiFarm/index.php(42): MediaWiki->run()
#27 {main}

@mwjames
Copy link
Contributor

mwjames commented Mar 26, 2019 via email

@mwjames mwjames changed the title PHP Warnings re: LinksEncoder.php when running runJobs.php PHP Warnings re: LinksEncoder.php fails with "... preg_match_all(): Compilation failed: recursive call could loop indefinitely at offset 20" Mar 30, 2019
@mwjames
Copy link
Contributor

mwjames commented Mar 30, 2019

[0] https://3v4l.org/TeuQA

As above tests shows there is nothing wrong with our pattern, so the issue most likely relates to the PHP/PCRE library in use.

If for some reason you cannot make any administrative changes to your PHP environment then remove the SMW_PARSER_LINV from the smwgParserFeatures to avoid running the LinksEncoder with said pattern.

@mwjames mwjames closed this as completed Mar 30, 2019
@kghbln kghbln added the invalid not related to SMW label Apr 2, 2019
@krabina
Copy link
Contributor

krabina commented Oct 8, 2019

I am getting the same message due to a planned migration of a server.

Old server: PCRE version 7.8.7.el6, no warning messages
New server: PCRE 8.32-17.el7, warning message appears.

I could not find any hint on which version should be used.

@JeroenDeDauw JeroenDeDauw reopened this Oct 8, 2019
@JeroenDeDauw JeroenDeDauw added question and removed invalid not related to SMW labels Oct 8, 2019
@plegault3397
Copy link

Getting the same error
I’m upgrading mediawiki and SemanticMediawiki
From:
MediaWiki 1.31.1
PHP 7.1.30 (apache2handler)
MySQL 5.6.10
Semantic MediaWiki 2.5.8

To:
MediaWiki 1.33.2
PHP 7.1.30 (apache2handler)
MySQL 5.6.41-log
Semantic MediaWiki 3.1.3

Started getting the errors
Warning: preg_match_all(): Compilation failed: recursive call could loop indefinitely at offset 20 in
Warning: Invalid argument supplied for foreach() in extensions/SemanticMediaWiki/src/Parser/LinksEncoder.php on line 182

pcre.x86_64 8.32-17.el7 @anaconda/7.4
pcre-devel.x86_64 8.32-17.el7 @jnj-RHEL7.7-202001
pcre.i686 8.32-17.el7 jnj-RHEL7.7-202002
pcre-devel.i686 8.32-17.el7 jnj-RHEL7.7-202002
pcre-static.i686 8.32-17.el7 jnj-RHEL7.7-202002-optional
pcre-static.x86_64 8.32-17.el7 jnj-RHEL7.7-202002-optional
pcre-tools.x86_64 8.32-17.el7 jnj-RHEL7.7-202002-optional

@plegault3397
Copy link

I changed my config to:

MediaWiki | 1.31.6
Semantic MediaWiki | 3.1.4
PHP | 7.2.24 (apache2handler)

Still same error. my homepage showing the error 3 times, I went through all my extensions and I found that extension DynamicPageList3 3.3.3 (2019-04-03)
disabled and 2 of the instances of the error went away.
I will post it to the DynamicPageList3 repository.
Still looking for the cause of the last instance of this error
"
Warning: preg_match_all(): Compilation failed: recursive call could loop indefinitely at offset 20 in /app/mediawiki/extensions/SemanticMediaWiki/src/Parser/LinksEncoder.php on line 173

Warning: Invalid argument supplied for foreach() in /app/mediawiki/extensions/SemanticMediaWiki/src/Parser/LinksEncoder.php on line 182
"

@mwjames
Copy link
Contributor

mwjames commented Feb 22, 2020

I think I was pretty clear in my previous answer #3848 (comment) on the matter of "Compilation failed: recursive call could loop indefinitely at offset 20".

To give a bit more background, the LinksEncoder [0] relies on the ?R (recursive regular) [1]. The PCRE library which is a requirement to use all preg* functions in PHP has to be up-to-date given it has seen various changes to its JIT to deal with the recursive expression (see [2]).

Old server: PCRE version 7.8.7.el6, no warning messages
New server: PCRE 8.32-17.el7, warning message appears.

This is an outdated version bundled with your environment (Version 8.32 30-November-2012, see [2]).

pcre.x86_64 8.32-17.el7 @anaconda/7.4
pcre-devel.x86_64 8.32-17.el7 @jnj-RHEL7.7-202001
pcre.i686 8.32-17.el7 jnj-RHEL7.7-202002
pcre-devel.i686 8.32-17.el7 jnj-RHEL7.7-202002
pcre-static.i686 8.32-17.el7 jnj-RHEL7.7-202002-optional
pcre-static.x86_64 8.32-17.el7 jnj-RHEL7.7-202002-optional
pcre-tools.x86_64 8.32-17.el7 jnj-RHEL7.7-202002-optional

This is an outdated version bundled with your environment (Version 8.32 30-November-2012, see [2]).

I could not find any hint on which version should be used.

I added #4576, so users can see which PCRE is used as part of the CI environment. For example, the CI with PHP 7.1.11 uses 8.38 2015-11-23 while PHP 7.3.15 uses 10.32 2018-09-10 [3]. There are no issues with those versions otherwise the integration tests would have shown a similar message as above.

Furthermore, you can use [4] to test string manipulation independent of SMW and get feedback about the PCRE version in connection with the used regular expression in SMW [0].

[0] https://github.com/SemanticMediaWiki/SemanticMediaWiki/blob/master/src/Parser/LinksEncoder.php#L173
[1] https://www.rexegg.com/regex-recursion.html
[2] http://www.pcre.org/original/changelog.txt
[3] https://travis-ci.org/SemanticMediaWiki/SemanticMediaWiki/jobs/653733963
[4] https://3v4l.org/oCCWC

@mwjames mwjames closed this as completed Feb 22, 2020
@mwjames
Copy link
Contributor

mwjames commented Feb 22, 2020

@kghbln I don't know if you want to mentioned this somewhere but if users rely on a software from let's say 30-November-2012 as in case of 8.32 [0] then there is no guarantee that this library or software is compatible with any recent version of SMW or MediaWiki.

As for pcre-8.32-17 [1], even though it is marked with 2016-12-06 it only includes security fixes which means the library itself is based on 8.32 from 30-November-2012 [1] and that is out of date.

[0]http://www.pcre.org/original/changelog.txt
[1] https://centos.pkgs.org/7/centos-x86_64/pcre-8.32-17.el7.x86_64.rpm.html

@kghbln
Copy link
Member

kghbln commented Feb 22, 2020

I don't know if you want to mentioned this somewhere

Yes I do. Added it prominently.

@plegault3397
Copy link

plegault3397 commented Feb 24, 2020

Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Occurrence of an unintended or unanticipated behaviour that causes a vulnerability or fatal error question
Projects
None yet
Development

No branches or pull requests

6 participants