Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rebuildData.php / ContentHandler.php: No handler for model 'xml' registered in $wgContentHandlers #1448

Closed
D-Groenewegen opened this issue Mar 11, 2016 · 8 comments

Comments

@D-Groenewegen
Copy link
Contributor

Following the fatal error mentioned in #1446, I tested the pre-release 2.4 version of Semantic MediaWiki. Hitting refreshData.php now gave me a different error:

 ............................................................ 0 %
...........................[cf430997] [no req]   Exception from line 324 of C:\P
rogram Files (x86)\Ampps\www\vhcodecs\includes\content\ContentHandler.php: No ha
ndler for model 'xml' registered in $wgContentHandlers
Backtrace:
#0 C:\Program Files (x86)\Ampps\www\vhcodecs\includes\content\ContentHandler.php
(262): ContentHandler::getForModelID(string)
#1 C:\Program Files (x86)\Ampps\www\vhcodecs\includes\Title.php(4735): ContentHa
ndler::getForTitle(Title)
#2 C:\Program Files (x86)\Ampps\www\vhcodecs\includes\parser\Parser.php(859): Ti
tle->getPageLanguage()
#3 C:\Program Files (x86)\Ampps\www\vhcodecs\includes\parser\Parser.php(1938): P
arser->getTargetLanguage()
#4 C:\Program Files (x86)\Ampps\www\vhcodecs\includes\parser\Parser.php(1901): P
arser->replaceInternalLinks2(string)
#5 C:\Program Files (x86)\Ampps\www\vhcodecs\includes\parser\Parser.php(1262): P
arser->replaceInternalLinks(string)
#6 C:\Program Files (x86)\Ampps\www\vhcodecs\includes\parser\Parser.php(405): Pa
rser->internalParse(string)
#7 C:\Program Files (x86)\Ampps\www\vhcodecs\extensions\SemanticMediaWiki\includ
es\ContentParser.php(202): Parser->parse(string, Title, ParserOptions, boolean,
boolean, integer)
#8 C:\Program Files (x86)\Ampps\www\vhcodecs\extensions\SemanticMediaWiki\includ
es\ContentParser.php(147): SMW\ContentParser->fetchFromParser()
#9 C:\Program Files (x86)\Ampps\www\vhcodecs\extensions\SemanticMediaWiki\src\Me
diaWiki\Jobs\UpdateJob.php(136): SMW\ContentParser->parse()
#10 C:\Program Files (x86)\Ampps\www\vhcodecs\extensions\SemanticMediaWiki\src\M
ediaWiki\Jobs\UpdateJob.php(119): SMW\MediaWiki\Jobs\UpdateJob->needToParsePageC
ontentBeforeUpdate()
#11 C:\Program Files (x86)\Ampps\www\vhcodecs\extensions\SemanticMediaWiki\src\M
ediaWiki\Jobs\UpdateJob.php(88): SMW\MediaWiki\Jobs\UpdateJob->doPrepareForUpdat
e()
#12 C:\Program Files (x86)\Ampps\www\vhcodecs\extensions\SemanticMediaWiki\src\M
ediaWiki\Jobs\UpdateJob.php(57): SMW\MediaWiki\Jobs\UpdateJob->doUpdate()
#13 C:\Program Files (x86)\Ampps\www\vhcodecs\extensions\SemanticMediaWiki\src\S
QLStore\ByIdDataRebuildDispatcher.php(192): SMW\MediaWiki\Jobs\UpdateJob->run()
#14 C:\Program Files (x86)\Ampps\www\vhcodecs\extensions\SemanticMediaWiki\src\M
aintenance\DataRebuilder.php(416): SMW\SQLStore\ByIdDataRebuildDispatcher->dispa
tchRebuildFor(string)
#15 C:\Program Files (x86)\Ampps\www\vhcodecs\extensions\SemanticMediaWiki\src\M
aintenance\DataRebuilder.php(229): SMW\Maintenance\DataRebuilder->deleteMarkedSu
bjects(SMW\SQLStore\ByIdDataRebuildDispatcher)
#16 C:\Program Files (x86)\Ampps\www\vhcodecs\extensions\SemanticMediaWiki\src\M
aintenance\DataRebuilder.php(170): SMW\Maintenance\DataRebuilder->doRebuildAll()

#17 C:\Program Files (x86)\Ampps\www\vhcodecs\extensions\SemanticMediaWiki\maint
enance\rebuildData.php(145): SMW\Maintenance\DataRebuilder->rebuild()
#18 C:\Program Files (x86)\Ampps\www\vhcodecs\maintenance\doMaintenance.php(101)
: SMW\Maintenance\RebuildData->execute()
#19 C:\Program Files (x86)\Ampps\www\vhcodecs\extensions\SemanticMediaWiki\maint
enance\rebuildData.php(178): require_once(string)
#20 C:\Program Files (x86)\Ampps\www\vhcodecs\extensions\SemanticMediaWiki\maint
enance\SMW_refreshData.php(9): require_once(string)
#21 {main}

Perhaps this is related to #295?

Another odd thing I noticed is the appearance of the words 'off' and 'on' surrounding a link when that link is produced by a query. The link comes from a value of a property that has the datatype 'Text'.

@mwjames
Copy link
Contributor

mwjames commented Mar 11, 2016

I'm guessing MW 1.24?

Perhaps this is related to #295?

No.

Exception from line 324 of C:\P
rogram Files (x86)\Ampps\www\vhcodecs\includes\content\ContentHandler.php: No ha
ndler for model 'xml' registered in $wgContentHandlers

This is the issue, which means some settings on your wiki doesn't correspond to the expected $wgContentHandlers. Not sure why you would need a model 'xml'. I haven't seen this before but some page is expecting XML. I'm guessing that some extension comes with special content handler such as [0, 1] and needs extra handling.

As we parse the whole page we also need to parse anything else (including other parser functions) as it may influence the state of property values which is why we cannot skip them.

https://www.mediawiki.org/wiki/Manual:$wgContentHandlers

I tested the pre-release 2.4 version of Semantic MediaWiki.
Hitting refreshData.php now gave me a different error:

Since this isn't related to SMW you can skip this with --ignore-exceptions (#1327).

Another odd thing I noticed is the appearance of the words 'off' and 'on' surrounding a link when that link is produced by a query. The link comes from a value of a property that has the datatype 'Text'.

For this we need a specific example and it would best to have this reproducible on http://sandbox.semantic-mediawiki.org/wiki.

Are you using the VisualEditor?

[0] https://phabricator.wikimedia.org/T47750
[1] https://phabricator.wikimedia.org/T43309

@mwjames mwjames changed the title ContentHandler.php exception (SMW 2.4alpha) rebuildData.php / ContentHandler.php: No handler for model 'xml' registered in $wgContentHandlers Mar 11, 2016
@D-Groenewegen
Copy link
Contributor Author

Still using MW 1.24.2 here.

--ignore-exceptions did not work at first (the exception did interrupt the
process), perhaps because I used it with SMW_refreshData.php, while others
seem to have been using rebuildData.php (is there a difference?).

XML was used on some pages, which may have been the issue, so I cleaned up
some of the code and triggered rebuildData.php with --ignore-exceptions.
The rebuild process is currently running, without any interruptions so far.

VisualEditor is not installed (and never has been).

2016-03-11 10:28 GMT+01:00 mwjames notifications@github.com:

I'm guessing MW 1.24?

Perhaps this is related to #295
#295?

No.

Exception from line 324 of C:\P
rogram Files (x86)\Ampps\www\vhcodecs\includes\content\ContentHandler.php:
No ha
ndler for model 'xml' registered in $wgContentHandlers

This is the issue, which means some settings on your wiki doesn't
correspond to the expected $wgContentHandlers. Not sure why you would
need a model 'xml'. I haven't seen this before but some page is expecting
XML. I'm guessing that some extension comes with special content handler
such as (https://phabricator.wikimedia.org/T47750) and needs extra
handling.

As we parse the whole page we also need to parse anything else (including
other parser functions) as it may influence the state of property values
which is why we cannot skip them.

https://www.mediawiki.org/wiki/Manual:$wgContentHandlers

I tested the pre-release 2.4 version of Semantic MediaWiki.
Hitting refreshData.php now gave me a different error:

Since this isn't related to SMW you can skip this with --ignore-exceptions
(#1327
#1327).

Another odd thing I noticed is the appearance of the words 'off' and 'on'
surrounding a link when that link is produced by a query. The link comes
from a value of a property that has the datatype 'Text'.

For this we need a specific example and it would best to have this
reproducible on http://sandbox.semantic-mediawiki.org/wiki.

Are you using the VisualEditor?


Reply to this email directly or view it on GitHub
#1448 (comment)
.

@mwjames
Copy link
Contributor

mwjames commented Mar 11, 2016

-ignore-exceptions did not work at first (the exception did interrupt the
process)

I just simulated a similar error by setting $GLOBALS['wgContentHandlers'] = array(); and when running php maintenance/rebuildData.php it would stop

@TAURUS .../mw-26-00/extensions/SemanticMediaWiki (ExtraneousLanguage)
$ php maintenance/rebuildData.php

Running for storage: SPARQLStore

Removing table entries (marked for deletion).

.......

7 IDs removed.

Refreshing selected pages (properties).
[b92eecda] [no req]   MWException from line 326 of ...\mw-26-00\includes\content\ContentHandler.php: No handler f
or model 'wikitext' registered in $wgContentHandlers
Backtrace:
#0 ...\mw-26-00\includes\Revision.php(1141): ContentHandler::getForModelID(string)
#1 ...\mw-26-00\includes\Revision.php(1076): Revision->getContentHandler()
#2 ...\mw-26-00\includes\Revision.php(1027): Revision->getContentInternal()
#3 ...\mw-26-00\includes\Revision.php(1003): Revision->getContent(integer, NULL)
#4 ...\mw-26-00\extensions\SemanticMediaWiki\includes\ContentParser.php(196): Revision->getText()
#5 ...\mw-26-00\extensions\SemanticMediaWiki\includes\ContentParser.php(147): SMW\ContentParser->fetchFromParser(
)
#6 ...\mw-26-00\extensions\SemanticMediaWiki\src\MediaWiki\Jobs\UpdateJob.php(136): SMW\ContentParser->parse()
#7 ...\mw-26-00\extensions\SemanticMediaWiki\src\MediaWiki\Jobs\UpdateJob.php(119): SMW\MediaWiki\Jobs\UpdateJob-
>needToParsePageContentBeforeUpdate()
#8 ...\mw-26-00\extensions\SemanticMediaWiki\src\MediaWiki\Jobs\UpdateJob.php(88): SMW\MediaWiki\Jobs\UpdateJob->
doPrepareForUpdate()
#9 ...\mw-26-00\extensions\SemanticMediaWiki\src\MediaWiki\Jobs\UpdateJob.php(57): SMW\MediaWiki\Jobs\UpdateJob->
doUpdate()
#10 ...\mw-26-00\extensions\SemanticMediaWiki\src\Maintenance\DistinctEntityDataRebuilder.php(167): SMW\MediaWiki
\Jobs\UpdateJob->run()
#11 ...\mw-26-00\extensions\SemanticMediaWiki\src\Maintenance\DistinctEntityDataRebuilder.php(147): SMW\Maintenan
ce\DistinctEntityDataRebuilder->doExecuteUpdateJobFor(Title)
#12 ...\mw-26-00\extensions\SemanticMediaWiki\src\Maintenance\DataRebuilder.php(196): SMW\Maintenance\DistinctEnt
ityDataRebuilder->doRebuild()
#13 ...\mw-26-00\extensions\SemanticMediaWiki\src\Maintenance\DataRebuilder.php(233): SMW\Maintenance\DataRebuild
er->doRebuildDistinctEntities()
#14 ...\mw-26-00\extensions\SemanticMediaWiki\src\Maintenance\DataRebuilder.php(170): SMW\Maintenance\DataRebuild
er->doRebuildAll()
#15 ...\mw-26-00\extensions\SemanticMediaWiki\maintenance\rebuildData.php(145): SMW\Maintenance\DataRebuilder->re
build()
#16 ...\mw-26-00\maintenance\doMaintenance.php(103): SMW\Maintenance\RebuildData->execute()
#17 ...\mw-26-00\extensions\SemanticMediaWiki\maintenance\rebuildData.php(178): require_once(string)
#18 {main}

yet, if I use php maintenance/rebuildData.php --ignore-exceptions it does ignore the exceptions

@TAURUS .../mw-26-00/extensions/SemanticMediaWiki (ExtraneousLanguage)
$ php maintenance/rebuildData.php --ignore-exceptions

Running for storage: SPARQLStore

Refreshing selected pages (properties).

.........................................................

57 pages refreshed.

57 exceptions were ignored! (See ...\mw-26-00\extensions\SemanticMediaWiki\rebuilddata-exceptions-2016-03-11.log)
.

Refreshing all semantic data in the database!
---
 Some versions of PHP suffer from memory leaks in long-running
 scripts. If your machine gets very slow after many pages
 (typically more than 1000) were refreshed, please abort with
 CTRL-C and resume this script at the last processed page id
 using the parameter -s (use -v to display page ids during
 refresh). Continue this until all pages have been refreshed.
---
 The progress displayed is an estimation and is self-adjusting
 during the update process.
---
Processing all IDs from 1 to 10704 ...


....................

So, --ignore-exceptions does work as expected.

perhaps because I used it with SMW_refreshData.php,

https://www.semantic-mediawiki.org/wiki/Help:RebuildData.php#Important_Notes

@D-Groenewegen
Copy link
Contributor Author

Thanks, then the docs must be outdated and incomplete because
SMW_refreshData.php rather than rebuildData.php is still being advertised
in the installation/upgrade guides:

https://www.semantic-mediawiki.org/wiki/Help:Installation/Upgrade_from_SMW_1.9%2B_for_MW_1.22%2B
(which is what I followed)
*
https://www.semantic-mediawiki.org/wiki/Help:Installation/Upgrade_from_SMW_1.9%2B_for_MW_1.19_-_1.21
*
https://www.semantic-mediawiki.org/wiki/Help:Installation/Using_Composer_with_MediaWiki_1.22_-_1.24

2016-03-11 14:12 GMT+01:00 mwjames notifications@github.com:

-ignore-exceptions did not work at first (the exception did interrupt the
process)

I just simulated a similar error but setting $GLOBALS['wgContentHandlers']
= array(); and when running php maintenance/rebuildData.php it would stop

@taurus .../mw-26-00/extensions/SemanticMediaWiki (ExtraneousLanguage)
$ php maintenance/rebuildData.php

Running for storage: SPARQLStore

Removing table entries (marked for deletion).

.......

7 IDs removed.

Refreshing selected pages (properties).
[b92eecda] [no req] MWException from line 326 of ...\mw-26-00\includes\content\ContentHandler.php: No handler f
or model 'wikitext' registered in $wgContentHandlers
Backtrace:
#0 ...\mw-26-00\includes\Revision.php(1141): ContentHandler::getForModelID(string)
#1 ...\mw-26-00\includes\Revision.php(1076): Revision->getContentHandler()
#2 ...\mw-26-00\includes\Revision.php(1027): Revision->getContentInternal()
#3 ...\mw-26-00\includes\Revision.php(1003): Revision->getContent(integer, NULL)
#4 ...\mw-26-00\extensions\SemanticMediaWiki\includes\ContentParser.php(196): Revision->getText()
#5 ...\mw-26-00\extensions\SemanticMediaWiki\includes\ContentParser.php(147): SMW\ContentParser->fetchFromParser(
)
#6 ...\mw-26-00\extensions\SemanticMediaWiki\src\MediaWiki\Jobs\UpdateJob.php(136): SMW\ContentParser->parse()
#7 ...\mw-26-00\extensions\SemanticMediaWiki\src\MediaWiki\Jobs\UpdateJob.php(119): SMW\MediaWiki\Jobs\UpdateJob-

needToParsePageContentBeforeUpdate()
#8 ...\mw-26-00\extensions\SemanticMediaWiki\src\MediaWiki\Jobs\UpdateJob.php(88): SMW\MediaWiki\Jobs\UpdateJob->
doPrepareForUpdate()
#9 ...\mw-26-00\extensions\SemanticMediaWiki\src\MediaWiki\Jobs\UpdateJob.php(57): SMW\MediaWiki\Jobs\UpdateJob->
doUpdate()
#10 ...\mw-26-00\extensions\SemanticMediaWiki\src\Maintenance\DistinctEntityDataRebuilder.php(167): SMW\MediaWiki
\Jobs\UpdateJob->run()
#11 ...\mw-26-00\extensions\SemanticMediaWiki\src\Maintenance\DistinctEntityDataRebuilder.php(147): SMW\Maintenan
ce\DistinctEntityDataRebuilder->doExecuteUpdateJobFor(Title)
#12 ...\mw-26-00\extensions\SemanticMediaWiki\src\Maintenance\DataRebuilder.php(196): SMW\Maintenance\DistinctEnt
ityDataRebuilder->doRebuild()
#13 ...\mw-26-00\extensions\SemanticMediaWiki\src\Maintenance\DataRebuilder.php(233): SMW\Maintenance\DataRebuild
er->doRebuildDistinctEntities()
#14 ...\mw-26-00\extensions\SemanticMediaWiki\src\Maintenance\DataRebuilder.php(170): SMW\Maintenance\DataRebuild
er->doRebuildAll()
#15 ...\mw-26-00\extensions\SemanticMediaWiki\maintenance\rebuildData.php(145): SMW\Maintenance\DataRebuilder->re
build()
#16 ...\mw-26-00\maintenance\doMaintenance.php(103): SMW\Maintenance\RebuildData->execute()
#17 ...\mw-26-00\extensions\SemanticMediaWiki\maintenance\rebuildData.php(178): require_once(string)
#18 {main}

yet, if I use php maintenance/rebuildData.php --ignore-exceptions it does
ignore the exceptions

@taurus .../mw-26-00/extensions/SemanticMediaWiki (ExtraneousLanguage)
$ php maintenance/rebuildData.php --ignore-exceptions

Running for storage: SPARQLStore

Refreshing selected pages (properties).

.........................................................

57 pages refreshed.

57 exceptions were ignored! (See ...\mw-26-00\extensions\SemanticMediaWiki\rebuilddata-exceptions-2016-03-11.log)
.

Refreshing all semantic data in the database!

Some versions of PHP suffer from memory leaks in long-running
scripts. If your machine gets very slow after many pages
(typically more than 1000) were refreshed, please abort with
CTRL-C and resume this script at the last processed page id
using the parameter -s (use -v to display page ids during

refresh). Continue this until all pages have been refreshed.

The progress displayed is an estimation and is self-adjusting

during the update process.

Processing all IDs from 1 to 10704 ...

....................

So, --ignore-exceptions does work as expected.

perhaps because I used it with SMW_refreshData.php,

https://www.semantic-mediawiki.org/wiki/Help:RebuildData.php#Important_Notes


Reply to this email directly or view it on GitHub
#1448 (comment)
.

@mwjames
Copy link
Contributor

mwjames commented Mar 13, 2016

Can we close this?

@D-Groenewegen
Copy link
Contributor Author

We're done here I think.

2016-03-13 22:57 GMT+01:00 mwjames notifications@github.com:

Can we close this?


Reply to this email directly or view it on GitHub
#1448 (comment)
.

@mwjames mwjames closed this as completed Mar 15, 2016
@D-Groenewegen
Copy link
Contributor Author

Update:

The new, upgraded site running SMW 2.4alpha is now online at vanhamel.nl/codecs. Transferring the site to a local computer has been a wise choice, all the more because it allowed me to do things I could not have done without shell access.

Running rebuildData.php seems to have had positive effects on the database: it has now gone down dramatically in size from about 1.8 to 1.1 GB, although a small part of this is attributable to other factors.

Unfortunately, two specific IDs* were causing a bottleneck: because of --ignore-exceptions, rebuildData.php was not interrupted but instead it continually attempted to process these same IDs over and over again, resulting in quite a drain on my computer's resources, to put it mildly. About 24GB became unusable to the point where I had to shut down my computer. The log shows numerous repetitions of the same report pointing to an issue with XML - similar to the one reported earlier.

The source of the problem is beyond me and I haven't exactly solved this other than telling rebuildData.php to skip the problematic IDs, which then allowed rebuildData.php to complete the data refresh.

This may be an issue of long standing since I can remember how earlier attempts at doing a "Data repair and upgrade" from Special:SMWAdmin appeared to 'hang' indefinitely.

(*P.S. These IDs could not be discovered in the Object ID lookup from Special:SMWAdmin)

@mwjames
Copy link
Contributor

mwjames commented Apr 21, 2016

Unfortunately, two specific IDs* were causing a bottleneck: because of --ignore-exceptions, rebuildData.php was not interrupted

If you know the ID then have a look at the smw_object_ids DB table to see which smw_title is being referenced.

but instead it continually attempted to process these same IDs over and over again, resulting in quite a drain on my computer's resources, to put it mildly. About 24GB became unusable to the point where I had to shut down my computer. The log shows numerous

That should not happen and a way to replicate this behaviour would be much appreciated.

The source of the problem is beyond me and I haven't exactly solved this other than telling rebuildData.php to skip the problematic IDs, which then allowed rebuildData.php to complete the data refresh.

Adding a range (-s/-e option) before and after the ID that causes the issue together with option -v should output a more verbose info about the object in question.

The log shows numerous repetitions of the same report pointing to an issue with XML - similar to the one reported earlier.

If the entity is known try to the delete the related page (using MW standard action=delete).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants