New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MW 1.23 / User-defined properties disappearing from pages #347
Comments
The potential PR #303 that could have caused such behaviour has been reverted and as we see now it did not introduce any inconsistencies. The behaviour of of use-defined properties disappearing is similar to the uncovered redirect bug in MW 1.23. The reason of use-defined properties not being parsed is that the Since it is similar to the redirect issue there are two things currently that I would suggest to see if any change can be monitored. The underlying cause for the redirect issue was the use of the ContentHandler therefore the first thing to test is to modify the
into
The above change is only to investigate if the use of the If the change doesn't show any significant improvement it is suggested to switch back to MW 1.22+ as code between 1.9.2 and 1.9.3 did not alter any part of the parsing/annotation (besides #303 which has been reverted). |
Does it make sense that this happens very infrequently and only for a subset of properties for a given page? If the annotations are being removed I don't get why most are still there. I'll try applying that change by hand when I’m online later tonight. |
As I said on WikiApiary/WikiApiary#114, I personally was unable to replicate the issue on either MW 1.23/MW 1.24 but the cause of the disappearance of user-defined properties can currently only be linked to the InternalParseBeforeLinks hook. Above suggestions are made in anticipation of the symptoms and are not the result of an analysis of the actual cause (since it "happens very infrequently" with no indication of what is triggering the disappearance as normally no user interaction is involved). |
@mwjames the change to |
Ahh... I just looked at the history of InternalParseBeforeLinks which was last changed Feb 17, 2014 (taken into account the revert from Jun 04, 2014 for changes made on May 05, 2014). The reason I'm fixated on If the This leaves me currently only to suggest to return to the previous MW version (as the issue seem to be first appeared close to the date when the update was made). PS: In order to analyze the problem, I need to know how to trigger the issue in order to make a suggestion on how to fix it. |
Re PS: I totally get that. I wish I could figure out a way to trigger it to happen. Similar to the #191 I have no idea what is causing this to happen. :-\ The only thing I know on this one is a refresh manually will fix it, where #191 that will never fix it. I would happily grant shell access to my server to trusted folks if there was an interest in looking directly. Also, FWIW, if anyone ever wanted to setup a duplicate of the WikiApiary site I dump a fresh XML backup weekly. I’m not sure that is helpful at all, but thought I would share. |
This would only make sense if we know what triggers it but right now the cause is unknown (in terms of a nondeterministic behavior). For example, is it a bot write to a page, is a job run or is it something else that those pages share in common that would make them classifiable (as a group) in terms of their lost data (belongs to certain entity etc.). |
Just highlighting that I don't believe a bot write to a page is related to this or #191. It has been mentioned many times as a potential reason, but @kghbln and I have found many example of pages with these issues that have no recent bot activity. |
I would still like to throw in my comment at WikiApiary/WikiApiary#114 (comment) To what I have seen it seems to be connected to edits to the respective template holding the properties. I think it is a weird behaviour but this the only direct cause and effect thing I observed so far. |
Bill Whitson has reported a similar behaviour for MediaWiki 1.22.0 + Semantic MediaWiki (Version 1.9-RC1) + Semantic Forms (Version 2.6). |
Continuing my thoughts from #397 (comment): SESP properties generally seems not to be missing which means that the process responsible for the store update is still active (namely the update hook). Before the As I explained above, the only point where In order for annotations to be stored successfully, the execution of |
SMW, uses the The annotations that were found during If for some reason during the processing, the |
@thingles Could you add the following, I'd like to track some activities (+ is marking the added line). LocalSettings.php
InternalParseBeforeLinks.php $this->parser->getOutput()->setProperty(
'smw-semanticdata-status',
$parserData->getSemanticData()->getProperties() !== array()
);
+ wfDebugLog( 'smw-report', __METHOD__ . ' ' . $this->parser->getTitle()->getDBKey() . ' Properties: ' . count( $parserData->getSemanticData()->getProperties() ) . "\n" );
return true; LinksUpdateConstructed.php + wfDebugLog( 'smw-report' ,__METHOD__ . ' ' . $this->linksUpdate->getTitle()->getDBKey() . ' Properties: ' . count( $parserData->getSemanticData()->getProperties() ) . "\n" );
$parserData->updateStore();
return true; This should report all activities in connection with user-defined annotation storage. The expectation is that when a If for some reason |
@mwjames I added these debug messages as described. I let it run for a few hours. Here is the log file.. There are 63 times that LinksUpdateConstructed had 0. I also see stuff like this that I’m not sure if it’s an error
This pattern also seems odd…
This was a result in the edit of the whois record for that wiki. It’s worth noting that the root page "Rain World Wiki" had no edits, however, it does transclude the subpages. That transclusion would cause the root page to be refreshed as well. |
This is an expected pair because it passes the properties from a page process to the updater (note 87 - 85 = 2, 2 means additional pre-defined properties such as category etc.). What?
The Looking at Rain_World_Wiki confirms reported activities, namely that it lost its user-defined properties because of the Who?Since there is no human interaction (last edit was 19:02, 24 June 2014 Audit Bee) recorded for "Rain World Wiki", who is making those updates from How?Transclusion ..., I don't really have time to go though all the templates and build a similar model therefore I would appreciated if you could post an simple example/description of how those master-page transclusion template scenarios are build and work together in WikiApiary so that I can try to create a simple example on my own (I don't have SF installed since it doesn't support Composer therefore any logic in a template or page should refrain from using SF functionality arraymap etc.). Preferably in a way so that I can copy and past. |
@thingles can you add an additional line for reporting in LinksUpdateConstructed.php wfDebugLog( ... )
+ wfDebugLog( 'smw-report' ,__METHOD__ . ' smw-semanticdata-status: ' . ( isset( $this->linksUpdate->mProperties['smw-semanticdata-status'] ) && $this->linksUpdate->mProperties['smw-semanticdata-status'] ? 'yes' : 'no' ) . "\n" ); I'm looking for entries with |
The way WikiApiary uses transclusion is that Rain World Wiki transcludes each of the subpages that exist. So, the extension data is written in the wikitext of Rain World Wiki/Extensions, but is never parsed there, it is an a The subpages do have edits. I believe that MediaWiki then fires off a job to refresh all pages that transclude that subpage. That is what triggers the update, I believe. You could easily replicate this by having page Foo with
and then Foo/Bar
Then edit Foo/Bar and see Foo should get refreshed. |
@mwjames I've added this and it’s now collecting. I'll let it run overnight and then share the details. |
@mwjames additional line added and full log available. Example lines I found that showed the behavior we saw before:
|
@thingles @kghbln Blame MW, or the jobqueue or the template but don't blame SMW for what is happening. With the following scenario I was able to recreate a LocalSettings (on a MediaWiki 1.23.0)
Template:Lorem_ipsum
Page: ExampleOfEmptyTemplateEdit
Page: ExampleOfEmptyTemplateEdit/TransclusionOfValueTemplate contains the actual template with values.
IssueIf I change Running |
@thingles After wrecking my head about this issue I looked at
ExperimentLet's see if my hunch produces some positive results by disabling the following lines where instead of relying on RefreshLinksJob.php (MW-core) /*
if ( isset( $this->params['rootJobTimestamp'] ) ) {
$skewedTimestamp = wfTimestamp( TS_UNIX, $this->params['rootJobTimestamp'] ) + 5;
if ( $page->getLinksTimestamp() > wfTimestamp( TS_MW, $skewedTimestamp ) ) {
// Something already updated the backlinks since this job was made
return true;
}
if ( $page->getTouched() > wfTimestamp( TS_MW, $skewedTimestamp ) ) {
$parserOutput = ParserCache::singleton()->getDirty( $page, $parserOptions );
if ( $parserOutput && $parserOutput->getCacheTime() <= $skewedTimestamp ) {
$parserOutput = false; // too stale
}
}
}
*/ At all times the actual Having a re-parse is vital in order for |
@mwjames sorry for the delay. I've enabled the logging again and have commented out the lines you requested in |
@mwjames Grabbed the log file after running for the day with the commented out block in RefreshLinksJob.php. I can’t seem to find any entries like the ones we saw before, but please verify on your end. Full log file. With that said though WikiApiary is still losing millions of properties just today. Of note, the longstanding Semantic Forms bug 51479 made an appearance (/cc @yaronkoren @s7eph4n) :
FYI: I will be away from a computer most of this weekend. |
The
It can't be, The skew calculation as seen in the |
/cc @hexmode as maybe he is familiar with some related changes in 1.23. Hasn't this been observed in 1.22 as well though? Or is that specifically just the race condition issue. |
@hexmode Mark I leave it to Mark to decide how to proceed but the easiest and probably most satisfying In if ( isset( $this->params['rootJobTimestamp'] &&
$wgUseRefreshJobSkewFactorCalculation ) ) { [0] Looking at the RefreshLinksJob.php 1.22.8 we can clearly see that skew factor calculation was introduced in MW 1.23 (hence the non-present in MW 1.22.8). [1] Issue can be reproduced by following #347 (comment) |
Haha. I wonder if this enables having fun :) |
What can I say. Blame it on evil globals... |
For details on the issue, see [0] as well as the method to replicate the problem. [0] #347 (comment)
For details on the issue, see [0] as well as the method to replicate the problem. [0] #347#issuecomment-48595047 Change-Id: I5cf085ecad30a9807a16642226897a6b7caf8245
@thingles @kghbln What is the status? I'd like for you to test out #405 which doesn't fix the issue in If you do check out this patch, revert all changes made in regards to this issue and only apply #405 and run Feedback is welcome so it can find its way into the 2.0/2.0.1 release. |
@thingles It will be cool if you could switch WikiAPIary to #405 as suggested. Currently I only have very few content on my test wiki and the effect will proably be easier to see at WikiAPIary. |
#405 has been merged into master, which means that if run EDIT: Please make sure that all changes from above to SMW/MW-core are reverted before running the |
@mwjames Thanks. I'll work on getting the update. On first attempt composer is complaining to me.
I need to dig into what is happening there. |
It seems the current master (as of 2.0) needs @JeroenDeDauw Any thoughts? |
Well, something is referring to |
WikiApiary is now running 2.0-rc3 (ref WikiApiary/WikiApiary#177). Regarding above I updated my composer.json to |
While #405 tries to mitigate the effect of #405 does a re-parse in case the If we still see those property values to decay then the only fix that avoid the issue is in core by doing #347 (comment) (well I hope not so but either |
On WikiApiary I’m seeing a large number of pages losing properties for no reason. This bug was originally highlighted at WikiApiary/WikiApiary#114 and has persisted now through a few iterations of master.
I've been seeing an increase in the number of data errors that bots are getting because properties that should be there are not. Looking at properties for WikiApiary for example shows:
That should have an "Is active" and "Is audited" property. They are in the template and in the page. If I manually force a "refresh" of the page the properties will appear again.
I’m seeing this a lot. Various tasks executing from cron or routinely throwing exceptions because mandatory properties have gone missing.
The text was updated successfully, but these errors were encountered: