New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MW 1.31 ALL postgres tests are broken!! #3101
Comments
@mwjames, it would help us to know a bit more about the nature of the failures. The patch [0] fixed 10 test errors and 9 failures when run with PostgreSQL, while introducing no new errors or failures in our testing. Compare [1] and [2]. Brad speculates that maybe if you're passing false for $temporary to duplicateTableStructure(), then when CloneDatabase drops the table it won't drop the sequence and a subsequent run might throw an error at the point where it tries to re-create it. That's just a guess, but he has proposed a core patch to address that situation [3]. If that is not the issue, please provide more information. Please understand that our intent is to improve support for Postgres, not degrade it. You are testing unreleased software, which can be frustrating, and we appreciate your feedback. We will work, as always, to address issues that you find. What is not acceptable, as you know, is to impugn the professionalism of the developers on either side of this interaction. Cindy [0] https://gerrit.wikimedia.org/r/#/c/420251/ |
Well, I believe that is the common imperative of software development but when I come back each weekend and see that things are broken again and again I'm questioning that directive.
I'm aware of that and is the reason why we are writing tests which includes tests that are categorized as integration tests and not just unit tests that mock a connection.
I created [0] which runs with [1] (mw-master) and as of this writing [1] reports "There were 101 errors:" which is lower than the initial reported "There were 539 errors:" in numbers but still 101 errors too many. Those errors from [1] don't appear for any other system or release. To simplify and help the analysis, I also created [2] which runs the same SMW master but uses the MW 1.30 release as base and reports with "1 error" (which is linked to #2903). This should clarify that whatever changed between MW 1.30 and mw-master creates a massive failure which can not be accounted to SMW as it is used unaltered during the test run. You are free to use the [0] as branch to switch releases. [0] https://github.com/SemanticMediaWiki/SemanticMediaWiki/tree/postgres-issue-3101 [1] https://travis-ci.org/SemanticMediaWiki/SemanticMediaWiki/jobs/369790915
[2] https://travis-ci.org/SemanticMediaWiki/SemanticMediaWiki/jobs/369795775
PS: I only respond during a weekend to questions or inquiries. |
@mwjames, I asked one of our devs to take a look at the 101 remaining errors in the PHPUnit run. A summary of the cause of these errors is as follows:
The full analysis of each error follows: Error # 1: This seems to be the same failure you are already tracking at #2903. You seem to have identified the MediaWiki change that broke SMW's expectations there. TL;DR: In MySQL, SQLite, and MSSQL, the "sequence" for auto-incrementing ID columns is not separable from the ID column itself, while in PostgreSQL and Oracle the sequence is a separate database object. MediaWiki used to have a method that only needed to be called for PostgreSQL and Oracle to fetch the next ID from the sequence object so it could be used when inserting into the table and known afterwards, meaning that most code forgot to ever call that method because MySQL and Sqlite don't need it. We decided to deprecate and remove that method in favor of having MediaWiki's database later handle it all transparently for PostgreSQL and Oracle. It seems SMW's tests were somehow or other depending on that separation in PostgreSQL and will need to be fixed. Chances are there are 27 errors caused by this issue now versus only 1 when running with MediaWiki 1.30 because, prior to https://gerrit.wikimedia.org/r/c/420251/, later tests probably worked because they continued on incrementing the sequence after the one colliding ID instead of resetting the sequence each time the table is re-cloned. Errors # 2–44: These all seem to be because the database handle is in an error state from # 1 and was never rolled back. This "error state" functionality was added in https://gerrit.wikimedia.org/r/c/421496/ to make handling of database errors less error-prone. Somewhat ironically, the new behavior matches PostgreSQL's native behavior that MediaWiki had previously been working around in a hacky and half-broken manner. MediaWiki's MediaWikiTestCase class calls rollback() on the database handle if the test left a transaction open (i.e. because the test failed) so any such error state will get cleared for subsequent tests. But SMW doesn't include that rollback logic in each test's tearDown(), hence the error state persists until a rollback does happen. Error # 45: This is use of a deprecated function. SMW will need to be updated to not use it. Error # 46: This is another duplicate ID error, see error # 1. Errors # 47–73: These all seem to be because the database handle is in an error state from # 46 and was never rolled back. See error # 2. Errors # 74–92: More duplicate ID errors, see error # 1. Error # 93: This was caused by a change in MediaWiki, but not in the database layer. When we were updating code related to the import duplicate detection logic, we updated that logic to use the rev_sha1 field added 6 years go. And SMW is using a dump that's older than that in its import tests. A fix has been submitted to avoid the PostgreSQL database error as https://gerrit.wikimedia.org/r/c/428521/. Errors # 94–95: More duplicate ID errors, see error # 1 . Error # 96: Same as error # 93. Error # 97–99: More duplicate ID errors, see error # 1. Error # 100: Same as error # 93. Error # 101: Another duplicate ID error, see error # 1. |
@mwjames Are there any remaining Postgres test failures that you need assistance with? |
@mwjames Just checking in with respect to Postgres. If there are issues that
could use attention, please let me know.
Again, I spent a fair amount of time during the weekend, reviewing
mentioned patches (to find a remedy) and above comments but none
allowed this to restore the status quo before the tests were broke
therefore the only viable option remained #3397.
To be honest, I'm really tired of issue like this me spending hours
unproductively (in a sense that doesn't move the project forward) in a
queue on the conclusion to introduce fixes such as #3397 that make me
unhappy to a degree that questions my support for this in future.
…On 8/28/18, cicalese ***@***.***> wrote:
> @mwjames Are there any remaining Postgres test failures that you need
> assistance with?
@mwjames Just checking in with respect to Postgres. If there are issues that
could use attention, please let me know.
--
You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub:
#3101 (comment)
|
Running a weekly test sprint on mw-master/postgres locally I found that [0] broke the entire setup for almost every test.
Going back to [1] restores to last mentioned stats in #2905 (comment).
PS: Seriously, I don't know why this happens every week anew that things are broken beyond an acceptable level. We are all (or most) professionals that now and then make mistakes but this is becoming a constant irritation.
[0] wikimedia/mediawiki@e9553f6
[1] wikimedia/mediawiki@9688e5a
The text was updated successfully, but these errors were encountered: