New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: Fix broken test result TestAtlanticProfiles #671
Conversation
This changes the default tolerance for all image tests though doesn't it? Is that desirable? Would it be better just to override the tolerance parameter for this particular test? When I created the Atlantic profiles example I tested it on two machines both using matplotlib v1.2.0 and it passed on each (although I created it on one of them so it is bound to pass there). Since the example tests don't run on Travis it is particularly difficult to decide if a custom tolerance is required when creating an example. Did you just run this on your own machine? I'd be interested in the version of matplotlib used because I'm not really sure why the images would be so different. |
I'm using matplotlib v1.2.0 and running these tests on my own machine. Since this number cannot be predicted (we cannot know how hardware/software changes might effect the result) as you have found, what is important is encapsulating the set of tests with a suitable tolerance to reduce this occurrence of test failure in future. |
I was just concerned because |
@ajdawson I understand your concern, however, my point is that the current set tolerance may already be doing this. The value already being used was chosen on exactly the same basis as the one for this PR (making this chosen value no less valid than the one before it). The problem you describe is how graphics unittests are implemented, not I think on the change which is this PR. To restate, the tolerance is not calculated, cannot be predicted and differences are also subject to problem-specific variation. In answer to your concern, one possibility would be to write a set of unittests that describe the boundary between those differences which are acceptable and those which are not. However, writing such an exhaustive set of unittests may be unfeasible?! i.e. trying to encompasses all types of differences. |
Yeah I understand the issue and that it is a deeply rooted problem. I don't necessarily disagree with the change in this PR, I just wanted to make certain that is has been thought through and discussed properly.
I agree. |
This commit increases the graphic tolerance this specific tests. Visual comparison of the test failure has been made and confirmation that it is correct.
BUG: Fix broken test result TestAtlanticProfiles
@ajdawson - just checking, did you originally generate this file with mpl 1.3? |
I'm pretty much certain it was with 1.2, the version I use inside my iris development virtualenv. |
ok. thanks, just I came across this issue separately and assumed it had been a v1.3.0 issue. Ah, actually, could it be you're using v1.2.1? I fixed a pixel difference between v1.2.0 and v1.2.1 which could be having a similar impact. Cheers! |
I'm using v1.2.0, just checked. |
I guess I could have messed up somehow, but the image compares fine on another machine I have that uses v1.2.0 of matplotlib... |
@pelson - could it be something to do with the behaviour of |
Not sure tbh. If I get a chance, I'll have a go at reproducing the image exactly and then using git blame on mpl to figure out what feature has caused it. |
I've just discovered something worrying... on my system at least, the graphics test does not distinguish a difference between the correct version of the first plot and the same plot upside down! This is crazy because it moves the plot lines and the y-axis labels. I tested this by commenting out ... but no test failure. I thought that something my system/virtualenv may be broken so I tried modifying the plot in other ways (changed the color of the text in the top x-axis) and the test fails as expected. Can you think of any reason this may be happening? I fiddled with the tolerance and it made no difference... There is something might strange going on here! |
This is not the full story. I only reset tolerance to the test suite default. I tested again and made the tolerance smaller and I was able to get the test to fail, but get this, with a tolerance of 0.00043. I'm struggling to believe that these plots can't be distinguished at a tolerance higher than that... The existence of the very discussion I'm writing in suggests this cannot be correct! |
I think this is fixed by matplotlib/matplotlib#1291 - if I remember correctly, the RMS being computed incorrectly.... |
@pelson do you know which released version of matplotlib is that fix included in? |
Just learnt something new about git (from matplotlib clone):
Don't now how much I trust it, since v1.3.0 isn't actually in there, but looks like it may be v1.3.0 ... EDIT: I need to do a fetch before I make statements like that 😉. It is in v1.3.0. |
This commit increases the graphic tolerance for the tests. Visual
comparison of the test failure has been made and confirmation
that it is correct.