New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DM-34355: Improve analysis_drp scatterPlot unit test #32
Conversation
8f25d08
to
5411692
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like a lot of this - what separates it from most of our other (disliked) hard-coded-numbers tests is that it has an easy-to-use flag that regenerates the reference data, and there are some really good ideas here for testing plot-generation that I never would have come up with on my own. It will still require frequent updates to the test data, but for the most part that looks easy enough to not be a burden on developers, who really should be looking at the plots they've changed before merging anything to main anyway. I've made one comment on the text comparisons that I hope could make them less fragile without much more work.
That said, I think the PNG-image comparison has to go - turning it into a warning just lets the images get out of sync, so that seems worse than an actual failing test rather than an improvement to me. The big problem IMO is the need to add a new PNG to the repo (git never forgets, so we're not replacing the old one!) whenever the plot changes; that just doesn't scale.
texts : `List[str]` | ||
A list of all text strings (title and axis/legend/tick labels). | ||
line_xys : `List[np.ndarray]` | ||
A list of all line _xy attributes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shape information would be very helpful to document here, too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does `List[numpy.ndarray]`, (N, 2)
make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can guess what that means, assuming the sphinx/documenteer logic doesn't mangle it, but I think it'd be just better to put it in the prose description, e.g.
A list of all line _xy attributes. | |
A list of all line ``_xy`` attributes. Each element is an array of shape ``(N, 2)``. |
tests/test_scatterPlot.py
Outdated
with open(filename, 'w') as f: | ||
f.writelines(f'{text}\n' for text in texts) | ||
|
||
self.assertTrue(filecmp.cmp(filename_texts_tmp, filename_texts_ref)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we could make this check a bit more robust to minor changes by replacing the full-file comparisons with logic that checks that the new plot's text lines (as a set
of str
lines) are a superset of those saved in the file. That would let us choose save just the lines we think are particularly important (e.g. by git interactive staging after running with resave=True
), and more importantly it would insulate us from matplotlib deciding to order the lines differently in response to e.g. figure elements being moved around or added to the figure in a different order.
tests/test_scatterPlot.py
Outdated
if resave: | ||
np.savetxt(filename, line) | ||
arr = np.loadtxt(filename) | ||
assert np.isclose(arr, line).flat |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure this is doing what you think it's doing? I would have thought coercing a .flat
access to a boolean would not automatically imply something like np.all
.
You could also just use assertFloatsAlmostEqual
(from lsst.utils.tests.TestCase
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I used assertFloatsAlmostEqual
as suggested, but the default tolerances failed on macOS at the 1e-12
level, which is somewhat concerning (I hadn't run Jenkins before). I set atol=1e-8, rtol=1e-10
for now, but I'm not sure how robust that's going to be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that the result is normally consumed by just human eyes, I'm not particularly bothered by really any loss of precision here (who knows what shenanigans matplotlib might be up to?) and would recommend tolerances around 1E-3
.
3ea70c2
to
faca789
Compare
faca789
to
e798d94
Compare
No description provided.