DM-4957: Generate JSON output from validate_drp for inclusion in a test harness #7

wmwv · 2016-02-14T23:40:34Z

Generates individual JSON output files for each of the 5 Key Project metrics:
AM1, AM2, AM3, PA1, PA2

Re-organized and standardized the calculation, passing around, printing, plotting, and saving of the key projects metric and associated information.

Add makePrint, makePlot, makeJson keyword options to validate.run. Default to True, but will allow future more specific usage. Document new make* options. Specifying outputPrefix overrides repoNametoPrefix generation of the plot and JSON file prefix strings.

Use .tolist() instead of list() to convert numpy.ndarrays. .tolist() works for multi-dimensional numpy.arrays list() just converts the first dimension into a list. -- so you could get a list of numpy.ndarray from a 2D list. instead of the list of lists you get with .tolist() Rename magrange->magRange for consistency.

A struct is now return from the result of each calculation and that struct is then passed to print* or plot* functions. Refactor calcAMx, printAMx to move calculations to calc. printAMx is now just a calculation-free printing function. Removes need for printAM1, printAM2, printAM3 wrappers. magrange -> magRange Remove no-longer relevant plotAMx tests.

Gracefully catch new ValidateErrorNoStars for AM1, AM2, AM3. The astrometric repeatability looks a different distances: 5, 20, and 200 arcminutes. It quite possible that a small field, or single chip will not have the larger of these distances. ValidateErrorNoStars is raised for these cases and caught by the main validate.py, which then skips further printing, plotting, or saving of the AMx that was not calculable. Add `util.calcOrNone` function that catches a specified error and returns None if that error is raised. Otherwise returns the result of the function. Passes through any other errors.

Remove `pipe_tasks` from deps. Only `pipe.base` is needed.

Previously plotting and printing were each re-calculating PA1. Plotting was doing it once, while printing was doing in 50 times. New centralization standardizes and more clearly separates each of calculation, ploting, and printing and will allow for saving to JSON. The name of the Key Performance Metric is now stored in the .name field of Structs returnedy from calc* routines in calcSrd.py Improved printing of PA1. Uses srdSpec to look up target goals instead of using hardcoded values.

Units are just stored as simple strings for ease of reference and use in output functions. E.g., 'mmag', 'mag', 'mas', 'arcmin'

AM1, AM2, AM3, PA1, and PA2 metrics are now all saved as JSON files. Replace saveAmxToJson with general saveKpmToJson.

timj · 2016-02-15T17:27:28Z

I have a general comment that I don't really agree that general python class files should be executable with shebangs. There is an implication that this means they should be able to run on their own but I don't think that is true. check.py and validate.py are executable but don't include a main. Is the shebang being used to convince the editor that it's python code? Doesn't the .py file suffix suffice for that?

timj · 2016-02-15T17:27:52Z

python/lsst/validate/drp/base.py

@@ -26,3 +26,7 @@
 class ValidateError(Exception):
    """Base classes for exceptions in validate_drp."""
    pass
+
+class ValidateErrorNoStars(ValidateError):
+    """Base classes for exceptions in validate_drp."""


I don't think this comment is correct.

(I don't see how to comment on the general comment). The way that emacs identifies files is suffix, but failing that a comment of the form -*- python -*-, not the #! line

I think you just add a PR-wide comment.

wmwv · 2016-02-15T18:26:24Z

Totally agree about no !/usr/bin/env python lines for the individual non-executable Python files. Thank you, this was just leftover from copying the header from other files.

These shebang lines have been removed from the routines not meant to be executed.

Document 'outputPrefix'. Relabeled docstring 'Inputs'->'Parameters'. Updated README to note that the requirement is `pipe_base` not `pipe_tasks`. Whitespace fixes.

wmwv · 2016-02-15T19:48:09Z

JSON test updated to real unittest. Thanks for catching this oversight.

timj · 2016-02-15T19:49:57Z

tests/testJson.py

+
+if __name__ == "__main__":
+    if "--display" in sys.argv:
+        display = True


The only mention of this variable is in this line. How does it have an effect?

To be honest, I don't know. I copied this in from some test code from some other package. I thought it was standard boilerplate for running tests. It doesn't really make sense.

I think it would be less confusing if it was removed.

Agreed. See the commit I snuck into DM-5121 to improve this test.

f325eec

ktlim · 2016-02-15T20:56:28Z

python/lsst/validate/drp/io.py

+import numpy as np
+
+
+def saveKpmToJson(KpmStruct, filename):


I think there was some prior conversation on this topic, which I appear to have missed, but I'm worried about 1) non-use of the Butler, 2) use of an explicit file format (what is the other side of this interface, if known?), and 3) use of an explicit file format other than YAML (effectively adding an additional dependency to the stack).

You are right about the JSON in that I saw JSON but kept reading YAML (and it uses pyyaml).

@KTL For context, I understand the motivations behind your questions and don't really disagree.

But for the purposes of this Pull Request:
As part of DM-2050 I was told by @frossie to have validate_drp output to JSON, so that's what I did. Happy to have a further discussion about how DM-4957 should have been implemented, but it's out of scope for the Pull Request.

A few notes:

json is in the Python standard library. There is no additional dependency being added by using JSON.

JSON is a subset of YAML.

I got a bit confused myself about YAML vs. JSON when answering @timj 's question above about implementing this in pipe.base.Struct (which is also beyond scope -- and a partial answer may be that pipe_base doesn't want to do do any serialization).

There is no current "other side of this interface" (DM-2050).

timj · 2016-02-15T21:04:51Z

@wmwv I think you merged this code even though there were open comments on the new test code.

wmwv · 2016-02-15T23:34:30Z

I did. I didn't realize that your original request was to review the test code.

wmwv added 10 commits February 14, 2016 18:36

Fix testLoading test, which was failing due to wrong import.

074605d

Linter fixes.

a64f3c4

Remove `pipe_tasks` from deps. Only `pipe.base` is needed.

Store and use units (as strings) in Struct, print, plot, save.

3642b13

Units are just stored as simple strings for ease of reference and use in output functions. E.g., 'mmag', 'mag', 'mas', 'arcmin'

Save PA1 and PA2 to JSON files on output.

c137b23

AM1, AM2, AM3, PA1, and PA2 metrics are now all saved as JSON files. Replace saveAmxToJson with general saveKpmToJson.

Whitespace fixes.

caabb2b

wmwv force-pushed the tickets/DM-4957 branch from 611264f to caabb2b Compare February 15, 2016 02:42

timj reviewed Feb 15, 2016
View reviewed changes

wmwv added 2 commits February 15, 2016 14:42

Update JSON test to a real unittest.

fac1013

Improve documentation in response to PR comments.

ed74ad9

Document 'outputPrefix'. Relabeled docstring 'Inputs'->'Parameters'. Updated README to note that the requirement is `pipe_base` not `pipe_tasks`. Whitespace fixes.

wmwv force-pushed the tickets/DM-4957 branch from 0d21d1a to ed74ad9 Compare February 15, 2016 19:47

timj reviewed Feb 15, 2016
View reviewed changes

wmwv merged commit ed74ad9 into master Feb 15, 2016

ktlim reviewed Feb 15, 2016
View reviewed changes

ktlim deleted the tickets/DM-4957 branch August 25, 2018 06:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DM-4957: Generate JSON output from validate_drp for inclusion in a test harness #7

DM-4957: Generate JSON output from validate_drp for inclusion in a test harness #7

wmwv commented Feb 14, 2016

timj commented Feb 15, 2016

timj Feb 15, 2016

RobertLuptonTheGood Feb 15, 2016

timj Feb 15, 2016

wmwv commented Feb 15, 2016

wmwv commented Feb 15, 2016

timj Feb 15, 2016

wmwv Feb 15, 2016

timj Feb 16, 2016

wmwv Feb 16, 2016

ktlim Feb 15, 2016

timj Feb 15, 2016

wmwv Feb 15, 2016

timj commented Feb 15, 2016

wmwv commented Feb 15, 2016

DM-4957: Generate JSON output from validate_drp for inclusion in a test harness #7

DM-4957: Generate JSON output from validate_drp for inclusion in a test harness #7

Conversation

wmwv commented Feb 14, 2016

timj commented Feb 15, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wmwv commented Feb 15, 2016

wmwv commented Feb 15, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

timj commented Feb 15, 2016

wmwv commented Feb 15, 2016