freeze degree string representation #2347

volodymyrss · 2022-03-29T09:20:37Z

For some reason, the default representation depends on astropy and/or python version.

codecov · 2022-03-29T14:53:25Z

Codecov Report

Merging #2347 (a1beaaf) into main (db3f08b) will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##             main    #2347   +/-   ##
=======================================
  Coverage   63.40%   63.40%           
=======================================
  Files         131      131           
  Lines       17044    17044           
=======================================
  Hits        10807    10807           
  Misses       6237     6237

Impacted Files	Coverage Δ
astroquery/heasarc/core.py	`73.79% <ø> (ø)`

📣 Codecov can now indicate which changes are the most critical in Pull Requests. Learn more

eerovaher · 2022-03-29T16:23:29Z

If the failures were indeed caused by using some particular versions of external software then why were the tests failing only in this subpackage?

volodymyrss · 2022-03-29T16:29:44Z

If the failures were indeed caused by using some particular versions of external software then why were the tests failing only in this subpackage?

This subpackage selects mock request data by constructing filename from the request. This way all of the remote tests can have a local version.
In some cases, request is subtly changed due to some external changes. In this case, there are three options:

store mock data for the new request.
- This is possible and may be feasible if we want to handle the case of a potentially meaningfully different request.
use the unmodified mock data for the modified request.
- this may case the request to break on real remote that, since what was used for the mock data is not the same as the one actually sent
modify the request so that it is always the same
- this is what I've done. Formatting of the degrees is more consistent with fixed precision.

eerovaher · 2022-03-29T16:50:28Z

I was wondering what makes this subpackage different from all the rest. I think your explanation sounds reasonable.

But if the problem can be solved by improving the robustness of generating filenames from the requests then why is it necessary to update thousands of lines in data files? If the precision of numerical values would be explicitly set to what it is with the combinations of software that did not cause the tests to fail then the generated filenames should be the same as they have been so far, and there should be no need to modify the mock data at all.

bsipocz · 2022-03-29T18:48:21Z

I share the concern about regenerated data, do you expect them to be stable after this PR? For the long term we need a solution where mock data may change only if there is a server-side API overhaul and not when more data gets added, etc.

volodymyrss · 2022-03-29T19:26:17Z

I was wondering what makes this subpackage different from all the rest. I think your explanation sounds reasonable.

But if the problem can be solved by improving the robustness of generating filenames from the requests then why is it necessary to update thousands of lines in data files? If the precision of numerical values would be explicitly set to what it is with the combinations of software that did not cause the tests to fail then the generated filenames should be the same as they have been so far, and there should be no need to modify the mock data at all.

I think you are referring to the option 2 in my list above: using the unchanged mock data for changed requests, by making filename generation insensitive to some changes of request, right?
Note that I took this approach in couple of cases, see here

astroquery/astroquery/heasarc/tests/test_heasarc_remote_isdc.py

Lines 77 to 78 in db3f08b

    
           patch_get.assume_fileid_for_request( 
        
               lambda url, params: f"last-month-{params['tablehead'].split()[-1]}")

. But in general it can be problematic, since it may create divergence between remote tests and local tests of the same functionality.

I instead went for option 3 - making the request the same for the same user input, with different software versions.
In fact, the output could potentially be meaningfully different for different requests. So user might get meaningfully different results for the same user code with different astropy/python versions. It's good to fix this.
Using request hash as filename helps to catch this kind of cases.

For changing files (also responding to @bsipocz), there are several factors:

since some requests are now slightly different (and, hopefully, the same for all versions), it's not unreasonable to keep them in different files, since the output for different request is also different. The content is either identical or slightly different (see below). This accounts for the 1000-line changes. I could make sure the
In some cases, the metadata at the service changed. E.g. now there is new table "Fermi LAT 12-Year Point Source Catalog (4FGL-DR3)", and there is IXPE. I guess we can assume this does not happen often - with major releases of datasets, new missions, etc. Should such a change be updated in the mock data? I could just revert to the previous file, with the same name.
Some responses are actually a bit random. I could keep the previous version with little effort, since the filename is the same.

bsipocz · 2022-03-29T19:41:42Z

I think all of the above should be covered by the remote data tests, the mock ones is fine if to be limited to ensure _parse_result(), and other astroquery internals work as expected. E.g. throwing a votable or a fits file to it the method processes it properly. It really is irrelevant that the actual content is roughly right for the mock query or exactly right.
Since we already have many files in the repo it's fine to keep them as per your first point, but I don't think we need to track slight changes. A fits file is a fits file in this context, the main points to be tested is that it reads incorrectly, and e.g. units are parsed well (that's a current know bug).

As for your second example, we're in that situation with skyview, and it's an annoyance really. Releases change, not too often, but often enough, so we're in a constant catch-up with our test suite without really serving any real purpose. Would be nice to not be exactly sensitive on exact matches.

And for random responses, I would say the same as above, if the structure is the same, and one that is expected then it doesn't matter. The big things to look for that if we run into an empty return, or a truncated one or a dropped column when we expect full results, etc.

volodymyrss · 2022-03-29T19:50:05Z

Alright, as I try to elaborate in #2348 , there is really no special reason in this parameterized fixture approach I had to change most of the files. I just wanted to make sure that the code will continue to be able to read new responses. But given the disadvantages, I will revert them. Actually, I could just close this PR, and continue in #2348, as you suggested elsewhere. The comments become similar too.

edit: maybe, to fix the tests quickly, we could revert all the data changes in this PR, and but keep the change to the request - it's an actual fix for the module functionality. And then we can proceed in the other PRs.

astroquery/heasarc/core.py

eerovaher · 2022-03-30T16:01:18Z

The commits have to be squashed because otherwise the data files that were added in the first commit will remain in the repository.

volodymyrss · 2022-03-30T19:25:16Z

The commits have to be squashed because otherwise the data files that were added in the first commit will remain in the repository.

Done!

bsipocz

Thank you!

bsipocz · 2022-03-30T19:36:51Z

astroquery/heasarc/core.py

@@ -171,7 +171,7 @@ def query_region_async(self, position: Union[coordinates.SkyCoord, str],
        # Generate the request


you may want to leave a one-liner comment as a reminder why formatting is applied here, otherwise it's likely someone would wonder a few years down the line 😅

Sure, done. I did not manage in one line, only two, but if you have better ideas - please suggest!

multi-liner is fine, too, I just meant that it can be just a very short reminder :)

keflavich approved these changes Mar 29, 2022

View reviewed changes

keflavich mentioned this pull request Mar 29, 2022

Improve developer experience with finding and comparing mock requests #2348

Draft

volodymyrss mentioned this pull request Mar 29, 2022

Fixed failing test in HEASARC module, adding local test data #2250

Merged

volodymyrss force-pushed the fix-degree-representation branch from 5619d32 to 19f43d0 Compare March 29, 2022 14:44

bsipocz added the heasarc label Mar 29, 2022

eerovaher reviewed Mar 29, 2022

View reviewed changes

astroquery/heasarc/core.py Outdated Show resolved Hide resolved

volodymyrss force-pushed the fix-degree-representation branch from 98f4d59 to 8802f84 Compare March 30, 2022 19:09

volodymyrss added 2 commits March 30, 2022 21:22

freeze rounding

183d06d

freeze rounding in heasarc query, adapted data files

714e6fd

volodymyrss force-pushed the fix-degree-representation branch from 8802f84 to 714e6fd Compare March 30, 2022 19:22

bsipocz added this to the v0.4.7 milestone Mar 30, 2022

bsipocz added the Affects-dev label Mar 30, 2022

bsipocz approved these changes Mar 30, 2022

View reviewed changes

comment about fixed representation

a1beaaf

bsipocz merged commit c081ee4 into astropy:main Mar 30, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

freeze degree string representation #2347

freeze degree string representation #2347

volodymyrss commented Mar 29, 2022

codecov bot commented Mar 29, 2022 •

edited

eerovaher commented Mar 29, 2022

volodymyrss commented Mar 29, 2022

eerovaher commented Mar 29, 2022

bsipocz commented Mar 29, 2022

volodymyrss commented Mar 29, 2022

bsipocz commented Mar 29, 2022

volodymyrss commented Mar 29, 2022 •

edited

eerovaher commented Mar 30, 2022

volodymyrss commented Mar 30, 2022

bsipocz left a comment

bsipocz Mar 30, 2022

volodymyrss Mar 30, 2022

bsipocz Mar 30, 2022

		@@ -171,7 +171,7 @@ def query_region_async(self, position: Union[coordinates.SkyCoord, str],
		# Generate the request

freeze degree string representation #2347

freeze degree string representation #2347

Conversation

volodymyrss commented Mar 29, 2022

codecov bot commented Mar 29, 2022 • edited

Codecov Report

eerovaher commented Mar 29, 2022

volodymyrss commented Mar 29, 2022

eerovaher commented Mar 29, 2022

bsipocz commented Mar 29, 2022

volodymyrss commented Mar 29, 2022

bsipocz commented Mar 29, 2022

volodymyrss commented Mar 29, 2022 • edited

eerovaher commented Mar 30, 2022

volodymyrss commented Mar 30, 2022

bsipocz left a comment

Choose a reason for hiding this comment

bsipocz Mar 30, 2022

Choose a reason for hiding this comment

volodymyrss Mar 30, 2022

Choose a reason for hiding this comment

bsipocz Mar 30, 2022

Choose a reason for hiding this comment

codecov bot commented Mar 29, 2022 •

edited

volodymyrss commented Mar 29, 2022 •

edited