DM-32058 Fail dataset type query early on non extant type #591

natelust · 2021-11-02T21:17:05Z

If a dataset type is used in queryDatasetTypes call, and it is a
type that is not known to the registry, the method should raise
instead of silently continuing. This was causing down stream
consumers of this method to get confusing registry issues.

Checklist

ran Jenkins
added a release note for user-visible changes to doc/changes

codecov · 2021-11-02T21:31:58Z

Codecov Report

Merging #591 (336a321) into master (1393936) will decrease coverage by 0.00%.
The diff coverage is 76.92%.

@@            Coverage Diff             @@
##           master     #591      +/-   ##
==========================================
- Coverage   83.52%   83.52%   -0.01%     
==========================================
  Files         241      241              
  Lines       30251    30256       +5     
  Branches     4515     4518       +3     
==========================================
+ Hits        25267    25270       +3     
- Misses       3788     3790       +2     
  Partials     1196     1196

Impacted Files	Coverage Δ
python/lsst/daf/butler/registries/remote.py	`0.00% <0.00%> (ø)`
python/lsst/daf/butler/registry/tests/_registry.py	`98.93% <ø> (ø)`
python/lsst/daf/butler/registries/sql.py	`81.54% <100.00%> (+0.11%)`	⬆️
python/lsst/daf/butler/registry/_registry.py	`72.31% <100.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1393936...336a321. Read the comment docs.

TallJimbo

This fixes the serious problem that users are seeing, and we should get it in the weekly one way or another.

As-is, I think this either leaves a branch deep in the query logic that can never fire, or that branch fires when DatasetType objects rather than string names are passed to queryDatasets, and then produces different behavior when the dataset type is not registered (exception vs. no results with diagnostics). Both of those possibilities (dead code and inconsistent behavior) are undesirable, but no results and exceptions are both better than wrong results, which is what we're getting on master now.

If we do decide that we want no results with diagnostics, as the behavior on master was supposed to be, then after fixing that (e.g. by catching the exception raised by queryDatasetTypes, which I think probably should raise) we'll want the original unit tests, and they'll pass without the workarounds added on this branch. If we want an exception instead, we should just replace that unit test code with a with self.assertRaises check; I don't think there's a scenario where the unit test workaround is what we want.

In the interest of getting this in the weekly, without leaving too much debt if a follow-up ticket takes time, I think I'd recommend just removing the now-failing test code and the UNIT_TEST_DATASET_TYPE logic added to support it. If the follow-up ticket changes the behavior to return no results instead of raising, we can get the original tests back from git. But if you're trying to get more than just into the weekly, please merge as-is and we'll try to make sure the followup doesn't take too long.

queryDatasetTypes does not yield dataset types that aren't registered, but queryDataIds needs to recognize the absence of any dataset type as dooming the entire query. Co-authored-by: Nate Lust <nlust@astro.princeton.edu>

TallJimbo

@natelust and I pair-coded what we believe is a better solution to this (including better at passing downstream tests). We approve each other's contributions.

TallJimbo approved these changes Nov 2, 2021

View reviewed changes

natelust force-pushed the tickets/DM-32058 branch from 645f9c3 to f04e8f6 Compare November 3, 2021 17:44

TallJimbo force-pushed the tickets/DM-32058 branch from f04e8f6 to 3510615 Compare November 4, 2021 01:07

Fix data ID queries with unregistered dataset types.

336a321

queryDatasetTypes does not yield dataset types that aren't registered, but queryDataIds needs to recognize the absence of any dataset type as dooming the entire query. Co-authored-by: Nate Lust <nlust@astro.princeton.edu>

TallJimbo force-pushed the tickets/DM-32058 branch from 3510615 to 336a321 Compare November 4, 2021 01:14

TallJimbo approved these changes Nov 4, 2021

View reviewed changes

TallJimbo merged commit 86adc55 into master Nov 4, 2021

TallJimbo deleted the tickets/DM-32058 branch November 4, 2021 22:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DM-32058 Fail dataset type query early on non extant type #591

DM-32058 Fail dataset type query early on non extant type #591

natelust commented Nov 2, 2021 •

edited by TallJimbo

codecov bot commented Nov 2, 2021 •

edited

TallJimbo left a comment •

edited

TallJimbo left a comment

DM-32058 Fail dataset type query early on non extant type #591

DM-32058 Fail dataset type query early on non extant type #591

Conversation

natelust commented Nov 2, 2021 • edited by TallJimbo

Checklist

codecov bot commented Nov 2, 2021 • edited

Codecov Report

TallJimbo left a comment • edited

Choose a reason for hiding this comment

TallJimbo left a comment

Choose a reason for hiding this comment

natelust commented Nov 2, 2021 •

edited by TallJimbo

codecov bot commented Nov 2, 2021 •

edited

TallJimbo left a comment •

edited