Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-43109: Add update-day-obs command #36

Merged
merged 2 commits into from
Mar 12, 2024
Merged

DM-43109: Add update-day-obs command #36

merged 2 commits into from
Mar 12, 2024

Conversation

timj
Copy link
Member

@timj timj commented Mar 5, 2024

Checklist

  • added documentation for a new migration script

Copy link

codecov bot commented Mar 5, 2024

Codecov Report

Attention: Patch coverage is 17.18750% with 53 lines in your changes are missing coverage. Please review.

Project coverage is 52.39%. Comparing base (8e3643d) to head (7a9c064).

Files Patch % Lines
...n/lsst/daf/butler_migrate/script/update_day_obs.py 17.85% 46 Missing ⚠️
python/lsst/daf/butler_migrate/cli/cmd/commands.py 0.00% 5 Missing ⚠️
...ython/lsst/daf/butler_migrate/cli/opt/arguments.py 0.00% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main      #36      +/-   ##
==========================================
- Coverage   54.35%   52.39%   -1.97%     
==========================================
  Files          33       34       +1     
  Lines        1148     1212      +64     
  Branches      255      271      +16     
==========================================
+ Hits          624      635      +11     
- Misses        490      543      +53     
  Partials       34       34              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Collaborator

@andy-slac andy-slac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. For scalability It maybe better to move all visit logic inside the exposure batch loop, but it's up to you.

visit_defs = butler.registry.queryDimensionRecords(
"visit_definition",
where="exposure in (exps)",
bind={"exps": list(exposures_to_be_updated)},
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This potentially needs to be in batches too, but if you are sure that it's not going to be large then it's OK.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been assuming that queryDimensionRecords would be clever enough to do the batching internally if the list is too long, but if you think that's going to be problematic I can chunk it externally.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My concern here is that WHERE exposure in (...) will become very long, and we do not batch that expression internally.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will chunk. I think we do batch internally in some cases and there might even be a ticket for doing it with dimension records.

This implementation is a proof of concept that uses standard
butler APIs to do the updates. It may be too slow for large
repositories.
@timj timj merged commit 793c232 into main Mar 12, 2024
10 of 12 checks passed
@timj timj deleted the tickets/DM-43109 branch March 12, 2024 16:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants