Skip to content

Conversation

hsinfang
Copy link
Collaborator

A first part of this ticket to get test data into s3://rubin:rubin-pp-users/central_repo

@hsinfang
Copy link
Collaborator Author

An obvious problem here is that some export scripts are somewhat similar but I haven't bothered to consolidate them. In particular this new script and make_hsc_rc2_export.py are quite close and I probably should combine them to one.

@hsinfang hsinfang requested a review from kfindeisen September 15, 2023 04:31
Copy link
Member

@kfindeisen kfindeisen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The initial script looks very clean and clear. However, I don't think I understand the --target-repo argument or the code that depends on it. Could you please clarify how it works, and when the caller might opt not to use it?

@hsinfang
Copy link
Collaborator Author

This script makes an export.yaml file and this export file may be used to populate an empty freshly created butler repo, or to add more datasets into an existing repo. The motivation I added the --target-repo argument was for the second case, and I wanted to skip the datasets that already exist in the target repo (s3://rubin:rubin-pp-users/central_repo to be precise). It turns out that the only existing datasets are refcats for what I'm doing in this ticket. That's expected; the script as-it was used to verify that.
I think I'm struggling a bit between making the script more generic versus for the very specific usage in mind. The script assumes a lot about the data already. So maybe checking every dataset doesn't make sense.

@hsinfang hsinfang force-pushed the tickets/DM-37387 branch 6 times, most recently from 4c834ca to 8651555 Compare September 19, 2023 16:39
@hsinfang
Copy link
Collaborator Author

@kfindeisen may you please take another look? Thanks to your feedbacks I think the script looks better now. It still assumes some peculiarities of /repo/embargo for those datasets and collections, but probably the script could be made useful relatively easily when we want to do something similar in the future.

Copy link
Member

@kfindeisen kfindeisen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks! Minor comments only.

@hsinfang hsinfang force-pushed the tickets/DM-37387 branch 2 times, most recently from 0e92228 to 07c3347 Compare September 19, 2023 23:20
This is used to export templates, refcats, skymap, and calib datasets
from /repo/embargo to s3://rubin:rubin-pp-users/central_repo/. This
also exports the LATISS/calib and LATISS/templates chains from
/repo/embargo.
If no target repo is given, make a temporary empty butler repo,
so all selected datasets would be exported.
@hsinfang hsinfang merged commit 3a8e392 into main Sep 19, 2023
@hsinfang hsinfang deleted the tickets/DM-37387 branch September 19, 2023 23:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants