Skip to content

Conversation

kfindeisen
Copy link
Member

This PR provides scripts for distilling ap_verify_ci_hits2015 into only the data we want to put into the Prompt Processing "initial" repo. Note that make_remote_butler.py can only be run on Google Cloud.

@kfindeisen kfindeisen requested a review from ktlim March 14, 2022 20:19

args = _make_parser().parse_args()
seed_config = lsst.daf.butler.Config(args.seed_config)
logging.info("Creating repository at %s...", args.target_repo)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does the make_local_butler.py file exist? Isn't that identical to this version if the default value for --target-repo was "."? Have you considered making this command a butler pluggable command so that you could make use of the normal butler --help click infrastructure? (it's butler create + butler import).

Copy link
Member Author

@kfindeisen kfindeisen Mar 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not believe that integrating what should be a local, highly specific script into the butler ecosystem would be beneficial, even if we had the time to do so.

We are deliberately not using the butler command-line itself (i.e., this is not a shell script) because we need to understand what our overheads are, and the butler utility loses a lot of time on package import and Butler setup.

Copy link
Contributor

@ktlim ktlim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with Tim's comments as well.

"""
butler = daf_butler.Butler(repo)
with butler.export(format="yaml") as contents:
# Need all detectors, even those without data, for visit definition
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand this comment. I'd think we could pretend for the purposes of any given repo that a camera has a subset of its actual detectors.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is something carried over from ap_verify. I don't know if it is still true, but visit definition used to fail if any of the detector IDs in the instrument (camera?) definition were missing from the registry.

The file can be used to create a reproducible dump of any ap_verify
dataset's preloaded repo.
@kfindeisen kfindeisen force-pushed the tickets/DM-33937 branch 3 times, most recently from cff1146 to 8afbfaa Compare March 16, 2022 19:09
@kfindeisen kfindeisen merged commit fdd5447 into main Mar 16, 2022
@kfindeisen kfindeisen deleted the tickets/DM-33937 branch March 16, 2022 19:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants