-
Notifications
You must be signed in to change notification settings - Fork 0
DM-33970: Use real images in upload.py #15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not exactly sure why the uploader needs to be so dynamic, but I'll let John figure that out. The rest looks fine. At some point we need to straighten out what's a group and what's a visit, but the distinction only matters in specialized cases.
ra=hsc_metadata[exposure_id]["ra"], | ||
dec=hsc_metadata[exposure_id]["dec"], | ||
kind="SURVEY", | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're going to need to add rot
here and also generate a fixed or random one above.
Because, for me, modifying the existing code was easier than trying to figure out brand-new behavior (e.g., CLI) from scratch. |
The client that posts ``next_visit`` messages. | ||
bucket : `google.cloud.storage.Bucket` | ||
The bucket to which to transfer the raws, once observed. | ||
visit_infos : `set` [`activator.Visit`] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit nervous calling these "visit_infos", since VisitInfo
is a different thing that doesn't align with this very well (though maybe we should make it do so?). Why not just visits
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because they're not visits either (in any of the senses of the word)! 😭
A name like "_info" feels natural to me, because these are blocks of metadata rather than identifiers (like visit
in the Butler context). I'm willing to be flexible on the first part, though.
python/tester/upload.py
Outdated
|
||
# TODO: may be cleaner to use a functor object than to depend on | ||
# closures for the bucket and data. | ||
def upload_dummy(visit, snap_id): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe upload_faked_files
instead of dummy? Or does that not even upload files at all, so just upload_faked_visits_no_files
(which is too long, but...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does, in fact, upload a file -- most of the contents of -main
were uploaded using this code (pre-refactor, of course).
instrument : `str` | ||
The short name of the active instrument. | ||
date : `str` | ||
The current date in YYYYMMDD format. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a datetime or astropy date?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. This is part of the original group generation code, and I don't want to touch it because then I'd need to come up with a different scheme for avoiding collisions between uploaded files.
This format is assumed by several other files.
This change moves the detector-handling logic from process_group to main, where it will be easier to customize for real test datasets.
This version of the code has a hardcoded dependency on specific HSC files, which will need to be removed later.
The uploaded files weren't conforming to the Prompt Processing path schema.
Without this fix, we would need to delete old files from the main bucket on every pass.
Current HSC raws use detectors 50 and 51 only.
4661aae
to
c96d1cb
Compare
c96d1cb
to
9c99605
Compare
This PR does a lot of refactoring of
upload.py
to make it easier to edit and more flexible with its inputs, then adds partial support for "observing" preloaded raw files. Because some of the raw metadata is hardcoded into theget_samples
function, this script will break if any other raws are uploaded torubin-prompt-proto-unobserved
.Because of the heavy refactoring on the first part of the ticket, I recommend reviewing this PR one or a few commits at a time; the net diff combines all the changes into one big rewrite.