DM-33970: Use real images in upload.py #15

kfindeisen · 2022-03-17T18:40:20Z

This PR does a lot of refactoring of upload.py to make it easier to edit and more flexible with its inputs, then adds partial support for "observing" preloaded raw files. Because some of the raw metadata is hardcoded into the get_samples function, this script will break if any other raws are uploaded to rubin-prompt-proto-unobserved.

Because of the heavy refactoring on the first part of the ticket, I recommend reviewing this PR one or a few commits at a time; the net diff combines all the changes into one big rewrite.

ktlim

I'm not exactly sure why the uploader needs to be so dynamic, but I'll let John figure that out. The rest looks fine. At some point we need to straighten out what's a group and what's a visit, but the distinction only matters in specialized cases.

ktlim · 2022-03-17T19:02:31Z

python/tester/upload.py

+                      ra=hsc_metadata[exposure_id]["ra"],
+                      dec=hsc_metadata[exposure_id]["dec"],
+                      kind="SURVEY",
+                      )


We're going to need to add rot here and also generate a fixed or random one above.

kfindeisen · 2022-03-17T19:16:18Z

I'm not exactly sure why the uploader needs to be so dynamic

Because, for me, modifying the existing code was easier than trying to figure out brand-new behavior (e.g., CLI) from scratch.

python/tester/upload.py

python/activator/visit.py

parejkoj · 2022-03-17T19:00:44Z

python/tester/upload.py

+        The client that posts ``next_visit`` messages.
+    bucket : `google.cloud.storage.Bucket`
+        The bucket to which to transfer the raws, once observed.
+    visit_infos : `set` [`activator.Visit`]


I'm a bit nervous calling these "visit_infos", since VisitInfo is a different thing that doesn't align with this very well (though maybe we should make it do so?). Why not just visits?

Because they're not visits either (in any of the senses of the word)! 😭

A name like "_info" feels natural to me, because these are blocks of metadata rather than identifiers (like visit in the Butler context). I'm willing to be flexible on the first part, though.

python/tester/upload.py

parejkoj · 2022-03-17T19:36:03Z

python/tester/upload.py

+
+            # TODO: may be cleaner to use a functor object than to depend on
+            # closures for the bucket and data.
+            def upload_dummy(visit, snap_id):


Maybe upload_faked_files instead of dummy? Or does that not even upload files at all, so just upload_faked_visits_no_files (which is too long, but...)

This does, in fact, upload a file -- most of the contents of -main were uploaded using this code (pre-refactor, of course).

python/tester/upload.py

parejkoj · 2022-03-17T19:41:41Z

python/tester/upload.py

+    instrument : `str`
+        The short name of the active instrument.
+    date : `str`
+        The current date in YYYYMMDD format.


Not a datetime or astropy date?

No. This is part of the original group generation code, and I don't want to touch it because then I'd need to come up with a different scheme for avoiding collisions between uploaded files.

This format is assumed by several other files.

This change moves the detector-handling logic from process_group to main, where it will be easier to customize for real test datasets.

This version of the code has a hardcoded dependency on specific HSC files, which will need to be removed later.

The uploaded files weren't conforming to the Prompt Processing path schema.

Without this fix, we would need to delete old files from the main bucket on every pass.

Current HSC raws use detectors 50 and 51 only.

kfindeisen requested a review from parejkoj March 17, 2022 18:46

ktlim approved these changes Mar 17, 2022

View reviewed changes

ktlim reviewed Mar 17, 2022

View reviewed changes

python/tester/upload.py Show resolved Hide resolved

parejkoj reviewed Mar 17, 2022

View reviewed changes

kfindeisen added 13 commits March 17, 2022 15:16

Factor exposure generation in upload.py.

4b2fede

Factor raw file format in upload.py.

8a3a117

This format is assumed by several other files.

Factor group lookup in upload.py.

b0cc959

Move ra/dec generation to other visit metadata.

51ff690

Make Visit hashable.

f4b1769

Keep observation info in Visit.

1828004

This change moves the detector-handling logic from process_group to main, where it will be easier to customize for real test datasets.

Pass Visit to send_next_visit.

84b954b

Factor upload strategy out of process_group.

8a59e70

Factor visit generation out of main.

de26346

Add preliminary support for uploading existing raws.

74ad168

This version of the code has a hardcoded dependency on specific HSC files, which will need to be removed later.

Fix bug in upload script.

4e06771

The uploaded files weren't conforming to the Prompt Processing path schema.

Allow raw files to be uploaded to new groups.

e6ee766

Without this fix, we would need to delete old files from the main bucket on every pass.

Fix get_last_group assuming detector 0.

efb9c4e

Current HSC raws use detectors 50 and 51 only.

kfindeisen force-pushed the tickets/DM-33970 branch from 4661aae to c96d1cb Compare March 17, 2022 21:05

kfindeisen added 2 commits March 17, 2022 16:10

Factor out the two upload algorithms from main.

44273f9

Add todo on next_visit delay.

9c99605

kfindeisen force-pushed the tickets/DM-33970 branch from c96d1cb to 9c99605 Compare March 17, 2022 22:12

kfindeisen merged commit a6b8322 into main Mar 17, 2022

kfindeisen deleted the tickets/DM-33970 branch March 17, 2022 22:14

DM-33970: Use real images in upload.py #15

DM-33970: Use real images in upload.py #15

Uh oh!

Conversation

kfindeisen commented Mar 17, 2022

Uh oh!

ktlim left a comment

Choose a reason for hiding this comment

Uh oh!

ktlim Mar 17, 2022

Choose a reason for hiding this comment

Uh oh!

kfindeisen commented Mar 17, 2022

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

parejkoj Mar 17, 2022

Choose a reason for hiding this comment

Uh oh!

kfindeisen Mar 17, 2022

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

parejkoj Mar 17, 2022

Choose a reason for hiding this comment

Uh oh!

kfindeisen Mar 17, 2022

Choose a reason for hiding this comment

Uh oh!

Uh oh!

parejkoj Mar 17, 2022

Choose a reason for hiding this comment

Uh oh!

kfindeisen Mar 17, 2022

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants