Skip to content

Conversation

kfindeisen
Copy link
Member

This PR reorganizes the raw file upload in uplaod_hsc_rc2.py to use a process pool that runs in parallel with the main "exposure-taking" thread. The result is that I/O is no longer a bottleneck for this program; it now runs in the time needed to perform the virtual slew and exposure operations (plus a few seconds for the last visit's upload).

This factoring gives the program more flexibility in how upload is
handled for large numbers of files.
This change allows the bucket to be re-initialized for each process
(it's not picklable), without being re-initialized for each task
(it's expensive).
upload_hsc_rc2.py needs to modify and upload roughly a hundred images
for each visit, and doing so serially is much too slow -- it takes
1-2 minutes to do a full visit, which is slower than the complete
system is supposed to run.
Experiments show that the main overhead is from initializing the
process pool itself, not from initializing the bucket. Chunk size adds
50-100% overhead if it's not optimal.
This change minimizes the pool startup overhead, by making sure it's
initialized exactly once.
Pool startup time increases with process count, but doing it in
parallel with the "exposure" makes it a non-issue. Processing time is
very sensitive to chunk size in non-obvious ways.
@kfindeisen kfindeisen requested a review from hsinfang September 8, 2023 18:05
@kfindeisen kfindeisen marked this pull request as ready for review September 8, 2023 18:05
This lets the uploader(s) run in parallel with the fake observations,
eliminating overhead and letting the exposures be generated at exactly
the advertised cadence.
Copy link
Collaborator

@hsinfang hsinfang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me and the speedup is awesome!

I was a bit surprised that boto3 doesn't have a built-in multithreading/miultiprocessing option, but a quick search seems to suggest so too and I'm sure you looked into it even more. The client, unlike resource, is thread safe but it's not clear to me if we can refactor the script to use just the client and whether that'd help.

The datasets to upload
"""
try:
max_processes = math.ceil(0.25*multiprocessing.cpu_count())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious why you chose 0.25 here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was pretty arbitrary -- I didn't want to use a large fraction of shared resources, even if it would only be in bursts of ~15 seconds. On the current rubin-devl, max_processes = 32.

@kfindeisen
Copy link
Member Author

kfindeisen commented Sep 11, 2023

I was a bit surprised that boto3 doesn't have a built-in multithreading/miultiprocessing option, but a quick search seems to suggest so too and I'm sure you looked into it even more. The client, unlike resource, is thread safe but it's not clear to me if we can refactor the script to use just the client and whether that'd help.

Even if we could, we might not want to use it, since thread-safe objects slow down processing even if they're not shared. The deciding issue was not any explicit parallel-processing support, but the fact that Bucket objects aren't picklable.

The previous cycle was slew, send next_visit, expose, upload. The
sequence next_visit, slew, expose, upload is more representative of
real observing procedures.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants