Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-28964: add incremental progress reporting to daf_butler #479

Merged
merged 7 commits into from Mar 18, 2021

Conversation

TallJimbo
Copy link
Member

This creates an abstract interface to allow at least daf_butler and some downstream middleware code to report progress in a way that would let higher-level tools show progress bars. It also makes a start at letting click-based command-line tools actually show those progress bars, but that's disabled by default as I'm not sure how it interacts with logging.

I haven't actually instrumented too many points in daf_butler - just a few known hot spots, like datastore transfers and loops over expandDataId in Registry - and there may not be much more we can do there, because we've actually done a pretty good job of vectorizing things, and that means a lot of slow operations are single commands from our perspective (and there's no way to e.g. get incremental progress on a query from the database). But if the middleware pundits are happy with the approach I've taken here, I'll go ahead and instrument at least BaseSkyMap.register, RawIngestTask, and ConvertRepoTask (on this ticket), where we certainly have lots of loops that would benefit from this kind of reporting.

@TallJimbo
Copy link
Member Author

One really obvious flaw here is that the tests are failing, and it's because the global progress-handler state I've introduced interacts poorly with pytest-xdist's multithreading. I thought had fixed that (and indeed, cannot reproduce this locally anymore) by using ContextVar to wrap the global, but maybe I misunderstood how that interacts with threading or how pytest-xdist actually works. In any case, I'm at a bit of loss for how best to isolate this test from all of the others, and I'm hoping someone has a suggestion.

Copy link
Member

@timj timj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The user interface looks fine. Given this is a "is this okay to continue" review request I'm not looking in detail. As for xdist, it forks a new python proces and is guaranteed to run setUpClass and setUp. It probably also runs setup_module. It's all meant to be distinct so I'm not sure what's going on.

python/lsst/daf/butler/datastores/fileDatastore.py Outdated Show resolved Hide resolved
@TallJimbo
Copy link
Member Author

As for xdist, it forks a new python process and is guaranteed to run setUpClass and setUp. It probably also runs setup_module. It's all meant to be distinct so I'm not sure what's going on.

Huh, that is at least not what I thought it did, so it will help me debug. It seems like it would have to be running other tests in the same process, between one TestCase's setUp and test invocation. But I should look more closely; maybe I missed a way this could happen just because I'm missing a tearDown.

@timj
Copy link
Member

timj commented Mar 1, 2021

I'm 99.9% sure that pytest guarantees that setUp runs immediately before the test is executed and then tearDown runs immediately after the test has completed. All the problems I've had with xdist relate to access to external resources like files.

Copy link
Member

@timj timj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks okay. My main comment is how this interacts with lsst.log but since you use it in obs_base it seems like it's all working fine.

general the name of a group of progress bars, not necessarily as single
one, and it should have the same form as a logger names: the
fully-qualified name of the class, method, or function, minus the
initial "lsst" (and without any "implemntation detail" packages or
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minus the inital "lsst" if outside of daf_butler....

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll just remove the parenthetical.

same name and level as ``self`` is enabled.
"""
if self._active_handler is not None:
logger = logging.getLogger(self._name)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this work for obs_base which uses lsst.log and not logging?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if it works to set the level for a "logger" that really corresponds to a progress bar via lsst.log; I used logging to do that (while still using lsst.log for actual logs). I think that's okay, because I'm really just using the logging system as a way to get a global mapping of name->level, but it probably is worth documenting.

python/lsst/daf/butler/core/progress.py Outdated Show resolved Hide resolved
python/lsst/daf/butler/core/progress.py Outdated Show resolved Hide resolved
python/lsst/daf/butler/core/progress.py Outdated Show resolved Hide resolved
python/lsst/daf/butler/core/progress.py Outdated Show resolved Hide resolved
tests/test_progress.py Outdated Show resolved Hide resolved
No low-level code has been implemented with this tooling yet, but once
it is, high-level code will be able to produce progress bars and
possibly other displays of how long various slow operations are taking.

This does include an option to turn on progress via a Click-based
progress bar in the butler command, but progress displays are off by
default even there.
@TallJimbo TallJimbo merged commit 386f0bb into master Mar 18, 2021
@TallJimbo TallJimbo deleted the tickets/DM-28964 branch March 18, 2021 19:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants