Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-33963: Add resource-stats gathering task. #15

Merged
merged 13 commits into from Mar 25, 2022
Merged

Conversation

TallJimbo
Copy link
Member

No description provided.

@TallJimbo TallJimbo force-pushed the tickets/DM-33963 branch 4 times, most recently from 41fd792 to d2f13e5 Compare March 14, 2022 15:15
@TallJimbo TallJimbo changed the title Add resource-stats gathering task. DM-33963: Add resource-stats gathering task. Mar 14, 2022
Copy link
Contributor

@natelust natelust left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding type annotations, at least where convenient. Even if nothing checks it now, each bit added now potentially makes things easier in the future.

python/lsst/analysis/drp/gatherResourceStatistics.py Outdated Show resolved Hide resolved


class GatherResourceStatisticsTask(PipelineTask):
"""A `PipelineTask` that gathers resource usage statistics from task
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be qualified to pipe_base, so that it does not try to link to the local import? I know there was some similar problem at some point, but I am not good at sphinx to know.

Copy link
Member Author

@TallJimbo TallJimbo Mar 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if there is a from lsst.pipe.base import PipelineTask then just PipelineTask will work, but if you import it another way you need to qualify it. Another reason to like that import style!

python/lsst/analysis/drp/gatherResourceStatistics.py Outdated Show resolved Hide resolved
python/lsst/analysis/drp/gatherResourceStatistics.py Outdated Show resolved Hide resolved
python/lsst/analysis/drp/gatherResourceStatistics.py Outdated Show resolved Hide resolved
python/lsst/analysis/drp/gatherResourceStatistics.py Outdated Show resolved Hide resolved
@TallJimbo TallJimbo force-pushed the tickets/DM-33963 branch 2 times, most recently from 56b8b3c to 02bb26e Compare March 21, 2022 14:31
@TallJimbo
Copy link
Member Author

Consider adding type annotations, at least where convenient. Even if nothing checks it now, each bit added now potentially makes things easier in the future.

I thought about it when I started, but I did not want to start to impose my preferences on a package where most of the work is done by other developers, and after the experience in middleware I've grown wary of adding them without checking, because it's easy to get them wrong and a bad annotation is often worse than no annotation.

@natelust
Copy link
Contributor

On the type annotations, I see what you are saying, I am happy for you to leave them out if you prefer. However, I am not quite sure I buy that argument, there is no more danger than the docstring being wrong, which is also something that is not checked or really enforced. At least with type annotations an author is more likely to updated them if an interface changes by virtue of being right next to the argument, and are useful when cross-checking the docstring.

@TallJimbo TallJimbo marked this pull request as ready for review March 22, 2022 14:09
bin.src/build-gather-resource-usage-qg Show resolved Hide resolved
python/lsst/analysis/drp/gatherResourceUsage.py Outdated Show resolved Hide resolved
)

def __init__(self, *, config):
super().__init__(config=config)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should validate the names were actually overloaded here, else PLACEHOLDER gets registered into butler repos if people run --register-dataset-types Which I think runs before graph builder to ensure the types that are registered are known to the butler.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Dataset type registration always runs after GraphBuilder, but worth doing anyway.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that true? I thought there was a situation the graph builder could get into if the types didn't exist yet (I know we fixed some bug related to that, but I thought there was additional undefined behavior by not registering first in some cases)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There have certainly been lots of bugs and tricky behavior in that area, but we have always been consistent about not modifying the data repository at all until after QuantumGraph generation; I think we may even use a read-only butler at first to guarantee this.

python/lsst/analysis/drp/gatherResourceUsage.py Outdated Show resolved Hide resolved
outputs=outputs,
)
quanta_by_task_def[task_def] = {quantum}
return QuantumGraph(quanta_by_task_def)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think you want to check for cycles here, or subset to connected, and verify you have the expected number of independent chains (possibly not, I think you have just a bunch of independent quanta, but I wanted to raise it so it would be on your mind.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should always be a bunch of independent quanta.

n_rows = len(handles_by_data_id)
# Create a dict of empty column arrays that we'll ultimately make into
# a table.
columns = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why use this dict of numpy arrays vs a recarray

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pandas.DataFrame is a transpose of recarray but pretty much identical to a dict of numpy arrays under the hood. It may still do a copy when I convert it to a DataFrame, but at least it won't need to do a transpose.

handle : `lsst.daf.butler.DeferredDatasetHandle`
Butler handle for the metadata dataset; used to identify the
metadata in diagnostic messages only.
warned_about_metadata_version : `bool`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For some reason I thought warnings themselves could be configured to only be emitted once

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If so I'm not familiar with it, and the message does include the data ID so the messages wouldn't be identical anyway.

The tricky part here is the dynamic aspect of the connections. This
looks like it'll work, but hasn't been tested.
The actual CLI script will be going into drp_pipe, because the intent
is to only run it inside the SCons scripts there.
This drops the pipeline-building stuff; the pipelines it generated were
just too fragile, because they embedded the labels of other tasks in
this task's config.
It now calls three private methods.
@TallJimbo TallJimbo merged commit 3ef4a34 into main Mar 25, 2022
@TallJimbo TallJimbo deleted the tickets/DM-33963 branch March 25, 2022 19:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants