New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

DM-33963: Add resource-stats gathering task. #15

Merged

TallJimbo merged 13 commits into main from tickets/DM-33963

Mar 25, 2022

Member

TallJimbo commented Mar 7, 2022

No description provided.

TallJimbo force-pushed the tickets/DM-33963 branch from 07d00db to 4d2302b Compare

March 7, 2022 19:59

TallJimbo commented

View reviewed changes

python/lsst/analysis/drp/gatherResourceStatistics.py Outdated Show resolved Hide resolved

timj reviewed

View reviewed changes

python/lsst/analysis/drp/gatherResourceStatistics.py Outdated Show resolved Hide resolved

TallJimbo force-pushed the tickets/DM-33963 branch 4 times, most recently from 41fd792 to d2f13e5 Compare

March 14, 2022 15:15

TallJimbo changed the title ~~Add resource-stats gathering task.~~ DM-33963: Add resource-stats gathering task.

TallJimbo force-pushed the tickets/DM-33963 branch from 6de0042 to be06824 Compare

March 14, 2022 17:47

natelust approved these changes

View reviewed changes

Contributor

natelust left a comment

Consider adding type annotations, at least where convenient. Even if nothing checks it now, each bit added now potentially makes things easier in the future.

python/lsst/analysis/drp/gatherResourceStatistics.py Outdated Show resolved Hide resolved

python/lsst/analysis/drp/gatherResourceStatistics.py Outdated



		class GatherResourceStatisticsTask(PipelineTask):
		"""A `PipelineTask` that gathers resource usage statistics from task

Contributor

natelust Mar 14, 2022

Should this be qualified to pipe_base, so that it does not try to link to the local import? I know there was some similar problem at some point, but I am not good at sphinx to know.

Member Author

TallJimbo Mar 24, 2022 •

edited

I think if there is a from lsst.pipe.base import PipelineTask then just PipelineTask will work, but if you import it another way you need to qualify it. Another reason to like that import style!

python/lsst/analysis/drp/gatherResourceStatistics.py Outdated Show resolved Hide resolved

python/lsst/analysis/drp/gatherResourceStatistics.py Outdated Show resolved Hide resolved

python/lsst/analysis/drp/gatherResourceStatistics.py Outdated Show resolved Hide resolved

python/lsst/analysis/drp/gatherResourceStatistics.py Outdated Show resolved Hide resolved

TallJimbo force-pushed the tickets/DM-33963 branch 2 times, most recently from 56b8b3c to 02bb26e Compare

March 21, 2022 14:31

Member Author

TallJimbo commented Mar 21, 2022

Consider adding type annotations, at least where convenient. Even if nothing checks it now, each bit added now potentially makes things easier in the future.

I thought about it when I started, but I did not want to start to impose my preferences on a package where most of the work is done by other developers, and after the experience in middleware I've grown wary of adding them without checking, because it's easy to get them wrong and a bad annotation is often worse than no annotation.

Contributor

natelust commented Mar 21, 2022

On the type annotations, I see what you are saying, I am happy for you to leave them out if you prefer. However, I am not quite sure I buy that argument, there is no more danger than the docstring being wrong, which is also something that is not checked or really enforced. At least with type annotations an author is more likely to updated them if an interface changes by virtue of being right next to the argument, and are useful when cross-checking the docstring.

TallJimbo force-pushed the tickets/DM-33963 branch from 33c142e to 8bcd0f3 Compare

March 22, 2022 14:07

TallJimbo marked this pull request as ready for review

March 22, 2022 14:09

timj reviewed

View reviewed changes

python/lsst/analysis/drp/gatherResourceStatistics.py Outdated Show resolved Hide resolved

natelust approved these changes

View reviewed changes

bin.src/build-gather-resource-usage-qg Show resolved Hide resolved

python/lsst/analysis/drp/gatherResourceUsage.py Outdated Show resolved Hide resolved

python/lsst/analysis/drp/gatherResourceUsage.py

+                  )
+                  def __init__(self, *, config):
+                      super().__init__(config=config)

Contributor

natelust Mar 24, 2022

I think you should validate the names were actually overloaded here, else PLACEHOLDER gets registered into butler repos if people run --register-dataset-types Which I think runs before graph builder to ensure the types that are registered are known to the butler.

Member Author

TallJimbo Mar 24, 2022

👍 Dataset type registration always runs after GraphBuilder, but worth doing anyway.

Contributor

natelust Mar 24, 2022

Is that true? I thought there was a situation the graph builder could get into if the types didn't exist yet (I know we fixed some bug related to that, but I thought there was additional undefined behavior by not registering first in some cases)

Member Author

TallJimbo Mar 25, 2022

There have certainly been lots of bugs and tricky behavior in that area, but we have always been consistent about not modifying the data repository at all until after QuantumGraph generation; I think we may even use a read-only butler at first to guarantee this.

python/lsst/analysis/drp/gatherResourceUsage.py Show resolved Hide resolved

python/lsst/analysis/drp/gatherResourceUsage.py Outdated Show resolved Hide resolved

python/lsst/analysis/drp/gatherResourceUsage.py

+                              outputs=outputs,
+                          )
+                          quanta_by_task_def[task_def] = {quantum}
+                      return QuantumGraph(quanta_by_task_def)

Contributor

natelust Mar 24, 2022

Do you think you want to check for cycles here, or subset to connected, and verify you have the expected number of independent chains (possibly not, I think you have just a bunch of independent quanta, but I wanted to raise it so it would be on your mind.)

Member Author

TallJimbo Mar 24, 2022

Should always be a bunch of independent quanta.

python/lsst/analysis/drp/gatherResourceUsage.py

+                      n_rows = len(handles_by_data_id)
+                      # Create a dict of empty column arrays that we'll ultimately make into
+                      # a table.
+                      columns = {

Contributor

natelust Mar 24, 2022

why use this dict of numpy arrays vs a recarray

Member Author

TallJimbo Mar 24, 2022

pandas.DataFrame is a transpose of recarray but pretty much identical to a dict of numpy arrays under the hood. It may still do a copy when I convert it to a DataFrame, but at least it won't need to do a transpose.

python/lsst/analysis/drp/gatherResourceUsage.py

+                      handle : `lsst.daf.butler.DeferredDatasetHandle`
+                          Butler handle for the metadata dataset; used to identify the
+                          metadata in diagnostic messages only.
+                      warned_about_metadata_version : `bool`

Contributor

natelust Mar 24, 2022

For some reason I thought warnings themselves could be configured to only be emitted once

Member Author

TallJimbo Mar 24, 2022

If so I'm not familiar with it, and the message does include the data ID so the messages wouldn't be identical anyway.

TallJimbo added 9 commits

March 25, 2022 11:03


          First-draft connections and configs for resource-stats gathering task.

89f9bbf

The tricky part here is the dynamic aspect of the connections. This
looks like it'll work, but hasn't been tested.


          Add task for consolidating resource usage from metadata into tables.

9b9085f


          Add methods to machine-generate pipelines for gathering resource stats.

5dae6a2

The actual CLI script will be going into drp_pipe, because the intent
is to only run it inside the SCons scripts there.


          Use new TaskDef classmethod to avoid duplicating file template.


          Rewrite GatherResourceStatistics to operate on a single input.

e8792ac

This drops the pipeline-building stuff; the pipelines it generated were
just too fragile, because they embedded the labels of other tasks in
this task's config.


          Add custom QuantumGraph generation logic and script.

133ce95


          Rename GatherResourceStatistics -> GatherResourceUsage.

cb7541f


          Refactor GatherResourceUsageTask.run for clarity.

d3bbd94

It now calls three private methods.


          Convert pex_config proxy list to a real list.

TallJimbo force-pushed the tickets/DM-33963 branch from 4a31f52 to 8627753 Compare

March 25, 2022 15:03

TallJimbo added 2 commits

March 25, 2022 11:12


          Use connection config templates in GatherResourceUsageTask.

1ebf4e3


          Convert defaultdict back to regular dict to avoid surprises.

670d87d

TallJimbo added 2 commits

March 25, 2022 11:13


          Use regex for matching metadata dataset type names.

8fbe37f


          Update to Python 3.8 in lint Github Action.

a8b2d46

TallJimbo merged commit 3ef4a34 into main

TallJimbo deleted the tickets/DM-33963 branch

March 25, 2022 19:15

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment