Post fine performance metrics to Spans #7885

crusaderky · 2023-06-06T18:27:42Z

Part of User-defined spans #7860
Closes Fine performance metrics: apportion to Computations #7776

Out of scope: make the Bokeh GUI span-aware

github-actions · 2023-06-06T19:32:42Z

Unit Test Results

See test report for an extended history of previous test failures. This is useful for diagnosing flaky tests.

      20 files ±  0       20 suites ±0 12h 6m 36s ⏱️ + 15m 27s
  3 679 tests +  4   3 565 ✔️ +  2   108 💤 ±0   6 ❌ +  2
35 580 runs +40 33 801 ✔️ +29 1 765 💤 +1 14 ❌ +10

For more details on these failures, see this check.

Results for commit f7721e1. ± Comparison against base commit e31c864.

This pull request removes 1 and adds 5 tests. Note that renamed tests count towards both.

distributed.tests.test_scheduler ‑ test_cumulative_worker_metrics

distributed.tests.test_spans ‑ test_worker_metrics
distributed.tests.test_worker_metrics ‑ test_no_spans_extension
distributed.tests.test_worker_metrics ‑ test_reschedule
distributed.tests.test_worker_metrics ‑ test_send_metrics_to_scheduler
distributed.tests.test_worker_metrics ‑ test_user_metrics_weird

♻️ This comment has been updated with latest results.

crusaderky · 2023-06-07T15:45:28Z

distributed/dashboard/components/scheduler.py

+        items = defaultdict(float)
+        for k, v in self.scheduler.cumulative_worker_metrics.items():
+            if isinstance(k, tuple) and k[0] == "get-data":
+                items[k] = v
+            elif isinstance(k, tuple) and k[0] == "execute":
+                items[k[:1] + k[2:]] += v  # sum all span_id's together
+        # Note: this sort works because we removed span_id's. Otherwise, it would crash
+        # when the spans extension is disabled and span_id's are None as a consequence.
+        items = sorted(items.items())


FYI @milesgranger
Support in bokeh for the spans is highly desirable but I'd rather leave it to a different PR.
XREF #7889

crusaderky · 2023-06-07T15:55:26Z

distributed/tests/test_scheduler.py

-
-    assert all(isinstance(value, float) for value in metrics.values())
-
-


moved to test_worker_metrics.py and overhauled

crusaderky · 2023-06-07T16:04:44Z

distributed/worker_state_machine.py

        self,
        task_name: str,
        func: Callable[P, Awaitable[StateMachineEvent]],
        /,
        *args: P.args,
+        span_id: str | None = None,


By the time the async instruction finishes, the task may have been forgotten, so we need to preserve the span_id. If we didn't, time wasted by work stealing would quietly disappear from all span-specific metrics.

crusaderky · 2023-06-07T18:27:18Z

A test is failing; investigating

  File "/Users/runner/work/distributed/distributed/distributed/spans.py", line 423, in heartbeat
    _, span_id, prefix, activity, unit = k
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: too many values to unpack (expected 5)

milesgranger

Overall seems good to me, barring the test failure you're already looking into.

milesgranger · 2023-06-08T10:51:02Z

distributed/spans.py

+        At the moment of writing, all keys are
+        ``("execute", <task prefix>, <activity>, <unit>)``
+        but more may be added in the future with a different format; please test for
+        ``k[0] == "execute"``.


Do you think it'd be possible later on to map everything to some dataclass or similar? There's an awful lot of tedious checks building up for key lengths / values to determine what's what.

it is certainly possible and sensible, yes

crusaderky · 2023-06-08T13:17:33Z

This is now blocked by and built on top #7894
[EDIT] merged

crusaderky · 2023-06-08T13:21:49Z

This is ready for final review. @milesgranger could you re-review distributed/dashboard/components/scheduler.py? I just did major changes to it.

milesgranger

The overhaul w/ get_metrics is an improvement IMO, thanks!

crusaderky self-assigned this Jun 6, 2023

crusaderky force-pushed the spans/metrics branch 5 times, most recently from 7b9b21a to 74a4a07 Compare June 7, 2023 14:33

crusaderky commented Jun 7, 2023

View reviewed changes

crusaderky mentioned this pull request Jun 7, 2023

Fix more Fine Performance Metrics bokeh crashes #7889

Closed

crusaderky commented Jun 7, 2023

View reviewed changes

crusaderky marked this pull request as ready for review June 7, 2023 16:06

crusaderky requested a review from fjetter as a code owner June 7, 2023 16:06

crusaderky requested a review from hendrikmakait June 7, 2023 16:06

crusaderky mentioned this pull request Jun 7, 2023

User-defined spans #7860

Closed

milesgranger approved these changes Jun 8, 2023

View reviewed changes

crusaderky force-pushed the spans/metrics branch from dd9abfa to f2e03ec Compare June 8, 2023 13:16

crusaderky marked this pull request as draft June 8, 2023 13:17

crusaderky force-pushed the spans/metrics branch from 32a59f2 to 93bd3c1 Compare June 8, 2023 13:20

Post fine performance metrics to spans

763b0c1

crusaderky force-pushed the spans/metrics branch from 93bd3c1 to 763b0c1 Compare June 8, 2023 13:20

crusaderky marked this pull request as ready for review June 8, 2023 13:20

milesgranger approved these changes Jun 8, 2023

View reviewed changes

crusaderky merged commit 3ab53fc into dask:main Jun 8, 2023
24 of 26 checks passed

crusaderky deleted the spans/metrics branch June 8, 2023 15:14

crusaderky mentioned this pull request Jun 16, 2023

Fix race condition in Fine Performance Metrics sync #7927

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Post fine performance metrics to Spans #7885

Post fine performance metrics to Spans #7885

crusaderky commented Jun 6, 2023 •

edited

github-actions bot commented Jun 6, 2023 •

edited

crusaderky Jun 7, 2023 •

edited

crusaderky Jun 7, 2023

crusaderky Jun 7, 2023 •

edited

crusaderky commented Jun 7, 2023

milesgranger left a comment •

edited

milesgranger Jun 8, 2023

crusaderky Jun 8, 2023

crusaderky commented Jun 8, 2023 •

edited

crusaderky commented Jun 8, 2023

milesgranger left a comment


		assert all(isinstance(value, float) for value in metrics.values())

Post fine performance metrics to Spans #7885

Post fine performance metrics to Spans #7885

Conversation

crusaderky commented Jun 6, 2023 • edited

github-actions bot commented Jun 6, 2023 • edited

Unit Test Results

crusaderky Jun 7, 2023 • edited

Choose a reason for hiding this comment

crusaderky Jun 7, 2023

Choose a reason for hiding this comment

crusaderky Jun 7, 2023 • edited

Choose a reason for hiding this comment

crusaderky commented Jun 7, 2023

milesgranger left a comment • edited

Choose a reason for hiding this comment

milesgranger Jun 8, 2023

Choose a reason for hiding this comment

crusaderky Jun 8, 2023

Choose a reason for hiding this comment

crusaderky commented Jun 8, 2023 • edited

crusaderky commented Jun 8, 2023

milesgranger left a comment

Choose a reason for hiding this comment

crusaderky commented Jun 6, 2023 •

edited

github-actions bot commented Jun 6, 2023 •

edited

crusaderky Jun 7, 2023 •

edited

crusaderky Jun 7, 2023 •

edited

milesgranger left a comment •

edited

crusaderky commented Jun 8, 2023 •

edited