-
-
Notifications
You must be signed in to change notification settings - Fork 750
Make disk access in system monitor configurable #6537
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
jrbourbeau
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @mrocklin -- I expect CI will complain that we need to add a corresponding entry to distributed-schema.yaml
Disk access is of marginal value, and somewhat of a pain to deal with. This makes it configurable. We might also ask ourselves if we should turn it off by default.
|
Ah indeed. I had forgotten about those tests.
…On Wed, Jun 8, 2022 at 4:50 PM James Bourbeau ***@***.***> wrote:
***@***.**** commented on this pull request.
Thanks @mrocklin <https://github.com/mrocklin> -- I expect CI will
complain that we need to add a corresponding entry to
distributed-schema.yaml
<https://github.com/dask/distributed/blob/main/distributed/distributed-schema.yaml>
—
Reply to this email directly, view it on GitHub
<#6537 (review)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AACKZTHCWAJSCXH3DLDLALTVOEIRZANCNFSM5YH6YNAA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
jrbourbeau
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @mrocklin, this looks reasonable to me. The only issue I can see is existing code that expects disk information to always be reported. For example, if I go to the /individual-workers-disk-timeseries figure in the dashboard when distributed.admin.system-monitor.disk is set to False I get a 500: Internal Server Error error on the browser and
2022-06-08 16:57:28,292 - tornado.application - ERROR - Uncaught exception GET /individual-workers-disk-timeseries (127.0.0.1)
HTTPServerRequest(protocol='http', host='127.0.0.1:8787', method='GET', uri='/individual-workers-disk-timeseries', version='HTTP/1.1', remote_ip='127.0.0.1')
Traceback (most recent call last):
File "/Users/james/mambaforge/envs/distributed/lib/python3.9/site-packages/tornado/web.py", line 1704, in _execute
result = await result
File "/Users/james/mambaforge/envs/distributed/lib/python3.9/site-packages/bokeh/server/views/doc_handler.py", line 54, in get
session = await self.get_session()
File "/Users/james/mambaforge/envs/distributed/lib/python3.9/site-packages/bokeh/server/views/session_handler.py", line 144, in get_session
session = await self.application_context.create_session_if_needed(session_id, self.request, token)
File "/Users/james/mambaforge/envs/distributed/lib/python3.9/site-packages/bokeh/server/contexts.py", line 243, in create_session_if_needed
self._application.initialize_document(doc)
File "/Users/james/mambaforge/envs/distributed/lib/python3.9/site-packages/bokeh/application/application.py", line 194, in initialize_document
h.modify_document(doc)
File "/Users/james/mambaforge/envs/distributed/lib/python3.9/site-packages/bokeh/application/handlers/function.py", line 143, in modify_document
self._func(doc)
File "cytoolz/functoolz.pyx", line 250, in cytoolz.functoolz.curry.__call__
File "/Users/james/projects/dask/distributed/distributed/dashboard/components/scheduler.py", line 3982, in individual_doc
fig = cls(scheduler, sizing_mode="stretch_both", **kwargs)
File "/Users/james/projects/dask/distributed/distributed/utils.py", line 767, in wrapper
return func(*args, **kwargs)
File "/Users/james/projects/dask/distributed/distributed/dashboard/components/scheduler.py", line 1049, in __init__
update(self.source, self.get_data())
File "/Users/james/projects/dask/distributed/distributed/dashboard/components/scheduler.py", line 1176, in get_data
read_bytes_disk += ws.metrics["read_bytes_disk"]
KeyError: 'read_bytes_disk'
logged in my local Python session
|
That can already be the case if, for example, the system doesn't have a disk (somewhat common in HPC scenarios). I've gone ahead and fixed the dashboard and modified tests to cover this scenario. |
Unit Test ResultsSee test report for an extended history of previous test failures. This is useful for diagnosing flaky tests. 15 files ±0 15 suites ±0 6h 4m 59s ⏱️ - 12m 27s For more details on these failures, see this check. Results for commit 0ed6b47. ± Comparison against base commit 9e4e3ab. |
|
@jrbourbeau good to merge? |
Disk access is of marginal value, and somewhat of a pain to deal with.
This makes it configurable.
We might also ask ourselves if we should turn it off by default.