-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[data] Ray Data jobs detail table #40756
Conversation
2e3356a
to
bcdd57d
Compare
MAX = ( | ||
"max", | ||
"max_over_time(sum({}) by (dataset)[" + f"{MAX_TIME_WINDOW}:{SAMPLE_RATE}])", | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a simpler way to get the max of a metric?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do the brackets ([]
) do?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think something like [1h:1s]
in this case means that we're querying the last 1 hour at 1 second intervals.
dashboard/modules/data/data_head.py
Outdated
for metric, queries in DATASET_METRICS.items(): | ||
for query in queries: | ||
result = await self._query_prometheus(query.value[1].format(metric)) | ||
for res in result["data"]["result"]: | ||
dataset, value = res["metric"]["dataset"], res["value"][1] | ||
if dataset in datasets: | ||
datasets[dataset][metric][query.value[0]] = value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could store these values back in the _StatsActor
for completed datasets so we don't have to query prometheus for datasets that are done executing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, let's do this in a future PR (feel free to leave a TODO
comment or open an issue)
Can we detect if people use Ray data in a ray job? If so, I'd like to move this table to the top above the "Task/actor overview" and hide this table when people don't use Ray Data |
Yeah we can just hide the table if there aren't any datasets. |
Great! Can you update the screenshot? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
data changes look good to me
MAX = ( | ||
"max", | ||
"max_over_time(sum({}) by (dataset)[" + f"{MAX_TIME_WINDOW}:{SAMPLE_RATE}])", | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do the brackets ([]
) do?
LGTM. Thanks! |
Signed-off-by: Andrew Xue <andewzxue@gmail.com>
Signed-off-by: Andrew Xue <andewzxue@gmail.com>
Signed-off-by: Andrew Xue <andewzxue@gmail.com>
Signed-off-by: Andrew Xue <andewzxue@gmail.com>
Signed-off-by: Andrew Xue <andewzxue@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice work!
dashboard/modules/data/data_head.py
Outdated
for metric, queries in DATASET_METRICS.items(): | ||
for query in queries: | ||
result = await self._query_prometheus(query.value[1].format(metric)) | ||
for res in result["data"]["result"]: | ||
dataset, value = res["metric"]["dataset"], res["value"][1] | ||
if dataset in datasets: | ||
datasets[dataset][metric][query.value[0]] = value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, let's do this in a future PR (feel free to leave a TODO
comment or open an issue)
Signed-off-by: Andrew Xue <andewzxue@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM for data side.
Hi @alanwguo - do you wanna do a final review on dashboard side? Thanks. |
Creates a table under the jobs page to display dataset-level metrics. The `_StatsActor` now stores dataset metadata like `state`, `progress` and `start/end_time` for each executed dataset, that is directly queried by the new `data_head` dashboard api. This api also makes requests to the prometheus server to get other metrics that are displayed. Signed-off-by: Andrew Xue <andewzxue@gmail.com>
Why are these changes needed?
Creates a table under the jobs page to display dataset-level metrics.
The
_StatsActor
now stores dataset metadata likestate
,progress
andstart/end_time
for each executed dataset, that is directly queried by the newdata_head
dashboard api. This api also makes requests to the prometheus server to get other metrics that are displayed.cluster env:
jobs-data-overview
to test on workspaces.Related issue number
Checks
git commit -s
) in this PR.scripts/format.sh
to lint the changes in this PR.method in Tune, I've added it in
doc/source/tune/api/
under thecorresponding
.rst
file.