Create a graph that shows a ratio of cpu utilization / logical cpu per task name? #45910
Labels
core
Issues that should be addressed in Ray Core
enhancement
Request for new feature and/or capability
observability
Issues related to the Ray Dashboard, Logging, Metrics, Tracing, and/or Profiling
p0.5
uueeehhh
One issue users run into is that they are under-utilizing their cluster because they request more logical cpus than a task requires.
One idea is to have a metric that shows a ratio of cpu utilization / logical cpu per task name.
One way this manifests is with ray data, each dataset task requests m cpus and n concurrency. m x n cpus are used correctly but overall cluster utilization is at 40% cpu. It can be counter-intuitive to think to improve utilization, concurrency should not be increased, instead num cpus should be decreased.
The text was updated successfully, but these errors were encountered: