-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[dashboard][kubernetes] Show container's memory info on K8s, not the physical host's. #14499
[dashboard][kubernetes] Show container's memory info on K8s, not the physical host's. #14499
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need the if statement? Shouldn't ray.utils.get_system_memory()
already do all these checks?
If we have a good reason for deviating from how the core measures memory usage, we should carefully document it, especially now that we're auto-reporting memory resources within ray.available_resources()
.
Btw the issue also mentions CPU usage. Should this PR also include that?
I think psutil's definition of percent is different, but sure I'll get rid of the if statement. I decided not to deal with CPU right now, as it seems slightly more subtle -- unless you (or anyone else) have a quick fix idea! |
|
When I run on minikube with 8 cpus allocated to minikube, attach to the head pod, and run |
We do want accurate usage and percent for CPU too. |
Isn't that a pretty big issue since it means Ray will incorrectly set |
KubernetesNodeProvider reads from the pod spec: https://github.com/ray-project/ray/blob/master/python/ray/autoscaler/_private/kubernetes/config.py#L93 |
After looking into it more carefully -- it reads 8 because it does indeed use all 8 CPUs available to it. But it uses at most x CPU's worth of cycles per unit time where x is the limit specified in the pod spec. |
I think the change in this PR should be pretty uncontroversial. CPUs I've started looking at -- I'd like to isolate those in a different PR. |
Got rid of the if, deferred CPU to another PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dealing with CPUs in a separate PR sounds good to me
Why are these changes needed?
Shows container's memory info in the dashboard when using Kubernetes, not the physical host's.
K8s dashboard situation looks a little better:
(1 head, 2 min_workers all with 512Mi memory limit)
Related issue number
Addresses memory subproblem of
#11172
Checks
scripts/format.sh
to lint the changes in this PR.