Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug][Doc] Increase default operator resource requirements, improve docs #727

Merged
merged 6 commits into from
Nov 18, 2022

Conversation

kevin85421
Copy link
Member

Why are these changes needed?

We have some indication from users that the default resource limits for the KubeRay operator may be too small when managing many Ray pods.

See the discussion here: https://ray-distributed.slack.com/archives/C02GFQ82JPM/p1667409518664019

  • Increase the default resource limits
  • Document the fact that users need to observe resource usage and adjust as needed.

Related issue number

Closes #685

Checks

  • I've made sure the tests are passing.
  • Testing Strategy
    • Unit tests
    • Manual tests
    • This PR is not tested :(

@DmitriGekhtman
Copy link
Collaborator

Could you mention the need to monitor and adjust memory in the operator portion of the docs?

Co-authored-by: Dmitri Gekhtman <62982571+DmitriGekhtman@users.noreply.github.com>
Signed-off-by: Kai-Hsun Chen <kaihsun@apache.org>
@kevin85421
Copy link
Member Author

Could you mention the need to monitor and adjust memory in the operator portion of the docs?

Which docs should I update? Thanks!

@DmitriGekhtman
Copy link
Collaborator

Could you mention the need to monitor and adjust memory in the operator portion of the docs?

Which docs should I update? Thanks!

I think this section would be best:
https://ray-project.github.io/kuberay/components/operator/

@kevin85421
Copy link
Member Author

Could you mention the need to monitor and adjust memory in the operator portion of the docs?

Which docs should I update? Thanks!

I think this section would be best: https://ray-project.github.io/kuberay/components/operator/

Thank you for the review! Updated.

Co-authored-by: Dmitri Gekhtman <62982571+DmitriGekhtman@users.noreply.github.com>
Signed-off-by: Kai-Hsun Chen <kaihsun@apache.org>
@DmitriGekhtman DmitriGekhtman merged commit deec37c into ray-project:master Nov 18, 2022
lowang-bh pushed a commit to lowang-bh/kuberay that referenced this pull request Sep 24, 2023
…ocs (ray-project#727)

We have some indication from users that the default resource limits for the KubeRay operator may be too small when managing many Ray pods.

See the discussion here: https://ray-distributed.slack.com/archives/C02GFQ82JPM/p1667409518664019

This PR increases the default resource limits and documents the fact that users need to observe resource usage and adjust as needed.

Signed-off-by: Kai-Hsun Chen <kaihsun@apache.org>
Co-authored-by: Dmitri Gekhtman <62982571+DmitriGekhtman@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug][Doc] Increase default operator resource requirements, improve docs
2 participants