[core] Correct OOM score adjustment logic for workers by peterjc123 · Pull Request #62470 · ray-project/ray

peterjc123 · 2026-04-09T09:03:14Z

Description

Looking into the comment of worker_oom_score_adjustment in ray_config_def.h, it says that

/// A value to add to workers' OOM score adjustment, so that the OS prioritizes
/// killing these over the raylet. 0 or positive values only (negative values
/// require sudo permissions).

But it doesn't actually add it to the current value, but just set it as the oom score adjustment. So I updated the logic to correctly reflect its behaviour.

Related issues

When the raylet processes has a oom_score_adj of -999, the oom_score_adj of the worker processes can only be set to 0 at the moment, but it should be able to be set to -998.

Additional information

Signed-off-by: peterjc123 <peterghost86@gmail.com>

gemini-code-assist

Code Review

This pull request modifies the AdjustWorkerOomScore function in src/ray/raylet/worker_pool.cc to allow for a lower OOM score adjustment. Specifically, the minimum allowed oom_score_adj value has been changed from 0 to -1000, enabling workers to be configured with a higher priority against the OOM killer. There are no review comments to address.

Kunchd · 2026-04-09T23:43:53Z

Hi @peterjc123 thanks for the PR! Quick question about the use case for this, in what scenario would it be preferable to set the worker score lower than 0?

peterjc123 · 2026-04-10T03:15:21Z

@Kunchd Thanks for your prompt response. Our use case is a multi-tenant physical machine running multiple jobs with different priority levels. At the job level, those priorities are already expressed via oom_score_adj on the parent process.

In that setup, Ray workers are child processes of the job and should generally inherit/follow the OOM preference established for that job. Allowing values below 0 is important because higher-priority jobs may already be assigned negative oom_score_adj values so they are less likely to be killed under memory pressure.

With the previous lower bound of 0, Ray could not preserve that policy for workers belonging to those jobs. This change lets worker processes align with the parent/job-level OOM configuration, so the kernel’s OOM selection remains consistent with the job priorities already configured on the machine.

Kunchd · 2026-04-10T18:55:59Z

Got it, so the goal here is to essentially provide OOM killing priority on a job level granularity. However, there's an issue with using the oom_score_adjustment environment variable to accomplish this. The oom score applies on a per-node basis, meaning all workers on that node regardless of what job they are running will share that specified oom score. Because of this, I don't think this environment variable can allow us to specify killing priority between jobs.

One work around is if your oom score can be specified without sudo privileges, you could modify the oom score at the start of your user defined function (for tasks) or at the construction of your user defined class (for actors) based on the job.

For a more complete solution, adding task/actor granularity prioritization is something we've been considering, so we are also open to help shepherd this effort if you are interested.

peterjc123 · 2026-04-11T02:03:46Z

Well actually the jobs are managed by k8s and they just assigned the different oom score adjustments in container setup so we will actually run isolated ray servers in the containers. So in my use cases, the code change is just sufficient.

Kunchd · 2026-04-13T17:14:31Z

I'm still not clear on how adjusting the oom scores in different containers will allow you to be able to specify different oom scores on a per job basis. Do you have ray nodes that are dedicated to running specific jobs, and each of these nodes depending on their assigned job will have a different oom score configured? And why can't you configure the priority between jobs with positive oom scores only?

Clarifying question aside, I do have one more concern with this change. Changing the score to negative might allow the kernel to OOM kill the raylet before workers. Doing so will take down the ray node with all workers running on it, which will be very destructive.

peterjc123 · 2026-04-14T03:14:40Z

@Kunchd

Let's answer the questions one by one.

The K8s setup and why we cannot use positive scores

In our multi-tenant K8s environment, each job runs in its own isolated K8s pod (container). Therefore, each pod runs its own independent Ray instance (Raylet + workers) dedicated solely to that job.

The K8s scheduling team enforces a strict, machine-wide OOM policy across all workloads on the physical machine—both Ray and non-Ray workloads. They dictate the container's oom_score_adj to establish priority: -500 for online services, 0 for system, and 500 for interruptible.

Because this is a global infrastructure policy, I do not have the authority to shift these priorities into positive numbers. If my online service is assigned -500 by K8s, the Raylet inherits -500. However, because Ray currently clamps worker processes to a minimum of 0, the workers inside my -500 container are artificially inflated to 0. This makes them highly vulnerable to the host machine's OOM killer compared to other -500 tier processes running on that same physical node.

The Raylet safety concern

You make an excellent point about the danger of the kernel OOM killing the Raylet before the workers. Taking down the whole node is definitely something we want to avoid.

By allowing negative numbers, my goal is actually to maintain Ray's intended kill hierarchy, just shifted into the negative space. Because the K8s container (and thus the Raylet) is already sitting at -500, I need to be able to set the workers to something like -450.

If workers are forced to 0, they are violently out of alignment with the pod's baseline. By removing the 0 floor, power users can configure the workers to be slightly more OOM-prone than the Raylet (e.g., Raylet at -500, workers at -450, preserving the safety mechanism you mentioned), while still respecting the strict negative baselines enforced by the host K8s environment.

Because this change only takes effect if a user explicitly overrides the default configuration, it remains strictly opt-in. Standard users will still get the default positive scores, ensuring the default Ray experience remains safe.

Kunchd · 2026-04-14T17:33:26Z

Thanks, that clarifies things a lot more. I still have a couple more questions to make sure we're making the right fix here, but I see the use case for this change now.

For your multi-tenant environment, why do you want to adjust the oom scores for individual pods on a physcial host? Are you oversubscribing the physical resources of that box?
When you mentioned that you are running a multi-tenant cluster, do you mean you are running multiple jobs on a single ray cluster, or multiple ray clusters on the same host, or some other configuration?

Modifying the oom score can cause significant cluster stability issues if misconfigured, so I want to make sure the change to support negative worker oom scores clearly warns users about potential issues.

I'll leave a couple of comments on nits.

peterjc123 · 2026-04-15T07:21:06Z

For your multi-tenant environment, why do you want to adjust the oom scores for individual pods on a physcial host? Are you oversubscribing the physical resources of that box?

As a tenant, I don't know the whole picture. But looking into the logs, I think they are actually oversubscribing the physical resources.

When you mentioned that you are running a multi-tenant cluster, do you mean you are running multiple jobs on a single ray cluster, or multiple ray clusters on the same host, or some other configuration?

Anyway, I'm running the job using the ray server in my pouch/container. I believe we are not using any ray cluster tools to manage the box or jobs.

Signed-off-by: peterjc123 <peterghost86@gmail.com>

Kunchd · 2026-04-16T00:31:20Z

Will approve after all tests passes.

peterjc123 · 2026-04-16T03:29:30Z

@Kunchd Thanks, I've updated the title and the descriptions of the PR.

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{Reviewed by Cursor Bugbot for commit 0393e2a. Configure here.}

Signed-off-by: peterjc123 <peterghost86@gmail.com>

Kunchd

LGTM. Thanks for the contribution!

Adjust OOM score adjustment range for workers

53593cd

Signed-off-by: peterjc123 <peterghost86@gmail.com>

peterjc123 requested a review from a team as a code owner April 9, 2026 09:03

Merge branch 'master' into pr/fix_oom_score_adj_range

899df66

cursor bot reviewed Apr 9, 2026

View reviewed changes

Comment thread src/ray/raylet/worker_pool.cc Outdated

gemini-code-assist bot reviewed Apr 9, 2026

View reviewed changes

peterjc123 changed the title ~~Adjust OOM score adjustment range for workers~~ [core] Adjust OOM score adjustment range for workers Apr 9, 2026

ray-gardener bot added core Issues that should be addressed in Ray Core community-contribution Contributed by the community labels Apr 9, 2026

Kunchd self-assigned this Apr 9, 2026

Merge branch 'master' into pr/fix_oom_score_adj_range

f2edf9e

Kunchd reviewed Apr 14, 2026

View reviewed changes

Comment thread src/ray/raylet/worker_pool.cc Outdated

Comment thread src/ray/raylet/worker_pool.cc Outdated

Merge branch 'master' into pr/fix_oom_score_adj_range

eb92a84

cursor bot reviewed Apr 15, 2026

View reviewed changes

Comment thread src/ray/raylet/worker_pool.cc Outdated

[worker] Improve OOM score adjustment logic by reading original score

5b8a9bd

Signed-off-by: peterjc123 <peterghost86@gmail.com>

peterjc123 force-pushed the pr/fix_oom_score_adj_range branch from eda6691 to 5b8a9bd Compare April 15, 2026 11:29

Kunchd added the go add ONLY when ready to merge, run all tests label Apr 16, 2026

peterjc123 changed the title ~~[core] Adjust OOM score adjustment range for workers~~ [core] Correct OOM score adjustment logic for workers Apr 16, 2026

Merge branch 'master' into pr/fix_oom_score_adj_range

0393e2a

cursor bot reviewed Apr 16, 2026

View reviewed changes

Comment thread src/ray/raylet/worker_pool.cc

Limit OOM score adjustment to a maximum of 1000

e0ed5f2

Signed-off-by: peterjc123 <peterghost86@gmail.com>

peterjc123 added 2 commits April 16, 2026 15:29

Format code for better readability in worker_pool.cc

209012e

Signed-off-by: peterjc123 <peterghost86@gmail.com>

Fix formatting in worker_pool.cc for OOM score adjustment

59f0c4b

Signed-off-by: peterjc123 <peterghost86@gmail.com>

Kunchd approved these changes Apr 16, 2026

View reviewed changes

edoakes merged commit e4d0c47 into ray-project:master Apr 16, 2026
6 checks passed

peterjc123 deleted the pr/fix_oom_score_adj_range branch April 17, 2026 14:59

Conversation

peterjc123 commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Related issues

Additional information

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Kunchd commented Apr 9, 2026

Uh oh!

peterjc123 commented Apr 10, 2026

Uh oh!

Kunchd commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

peterjc123 commented Apr 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Kunchd commented Apr 13, 2026

Uh oh!

peterjc123 commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

The K8s setup and why we cannot use positive scores

The Raylet safety concern

Uh oh!

Kunchd commented Apr 14, 2026

Uh oh!

Uh oh!

Uh oh!

peterjc123 commented Apr 15, 2026

Uh oh!

Uh oh!

Kunchd commented Apr 16, 2026

Uh oh!

peterjc123 commented Apr 16, 2026

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Kunchd left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

peterjc123 commented Apr 9, 2026 •

edited

Loading

Kunchd commented Apr 10, 2026 •

edited

Loading

peterjc123 commented Apr 11, 2026 •

edited

Loading

peterjc123 commented Apr 14, 2026 •

edited

Loading