Skip to content

replace concurrency with compute in RayDataAdapter backend#1538

Open
omkar-334 wants to merge 5 commits intoNVIDIA-NeMo:mainfrom
omkar-334:rayback
Open

replace concurrency with compute in RayDataAdapter backend#1538
omkar-334 wants to merge 5 commits intoNVIDIA-NeMo:mainfrom
omkar-334:rayback

Conversation

@omkar-334
Copy link
Copy Markdown
Contributor

Fixes #1520

@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented Feb 23, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

Signed-off-by: Omkar Kabde <omkarkabde@gmail.com>
@omkar-334
Copy link
Copy Markdown
Contributor Author

cc @praateekmahajan @ayushdg

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Feb 23, 2026

Greptile Summary

Replaced deprecated concurrency parameter with the newer compute parameter in Ray Data's map_batches API, addressing the deprecation warning introduced in Ray 2.51.

  • For actor-based stages, now uses ActorPoolStrategy with either size (for fixed concurrency) or min_size/max_size (for dynamic concurrency range)
  • For task-based stages, removed the redundant concurrency: None parameter
  • CPU and GPU resource specifications (num_cpus, num_gpus) are now passed alongside the compute strategy in compute_kwargs
  • Updated logging to reflect the new parameter name

Confidence Score: 5/5

  • This PR is safe to merge - it's a straightforward API migration that addresses a deprecation warning
  • The change correctly migrates from the deprecated concurrency parameter to the new compute parameter using ActorPoolStrategy. The logic preserves the same behavior while adhering to Ray Data 2.51+ API changes. The implementation properly handles both fixed concurrency (using size) and dynamic concurrency ranges (using min_size/max_size), and correctly maintains resource specifications.
  • No files require special attention

Important Files Changed

Filename Overview
nemo_curator/backends/experimental/ray_data/adapter.py Migrated from deprecated concurrency parameter to compute with ActorPoolStrategy for Ray Data 2.51+ compatibility

Last reviewed commit: a1b2e10

Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 file reviewed, no comments

Edit Code Review Agent Settings | Greptile

Comment thread nemo_curator/backends/experimental/ray_data/adapter.py Outdated
Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 file reviewed, no comments

Edit Code Review Agent Settings | Greptile

Copy link
Copy Markdown
Contributor

@weijiac0619 weijiac0619 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The migration from concurrency to compute LGTM.

Copy link
Copy Markdown
Contributor

@praateekmahajan praateekmahajan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thank yu @omkar-334 for your contribution!

Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 file reviewed, no comments

Edit Code Review Agent Settings | Greptile

@praateekmahajan
Copy link
Copy Markdown
Contributor

/ok to test a1b2e10

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Ray Data Adapter (backend) should use compute instead of concurrency due to deprecation

3 participants