Skip to content

Conversation

@Kunchd
Copy link
Contributor

@Kunchd Kunchd commented Oct 3, 2025

Why are these changes needed?

This commit takes a step towards utilizing real classes during unit testing instead of mocks for components that are simple and local (e.g. reference counter). We remove the reference counter mock as promised in #57177 (review).

By directly testing on the real reference_counter, we are able to catch a behavior mismatch between AddNewActorHandle and its doc described behavior. Specifically, the EmplaceNewActorHandle doc states that the function should return false instead of failing due to ray check and the test tests exactly that. However, due to calling AddOwnedObject, EmplaceNewActorHandle actually ray check fails. The commit addresses this issue by making AddNewActorHandle idempotent.

Related issue number

N/A

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run pre-commit jobs to lint the changes in this PR. (pre-commit setup)
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it in doc/source/tune/api/ under the
      corresponding .rst file.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

@Kunchd Kunchd requested a review from a team as a code owner October 3, 2025 23:11
@Kunchd Kunchd marked this pull request as draft October 3, 2025 23:11
cursor[bot]

This comment was marked as outdated.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors AddNewActorHandle to EmplaceNewActorHandle and fixes a bug where it was not idempotent, potentially causing issues with reference counting on repeated calls. The changes also improve the actor_manager_test by using a real ReferenceCounter instance instead of a mock, which helps in catching such behavioral mismatches. The logic fix in EmplaceNewActorHandle is correct. However, I've found a critical issue in the test setup where dangling pointers are passed to the ReferenceCounter constructor, which could lead to test flakiness or crashes. My review includes a fix for this issue.

@ZacAttack ZacAttack added the go add ONLY when ready to merge, run all tests label Oct 6, 2025
@Kunchd Kunchd force-pushed the actor_manager_rc branch 2 times, most recently from a8e00e1 to e859ad5 Compare October 10, 2025 03:35
@Kunchd Kunchd added go add ONLY when ready to merge, run all tests and removed go add ONLY when ready to merge, run all tests labels Oct 10, 2025
This commit refactors all tests previously using mock reference counter
to the real reference counter. This is done to move towards more comprehensive
testing when applicable. The refactor also addresses existing documentation
and behavior mismatches in actor_manager tests.

Signed-off-by: davik <davik@anyscale.com>
@Kunchd Kunchd marked this pull request as ready for review October 10, 2025 17:45
@Kunchd Kunchd requested a review from israbbani October 10, 2025 17:45
@Kunchd Kunchd changed the title [Core] Address actor manager behavior mismatch with documentation [Core] Remove reference counter mock for real reference counter in testing Oct 10, 2025
cursor[bot]

This comment was marked as outdated.

@ray-gardener ray-gardener bot added the core Issues that should be addressed in Ray Core label Oct 10, 2025
}
// Place a sentinel value in the map to indicate that the actor handle is being
// created.
actor_handles_.emplace(actor_id, nullptr);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uhh what's the point of this sentinel?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sentinel will serve as a place holder associated with the actor_id to prevent the race condition of multiple calls to EmplaceNewActorHandle from acting on the actor id at the same time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From our conversation offline, the better approach will be to guarantee that actor_id is unique. The investigation + enforcement of actor id uniqueness will be addressed in a future PR.

Signed-off-by: davik <davik@anyscale.com>
Copy link
Contributor

@dayshah dayshah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wait why is the test breaking, is it because we double end up double adding the same ref in the test.

I think the problem is probably the test code, not the src code. The only way you can have the actor handle in actor_handles_ before AddNewActorHandle is if you went through GetNamedActorHandle which will always set owned to false, and it actually won't touch the ref counter at all in this codepath in that case.

Also for the future, describe the issue you're trying to fix in the pr description, e.g. why do you need to make AddNewActorHandle idempotent

cursor[bot]

This comment was marked as outdated.

@Kunchd Kunchd requested a review from dayshah October 14, 2025 17:56
Copy link
Contributor

@israbbani israbbani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with what @dayshah said. We should change the comment and the tests instead of the implementation of AddNewActorHandle. It's a fatal error if the ActorHandle already exists and we should raise the error as close to the call-site as possible.

cursor[bot]

This comment was marked as outdated.

@Kunchd Kunchd requested a review from israbbani October 27, 2025 16:43
davik and others added 2 commits October 28, 2025 18:54
Copy link
Contributor

@israbbani israbbani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. A few small nits around comments, logging, and documentation.

Copy link
Contributor

@israbbani israbbani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One small nit.

Co-authored-by: Ibrahim Rabbani <israbbani@gmail.com>
Signed-off-by: Kunchen (David) Dai <54918178+Kunchd@users.noreply.github.com>
@edoakes edoakes enabled auto-merge (squash) November 17, 2025 19:10
@github-actions github-actions bot disabled auto-merge November 17, 2025 19:50
@edoakes edoakes merged commit 957568d into ray-project:master Nov 17, 2025
6 checks passed
Aydin-ab pushed a commit to Aydin-ab/ray-aydin that referenced this pull request Nov 19, 2025
…sting (ray-project#57178)

This commit takes a step towards utilizing real classes during unit
testing instead of mocks for components that are simple and local (e.g.
reference counter). We remove the reference counter mock as promised in
ray-project#57177 (review).

By directly testing on the real reference_counter, we are able to catch
a behavior mismatch between AddNewActorHandle and its doc described
behavior. Specifically, the EmplaceNewActorHandle doc states that the
function should return false instead of failing due to ray check and the
test tests exactly that. However, due to calling AddOwnedObject,
EmplaceNewActorHandle actually ray check fails. The commit addresses
this issue by making AddNewActorHandle idempotent.

---------

Signed-off-by: davik <davik@anyscale.com>
Signed-off-by: Kunchen (David) Dai <54918178+Kunchd@users.noreply.github.com>
Co-authored-by: davik <davik@anyscale.com>
Co-authored-by: Ibrahim Rabbani <irabbani@anyscale.com>
Co-authored-by: Ibrahim Rabbani <israbbani@gmail.com>
Signed-off-by: Aydin Abiar <aydin@anyscale.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Issues that should be addressed in Ray Core go add ONLY when ready to merge, run all tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants