-
Notifications
You must be signed in to change notification settings - Fork 6.9k
[Core] Remove reference counter mock for real reference counter in testing #57178
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request refactors AddNewActorHandle to EmplaceNewActorHandle and fixes a bug where it was not idempotent, potentially causing issues with reference counting on repeated calls. The changes also improve the actor_manager_test by using a real ReferenceCounter instance instead of a mock, which helps in catching such behavioral mismatches. The logic fix in EmplaceNewActorHandle is correct. However, I've found a critical issue in the test setup where dangling pointers are passed to the ReferenceCounter constructor, which could lead to test flakiness or crashes. My review includes a fix for this issue.
3c67aca to
f513826
Compare
a8e00e1 to
e859ad5
Compare
This commit refactors all tests previously using mock reference counter to the real reference counter. This is done to move towards more comprehensive testing when applicable. The refactor also addresses existing documentation and behavior mismatches in actor_manager tests. Signed-off-by: davik <davik@anyscale.com>
e859ad5 to
db60fca
Compare
Signed-off-by: davik <davik@anyscale.com>
src/ray/core_worker/task_submission/tests/actor_task_submitter_test.cc
Outdated
Show resolved
Hide resolved
src/ray/core_worker/actor_manager.cc
Outdated
| } | ||
| // Place a sentinel value in the map to indicate that the actor handle is being | ||
| // created. | ||
| actor_handles_.emplace(actor_id, nullptr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
uhh what's the point of this sentinel?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The sentinel will serve as a place holder associated with the actor_id to prevent the race condition of multiple calls to EmplaceNewActorHandle from acting on the actor id at the same time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From our conversation offline, the better approach will be to guarantee that actor_id is unique. The investigation + enforcement of actor id uniqueness will be addressed in a future PR.
Signed-off-by: davik <davik@anyscale.com>
dayshah
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wait why is the test breaking, is it because we double end up double adding the same ref in the test.
I think the problem is probably the test code, not the src code. The only way you can have the actor handle in actor_handles_ before AddNewActorHandle is if you went through GetNamedActorHandle which will always set owned to false, and it actually won't touch the ref counter at all in this codepath in that case.
Also for the future, describe the issue you're trying to fix in the pr description, e.g. why do you need to make AddNewActorHandle idempotent
israbbani
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with what @dayshah said. We should change the comment and the tests instead of the implementation of AddNewActorHandle. It's a fatal error if the ActorHandle already exists and we should raise the error as close to the call-site as possible.
Signed-off-by: davik <davik@anyscale.com>
Signed-off-by: davik <davik@anyscale.com>
Signed-off-by: davik <davik@anyscale.com>
Signed-off-by: davik <davik@anyscale.com>
Signed-off-by: davik <davik@anyscale.com>
Signed-off-by: davik <davik@anyscale.com>
Signed-off-by: davik <davik@anyscale.com>
…ef counter Signed-off-by: davik <davik@anyscale.com>
israbbani
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. A few small nits around comments, logging, and documentation.
Signed-off-by: davik <davik@anyscale.com>
israbbani
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One small nit.
Co-authored-by: Ibrahim Rabbani <israbbani@gmail.com> Signed-off-by: Kunchen (David) Dai <54918178+Kunchd@users.noreply.github.com>
…sting (ray-project#57178) This commit takes a step towards utilizing real classes during unit testing instead of mocks for components that are simple and local (e.g. reference counter). We remove the reference counter mock as promised in ray-project#57177 (review). By directly testing on the real reference_counter, we are able to catch a behavior mismatch between AddNewActorHandle and its doc described behavior. Specifically, the EmplaceNewActorHandle doc states that the function should return false instead of failing due to ray check and the test tests exactly that. However, due to calling AddOwnedObject, EmplaceNewActorHandle actually ray check fails. The commit addresses this issue by making AddNewActorHandle idempotent. --------- Signed-off-by: davik <davik@anyscale.com> Signed-off-by: Kunchen (David) Dai <54918178+Kunchd@users.noreply.github.com> Co-authored-by: davik <davik@anyscale.com> Co-authored-by: Ibrahim Rabbani <irabbani@anyscale.com> Co-authored-by: Ibrahim Rabbani <israbbani@gmail.com> Signed-off-by: Aydin Abiar <aydin@anyscale.com>
Why are these changes needed?
This commit takes a step towards utilizing real classes during unit testing instead of mocks for components that are simple and local (e.g. reference counter). We remove the reference counter mock as promised in #57177 (review).
By directly testing on the real reference_counter, we are able to catch a behavior mismatch between AddNewActorHandle and its doc described behavior. Specifically, the EmplaceNewActorHandle doc states that the function should return false instead of failing due to ray check and the test tests exactly that. However, due to calling AddOwnedObject, EmplaceNewActorHandle actually ray check fails. The commit addresses this issue by making AddNewActorHandle idempotent.
Related issue number
N/A
Checks
git commit -s) in this PR.method in Tune, I've added it in
doc/source/tune/api/under thecorresponding
.rstfile.