-
Notifications
You must be signed in to change notification settings - Fork 6.9k
[core] Remove is_from_worker
flag and consumed_bytes
plasma counter
#55260
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Summary of Changes
Hello @edoakes, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
I've made changes to the core Ray system to remove an unnecessary is_from_worker
flag and the consumed_bytes
metric from the Plasma object store. This refactoring addresses an abstraction leak and eliminates a metric that provided little value to users while complicating testing and maintenance. The changes simplify the Plasma client API, clean up related C++ and Python code, and update the protobuf schema, resulting in a more streamlined and robust system.
Highlights
- Abstraction Leak Removal: I've removed the
is_from_worker
flag from the Plasma object store'sGet
operations. This flag was identified as an abstraction leak, unnecessarily complicating the Plasma API by exposing internal worker-related concerns. - Metric Deprecation: The
consumed_bytes
plasma counter and its associated logic have been entirely removed. This metric was primarily used for theray memory
CLI command and brittle object spilling tests, offering limited utility to end-users and proving difficult to maintain. - Test Code Simplification: The
assert_no_thrashing
test utility and all its invocations withintest_object_spilling.py
have been deleted. This cleanup was possible because the utility relied on theconsumed_bytes
metric, which is no longer relevant. - Protobuf Schema Update: The
ObjectStoreStats
protobuf definition has been updated to remove theconsumed_bytes
field, ensuring that the system's data structures accurately reflect the current state of metrics tracking.
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command>
or @gemini-code-assist <command>
. Below is a summary of the supported commands.
Feature | Command | Description |
---|---|---|
Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/
folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments or fill out our survey to provide feedback.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request removes the is_from_worker
flag and the consumed_bytes
metric from the plasma store, which simplifies the code by removing an abstraction leak. The changes are consistent across the C++ and Python codebase, including updates to the plasma protocol, client, and store implementations. The corresponding tests that relied on these features have also been updated. My review found one issue in the test files where an assertion related to the removed metric was not deleted, which would cause a test to fail. Otherwise, the changes look good and align with the stated goals of the PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚢
…es/rm-is-from-worker
…es/rm-is-from-worker
…es/rm-is-from-worker
…es/rm-is-from-worker
…es/rm-is-from-worker
…er (#55260) We have a deep abstraction leak in plasma, where we set `is_from_worker` when requesting objects from plasma and thread it through many layers. It is used for one thing: to decide whether or not to record a metric on the number of "consumed" bytes. Note that the flag is not always `true` from the worker (in `GetIfLocal` we set it to `false`). Original PR: #13853 The metric is only used for the "memory summary", which is output by the `ray memory` CLI command. For end users, this metric is very very unlikely to be useful. It is also used for the object spilling tests, which parse the string and make an assertion on the embedded values. This is very very brittle. If/when we revisit spilling we will test it in a more robust way. --------- Signed-off-by: Edward Oakes <ed.nmi.oakes@gmail.com> Signed-off-by: sampan <sampan@anyscale.com>
…er (ray-project#55260) We have a deep abstraction leak in plasma, where we set `is_from_worker` when requesting objects from plasma and thread it through many layers. It is used for one thing: to decide whether or not to record a metric on the number of "consumed" bytes. Note that the flag is not always `true` from the worker (in `GetIfLocal` we set it to `false`). Original PR: ray-project#13853 The metric is only used for the "memory summary", which is output by the `ray memory` CLI command. For end users, this metric is very very unlikely to be useful. It is also used for the object spilling tests, which parse the string and make an assertion on the embedded values. This is very very brittle. If/when we revisit spilling we will test it in a more robust way. --------- Signed-off-by: Edward Oakes <ed.nmi.oakes@gmail.com> Signed-off-by: Andrew Grosser <dioptre@gmail.com>
…er (ray-project#55260) We have a deep abstraction leak in plasma, where we set `is_from_worker` when requesting objects from plasma and thread it through many layers. It is used for one thing: to decide whether or not to record a metric on the number of "consumed" bytes. Note that the flag is not always `true` from the worker (in `GetIfLocal` we set it to `false`). Original PR: ray-project#13853 The metric is only used for the "memory summary", which is output by the `ray memory` CLI command. For end users, this metric is very very unlikely to be useful. It is also used for the object spilling tests, which parse the string and make an assertion on the embedded values. This is very very brittle. If/when we revisit spilling we will test it in a more robust way. --------- Signed-off-by: Edward Oakes <ed.nmi.oakes@gmail.com> Signed-off-by: jugalshah291 <shah.jugal291@gmail.com>
We have a deep abstraction leak in plasma, where we set
is_from_worker
when requesting objects from plasma and thread it through many layers. It is used for one thing: to decide whether or not to record a metric on the number of "consumed" bytes. Note that the flag is not alwaystrue
from the worker (inGetIfLocal
we set it tofalse
). Original PR: #13853The metric is only used for the "memory summary", which is output by the
ray memory
CLI command. For end users, this metric is very very unlikely to be useful. It is also used for the object spilling tests, which parse the string and make an assertion on the embedded values. This is very very brittle. If/when we revisit spilling we will test it in a more robust way.