Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[data][fault tolerance] Do not eagerly free root RefBundles #39016

Merged
merged 5 commits into from
Aug 29, 2023

Conversation

raulchen
Copy link
Contributor

@raulchen raulchen commented Aug 28, 2023

Why are these changes needed?

Fix the issue that we eagerly free the root RefBundles. This bug broke fault tolerance.

Related issue number

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it in doc/source/tune/api/ under the
      corresponding .rst file.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

Signed-off-by: Hao Chen <chenh1024@gmail.com>
cleaned_metadata(read_task),
)
],
owns_blocks=True,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you set owns_blocks=False that would also equivalently prevent them from being freed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though, it sounds like we'd like to free them at the end of the workload, just not as soon as possible, so there is some subtlety here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, owns_blocks=False is the right fix. at the end of the workload, they will be eventually freed by python GC. and their sizes are small. So it should be fine.

Signed-off-by: Hao Chen <chenh1024@gmail.com>
Signed-off-by: Hao Chen <chenh1024@gmail.com>
Signed-off-by: Hao Chen <chenh1024@gmail.com>
@raulchen raulchen changed the title [wip] Keep object refs in input data [data] Do not eagerly free root RefBundles Aug 29, 2023
@raulchen raulchen changed the title [data] Do not eagerly free root RefBundles [data][fault tolerance] Do not eagerly free root RefBundles Aug 29, 2023
Signed-off-by: Hao Chen <chenh1024@gmail.com>
@raulchen raulchen merged commit e2079cb into ray-project:master Aug 29, 2023
44 of 50 checks passed
@raulchen raulchen deleted the input-data-keep-refs branch August 29, 2023 23:01
raulchen added a commit to raulchen/ray that referenced this pull request Aug 29, 2023
…ect#39016)

Fix the issue that we eagerly free the root RefBundles. This bug broke fault tolerance.

---------

Signed-off-by: Hao Chen <chenh1024@gmail.com>
GeneDer pushed a commit that referenced this pull request Aug 30, 2023
…39085)

Fix the issue that we eagerly free the root RefBundles. This bug broke fault tolerance.

---------

Signed-off-by: Hao Chen <chenh1024@gmail.com>
arvind-chandra pushed a commit to lmco/ray that referenced this pull request Aug 31, 2023
…ect#39016)

Fix the issue that we eagerly free the root RefBundles. This bug broke fault tolerance.

---------

Signed-off-by: Hao Chen <chenh1024@gmail.com>
Signed-off-by: e428265 <arvind.chandramouli@lmco.com>
LeonLuttenberger pushed a commit to jaidisido/ray that referenced this pull request Sep 5, 2023
…ect#39016)

Fix the issue that we eagerly free the root RefBundles. This bug broke fault tolerance.

---------

Signed-off-by: Hao Chen <chenh1024@gmail.com>
jimthompson5802 pushed a commit to jimthompson5802/ray that referenced this pull request Sep 12, 2023
…ect#39016)

Fix the issue that we eagerly free the root RefBundles. This bug broke fault tolerance.

---------

Signed-off-by: Hao Chen <chenh1024@gmail.com>
Signed-off-by: Jim Thompson <jimthompson5802@gmail.com>
vymao pushed a commit to vymao/ray that referenced this pull request Oct 11, 2023
…ect#39016)

Fix the issue that we eagerly free the root RefBundles. This bug broke fault tolerance.

---------

Signed-off-by: Hao Chen <chenh1024@gmail.com>
Signed-off-by: Victor <vctr.y.m@example.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants