Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Core]Pull off timers out of heartbeat in raylet #13963

Merged
merged 17 commits into from
Feb 24, 2021

Conversation

WangTaoTheTonic
Copy link
Contributor

Why are these changes needed?

I found raylet mixed up many period actions with heartbeat, which is ugly and not easy for us to pick up heartbeat reporting into a separate thread.

This PR do 3 things:

  1. pull off other things out of heartbeat period.
  2. Add a function RunFnPeriodically.
  3. Use RunFnPeriodically to replace dummy methods that dispatch task in a loop.

Related issue number

Checks

  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

@rkooo567 rkooo567 self-assigned this Feb 7, 2021
@WangTaoTheTonic WangTaoTheTonic changed the title [Core]Pull off timer out of heartbeat in raylet [Core]Pull off timers out of heartbeat in raylet Feb 7, 2021
@rkooo567
Copy link
Contributor

rkooo567 commented Feb 7, 2021

Tests all failing...

@rkooo567 rkooo567 added the @author-action-required The PR author is responsible for the next step. Remove tag to send back to the reviewer. label Feb 7, 2021
@WangTaoTheTonic
Copy link
Contributor Author

Looks like something wrong with pull manager, revert that part first and test are passed.
Will modify that later. @rkooo567

@WangTaoTheTonic WangTaoTheTonic added tests-ok The tagger certifies test failures are unrelated and assumes personal liability. and removed @author-action-required The PR author is responsible for the next step. Remove tag to send back to the reviewer. labels Feb 8, 2021
@WangTaoTheTonic
Copy link
Contributor Author

ping @rkooo567

@@ -362,3 +362,18 @@ std::shared_ptr<std::unordered_map<std::string, std::string>> ParseURL(std::stri
result->emplace(key_value_pair.first, key_value_pair.second);
return result;
}

void RunFnPeriodically(std::function<void()> fn, boost::posix_time::milliseconds period,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can change this to a helper class (say PeriodicRunner), so the callers can just pass in a function and a period, without having to be aware of the timers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great! I'll change that.

@rkooo567 rkooo567 added @author-action-required The PR author is responsible for the next step. Remove tag to send back to the reviewer. and removed tests-ok The tagger certifies test failures are unrelated and assumes personal liability. labels Feb 22, 2021
@rkooo567
Copy link
Contributor

Still seems to have many test failures.

Copy link
Contributor

@raulchen raulchen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

/// \class PeriodicalRunner
/// A periodical runner attached with an io_context.
/// It can run functions with specified period. Each function is triggered by its timer.
/// To run a function, call `RunFnPeriodically(fn, period_ms)`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Document that when the PeriodicalRunner object is destructed, the registered functions will stop running.

~PeriodicalRunner();

void RunFnPeriodically(std::function<void()> fn,
boost::posix_time::milliseconds period);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this function can accept a long, so callers don't need to construct a posix_time before calling it.

@WangTaoTheTonic WangTaoTheTonic added tests-ok The tagger certifies test failures are unrelated and assumes personal liability. and removed @author-action-required The PR author is responsible for the next step. Remove tag to send back to the reviewer. labels Feb 24, 2021
@raulchen raulchen merged commit 6af0291 into ray-project:master Feb 24, 2021
@raulchen raulchen deleted the raylet_thread branch February 24, 2021 03:59
richardliaw added a commit to richardliaw/ray that referenced this pull request Feb 24, 2021
richardliaw added a commit to richardliaw/ray that referenced this pull request Feb 25, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
tests-ok The tagger certifies test failures are unrelated and assumes personal liability.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants