Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(frontend, meta): support RECOVER command to trigger recovery #16259

Merged
merged 10 commits into from
Apr 16, 2024

Conversation

kwannoel
Copy link
Contributor

@kwannoel kwannoel commented Apr 11, 2024

I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.

What's changed and what's your intention?

Closes #14700

The current shortcoming of this approach is that while recovery is triggered, the command does not wait for recovery to complete, it operates in an async way. Welcome any ideas for improvement. If not we can improve it later.

Path of recover command:

Frontend -> Meta Client -> Stream Manager -> Notification Service -> Barrier Manager -> trigger recovery

Checklist

  • I have written necessary rustdoc comments
  • I have added necessary unit tests and integration tests
  • I have added test labels as necessary. See details.
  • I have added fuzzing tests or opened an issue to track them. (Optional, recommended for new SQL features Sqlsmith: Sql feature generation #7934).
  • My PR contains breaking changes. (If it deprecates some features, please create a tracking issue to remove them in the future).
  • All checks passed in ./risedev check (or alias, ./risedev c)
  • My PR changes performance-critical code. (Please run macro/micro-benchmarks and show the results.)
  • My PR contains critical fixes that are necessary to be merged into the latest release. (Please check out the details)

Documentation

  • My PR needs documentation updates. (Please use the Release note section below to summarize the impact on users)

Release note

The user can use the command: RECOVER to trigger an adhoc recovery. This can be used in cases where barrier latency is high, and we want to force a recovery to kick-in, so commands like cancel, drop can immediately take effect.

@kwannoel kwannoel marked this pull request as ready for review April 15, 2024 04:43
Copy link
Contributor

@yezizp2012 yezizp2012 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thx for the PR.

@kwannoel kwannoel added the user-facing-changes Contains changes that are visible to users label Apr 15, 2024
src/meta/src/barrier/mod.rs Outdated Show resolved Hide resolved
src/frontend/src/handler/recover.rs Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Provide a command to trigger recovery manually
3 participants