Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(cdc): add a parameter to control timeout of cdc source waiting time #16598

Merged
merged 4 commits into from
May 7, 2024

Conversation

KeXiangWang
Copy link
Contributor

@KeXiangWang KeXiangWang commented May 6, 2024

I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.

What's changed and what's your intention?

In our production environment, we found one case that the upstream system's table schema was too complicated so that takes long time to fetch. The time required exceeded the preset timeout threshold in #14406 and leads to the failure of source creation. In this PR, we add a runtime parameter to control it. If a user encounter a error like this when creating cdc source:

RROR:  Failed to run the query

Caused by these errors (recent errors listed first):
  1: gRPC request to meta service failed: Internal error
  2: get error from control stream: worker node {}, gRPC request to stream service failed: Internal error: Failed to send barrier with epoch {} to actor 1: channel closed;

he can run SET cdc_source_wait_streaming_start_timeout TO 120; to increase timeout from the default 30s to 120s can recreate the source.

Checklist

  • I have written necessary rustdoc comments
  • I have added necessary unit tests and integration tests
  • I have added test labels as necessary. See details.
  • I have added fuzzing tests or opened an issue to track them. (Optional, recommended for new SQL features Sqlsmith: Sql feature generation #7934).
  • My PR contains breaking changes. (If it deprecates some features, please create a tracking issue to remove them in the future).
  • All checks passed in ./risedev check (or alias, ./risedev c)
  • My PR changes performance-critical code. (Please run macro/micro-benchmarks and show the results.)
  • My PR contains critical fixes that are necessary to be merged into the latest release. (Please check out the details)

Documentation

  • My PR needs documentation updates. (Please use the Release note section below to summarize the impact on users)

Release note

risingwavelabs/risingwave-docs#2126

Copy link
Contributor

@kwannoel kwannoel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Customer-facing issue, so LGTM. Best if we can add test cases for this.

e.g.

  1. Add the complex catalog.
  2. Show that it takes a long time and timeout occurs.

Then subsequently we can optimize this path as well without having to change timeout, since we have a repro case.

@StrikeW StrikeW added the need-cherry-pick-release-1.9 Open a cherry-pick PR to branch release-1.9 after the current PR is merged label May 7, 2024
@KeXiangWang
Copy link
Contributor Author

we can add test cases for this.

Added an issue to track. #16622

Copy link

gitguardian bot commented May 7, 2024

️✅ There are no secrets present in this pull request anymore.

If these secrets were true positive and are still valid, we highly recommend you to revoke them.
Once a secret has been leaked into a git repository, you should consider it compromised, even if it was deleted immediately.
Find here more information about risks.


🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.

@KeXiangWang KeXiangWang enabled auto-merge May 7, 2024 19:24
@KeXiangWang KeXiangWang force-pushed the wkx/cdc-source-wait-timeout branch from cda3e32 to f438137 Compare May 7, 2024 19:29
@KeXiangWang KeXiangWang added this pull request to the merge queue May 7, 2024
Merged via the queue into main with commit b64a0ac May 7, 2024
27 of 28 checks passed
@KeXiangWang KeXiangWang deleted the wkx/cdc-source-wait-timeout branch May 7, 2024 20:02
@fuyufjh
Copy link
Member

fuyufjh commented May 8, 2024

The error message is not clear. Let's improve it later

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
need-cherry-pick-release-1.9 Open a cherry-pick PR to branch release-1.9 after the current PR is merged type/feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants