-
Notifications
You must be signed in to change notification settings - Fork 545
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(cdc): introduce with option to configure cdc snapshot #16426
Merged
Merged
Changes from all commits
Commits
Show all changes
24 commits
Select commit
Hold shift + click to select a range
3766731
wip: optimize cdc backfill
StrikeW b74f43c
WIP: encapsulate snapshot read full table in a function
StrikeW 5c6e8cf
refactor to start snapshot in a fixed interval
StrikeW adff2d4
WIP: test the upstream buffered events
StrikeW 279fdd1
Merge remote-tracking branch 'origin/main' into siyuan/optimize-cdc-b…
StrikeW 65df138
refine ut
StrikeW 3faa90c
clean code
StrikeW d8f1d97
refactor cdc snapshot options
StrikeW 2931d9c
minor
StrikeW 4e3445c
Merge branch 'siyuan/optimize-cdc-backfill-new' into siyuan/cdc-snaps…
StrikeW 3269095
set interval to 5 for test
StrikeW 792ce11
solved mysql scan uncomplete problem
StrikeW b587a8b
fix mysql incomplete scan problem
StrikeW c03c3c1
Merge remote-tracking branch 'origin/main' into siyuan/optimize-cdc-b…
StrikeW 833a8be
fix state commit
StrikeW 3f35356
minor
StrikeW d5b5a9f
minor
StrikeW 3d5be28
Merge remote-tracking branch 'origin/siyuan/optimize-cdc-backfill-new…
StrikeW 0bf584a
set interval to 1 for test
StrikeW a9177e4
To test cdc-commit-offset
StrikeW e13726a
Merge remote-tracking branch 'origin/main' into siyuan/cdc-snapshot-o…
StrikeW 9637b2b
minor
StrikeW a9996d7
backward compatibility
StrikeW afe3bd1
minor
StrikeW File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -39,6 +39,7 @@ use crate::executor::backfill::cdc::upstream_table::snapshot::{ | |
use crate::executor::backfill::utils::{ | ||
get_cdc_chunk_last_offset, get_new_pos, mapping_chunk, mapping_message, mark_cdc_chunk, | ||
}; | ||
use crate::executor::backfill::CdcScanOptions; | ||
use crate::executor::prelude::*; | ||
use crate::task::CreateMviewProgress; | ||
|
||
|
@@ -70,12 +71,7 @@ pub struct CdcBackfillExecutor<S: StateStore> { | |
/// Rate limit in rows/s. | ||
rate_limit_rps: Option<u32>, | ||
|
||
disable_backfill: bool, | ||
|
||
// TODO: make these options configurable | ||
snapshot_interval: u32, | ||
|
||
snapshot_read_limit: u32, | ||
options: CdcScanOptions, | ||
} | ||
|
||
impl<S: StateStore> CdcBackfillExecutor<S> { | ||
|
@@ -89,9 +85,8 @@ impl<S: StateStore> CdcBackfillExecutor<S> { | |
metrics: Arc<StreamingMetrics>, | ||
state_table: StateTable<S>, | ||
rate_limit_rps: Option<u32>, | ||
disable_backfill: bool, | ||
snapshot_interval: u32, | ||
snapshot_read_limit: u32, | ||
disable_backfill: bool, // backward compatibility | ||
scan_options: Option<CdcScanOptions>, | ||
) -> Self { | ||
let pk_in_output_indices = external_table.pk_in_output_indices().clone().unwrap(); | ||
let upstream_table_id = external_table.table_id().table_id; | ||
|
@@ -101,6 +96,11 @@ impl<S: StateStore> CdcBackfillExecutor<S> { | |
pk_in_output_indices.len() + METADATA_STATE_LEN, | ||
); | ||
|
||
let options = scan_options.unwrap_or(CdcScanOptions { | ||
disable_backfill, | ||
..Default::default() | ||
}); | ||
Comment on lines
+99
to
+102
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would suggest moving this into |
||
|
||
Self { | ||
actor_ctx, | ||
external_table, | ||
|
@@ -110,9 +110,7 @@ impl<S: StateStore> CdcBackfillExecutor<S> { | |
progress, | ||
metrics, | ||
rate_limit_rps, | ||
disable_backfill, | ||
snapshot_interval, | ||
snapshot_read_limit, | ||
options, | ||
} | ||
} | ||
|
||
|
@@ -176,7 +174,7 @@ impl<S: StateStore> CdcBackfillExecutor<S> { | |
let state = state_impl.restore_state().await?; | ||
current_pk_pos = state.current_pk_pos.clone(); | ||
|
||
let to_backfill = !self.disable_backfill && !state.is_finished; | ||
let to_backfill = !self.options.disable_backfill && !state.is_finished; | ||
|
||
// The first barrier message should be propagated. | ||
yield Message::Barrier(first_barrier); | ||
|
@@ -200,10 +198,12 @@ impl<S: StateStore> CdcBackfillExecutor<S> { | |
initial_binlog_offset = ?last_binlog_offset, | ||
?current_pk_pos, | ||
is_finished = state.is_finished, | ||
disable_backfill = self.disable_backfill, | ||
snapshot_row_count = total_snapshot_row_count, | ||
rate_limit = self.rate_limit_rps, | ||
"start cdc backfill" | ||
disable_backfill = self.options.disable_backfill, | ||
snapshot_interval = self.options.snapshot_interval, | ||
snapshot_batch_size = self.options.snapshot_batch_size, | ||
"start cdc backfill", | ||
); | ||
|
||
// CDC Backfill Algorithm: | ||
|
@@ -269,7 +269,7 @@ impl<S: StateStore> CdcBackfillExecutor<S> { | |
); | ||
|
||
let right_snapshot = pin!(upstream_table_reader | ||
.snapshot_read_full_table(read_args, self.snapshot_read_limit) | ||
.snapshot_read_full_table(read_args, self.options.snapshot_batch_size) | ||
.map(Either::Right)); | ||
|
||
let (right_snapshot, valve) = pausable(right_snapshot); | ||
|
@@ -298,7 +298,7 @@ impl<S: StateStore> CdcBackfillExecutor<S> { | |
// increase the barrier count and check whether need to start a new snapshot | ||
barrier_count += 1; | ||
let can_start_new_snapshot = | ||
barrier_count == self.snapshot_interval; | ||
barrier_count == self.options.snapshot_interval; | ||
|
||
if let Some(mutation) = barrier.mutation.as_deref() { | ||
use crate::executor::Mutation; | ||
|
@@ -567,7 +567,7 @@ impl<S: StateStore> CdcBackfillExecutor<S> { | |
state_impl.commit_state(pending_barrier.epoch).await?; | ||
yield Message::Barrier(pending_barrier); | ||
} | ||
} else if self.disable_backfill { | ||
} else if self.options.disable_backfill { | ||
// If backfill is disabled, we just mark the backfill as finished | ||
tracing::info!( | ||
upstream_table_id, | ||
|
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we follow the
batch_size
specified in the config file instead?