New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DBZ-6828 Ad-hoc blocking snaps trigger emits schema changes of all tables #4824
Conversation
addSchemaEvent(snapshotContext, "", connection.setStatementFor(connection.readMySqlCharsetSystemVariables())); | ||
if (!snapshottingTask.isBlocking()) { | ||
// Record default charset | ||
addSchemaEvent(snapshotContext, "", connection.setStatementFor(connection.readMySqlCharsetSystemVariables())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jpechane I noted that for Charset statement no database name is provided. Is it right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, IIRC it does not depnd on database
@Override | ||
protected void assertDdl(List<String> schemaChangesDdls) { | ||
|
||
assertThat(schemaChangesDdls.get(schemaChangesDdls.size() - 2)).isEqualTo("DROP TABLE IF EXISTS `blocking_snapshot_test_1`.`b`"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently this will fail because for the DROP
events we don't have information about the table. See https://github.com/debezium/debezium/pull/4824/files#diff-63840528f25c3335e78d8b91c2ea67602750af012c1620d9981f11cb476b480aR642
@@ -635,6 +639,10 @@ protected void createSchemaChangeEventsForTables(ChangeEventSourceContext source | |||
LOGGER.debug("Processing schema event {}", event); | |||
|
|||
final TableId tableId = event.getTables().isEmpty() ? null : event.getTables().iterator().next().id(); | |||
if (snapshottingTask.isBlocking() && !snapshotContext.capturedTables.contains(tableId)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jpechane Here I need to avoid to resend already processed events and process the only ones requested in bloking snapshot signal. Is there a way to check the same for the DROP
statements for which tableId
is null?
I mean something more robust then checking the name of the table in the ddl
field.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mfvitale Under which conditions the DROP
even will not contain tableId
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jpechane Any DROP events both from snapshot and streaming. Do you know if it's for some limitation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mfvitale It is good workaround. But I'd argu that the missing table change is bug in this case. Would you be able to dig into it a bit deeper? If not we can merge this PR and create another Jira just for that problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have the same feeling but not 100% sure.
I think in this case is better to create a separate Jira issue to investigate it.
5110249
to
1936ad9
Compare
@mfvitale Applied, thanks! Please create the follow-up Jira for |
|
List<SchemaChangeEvent> missingSchemaChangeEvents = schemaChangeEvents.stream() | ||
.filter(not(schemaEvents::contains)) | ||
.collect(Collectors.toList()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mfvitale Hi here have a performance bottleneck. time complexity is O(databaseSchemaNumber²)
, We have 140+ databases on single instance, here process need 2 hour+.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @minchowang can you please open a separate issue? Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With these changes the blocking snapshot will generate schema events only for signaled tables
closes: https://issues.redhat.com/browse/DBZ-6828