-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cdc: Handle changefeed over multiple tables with the same name. #62115
Conversation
Require the user specify `full_table_name` option when creating changefeed over multiple tables with the same name. Release Notes: None
@amruss, this seems like a reasonable behavior change to me but it is a behavior change, do you endorse it? It means that if someone wanted to have changes from multiple tables flowing to the same topic, they'd have to do it in two separate changefeed statements. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @miretskiy)
pkg/ccl/changefeedccl/changefeed_processors.go, line 587 at r1 (raw file):
} func (t *tableDescriptorTopic) GetName() string {
Maybe GetTargetName()
? tableDescriptorTopic.GetName()
reads like it returns the topic name, but the topic can be escaped and prefixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@HonoreDB @amruss
To be a bit pedantic: while it is a behavior change, right now, creating changefeed over same named tables is almost certainly a mistake. You cannot differentiate between events in different tables.
So, this forces you to add full_table_name option.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @HonoreDB)
pkg/ccl/changefeedccl/changefeed_processors.go, line 587 at r1 (raw file):
Previously, HonoreDB (Aaron Zinger) wrote…
Maybe
GetTargetName()
?tableDescriptorTopic.GetName()
reads like it returns the topic name, but the topic can be escaped and prefixed.
I used to have that... But during last round of refactoring, decided to keep GetName/version/etc to be identical to the API offered by table descriptor (something we used before introduction of TopicDescriptor).
That's why the method is GetName.
However, now that I have to override it anyway, renaming it to GetTargetName makes sense.
Perhaps in another PR though.
You can distinguish when the output is Avro, since the message format includes a schema identifier and the confluent schema registry distinguishes using (at least) metadata; this is how Netflix currently handles this case. Also I don't know, maybe somebody doesn't want to differentiate downstream and just wants dbv1.public.users and dbv2.public.users to flow into the same datalake. If anything it's less likely to be a mistake when you're referencing both tables in the same statement, harder to do by accident. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm.. Interesting and a good point. Perhaps the bug itself was a "feature". Though, I'd argue that this feature was a byproduct of mistake. We should have required the user to state their intent (via an option -- possibly different one from full_table_name).
Since we now have this feature, perhaps I should just drop this PR and close the bug as "working as intended"...
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @HonoreDB)
I think that this is intended behavior right now, might be better solved with a docs update (I don't think we do a great job today of explaining topic naming in our docs today anyways). |
Require the user specify
full_table_name
option whencreating changefeed over multiple tables with the same name.
Fixes #32509
Release Notes: None