-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MySQL CDC sync fails because starting binlog position not found in DB #6425
Comments
@sherifnada just adding some thoughts here regarding the specific issue of waiting long periods of time between syncs. I ran into this issue and was going to post in the slack group but was curious what was considered "expected" behaviour. Upon reviewing debezium docs as well as airbyte:
I thought that's fair. The docs could potentially be more explicit but they do state that syncs should be frequent. If a user is implementing CDC, I would presume they understanding what that means i.e. reading bin logs, and thus they should understand that bin logs are not persistent and that sync should be frequent enough to consume the logs before they're dropped. |
CDC use binlog files to apply snapshot changes on db. binlog files have a retention policy, time of living, during that time mysql server keeps them alive and available for use.
User which is going to configure cdc for mysql database should be sure that:
To make reset from our side during a db sync might cause some issues:
This kind of critical error that we can not handle because binlogs and db settings should be properly configured. Instead of reset data on our side I suggest:
@sherifnada what do you think? |
@sherifnada please take a look at this comment #6425 (comment) |
@mkhokh-33 your suggestions sound good, let's go with those! |
Enviroment
Source: MySQL
0.4.4
Destination: Snowflake
0.3.14
EC2 Docker Airbyte instance
0.29.17-alpha
Current Behavior
context dump from #5870
logs-23-0.txt
looking back at these logs, they actually start with the error
but the job continues ahead and appears to "succeed" whilst still causing a retry
attached sync timestamps
separately. on point 2. I do have multiple connections with the same CDC source to the same destination. is this not allowed? I run these two multiple connections at the same time and they appeared to work as expected although I run into the "hanging" issue, where it doesnt actually COPY the data after reading it.
Expected Behavior
Presumably we should either:
The text was updated successfully, but these errors were encountered: