-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.
Already on GitHub? Sign in to your account
馃悰 Source Postgres (CDC): Sync fails when records in Postgres are deleted for tables with non null constraint on columns #8557
Comments
@malcolmtivelius could you upgrade your Postgres connector from 0.3.13 to 0.3.17 and tell us if the error persists? |
Hey I just upgraded Airbyte to v.0.32.6 and the Postgres connector to 0.3.17. Still running into the same issue when deleting a record! Let me know if you need anything else |
@sherifnada I'll let you prioritize 馃槃 |
Hi ! Just stumbled upon the same issue, and it seems that this is because of the way data is meant to be formatted when going out of the source, and in case of deletes, if REPLICA IDENTITY is set on the tables with DEFAULT value, we do not have information about the original data. A workaround i've find is to use REPLICA IDENTITY FULL in order to keep the previous data in cases of update / delete, and avoiding having issues in cases of delete, even if the best would be not to make some schema validation on these source events In the meantime, i tried to understand what was happening, since it seems to be inside debezium internal behaviour with kafka connect, and i wondered if this could be because of the way DELETE tombstone events are stripped here. I really new to this stuff, but this seems to impact the same case as the issue (and it would make sense that if there is a method to indicates that the key shall be dereferenced, that when this is stripped, it mistakes about the nature of the input) |
Thanks for letting me know about this workaround! @shrodingers I just tried it out and my issue was resolved. But as you say, it would still be beneficial to only send the primary key (and not the entire row) if a record is deleted. Will follow this issue for further updates |
Hello @malcolmtivelius |
Hi @yurii-bidiuk
I fixed it with the full replica identity as suggested by @shrodingers |
@yurii-bidiuk, I think the column needs to be Non nullable in the source for the error to be reproducible (because it will fail the JSON schema validation i guess). So the DELETE event, which has all but primary key to null, will fail validation (if columns are nullable, validation succeeds i think, since null is allowed). Let me know if i can help on this issue :) |
@lucienfregosibodyguard @shrodingers thank you, guys. Previously I used a table with 2 columns (id and testData), but after increasing the number of columns to 5 a bug start reproducing for me as well |
Environment
Current Behavior
Trying to setup CDC with postgres 12 (CloudSQL hosted)-> Bigquery, keep getting errors when data is deleted. Think it has something to do with that the postgres schema has non null constraints on certain columns. The deletes are picked up as row [id, null, null etc..] and since some of the columns have non null constraints it produces an error. Have tried with both wall2json and pgoutput
Expected Behavior
Expected behavior is that the delete events should be picked up by debezium and not throw any errors, since the non null constraints are only relevant when writing to the Postgres table, not reading from it.
Logs
LOG
Steps to Reproduce
Are you willing to submit a PR?
Do not know how to solve this, it seems like and issue with debezium
The text was updated successfully, but these errors were encountered: