Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DBZ-6046 Add instructions for upgrading the PG database used by Debezium #4403

Conversation

roldanbob
Copy link
Contributor

DBZ-6046

Adds steps for upgrading the PostgreSQL database that Debezium connects to.

Tested in a local Antora build.

I've embedded some questions as comments (L2295, L2298, L2314)

Copy link
Contributor

@jpechane jpechane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@roldanbob I think in general this is good description. I aded answers to the inline comments and also mentioned few clarifications.

For example, when a database server that a connector monitors stops or crashes, after the connector re-establishes communication with the PostgreSQL server, it continues to read from the last position recorded by the log sequence number (LSN) offset.
The connector obtains the information about the last recorded offset from the write-ahead log (WAL) at the configured PostgreSQL replication slot.

However, during a PostgreSQL upgrade, all replication slots are removed, and these slots are not recreated automatically.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs an additional point. It is possible to recreate the slot in advance but the LSNs after database upgrade are different. So even worst case scenario is that Debezium will find the slot but with LSN value inconsistent with those that they have stored as seen so they can just jump over existting changes tryring to resume from a position it knows it should start from. In this case the connector will not fail but in fact data will be lost silently.


7. Shut down the connector gracefully by stopping Kafka Connect. +
Kafka Connect stops the connectors, flushes all event records to Kafka, and records the last offset received from each connector.
// Do we need to delete the connector and its offset topic?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stopping the connector means deleting it if you don't stop the whole Kafka Connect cluster.
The offset topic should be kept as it is shared among all connectors.

// Do we need to delete the connector and its offset topic?

8. As a PostgreSQL administrator, drop the replication slot on the primary database server.
// Can this be done via setting xref:postgresql-property-slot-drop-on-stop[`slot.drop.on.stop`] to `true`?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this option is only for testing.


13. As a PostgreSQL administrator, create a {prodname} logical replication slot on the database.
You must create the slot before enabling writes to the database.
Otherwise, {prodname} cannot capture the changes, resulting in data loss.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above, there are three sitiations that can happen

  • slot is not created and auto creation of slot enabled at Debezium config - probable data loss
  • slot is not created and auto creation of slot disabled at Debezium config - connector refuses to start
  • slot is created but old LSN stored in connector offsets - undefined behaviour, data loss

endif::product[]

15. In the {prodname} connector configuration, set the xref:postgresql-property-publication-name[`publication.name`] property to the name of the publication.
// Is this necessary? Is the previous configured publication name still valid? Should automatic creation be disabled? (i.e., `publication.autocreate.mode` set to `disabled`)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd modfiy this a bit saying something like if publication is not present after database upgrade it should be re-created. This is more like an insurance step as IIRC the publiations should survive the upgrade.

Otherwise, {prodname} cannot capture the changes, resulting in data loss.

14. As a PostgreSQL administrator, create a publication that defines the tables to be captured.
// Can this step be skipped, assuming that a publication was previously configured, and it's preserved during upgrade?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment below. IMHO we can assume that the publication is preserved and it is more like check it was preserved and if not the re-create it.

@jpechane
Copy link
Contributor

jpechane commented Apr 6, 2023

@roldanbob Have you had a chance to see my comments? Thanks

@roldanbob
Copy link
Contributor Author

@roldanbob Have you had a chance to see my comments? Thanks

@jpechane Returning to this now that the downstream release is complete. I've noted your comments, but because this branch grew stale in the interim, I'm going to renew work on a new branch and close this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants