Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pg cdc feature branch #2548

Merged
merged 37 commits into from
Apr 9, 2021
Merged

pg cdc feature branch #2548

merged 37 commits into from
Apr 9, 2021

Conversation

jrhizor
Copy link
Contributor

@jrhizor jrhizor commented Mar 22, 2021

add postgres CDC support

@jrhizor
Copy link
Contributor Author

jrhizor commented Mar 22, 2021

Biggest questions are where to inject the CDC logic.

@cgardens cgardens changed the title debezium wip debezium feature branch Mar 25, 2021
@cgardens cgardens changed the title debezium feature branch pg cdc feature branch Mar 25, 2021
@cgardens cgardens marked this pull request as ready for review April 7, 2021 23:24
jrhizor and others added 9 commits April 7, 2021 16:38
* cdc docs

* Update docs/integrations/sources/postgres.md

Co-authored-by: Charles <giardina.charles@gmail.com>

* address gcp

* learn too english

* add link

* add more disk space warnings

* add additional cdc use case

* add information on how to find postgresql.conf

* add how to find the file

Co-authored-by: Charles <giardina.charles@gmail.com>
* postgres cdc race condition

* working? but different process

* add additional logging to help debug in the future

* everything done except working config

* remove unintended change
* add oneof configuration for cdc postgres

* fmt

Co-authored-by: Charles <giardina.charles@gmail.com>
@cgardens
Copy link
Contributor

cgardens commented Apr 9, 2021

/test connector=source-postgres

🕑 source-postgres https://github.com/airbytehq/airbyte/actions/runs/734377112
✅ source-postgres https://github.com/airbytehq/airbyte/actions/runs/734377112

@jrhizor
Copy link
Contributor Author

jrhizor commented Apr 9, 2021

For manual testing I ran:

docker run --rm --name source -v $(pwd)/resources/postgresql.conf:/etc/postgresql/postgresql.conf -e POSTGRES_PASSWORD=password -p 2000:5432 -d postgres -c 'config_file=/etc/postgresql/postgresql.conf'

from ~/code/airbyte/airbyte-integrations/connectors/source-postgres/src/main

Then I did:

→ docker exec -it source psql -U postgres
psql (13.1 (Debian 13.1-1.pgdg100+1))
Type "help" for help.

postgres=# CREATE TABLE cars(id INTEGER, name VARCHAR(200), PRIMARY KEY (id));
CREATE TABLE
postgres=# INSERT INTO cars VALUES (0, 'mazda');
INSERT 0 1
postgres=# INSERT INTO cars VALUES (1, 'ferrari');
INSERT 0 1
postgres=# SELECT pg_create_logical_replication_slot('slot1', 'pgoutput');
 pg_create_logical_replication_slot
------------------------------------
 (slot1,0/15E4058)
(1 row)

postgres=# CREATE PUBLICATION pub1 FOR ALL TABLES;
CREATE PUBLICATION
postgres=# INSERT INTO cars VALUES (3, 'tesla');
INSERT 0 1
postgres=# INSERT INTO cars VALUES (4, 'hotwheels');
INSERT 0 1
postgres=# DELETE FROM cars WHERE name = 'hotwheels';
DELETE 1
postgres=#

And I ran three syncs, one after making the publication, one after adding two more rows, and one after the delete.

Final output:

# jrhizor in /tmp/airbyte_local/file [14:26:46]
→ cat _airbyte_raw_public_cars.jsonl
{"_airbyte_ab_id":"099f4399-df2f-4f84-ba31-0f1978c04434","_airbyte_emitted_at":1618003545939,"_airbyte_data":{"id":0,"name":"mazda","_ab_cdc_updated_at":1618003547642,"_ab_cdc_lsn":22954984,"_ab_cdc_deleted_at":null}}
{"_airbyte_ab_id":"84e056ce-ed41-4fd5-888c-0d04762b9d7a","_airbyte_emitted_at":1618003545939,"_airbyte_data":{"id":1,"name":"ferrari","_ab_cdc_updated_at":1618003547655,"_ab_cdc_lsn":22954984,"_ab_cdc_deleted_at":null}}
{"_airbyte_ab_id":"aad4d213-5490-42e4-abf6-51933a7d2cb9","_airbyte_emitted_at":1618003589691,"_airbyte_data":{"id":3,"name":"tesla","_ab_cdc_updated_at":1618003573423,"_ab_cdc_lsn":22955080,"_ab_cdc_deleted_at":null}}
{"_airbyte_ab_id":"853e2385-8835-4fd0-b568-c54609e6b550","_airbyte_emitted_at":1618003589691,"_airbyte_data":{"id":4,"name":"hotwheels","_ab_cdc_updated_at":1618003581506,"_ab_cdc_lsn":22955576,"_ab_cdc_deleted_at":null}}
{"_airbyte_ab_id":"cf11e9b7-0277-4fa0-9162-b314ed7b8e54","_airbyte_emitted_at":1618003634540,"_airbyte_data":{"id":4,"name":null,"_ab_cdc_updated_at":1618003625921,"_ab_cdc_lsn":22955816,"_ab_cdc_deleted_at":1618003625921}}

@cgardens
Copy link
Contributor

cgardens commented Apr 9, 2021

Output of my test:

cdc

➜  airbyte git:(jrhizor/debezium) cat /tmp/airbyte_local/json_data/cdc_test/_airbyte_raw_cdc_public_cars.jsonl
{"_airbyte_ab_id":"3c00aed2-03c8-4c7e-a3a3-bb6e3f9f013c","_airbyte_emitted_at":1618004801536,"_airbyte_data":{"id":0,"name":"mazda","_ab_cdc_updated_at":1618004802907,"_ab_cdc_lsn":267773586200,"_ab_cdc_deleted_at":null}}
{"_airbyte_ab_id":"c3574d33-f5f9-4f99-8e0e-5dea9a401a60","_airbyte_emitted_at":1618004801536,"_airbyte_data":{"id":1,"name":"ferrari","_ab_cdc_updated_at":1618004802921,"_ab_cdc_lsn":267773586200,"_ab_cdc_deleted_at":null}}
➜  airbyte git:(jrhizor/debezium) cat /tmp/airbyte_local/json_data/cdc_test/_airbyte_raw_cdc_public_plays.jsonl
{"_airbyte_ab_id":"24630352-7d90-448a-acda-3971686f5eac","_airbyte_emitted_at":1618004801536,"_airbyte_data":{"id":1,"name":"victory","_ab_cdc_updated_at":1618004802942,"_ab_cdc_lsn":267773586200,"_ab_cdc_deleted_at":null}}
{"_airbyte_ab_id":"84599002-372e-473e-abb7-badba57a4558","_airbyte_emitted_at":1618004801536,"_airbyte_data":{"id":3,"name":"the art of success","_ab_cdc_updated_at":1618004802942,"_ab_cdc_lsn":267773586200,"_ab_cdc_deleted_at":null}}
{"_airbyte_ab_id":"371dac39-f10d-4216-af2c-ad161fedc9a8","_airbyte_emitted_at":1618004801536,"_airbyte_data":{"id":2,"name":"much ado about nothing","_ab_cdc_updated_at":1618004802943,"_ab_cdc_lsn":267773586200,"_ab_cdc_deleted_at":null}}
➜  airbyte git:(jrhizor/debezium)

conventional incremental

➜  airbyte git:(jrhizor/debezium) cat /tmp/airbyte_local/json_data/cdc_test/_airbyte_raw_cdc_public_plays.jsonl
{"_airbyte_ab_id":"a3631c00-fac5-4f15-839d-8f9d76a59f97","_airbyte_emitted_at":1618004465233,"_airbyte_data":{"id":1,"name":"victory","_ab_cdc_updated_at":1618004466612,"_ab_cdc_lsn":267773585880,"_ab_cdc_deleted_at":null}}
{"_airbyte_ab_id":"1f2e656f-68d0-49c2-9386-a89b3fb6f6c9","_airbyte_emitted_at":1618004465233,"_airbyte_data":{"id":3,"name":"the art of success","_ab_cdc_updated_at":1618004466613,"_ab_cdc_lsn":267773585880,"_ab_cdc_deleted_at":null}}
{"_airbyte_ab_id":"0aeb4c8f-b88e-42e2-a7c5-9475f9422fd3","_airbyte_emitted_at":1618004465233,"_airbyte_data":{"id":2,"name":"much ado about nothing","_ab_cdc_updated_at":1618004466613,"_ab_cdc_lsn":267773585880,"_ab_cdc_deleted_at":null}}
➜  airbyte git:(jrhizor/debezium) cat /tmp/airbyte_local/json_data/cdc_test/_airbyte_raw_conventional_public_cars.jsonl
{"_airbyte_ab_id":"d5656811-ae3c-4e55-aff5-28fc3ca49495","_airbyte_emitted_at":1618004780885,"_airbyte_data":{"id":0,"name":"mazda"}}
{"_airbyte_ab_id":"de742e13-4d34-40b2-960d-bbd02d8b1358","_airbyte_emitted_at":1618004780885,"_airbyte_data":{"id":1,"name":"ferrari"}}
➜  airbyte git:(jrhizor/debezium) cat /tmp/airbyte_local/json_data/cdc_test/_airbyte_raw_conventional_public_plays.jsonl
{"_airbyte_ab_id":"24ab4a31-b9e4-468a-9e0f-c403d89c5ee0","_airbyte_emitted_at":1618004780885,"_airbyte_data":{"id":1,"name":"victory"}}
{"_airbyte_ab_id":"61f187bd-7c45-4040-9914-be60d924e1d7","_airbyte_emitted_at":1618004780885,"_airbyte_data":{"id":3,"name":"the art of success"}}
{"_airbyte_ab_id":"1d5c3bbd-5ac4-4e78-9661-45efece7316f","_airbyte_emitted_at":1618004780885,"_airbyte_data":{"id":2,"name":"much ado about nothing"}}

@cgardens
Copy link
Contributor

cgardens commented Apr 9, 2021

/publish connector=source-postgres

❌ source-postgres https://github.com/airbytehq/airbyte/actions/runs/734471323

@cgardens
Copy link
Contributor

cgardens commented Apr 9, 2021

/publish connector=source-postgres

❌ source-postgres https://github.com/airbytehq/airbyte/actions/runs/734477157

* add docs on creating replica identities

* emphasize danger

* grammar
@cgardens
Copy link
Contributor

cgardens commented Apr 9, 2021

/test connector=source-postgres

🕑 source-postgres https://github.com/airbytehq/airbyte/actions/runs/734542603
✅ source-postgres https://github.com/airbytehq/airbyte/actions/runs/734542603

@cgardens
Copy link
Contributor

cgardens commented Apr 9, 2021

/publish connector=source-postgres

❌ source-postgres https://github.com/airbytehq/airbyte/actions/runs/734558600

@cgardens
Copy link
Contributor

cgardens commented Apr 9, 2021

/publish connector=connectors/source-postgres

🕑 connectors/source-postgres https://github.com/airbytehq/airbyte/actions/runs/734578944
✅ connectors/source-postgres https://github.com/airbytehq/airbyte/actions/runs/734578944

@jrhizor jrhizor merged commit 2b19da8 into master Apr 9, 2021
@jrhizor jrhizor deleted the jrhizor/debezium branch April 9, 2021 23:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants