Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Introduce MySQL Connector #1829

Merged
merged 1 commit into from
Aug 17, 2023
Merged

feat: Introduce MySQL Connector #1829

merged 1 commit into from
Aug 17, 2023

Conversation

abcpro1
Copy link
Contributor

@abcpro1 abcpro1 commented Aug 7, 2023

Define and implement a MySQL connector for Dozer.

It operates similarly to the Postgres connector; especially when it comes to listing tables, columns, and schemas.

For replication, MySQL CDC is used to replay database changes in Dozer, after initially copying the table contents with a simple SELECT query.

The connector implementation is pretty much complete with the only exception of partial updates to JSON fields, which is a feature in MySQL that does not have pre-existing support in Dozer as far as I could tell.

closes #1670
/claim #1670

@algora-pbc algora-pbc bot mentioned this pull request Aug 7, 2023
@abcpro1 abcpro1 force-pushed the mysql branch 2 times, most recently from 3da019b to aef9633 Compare August 9, 2023 22:35
@snork-alt
Copy link
Contributor

Thanks for your contribution @abcpro1. Waiting for @chubei to review so we can merge and payout the bounty. @abcpro1 would u be up for a call ? Can drop me a note at matteo@getdozer.io ?

@chubei chubei self-requested a review August 10, 2023 12:59
@chubei
Copy link
Contributor

chubei commented Aug 10, 2023

Hi @abcpro1 thank you for your work! The implementation looks amazing and I can't wait to merge it.

I hope you can add several more things to this PR. Please let me know what you think.

  1. Implement MySqlConnector::types_mapping.

This method is mostly for documentation purpose, so we can quickly see which types are supported instead of digging into the code.

  1. Add an integration test in dozer-ingestion/tests.

I see that you've added an end to end test in dozer-tests and really appreciate that. It's not documented but we also have an integration test suite in dozer-ingestion/tests specifically designed for testing the connectors (while dozer-tests tests the whole dozer in an end to end fashion).

To add a new connector to the integration test suite is very easy. You implement several traits which does the database setup, and add a test that calls the test suite on your connector.

You can take the PostgresConnectorTest as a reference.

  1. Unit tests where necessary.

I don't see any unit tests yet (or I missed them?). I hope the core logic can be covered by unit tests.

@abcpro1
Copy link
Contributor Author

abcpro1 commented Aug 10, 2023

@chubei Thank you for reviewing this pull request. I'm glad you like it. I will add comprehensive tests as you suggested. As for MySqlConnector::types_mapping, I have a question: does "external type" here mean MySQL type?

@chubei
Copy link
Contributor

chubei commented Aug 11, 2023

does "external type" here mean MySQL type?

Yes.

@snork-alt
Copy link
Contributor

@abcpro1 it would be great if you can also document in the PR all the config parameters of the connector and maybe add a small sample using MySQL in dozer-samples. Thanks.

@abcpro1
Copy link
Contributor Author

abcpro1 commented Aug 11, 2023

@snork-alt Sure.

@chubei chubei linked an issue Aug 15, 2023 that may be closed by this pull request
Define and implement a MySQL connector for Dozer. This connector operates similarly to the Postgres connector.

The replication process starts by replicating rows from MySQL tables using a SELECT query.
The next step is using the MySQL CDC protocol (binary log events) for watching live changes to the tables and replicating them in Dozer.

All connector requirements are implemented, including listing tables, columns, schemas, and replication.

The implementation respects the column and table selection such that only the requested columns
are included in the generated schemas and in the replication process; the rest of the columns are ignored.
@chubei chubei added this pull request to the merge queue Aug 17, 2023
Merged via the queue into getdozer:main with commit 2dcb6e0 Aug 17, 2023
5 checks passed
@abcpro1 abcpro1 deleted the mysql branch August 27, 2023 22:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

MySQL Connector
3 participants