Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🎉 Source MySQL: support all MySQL 8.0 types #7970

Merged
merged 17 commits into from
Dec 12, 2021

Conversation

tuliren
Copy link
Contributor

@tuliren tuliren commented Nov 15, 2021

What

How

  • The current AbstractJdbcSource is changed to a generic class.
  • A new JdbcCompatibleSourceOperations interface is added that extends from SourceOperations. It has more interface methods to be used in AbstractJdbcSource.
    • All the type related methods are moved from the source class to this interface.
  • The current JdbcSourceOperations is broken into two classes:
    • AbstractJdbcCompatibleSourceOperations has most of the original logics with generic types.
    • JdbcSourceOperations extends from AbstractJdbcCompatibleSourceOperations with the JDBCTypes.
    • MySqlSourceOperations extends from AbstractJdbcCompatibleSourceOperations with the MysqlType.

TODOs

  • Add more unit tests.
  • Handle complex MySQL types (e.g. Point).
  • Use MysqlType in all methods.

Recommended reading order

  1. MySqlSourceOperations.java
  2. The rest.

Pre-Merge Checklist - Updating Connector

Community member or Airbyter

  • Grant edit access to maintainers (instructions)
  • Secrets in the connector's spec are annotated with airbyte_secret
  • Unit & integration tests added and passing. Community members, please provide proof of success locally e.g: screenshot or copy-paste unit, integration, and acceptance test output. To run acceptance tests for a Python connector, follow instructions in the README. For java connectors run ./gradlew :airbyte-integrations:connectors:<name>:integrationTest.
  • Code reviews completed
  • Documentation updated
    • Connector's README.md
    • Connector's bootstrap.md. See description and examples
    • Changelog updated in docs/integrations/<source or destination>/<name>.md including changelog. See changelog example
  • PR name follows PR naming conventions

Airbyter

If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.

  • Create a non-forked branch based on this PR and test the below items on it
  • Build is successful
  • Credentials added to Github CI. Instructions.
  • /test connector=connectors/<name> command is passing.
  • New Connector version released on Dockerhub by running the /publish command described here
  • After the new connector version is published, connector version bumped in the seed directory as described here
  • Seed specs have been re-generated by building the platform and committing the changes to the seed spec files, as described here

@github-actions github-actions bot added the area/connectors Connector related issues label Nov 15, 2021
@tuliren tuliren temporarily deployed to more-secrets November 15, 2021 09:34 Inactive
Copy link
Contributor

@sherifnada sherifnada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like it! Great direction

@tuliren tuliren force-pushed the liren/support-all-mysql-types branch from cf5e44f to 3f675b2 Compare November 29, 2021 10:52
@tuliren tuliren temporarily deployed to more-secrets November 29, 2021 10:54 Inactive
@tuliren tuliren temporarily deployed to more-secrets November 29, 2021 11:10 Inactive
@tuliren tuliren force-pushed the liren/support-all-mysql-types branch from 809be96 to 55a5499 Compare November 29, 2021 11:12
@tuliren tuliren temporarily deployed to more-secrets November 29, 2021 11:14 Inactive
@tuliren tuliren temporarily deployed to more-secrets November 29, 2021 11:21 Inactive
@tuliren tuliren temporarily deployed to more-secrets November 29, 2021 11:23 Inactive
@tuliren tuliren temporarily deployed to more-secrets November 29, 2021 11:43 Inactive
@tuliren tuliren marked this pull request as ready for review November 30, 2021 22:53
@tuliren
Copy link
Contributor Author

tuliren commented Dec 2, 2021

@sherifnada, @etsybaev, @DoNotPanicUA, this PR is ready for review. Would you mind taking a look?

@tuliren tuliren temporarily deployed to more-secrets December 2, 2021 00:47 Inactive
Copy link
Contributor

@etsybaev etsybaev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks quite impressive!

* fail it, so it is turned off by default. It should be enabled for all databases eventually.
*/
protected boolean testCatalog() {
return false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that expected to be committed or just accidentally forgotten?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is expected. See the comment for the rationale. We need to check each database because we can turn it on by default.

* fail it, so it is turned off by default. It should be enabled for all databases eventually.
*/
protected boolean testCatalog() {
return false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment

@@ -97,9 +97,4 @@ public static void main(final String[] args) throws Exception {
LOGGER.info("completed source: {}", CockroachDbSource.class);
}

@Override
protected JdbcSourceOperations getSourceOperations() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how come this was removed? is the class still useful?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method is unnecessary. The sourceOperations is an instance variable, and it is assigned in the constructor. The AbstractJdbcSource can just reference that variable.

case TIME -> putTime(json, columnName, resultSet, colIndex);
// The returned year value can either be a java.sql.Short (when yearIsDateType=false)
// or a java.sql.Date with the date set to January 1st, at midnight (when yearIsDateType=true).
// Currently, JsonSchemaPrimitive does not support integer, but only supports number.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems like fixing up jsonSchemaPrimitive is high up on the list of things to fix also?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added an issue: #8722

* does not contain information about its charset, which is needed to determine whether the column
* is a string or binary. We don't distinguish between string vs binary in
* {@link JsonSchemaPrimitive} for now. So it is fine. However, in the future, if we want to
* separate these two types, we need to update the column metadata query for MySQL. See
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we have an issue for this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added an issue: #8723

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually JDBC does return BINARY for CHAR columns with binary character set. Same for VARCHAR. So this is a false alarm.

.fullSourceDataType("varchar(256) character set cp1251")
.addInsertValues("null", "'тест'")
.addExpectedValues(null, "тест")
// MySQL converts values in the ranges '0' - '69' to YEAR value in the range 2000 - 2069
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what on earth...

@tuliren tuliren temporarily deployed to more-secrets December 9, 2021 21:22 Inactive
@tuliren tuliren temporarily deployed to more-secrets December 9, 2021 22:08 Inactive
@tuliren tuliren temporarily deployed to more-secrets December 11, 2021 21:21 Inactive
@tuliren
Copy link
Contributor Author

tuliren commented Dec 12, 2021

/test connector=connectors/source-mysql

🕑 connectors/source-mysql https://github.com/airbytehq/airbyte/actions/runs/1568650475
❌ connectors/source-mysql https://github.com/airbytehq/airbyte/actions/runs/1568650475
🐛 https://gradle.com/s/bcgx7vv2yn5n2

@github-actions github-actions bot added the area/documentation Improvements or additions to documentation label Dec 12, 2021
@tuliren tuliren temporarily deployed to more-secrets December 12, 2021 04:06 Inactive
@tuliren
Copy link
Contributor Author

tuliren commented Dec 12, 2021

/test connector=connectors/source-mysql

🕑 connectors/source-mysql https://github.com/airbytehq/airbyte/actions/runs/1568694629
✅ connectors/source-mysql https://github.com/airbytehq/airbyte/actions/runs/1568694629
No Python unittests run

@jrhizor jrhizor temporarily deployed to more-secrets December 12, 2021 04:38 Inactive
@tuliren
Copy link
Contributor Author

tuliren commented Dec 12, 2021

/publish connector=connectors/source-mysql

🕑 connectors/source-mysql https://github.com/airbytehq/airbyte/actions/runs/1568730855
✅ connectors/source-mysql https://github.com/airbytehq/airbyte/actions/runs/1568730855

@jrhizor jrhizor temporarily deployed to more-secrets December 12, 2021 05:03 Inactive
@tuliren tuliren temporarily deployed to more-secrets December 12, 2021 05:35 Inactive
@tuliren tuliren merged commit 6843bc1 into master Dec 12, 2021
@tuliren tuliren deleted the liren/support-all-mysql-types branch December 12, 2021 05:49
schlattk pushed a commit to schlattk/airbyte that referenced this pull request Jan 4, 2022
* Add jdbc compatible layer

* Support routine mysql types

* Format code

* Fix build

* Refactor abstract jdbc source and operation classes

* Update mysql source operations

* Test discover command for mysql

* Remove abstract jdbc compatible source layer

* Format code

* Update template

* Fix more types

* Bump version

* Log original field type

* Update comments

* Bump version in seed
@rodireich rodireich assigned rodireich and unassigned rodireich Apr 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/connectors Connector related issues area/documentation Improvements or additions to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support all types in MySQL source
7 participants