Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JDBC Sources: Consolidate putDate behavior, and potentially other methods #17068

Closed
edgao opened this issue Sep 22, 2022 · 5 comments · Fixed by #20346
Closed

JDBC Sources: Consolidate putDate behavior, and potentially other methods #17068

edgao opened this issue Sep 22, 2022 · 5 comments · Fixed by #20346
Assignees

Comments

@edgao
Copy link
Contributor

edgao commented Sep 22, 2022

Multiple sources override the default putDate implementation (

protected void putDate(final ObjectNode node, final String columnName, final ResultSet resultSet, final int index) throws SQLException {
node.put(columnName, DataTypeUtils.toISO8601String(resultSet.getDate(index)));
}
), because it's incorrect. Specifically, it produces a full ISO8601 timestamp, when dates should just be YYYY-MM-DD.

Postgres and Mysql currently have the exact same implementation (https://github.com/airbytehq/airbyte/blob/master/airbyte-integrations/connectors/source-postgres/src/main/java/io/airbyte/integrations/source/postgres/PostgresSourceOperations.java#L214-L217), and we intend to add that to Snowflake as well. Would be good to just modify the base class and fix this for all JDBC sources.


We should probably also check for other methods in a similar situation, and consolidate them too.

@VitaliiMaltsev
Copy link
Contributor

@edgao i have made a small investigation into DATE datatype in different databases
For the most of them YYYY-MM-DD is a valid pattern for DATE but also we have exceptions like OracleDb
DATE in OracleDb could contain not only year/month/day but also time so for Oracle yyyy-MM-dd'T'HH:mm:ss is a valid pattern

  1. Should we use the common putDate for all jdbc sources with pattern YYYY-MM-DD and override it for OracleDb?
  2. Which is a correct way to handle DATE column if it is a cursor field in incremental sync and all of previously synced records have full ISO8601 timestamp pattern?

@edgao
Copy link
Contributor Author

edgao commented Dec 8, 2022

if an oracle DATE column can include a timestamp... shouldn't we just discover it as a timestamp without timezone?

@VitaliiMaltsev
Copy link
Contributor

VitaliiMaltsev commented Dec 8, 2022

if an oracle DATE column can include a timestamp... shouldn't we just discover it as a timestamp without timezone?

yes, we can do it. Do I understand you correctly that it is necessary to do what I described in paragraph 1?
OracleDB has also separate TIMESTAMP data type which could also include the same values

@edgao
Copy link
Contributor Author

edgao commented Dec 8, 2022

yeah, that sounds right. I.e. most DB sources would continue to discover date fields, and write YYYY-MM-DD values; only source-oracle would discover timestamp_without_timezone and write full timestamp values

for previously-synced data - wouldn't the time component always be 00:00:00? so we can always just ignore it

@VitaliiMaltsev VitaliiMaltsev linked a pull request Dec 14, 2022 that will close this issue
37 tasks
@VitaliiMaltsev
Copy link
Contributor

blocked by #20634

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants