Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PECO-1297] sqlalchemy: fix: can't read columns for tables containing a TIMESTAMP_NTZ column #296

Merged
merged 4 commits into from
Nov 30, 2023

Conversation

susodapop
Copy link
Contributor

@susodapop susodapop commented Nov 30, 2023

Description

Our sqlalchemy dialect contains a method that parses the contents of a Thrift TGetColumnsResponse and maps column types from Databricks → sqlalchemy types. This uses a dictionary, which prior to this PR didn't include an entry for TIMESTAMP_NTZ types. This was not caught by sqlalchemy's reusable dialect compliance tests because TIMESTAMP_NTZ is specific to Databricks and was therefore not covered. For this PR, I added an e2e test to capture this and wrote the fix to make the test pass.

This change only affects the sqlalchemy codepath. I reran the ComponentReflectionTest portion of the sqlalchemy dialect compliance test suite (which is the only test suite that uses the dialect's get_columns method). All tests pass.

All non sqlalchemy e2e tests pass as well.

Related Tickets and Documents

Closes #295

        _raw_col_type = re.search(pat, thrift_resp_row.TYPE_NAME).group(0).lower()  # type: ignore
>       _col_type = GET_COLUMNS_TYPE_MAP[_raw_col_type]
E       KeyError: 'timestamp_ntz'

Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com>
Closes issue #295

Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com>
Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com>
Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com>
@susodapop susodapop merged commit 2a68448 into main Nov 30, 2023
12 checks passed
@susodapop susodapop deleted the peco-1297 branch November 30, 2023 22:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

TIMESTAMP_NTZ datatype used by DataFrame.to_sql but fails on read_sql_table
2 participants