Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Get all columns with describe table method from RedshiftData-api #3377

Merged

Conversation

beubeu13220
Copy link
Contributor

Signed-off-by: Alexandre Brehelin alexandre.brehelin@needelp.com

What this PR does / why we need it:
As described in issues, with current get_table_column_names_and_types method in Redshift source we can't register data with more than 500 columns.
Push method raise an error because input DataFrame doesn't fit condition (set(input_columns) != set(source_columns)) return by get_table_column_names_and_types.
By default, describe table method return only 500 first columns from table. To get all columns we have to iterate through page results.

Which issue(s) this PR fixes:
Fixes #3282

@beubeu13220 beubeu13220 changed the title fix: get all columns with describe table method from RedshiftData-api fix: Get all columns with describe table method from RedshiftData-api Dec 4, 2022
Copy link
Collaborator

@adchia adchia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

Copy link
Collaborator

@adchia adchia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@feast-ci-bot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: adchia, beubeu13220

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@adchia adchia force-pushed the fixRedshiftGetColumnNamesAndTypes branch from b837110 to b7c480a Compare December 14, 2022 21:40
@feast-ci-bot feast-ci-bot removed the lgtm label Dec 14, 2022
@adchia
Copy link
Collaborator

adchia commented Dec 14, 2022

/lgtm

…names and types

Signed-off-by: Alexandre Brehelin <alexandre.brehelin@needelp.com>
@adchia adchia force-pushed the fixRedshiftGetColumnNamesAndTypes branch from b7c480a to dc7fa49 Compare December 14, 2022 22:44
@feast-ci-bot feast-ci-bot removed the lgtm label Dec 14, 2022
@adchia
Copy link
Collaborator

adchia commented Dec 14, 2022

/lgtm

@adchia adchia merged commit fd97254 into feast-dev:master Dec 14, 2022
adchia pushed a commit that referenced this pull request Dec 15, 2022
## [0.27.1](v0.27.0...v0.27.1) (2022-12-15)

### Bug Fixes

* Enable registry caching in SQL Registry ([#3395](#3395)) ([2e57376](2e57376))
* Fix bug where SQL registry was incorrectly writing infra config around online stores ([#3394](#3394)) ([6bcf77c](6bcf77c))
* Get all columns with describe table method from RedshiftData-api ([#3377](#3377)) ([fd97254](fd97254))
* ODFV able to handle boolean pandas type ([#3384](#3384)) ([8f242e6](8f242e6))
* Remove PySpark dependency from Snowflake Offline Store ([#3388](#3388)) ([7b160c7](7b160c7))
felixwang9817 pushed a commit that referenced this pull request Jan 3, 2023
# [0.28.0](v0.27.0...v0.28.0) (2023-01-03)

### Bug Fixes

* Apply billing project when infer schema ([#3417](#3417)) ([4f9ad7e](4f9ad7e))
* Assertion condition when value is 0 ([#3401](#3401)) ([98a24a3](98a24a3))
* Enable registry caching in SQL Registry ([#3395](#3395)) ([2e57376](2e57376))
* Fix bug where SQL registry was incorrectly writing infra config around online stores ([#3394](#3394)) ([6bcf77c](6bcf77c))
* Get all columns with describe table method from RedshiftData-api ([#3377](#3377)) ([fd97254](fd97254))
* ODFV able to handle boolean pandas type ([#3384](#3384)) ([8f242e6](8f242e6))
* Remove PySpark dependency from Snowflake Offline Store ([#3388](#3388)) ([7b160c7](7b160c7))
* Specifies timeout in exception polling ([#3398](#3398)) ([c0ca7e4](c0ca7e4))
* Update import logic to remove `pyspark` dependency from Snowflake Offline Store ([#3397](#3397)) ([cf073e6](cf073e6))

### Features

* Add template for Github Codespaces ([#3421](#3421)) ([41c0537](41c0537))
* Adds description attribute for features/fields ([#3425](#3425)) ([26f4881](26f4881))
* Snowflake skip materialization if no table change ([#3404](#3404)) ([0ab3942](0ab3942))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Feast push (Redshift/DynamoDb) not work with PushMode.ONLINE_AND_OFFLINE when more than 500 columns
3 participants