Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File CDK: S3 config adapter (Parquet options) #28135

Closed
2 tasks
clnoll opened this issue Jul 11, 2023 · 4 comments
Closed
2 tasks

File CDK: S3 config adapter (Parquet options) #28135

clnoll opened this issue Jul 11, 2023 · 4 comments

Comments

@clnoll
Copy link
Contributor

clnoll commented Jul 11, 2023

The S3 connector offers a columns option for Parquet files. It does not appear to be in use by any connectors, but this should be verified. If any connectors are using it, we should update the file-based CDK to handle this option.

Acceptance Criteria

  • The existing Parquet config options are mapped and handled appropriately by the file-based CDK.
  • Any options that we cannot support are identified, along with the connectors that will be impacted.
@clnoll
Copy link
Contributor Author

clnoll commented Jul 11, 2023

Grooming notes:

  • Verify that column selection would be a good replacement for this.
  • We'll deprecate the columns option (after verifying nobody from cloud is using it) and if any OS users write in we'll tell them to use column selection.
  • To deprecate: go through the normal procedure (message to cloud users) + add a slack message for OS. If this option is set, we'll raise an exception stating that the field is deprecated & they should use column selection.

@maxi297 maxi297 self-assigned this Aug 8, 2023
@maxi297
Copy link
Contributor

maxi297 commented Aug 8, 2023

Closing as there are only two customers on cloud using this option (one of which has never been synced and the other was a one time thing) and there is a path forward i.e. the customer can use column selection. This path forward could have a hit on performance but we can re-evaluate when it arises

@maxi297 maxi297 closed this as completed Aug 8, 2023
@maxi297
Copy link
Contributor

maxi297 commented Aug 8, 2023

Comment on wrong issue. Re-opening

@maxi297 maxi297 reopened this Aug 8, 2023
@maxi297
Copy link
Contributor

maxi297 commented Aug 8, 2023

It was the right issue... So closing because of :

Closing as there are only two customers on cloud using this option (one of which has never been synced and the other was a one time thing) and there is a path forward i.e. the customer can use column selection. This path forward could have a hit on performance but we can re-evaluate when it arises

@maxi297 maxi297 closed this as completed Aug 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants