Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Don't cast all types to string for CSV #498

Draft
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

candiduslynx
Copy link
Contributor

@candiduslynx candiduslynx commented May 9, 2024

Fixes https://github.com/cloudquery/cloudquery-issues/issues/1644

Requires apache/arrow#41595 & bump to arrow v16.1.0.

Diff in files generated:

  1. date32 & date64: 2021-01-02 00:00:00 is now 2021-01-02 00:00:00Z (so this can also be considered a follow-up for fix(deps): Upgrade github.com/apache/arrow/go to v16 #494)
  2. Lists are marshaled as lists instead of strings, so it required a deep conversion for proper support (ValueStr ignores null value set to "")

@candiduslynx candiduslynx requested a review from a team as a code owner May 9, 2024 05:30
@candiduslynx candiduslynx requested review from maaarcelino and disq and removed request for a team and maaarcelino May 9, 2024 05:30
@github-actions github-actions bot added fix and removed fix labels May 9, 2024
@github-actions github-actions bot added fix and removed fix labels May 9, 2024
return dt
}
switch dt := dt.(type) {
case *arrow.MapType:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maps aren't supported

}
switch dt := dt.(type) {
case *arrow.MapType:
case *arrow.FixedSizeListType:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lists are converted as list(converted elem)

@github-actions github-actions bot added fix and removed fix labels May 9, 2024
@erezrokah
Copy link
Contributor

erezrokah commented May 9, 2024

Can we wait for someone to complain about the CSV format before fixing this? We recently did a breaking change in all our destinations that are using this module due to #494 (which was justified as it fixed some stuff in Parquet, JSON and CSV for timestamps). Not sure we should do another one only because of a CSV issue nobody reported on.

@candiduslynx candiduslynx marked this pull request as draft May 9, 2024 07:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants