Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARROW-12493: Add support for writing dictionary arrays to CSV and JSON #16

Merged
merged 1 commit into from
Apr 22, 2021

Conversation

tustvold
Copy link
Contributor

Provide support for serializing dictionary arrays to CSV and JSON by hydrating them to their underlying representation. This is not the most efficient way to do this, but was the simplest way I could think of to cover all bases.

It may be worthwhile special-casing StringDictionaries with a more efficient implementation in a subsequent PR, as I imagine they're the most common form of DictionaryArray.

@github-actions github-actions bot added the arrow Changes to the arrow crate label Apr 21, 2021
@codecov-commenter
Copy link

Codecov Report

Merging #16 (4aaa465) into master (5479e19) will increase coverage by 0.01%.
The diff coverage is 97.29%.

❗ Current head 4aaa465 differs from pull request most recent head 4fa2a1c. Consider uploading reports for the commit 4fa2a1c to get more accurate results
Impacted file tree graph

@@            Coverage Diff             @@
##           master      #16      +/-   ##
==========================================
+ Coverage   82.47%   82.48%   +0.01%     
==========================================
  Files         162      162              
  Lines       43414    43447      +33     
==========================================
+ Hits        35806    35838      +32     
- Misses       7608     7609       +1     
Impacted Files Coverage Δ
arrow/src/csv/writer.rs 83.01% <91.66%> (+0.27%) ⬆️
arrow/src/json/writer.rs 91.92% <100.00%> (+0.43%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5479e19...4fa2a1c. Read the comment docs.

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Thank you @tustvold


assert_eq!(
String::from_utf8(buf).unwrap(),
r#"{"c1":"cupcakes","c2":"sdsd"}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧁

@jorgecarleitao
Copy link
Member

We had to perform a small re-write of master. The commits may look a bit odd, but it should not cause conflicts. Could you kindly rebase this against the latest master to make it easier to review?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrow Changes to the arrow crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants