Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARROW-9777: [Rust] [IPC] write custom metadata [WIP] #9011

Closed
wants to merge 4 commits into from

Conversation

mqy
Copy link
Contributor

@mqy mqy commented Dec 25, 2020

I think this PR is a sub task of issue ARROW-9777.
Sorry had not seen this issue before I finished this PR.

This PR is extracted from one of my dev branch, which has more changes for custom metadata. You may have a look at the diff to arrow master.

Two design choices:

Added a new type CustomMetaData (alias to BTreeMap), in datatypes.rs.

Why BTreeMap instead of HashMap? Because Field implements traits Hash, PartialOrd, and Ord,
but HashMap breaks the behavior.

Additional refactoring

It tends to be tedious when add more args, for example:
try_new, try_new_with_options, try_new_with_options_and_custom_metadata.

Finally, I refactored several functions according to builder pattern. Sorry that, this refactoring may introduce some inconveniences.

@github-actions
Copy link

@codecov-io
Copy link

codecov-io commented Dec 25, 2020

Codecov Report

Merging #9011 (f13c00b) into master (a4f7c4a) will decrease coverage by 0.04%.
The diff coverage is 51.06%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #9011      +/-   ##
==========================================
- Coverage   82.87%   82.82%   -0.05%     
==========================================
  Files         201      201              
  Lines       49739    49781      +42     
==========================================
+ Hits        41220    41232      +12     
- Misses       8519     8549      +30     
Impacted Files Coverage Δ
rust/arrow-flight/src/utils.rs 0.00% <0.00%> (ø)
rust/arrow/src/datatypes.rs 76.45% <ø> (ø)
...ntegration-testing/src/bin/arrow-file-to-stream.rs 0.00% <0.00%> (ø)
...ion-testing/src/bin/arrow-json-integration-test.rs 0.00% <0.00%> (ø)
...ntegration-testing/src/bin/arrow-stream-to-file.rs 0.00% <0.00%> (ø)
rust/arrow/src/ipc/writer.rs 76.88% <55.69%> (-6.37%) ⬇️
rust/arrow/src/ipc/reader.rs 82.90% <100.00%> (+0.03%) ⬆️
rust/parquet/src/arrow/schema.rs 90.62% <100.00%> (ø)
rust/arrow/src/array/transform/fixed_binary.rs 78.94% <0.00%> (-5.27%) ⬇️
rust/arrow/src/ipc/gen/Message.rs 34.37% <0.00%> (+2.34%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a4f7c4a...f13c00b. Read the comment docs.

@mqy mqy marked this pull request as draft December 25, 2020 13:10
@mqy mqy force-pushed the ARROW-9777_ipc_write_custom_metadata branch from cc11d56 to 8dfa2f5 Compare December 25, 2020 13:56
@mqy
Copy link
Contributor Author

mqy commented Dec 25, 2020

I just mark this PR as draft, because I think there must have some missing tests to add.

@nevi-me since this PR may overlaps with your existing work, please let me know should I close this PR or not?

@mqy mqy changed the title ARROW-9777: [Rust] make IPC writer able to write custom metadata ARROW-9777: [Rust] [IPC] write custom metadata [WIP] Dec 25, 2020
@mqy mqy closed this Dec 28, 2020
Copy link
Contributor

@nevi-me nevi-me left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @mqy, I'm on vacation, so I haven't been checking the project much.

ARROW-10299 is for implementing ipc::MetadataVersion::V5 on the write side (reads are mostly fine)
ARROW-10258 depends on ARROW-10259 that you're working on.

}

/// Creates the FileWriter.
pub fn new(writer: W, schema: &Schema) -> Self {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The major downside with a new() that doesn't create the header, is that it creates a burden on end-users to remember to call the write_header_schema(). I would prefer to stick with try_new() and try_new_with_options()(happy to rename the latter), where we now amend the latter to also take other parameters.

I don't mind us using a builder pattern, but we shouldn't change the try_new in the process.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, so add metadata to IpcWriteOptions?

@mqy
Copy link
Contributor Author

mqy commented Dec 30, 2020

Hey @mqy, I'm on vacation, so I haven't been checking the project much.

ARROW-10299 is for implementing ipc::MetadataVersion::V5 on the write side (reads are mostly fine)
ARROW-10258 depends on ARROW-10259 that you're working on.

@nevi-me I had seen that your status is busy.

The major reason that I closed this PR is because I don't want it in the pull request list even if in draft mode too long.
Now since I got you feedback, I'll continue this PR later after #9025

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants