ARROW-898: [C++/Python] Use shared_ptr to avoid copying KeyValueMetadata, add to Field type also#605
ARROW-898: [C++/Python] Use shared_ptr to avoid copying KeyValueMetadata, add to Field type also#605wesm wants to merge 6 commits intoapache:masterfrom
Conversation
…d / not copied Change-Id: Ia70e00c9709483899993024ae5a33405114fc84b
Change-Id: I860102ffb34934240ff482d0182d68f5db4217b7
Change-Id: I42532f474b858c401e395a3fc0f476b41d9951d6
Change-Id: Ic27aec0fc84f70c55fe935f4498206a26a951570
| flatbuf::CreateKeyValue(fbb, fbb.CreateString(key), fbb.CreateString(value))); | ||
| } | ||
| *out = flatbuf::CreateSchema( | ||
| fbb, endianness(), fb_offsets, fbb.CreateVector(key_value_offsets)); |
There was a problem hiding this comment.
If there is no metadata (and not a length-0 object), we will write null to flatbuffers
There was a problem hiding this comment.
This makes more sense than what I did originally.
Change-Id: Ief9409543b7c7f0c37c473ec0a145a5a6b733430
|
This is ready to go. I created https://issues.apache.org/jira/browse/ARROW-906 as a follow up to serialize the field-level metadata to Flatbuffers. |
cpp/src/arrow/type-test.cc
Outdated
| auto f0 = field("f0", int32()); | ||
| auto f1 = field("f0", int32(), true, metadata); | ||
| std::shared_ptr<Field> f2; | ||
| ASSERT_OK(f1->RemoveMetadata(&f2)); |
There was a problem hiding this comment.
Since this and Schema::RemoveMetadata can only really fail for esoteric reasons (OOM in std::vector copy operations), we should maybe make them return the result rather than Status
Change-Id: I80f9e8219920598e12edad763062e6f8aa16709c
| const auto& value = custom_metadata_.value(i); | ||
| key_value_offsets.push_back( | ||
| flatbuf::CreateKeyValue(fbb, fbb.CreateString(key), fbb.CreateString(value))); | ||
| const KeyValueMetadata* metadata = schema.metadata().get(); |
There was a problem hiding this comment.
Is this because there's one fewer function call per iteration in the loop just below?
There was a problem hiding this comment.
The get() is not strictly needed, I think it was mostly making the type on the LHS easier to write down =)
|
+1 |
|
Sweet, thanks for doing this |
…ata, add to Field type also This also adds support for adding and removing schema-level metadata to the Python Schema wrapper object. Need to do the same for Field but putting this up for @cpcloud to review since he's working on using this in parquet-cpp Author: Wes McKinney <wes.mckinney@twosigma.com> Closes apache#605 from wesm/ARROW-898 and squashes the following commits: 03873f7 [Wes McKinney] RemoveMetadata not return Status c621c2c [Wes McKinney] Add metadata methods to Field, some code cleaning 581b9fa [Wes McKinney] Add unit tests for passing metadata to Field constructor 51fae29 [Wes McKinney] Add metadata to Field 2ce4003 [Wes McKinney] Test sharing metadata 48aa3ca [Wes McKinney] Use shared_ptr<const T> for KeyValueMetadata so metadata can be shared / not copied
Fixes apacheGH-605. If `dev/release/release.sh` shows what should we do at repository.apache.org, it's helpful.
This also adds support for adding and removing schema-level metadata to the Python Schema wrapper object. Need to do the same for Field but putting this up for @cpcloud to review since he's working on using this in parquet-cpp