Skip to content

Conversation

@paleolimbot
Copy link
Member

Fixes #6.

This PR adds functions ArrowMetadataBuilderAppend() (for blindly but efficiently adding a key/value pair to the end of some metadata) and ArrowMetadataBuilderSet() (to less efficiently replace or remove a value for key).

The use case I had in mind is building extension type metadata from some input (i.e., make an output schema like the input except with new extension type or with new serialized extension type metadata). It's rather difficult to replicate the "replace" or "remove" behaviour otherwise.

@codecov-commenter
Copy link

codecov-commenter commented Aug 5, 2022

Codecov Report

Merging #12 (1a4bb88) into main (51e5052) will decrease coverage by 1.65%.
The diff coverage is 81.00%.

@@            Coverage Diff             @@
##             main      #12      +/-   ##
==========================================
- Coverage   91.97%   90.32%   -1.66%     
==========================================
  Files           5        6       +1     
  Lines         798      930     +132     
  Branches       30       38       +8     
==========================================
+ Hits          734      840     +106     
- Misses         41       59      +18     
- Partials       23       31       +8     
Impacted Files Coverage Δ
src/nanoarrow/metadata.c 86.02% <80.20%> (-13.98%) ⬇️
src/nanoarrow/schema_view.c 98.87% <100.00%> (+<0.01%) ⬆️
src/nanoarrow/buffer_inline.h 84.78% <0.00%> (ø)

📣 Codecov can now indicate which changes are the most critical in Pull Requests. Learn more

return NANOARROW_OK;
}

ArrowErrorCode ArrowMetadataBuilderAppendView(struct ArrowBuffer* buffer,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth possibly exposing this variant too?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed a compromise of sorts...all values are now ArrowStringView but all keys stay const char*, which is somewhere inbetween the programmability and non-null terminated-ness of the ArrowStringView and the reality that keys are almost always "a literal string". I tried having both be ArrowStringView but that was rather painful to actually use.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wonder if a helper macro to convert const char* to struct ArrowStringView might help work around that?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added ArrowCharView():

/// \brief Create a string view from a null-terminated string
struct ArrowStringView ArrowCharView(const char* value);

It's still a little awkward feeling if I convert keys to use struct ArrowStringView but I don't really mind either way.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not a big deal for me either, though it is a little awkward to have the types differ between key/value even if there is a pragmatic reason

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll change them to be the string views...anybody looking for convenience isn't writing C code anyway!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

Comment on lines +144 to +141
if (value == NULL) {
return NANOARROW_OK;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, to me it's a little weird to accept NULL as the value and then just do nothing with it. If we just considered append(key, NULL) to be an error, we could drop this, and then we could pass the views by value instead of indirecting through a pointer

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved these to be internal implementation details...the NULL mostly just helps with minimizing the number of times one has to copy a metadata string (correspondingly, I added ArrowMetadatBuilderRemove() to make it more explicit).

@paleolimbot paleolimbot marked this pull request as ready for review August 8, 2022 16:51
out.data = value;
out.n_bytes = 0;
if (value) {
while (*value++) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not strlen? to avoid the include?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes...too much of a hack? I'll check if string.h gets included in another inline file anyway (it probably does).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+I don't think it's a big deal to include string.h (it'll certainly raise less eyebrows than this)

@paleolimbot paleolimbot merged commit 89b5932 into apache:main Aug 9, 2022
@paleolimbot paleolimbot deleted the build-metadata branch August 9, 2022 16:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement metadata building utility

3 participants