Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support custom metadata for schema and columns #20

Merged
merged 1 commit into from Sep 15, 2020
Merged

Conversation

quinnj
Copy link
Member

@quinnj quinnj commented Sep 15, 2020

Closes #13. This PR adds two new functions: Arrow.getmetadata(x) and
Arrow.setmetadata!(x, ::Dict{String, String}), which allows, rather
obviously, setting metadata for an arbitrary object and then retrieving
that metadata. By utilizing these functions, users can get/set custom
metadata that is serialized in the arrow format at the schema level and
field (column) level. More specifically, to set arrow schema custom
metadata, a user would call Arrow.setmetadata!(tbl, meta) on their
table object tbl. To retrive arrow schema custom metadata, one can
call tbl = Arrow.Table(...); meta = Arrow.getmetadata(tbl). Similarly
for column/field-level metadata, one can call Arrow.setmetadata!(col, colmeta) to cause custom metadata to be serialized in the arrow
message, and call Arrow.getmetadata(tbl.colX) to retrive custom
metadata for a specific column in an Arrow.Table.

Note that technically the arrow Message and Footer objects also
allow setting custom metadata, but those are not addressed at all in
this PR since they seem to be less useful/urgent.

Closes #13. This PR adds two new functions: `Arrow.getmetadata(x)` and
`Arrow.setmetadata!(x, ::Dict{String, String})`, which allows, rather
obviously, setting metadata for an arbitrary object and then retrieving
that metadata. By utilizing these functions, users can get/set custom
metadata that is serialized in the arrow format at the schema level and
field (column) level. More specifically, to set arrow schema custom
metadata, a user would call `Arrow.setmetadata!(tbl, meta)` on their
table object `tbl`. To retrive arrow schema custom metadata, one can
call `tbl = Arrow.Table(...); meta = Arrow.getmetadata(tbl)`. Similarly
for column/field-level metadata, one can call `Arrow.setmetadata!(col,
colmeta)` to cause custom metadata to be serialized in the arrow
message, and call `Arrow.getmetadata(tbl.colX)` to retrive custom
metadata for a specific column in an `Arrow.Table`.

Note that technically the arrow `Message` and `Footer` objects also
allow setting custom metadata, but those are not addressed at all in
this PR since they seem to be less useful/urgent.
@codecov
Copy link

codecov bot commented Sep 15, 2020

Codecov Report

Merging #20 into master will increase coverage by 0.68%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #20      +/-   ##
==========================================
+ Coverage   79.26%   79.95%   +0.68%     
==========================================
  Files          13       13              
  Lines        1992     2020      +28     
==========================================
+ Hits         1579     1615      +36     
+ Misses        413      405       -8     
Impacted Files Coverage Δ
src/table.jl 93.16% <100.00%> (+0.17%) ⬆️
src/write.jl 91.98% <100.00%> (+0.36%) ⬆️
src/metadata/Schema.jl 77.54% <0.00%> (+2.32%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8e9a1fd...814880e. Read the comment docs.

@quinnj quinnj merged commit b345f03 into master Sep 15, 2020
@quinnj quinnj deleted the jq/metadata branch September 15, 2020 15:17
quinnj added a commit that referenced this pull request Oct 3, 2020
Closes #13. This PR adds two new functions: `Arrow.getmetadata(x)` and
`Arrow.setmetadata!(x, ::Dict{String, String})`, which allows, rather
obviously, setting metadata for an arbitrary object and then retrieving
that metadata. By utilizing these functions, users can get/set custom
metadata that is serialized in the arrow format at the schema level and
field (column) level. More specifically, to set arrow schema custom
metadata, a user would call `Arrow.setmetadata!(tbl, meta)` on their
table object `tbl`. To retrive arrow schema custom metadata, one can
call `tbl = Arrow.Table(...); meta = Arrow.getmetadata(tbl)`. Similarly
for column/field-level metadata, one can call `Arrow.setmetadata!(col,
colmeta)` to cause custom metadata to be serialized in the arrow
message, and call `Arrow.getmetadata(tbl.colX)` to retrive custom
metadata for a specific column in an `Arrow.Table`.

Note that technically the arrow `Message` and `Footer` objects also
allow setting custom metadata, but those are not addressed at all in
this PR since they seem to be less useful/urgent.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support reading + storing, and writing custom metadata
1 participant