Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[R] schema$metadata should be properly typed #24859

Closed
asfimport opened this issue May 5, 2020 · 4 comments
Closed

[R] schema$metadata should be properly typed #24859

asfimport opened this issue May 5, 2020 · 4 comments

Comments

@asfimport
Copy link

Currently, I try to export numeric data plus some metadata in Python into to a parquet file and read it in R. However, the metadata seems to be a dict in Python but a string in R. I would have expected a list (which is roughly a dict in Python). Am I missing something? Here is the code to demonstrate the issue:

import sys
import numpy as np
import pyarrow as pa
import pyarrow.parquet as pq
print(sys.version)
print(pa.__version__)
x = np.random.randint(0, 10, (10, 3))
arrays = [pa.array(x[:, i]) for i in range(x.shape[1])]
table = pa.Table.from_arrays(arrays=arrays, names=['A', 'B', 'C'],
metadata=\{'foo': '42'})
pq.write_table(table, 'array.parquet', compression='snappy')
table = pq.read_table('array.parquet')
metadata = table.schema.metadata
print(metadata)
print(type(metadata))

 

And in R:

 

library(arrow)
print(R.version)
print(packageVersion("arrow"))
table <- read_parquet("array.parquet", as_data_frame = FALSE)
metadata <- table$schema$metadata
print(metadata)
print(is(metadata))
print(metadata["foo"])`` 

 

Output Python:

3.6.8 (default, Aug 7 2019, 17:28:10)
[GCC 4.8.5 20150623 (Red Hat 4.8.5-39)]
0.13.0
OrderedDict([(b'foo', b'42')])
<class 'collections.OrderedDict'>

 

Output R:

[1] ‘0.17.0’
[1] " -- metadata -- foo: 42"
[1] "character" "vector" "data.frameRowLabels"
[4] "SuperClassMethod"
[1] NA

 

Reporter: René Rex
Assignee: Neal Richardson / @nealrichardson

PRs and other links:

Note: This issue was originally created as ARROW-8703. Please see the migration documentation for further details.

@asfimport
Copy link
Author

Francois Saint-Jacques / @fsaintjacques:
You are correct, see TODO

@asfimport
Copy link
Author

René Rex:
Yes, this explains :) Haven't checked the source code... Thanks!

@asfimport
Copy link
Author

Francois Saint-Jacques / @fsaintjacques:
Issue resolved by pull request 7236
#7236

@asfimport
Copy link
Author

René Rex:
Thanks a lot! That was quick :)

@asfimport asfimport added this to the 1.0.0 milestone Jan 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants