You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Nicola Crane / @thisisnic:
This is indeed a bug, and thanks for reporting it [~dmedw01]. It's due to how we infer types of lists - will get a PR up to fix this soon. A temporary workaround would be to reorder the list so that the first element is never NULL, though I can see that this is not ideal.
df2<-tibble::tibble(x=list(NULL, 1, 2))
# manually specify the schema of the list columndf_to_save<- as_arrow_table(df2, schema= schema(x= list_of(int32())))
arrow::write_parquet(df_to_save, tempfile(fileext=".parquet"))
Works
reticulate::py_run_string("
import pandas as pd
df = pd.DataFrame( {'col1': [[1,2], None, [3,4]]}
)
df.to_parquet('/tmp/test1.parquet')
")
df1 <- arrow::read_parquet("/tmp/test1.parquet")
arrow::write_parquet(df1, tempfile(fileext = ".parquet"))
Fails in arrow 9.0; works in arrow 5.0
reticulate::py_run_string("
import pandas as pd
df = pd.DataFrame( {'col1': [None, [1,2], [3,4]]}
)
df.to_parquet('/tmp/test2.parquet')
")
df2 <- arrow::read_parquet("/tmp/test2.parquet")
arrow::write_parquet(df2, tempfile(fileext = ".parquet"))
Environment: Ubuntu 18.04; R 4.1.1; arrow 9.0
Reporter: David
Assignee: Nicola Crane / @thisisnic
PRs and other links:
Note: This issue was originally created as ARROW-17639. Please see the migration documentation for further details.
The text was updated successfully, but these errors were encountered: