incorporate parquet files? #9897
Labels
Feature: File Upload & Handling
Type: Suggestion
an idea
User Role: Depositor
Creates datasets, uploads data, etc.
Has there been any discussion of using parquet files at some level of dataverse? (I see it mentioned in only one issue.)
I've used them some, and I love how they work well with R, Python, DuckDb, Spark, and others.
Several R programmers (like @kuriwaki) have advocated for rds files over RData files. From my recent experience with parquet files, they have all the advertised advantages of rds files (eg, compression, strong-typing, and factor levels), plus the appeal of interoperability with other platforms.
I haven't thought much beyond this. But when I read about problems with RData files and the messiness of Rserve described by @landreev, I see parquet as a improvement for many reasons --not least is the ability to replace a flaky remote instance with a local parquet library.
cc: @pdurbin
The text was updated successfully, but these errors were encountered: