-
-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow use of ParquetWriter, ParquetReader w/o compiling all compression deps (WASM support) #16729
Comments
As an aside, there's https://github.com/kylebarron/parquet-wasm. I don't know if figuring out the interoperability is worth it. Back to your point, it seems another idea would be to make the compression dependencies a la carte instead of all of them. |
With #16731, this is possible with zstd: polars-io = { version = "0.40", default_features = false, features = ["polars-parquet", "zstd"] } For the other compression dependencies, it's still possible but you have to add polars-io = { version = "0.40", default_features = false, features = ["polars-parquet"] }
polars-parquet = { version = "0.40", default_features = false, features = ["lz4"] } Adding passthrough feature flags for |
I am trying to compile Polars for wasm32 target (to be used within Leptos app on client side) and I found this issue here.
This line was introduced in PR 15408 to fix an overflow problem. I think it's due to this type definition:
This will cause |
Why would you need parquet on wasm if I may ask? I have never seen uncompressed parquet files and supporting this adds a lot of complexity and breaking assumptions that I don't think are worth it. |
I use duckdb-wasm to read parquet files and it's great. It means I can have a dashboard without an api layer and I can write most of my logic in sql instead of javascript. That said, the duckdb parquet extension supports compression. I don't know how much it'd save on bundle size and load time if it only supported uncompressed parquets. I don't know what the benefit is of uncompressed parquets is vs uncompressed ipc files though. |
@ritchie46 sure: I am reading, and analyzing parquet files from within the browser (client side only; no server side).
The aim was not to disable all compression. I was just trying to debug the issues and disabled one feature at a time and found this overflow issue in wasm32. I am actually using compression.
I understand. No worries. |
Similar to @jccampagne, I don't necessarily need uncompressed parquet files. I just want to use a single compression scheme (zstd), but currently all of the compression dependencies are built. This is probably fine for most targets, but not really suitable for wasm for 2 reasons:
I updated the issue to reflect this more accurately. Really I just need some way to load a Dataframe into wasm from S3. I'm not even attached to parquet, but IPC has the same issue: enabling the |
I can make a similar change to |
Description
Currently, enabling the
parquet
feature inpolars-io
also enablespolars-parquet/compression
, which in turn enables the following dependencies that are hard to cross-compile towasm32-unknown-unknown
:Also, supporting all of these compression schemes bloats up the WASM bundle. So, it would be nice to not gate
polars_io::parquet
behind a flag that necessarily brings in all of these dependencies.To address this, I propose gating
polars_io::parquet
behind thepolars-parquet
flag instead of thepolars
flag.The text was updated successfully, but these errors were encountered: