-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Object store wasm usage #490
Object store wasm usage #490
Conversation
The main question I had was around your experiments with ehttp (as a replacement for reqwest) - did you find it was worthwhile? I did quite like the lower impedance model (that is, the very fetch-like behaviour), though the lack of header enums was a slight annoyance (the primary reason I've left the reqwest dep in on the object-store-wasm - I'm fairly certain the compiler should just leave in the enums, string conversion logic and strip everything else). I had hoped that ehttp would be the solution to the streaming problem (having to fake it, that is), but the mpsc channel + spawn_local trick actually ended up being the solve. It looks like it shouldn't be a problem to do the exact same thing with reqwest (I suspect I just didn't see it at the time), so it's really just a question of ergonomics (for me, and any other maintainers), how much appetite there is for porting the changes over to the OG object-store. |
Ah, also, given it does change the deps (hopefully not significantly), I'll get the delta generation working again in a separate PR (before this one's ready). |
🙏 I'm happy to have both in the the codebase for a while as we figure out which design makes more sense.
I think I just got to a point where I didn't fully understand Rust-async-streams or something like that, and there was existing code using reqwest that I could look at and figure out how to make work. So I don't think I got to the point where I formed a concrete opinion over whether to use ehttp or reqwest.
That would be awesome 🙏 |
This is great! I actually sat down last night trying to use parquet-wasm to make my own wasm object-store replacement for Datafusion... It's great to see someone else is already on it 🥳 |
Asset Sizes
|
How are you doing on this? Is this ready for a review? |
Yeah, I've published the MVP version of the crate (didn't feel comfortable contributing a git-pinned dep), I think it's good to go. |
…g responses, transitioning over to ehttp for better wasm-compat/lighter weight)
168109b
to
be200c1
Compare
…patch to enabled object-store-wasm sub-features)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is amazing, thank you! Just a couple more comments
pub async fn new(url: String) -> WasmResult<AsyncParquetFile> { | ||
let client = Client::new(); | ||
let mut reader = HTTPFileReader::new(url.clone(), client.clone(), 1024); | ||
pub async fn new(url: String, options: Option<Object>) -> WasmResult<AsyncParquetFile> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is using Object
here better than using JsValue
? Is that just for TS types?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting 🤔 apparently OptionFromWasmAbi
is not defined for JsValue (but it is for Object) - I wondered why I did that 😮 . Yeah, both necessary and very slightly more accurate (since JsValue includes things like numbers and so on).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. Previously I took in JsValue
but parsed it via serde_wasm_bindgen
to Option<OptionsStruct>
. https://github.com/geoarrow/geoarrow-rs/blob/c296ced466b34e5a9dd487058d34910f08738833/js/src/io/parquet/async.rs#L175-L176
I'm not sure if either way is better than the other. I think in that case where we have strict object types, we should be able to define the typescript code for the input though.
Can you merge in main? Then I think we should be good to go |
Amazing, thank you! |
NB: Starting this a little earlier for design feedback on the object-store-wasm crate itself (primarily around ehttp vs reqwest).
The usage is very straightforward (given all the plumbing is present for ParquetObjectReader), I ended up doing it as a straight swap, leaving the existing HttpFileReader code for posterity/easy switchback.