-
Notifications
You must be signed in to change notification settings - Fork 166
Make sure we Fetch
query result when ready
#1996
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
ef2f016
to
8cff0fa
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is properly hacky, love it.
Left some comments, for you or for me in a bit.
I have a setup to run most duckdb unittester in duckdb-wasm (currently only node) vs various duckdb-wasm interfaces, I need to upstream that, and that can be used to stress test this more.
} | ||
|
||
const res = copyBuffer(this.mod, d, n); | ||
dropResponseBuffers(this.mod); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dropResponseBuffers(this.mod); | |
dropResponseBuffers(this.mod); |
I am not sure if this needs to be performed, or moved afterwards.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah it may need to be revisited though might be worth a dedicated PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking moving this below the return null
might be it, but I need to check.
In any case, this impact only retry
cases, that are currently not exposed in the wild, so might be ok-ish.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we want it in both "SUCCESS" cases since a buffer is instantiated, otherwise we would leak the (1-byte) retry buffer each time no?
if (res.length === 1 && res[0] == 84) { | ||
return null; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternative would be returning an arrow error with custom error code, unsure if that is properly supported, let me check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that's the other option we considered, and I agree it should work just as well. We felt this way was the most straightforward and explicit compared to abuse a specific Arrow error code.
Actually not properly approved, left some comments that I need to double check. |
@Y--: I think the best path forward is wrapping Uint8Array in an object that can then hold also additional informations about what kind of buffer is returned and allows more flexibility, this is still needed going forward. At that point instead of the magic byte a duckdb-wasm level error can be cleanly returned with a payload. |
I will set this up and ping back. |
I can try to work on that ~ my main concern is to do this preserving the 0-copy transfer as much as possible, but hopefully the Arrow API can help here. |
Sadly it looks like ConcatenateBuffers does copy its input. But we can use |
Implemented here - |
I ended up sketching an alternative here: main...carlopi:duckdb-wasm:refs/heads/wrap_fetch, what do you think? I might clean up and PR that, then you can add your on top |
Hi, I ended up with #2000, after which it should be somewhat trivial to add this on top, it should be a matter of adding two enums, one at lib/include/duckdb/web/webdb.h on the C++ side and one at packages/duckdb-wasm/src/status.ts in the typescript layer. Plus adding the relevant handling in Would you mind taking a look, feedback is welcome |
Had a quick look and it look very reasonable. I would need to look in more detail to make sure we don't copy the buffer un-necessarily, but I assume you already made sure of it? |
I will check this myself and verify this is the case. I would be inclined to say so, but the question is reasonable and better to get another look. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
In the current code, we call
Fetch
on a query result before checking if there is a result available, which may cause deadlocks since we don't return to the event loop in theBLOCKED
andNO_TASKS_AVAILABLE
cases.This PR solves it by returning a special buffer containing one "magic" byte out of the C++ layer, then a
null
buffer indicating the caller they need to retry fetching.We tested this with MotherDuck operators which do return "BLOCKED", and this fixes the streaming issue we had.
Ideally, this would test this with a custom "test" extension that reproduces the behavior.
Another way to test it (albeit less ideal), would be to run DuckDB Wasm tests with a special version of DuckDB compiled with
FORCE_ASYNC_SINK_SOURCE
flag.