-
Notifications
You must be signed in to change notification settings - Fork 193
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Manual paging #89
Manual paging #89
Conversation
in eachRow via fetchSize option spec, resulting in a nextPage method attached to the resultStats in callback(err, resultStats) in stream, implicitly via the nodejs stream implementation, or again via fetchSize option spec fix: according to https://nodejs.org/api/stream.html#stream_readable_push_chunk_encoding: "if push() returns false, then we need to stop reading from source", i.e. this.paused = !this.push(this.buffer.shift());
You want to use krassif:manual-paging as local branch right? Looks good, I will squash the commits. I will move it to NODEJS-68 dev branch.
And some integration tests also. |
Squashed and merged into NODEJS-68 branch and added 2 integration tests. We should continue by adding unit tests. If you want to implement them, make sure to use NODEJS-68 as your target branch. |
I tested the The |
Sure. I filled a Cassandra database with a million rows (~3GB), called Take a look at the code I mentioned, I don't see how it can possibly backpressure when the |
In the case you describe, you should expect buffering but not directly related to ResultStream. resultStream
.pipe(stringifierStream)
.pipe(process.stdout); If that is the case, what is causing the excessive buffering is the stringifier -> stdout reading, as the transform stream is successfully pulling the rows (objs) from the result stream as soon as they are received, causing the result stream to query for more. You should implement throttling in your transform stream. About the implementation, the eachRow final callback is emitted once all the rows have been yielded. |
Transform streams throttle automatically, though. That's a key advantage over writing a Readable yourself. Only when stdout is under some watermark does it ask for more data from stringifierStream, which asks for more data from cassandra-driver. It's true that I might have no idea what's going on with the Anyway, I did test 59cd16e, and I saw the ResultStream's buffer.length climb to 310700 items. That should never be happening even if the consumer of the row stream is reading "too fast". cassandra-driver should wait for ResultStream's buffer to drain before doing another query. Sorry for the lack of clean test case, this is part of a big piece of software. |
Ah, I see that what I mentioned is what nodejs-driver/lib/types/result-stream.js Line 25 in a5a51db
|
This works really well. It handle very well ending prematurely the stream. I look forward for the merging of this branch into the master. Until that moment we will need to clone this branch into our product instead of doing npm install. |
I see a similar behaviour as @ivan with 3.0.0-rc1. I pipe a very long SELECT result stream through a transform stream and into a writable stream. For analysis I changed the writer's _write method to a simple The table consists of about 500m rows, but the oom happens much earlier, around 20k rows.
|
I condensed my problem into a more concise example, getting rid of the transform stream and basically just piping the stream into a writer that uses setTimeout. See here: https://gist.github.com/bkw/6c8d6d80e2d382ced59d |
@bkw function MyWritable(options) {
stream.Writable.call(this, options);
}
util.inherits(MyWritable, stream.Writable);
MyWritable.prototype._write = function (row, e, callback) {
setTimeout(callback, 0);
};
|
Thanks for commenting, @jorgebay. I'm afraid, that's not it, though. I'm just using the simplified Constructor API for the example, as described here: https://nodejs.org/api/stream.html#stream_simplified_constructor_api This does not really call write(), but just sets up a writable stream, which internally does use _write(). |
…ed by client.stream's lack of backpressure" This reverts commit 6da836c. The NODEJS-68 branch did not fix any problems: datastax/nodejs-driver#89 (comment)
This does contain code changes as per Jorge comments. Please review and merge.
The Datastax agreement has been accepted as well.
Thanks.