Document batch configuration options in run #248

coffeemug · 2014-03-27T20:54:29Z

@wojons asked how to configure batching. @mlucy pointed out the following. We should document it properly.

So, this isn't documented anywhere, but there's a batch_conf optarg which lets you set these things. If you write query.run(batch_conf:{max_els:5, max_dur:50*1000}), you'll get a batch back as soon as 5 rows are available or 50*1000 microseconds pass, whichever happens first. (You can also use max_size to configure a maximum serialized size in bytes, which is what we use internally for most batch sizing.)

The text was updated successfully, but these errors were encountered:

AtnNn · 2014-03-27T20:56:43Z

@mlucy I believe you left it undocumented for a reason. Is the code foolproof enough? Does it error correctly if zero or negative values are passed?

mlucy · 2014-03-27T21:08:46Z

I honestly have no idea. It wasn't originally written for external use. We should write some tests for error cases before we document it for people.

wojons · 2014-05-02T19:27:06Z

Does this batching also work with controlling the internal batching size.

mlucy · 2014-05-02T20:01:39Z

Yeah; it affects the intracluster batch sizes as well.

coffeemug · 2014-09-18T21:52:57Z

This has been fixed (see rethinkdb/rethinkdb#2185), so we should document it for 1.16.

neumino · 2014-09-25T19:14:44Z

The changes landed in 1.15.
Shouldn't we document it now?

coffeemug · 2014-09-26T06:15:13Z

Yep, we missed this. /cc @chipotle

chipotle · 2014-10-01T22:05:49Z

What are the defaults for the various options? There's max_batch_rows, max_batch_bytes, max_batch_seconds, and first_batch_scaledown_factor. Also, what's that last one do? (This ticket references the first three by their old names, but not the scaledown factor.)

coffeemug · 2014-10-01T22:11:09Z

/cc @mlucy

chipotle · 2014-10-01T22:21:55Z

@neumino pointed me at the source code with the defaults. One other question: I'm assuming that in the JS driver these "sub-optargs" are not converted from camelCase, so you would write

run(conn, {batchConf: {max_batch_rows: 10}}

rather than

run(conn, {batchConf: {maxBatchRows: 10}}

Is that right?

neumino · 2014-10-01T22:25:32Z

I just looked at the code, there's currently a bug in the JS driver.

We accept only batchConf but we do not translate it.
And like @chipotle said, we do not translate sub-optargs.

neumino · 2014-10-01T22:29:23Z

Err sorry, I just read ast.coffee and not net.coffee.

The available options seem to be here:
rethinkdb/rethinkdb#2463 (comment)

I don't see nested options, so I'm a bit confused what the new syntax is.

coffeemug · 2014-10-01T22:31:31Z

Do we need to create an issue to keep track of the js driver bug?

neumino · 2014-10-01T22:47:00Z

I'm not sure if it's a bug in the JS driver, or the actual spec.
@gchpaco should know more.

chipotle · 2014-10-01T23:58:32Z

I've looked at batching.cc lines 173-199; I can see what effect changing first_scaledown_factor has on the first result batch retrieved but I don't understand the use case for changing that parameter.

coffeemug · 2014-10-02T07:29:41Z

@chipotle -- it's perceived latency. What happens is, people type a query in the repl, run it, and measure how long it takes to get the first batch. They then treat it as an indication of RethinkDB performance. Of course what really happens is that there is a tradeoff between latency and throughput, but that's not generally how people think.

So the scaledown factor gives people the first batch quickly to improve perceived latency in repl interactions, and then starts optimizing for throughput for future batches. This turns out to hit a great balance between perceived vs. real latency/throughput performance.

chipotle · 2014-10-02T17:27:55Z

It looks like the default values ensure that the initial batch will be one row -- max_batch_rows and first_batch_scaledown_factor are both 8. If you adjust max_batch_rows is there guidance we should give on what to change the scaledown factor to? Is there really an advantage to making first_batch_scaledown_factor user-tunable, rather than just automatically setting it to be the same as max_batch_rows? And if so, would you ever want to set it to anything other than max_batch_rows or 1, which (I think) would make the first batch perform the same as the following batches?

AtnNn · 2014-10-05T04:31:22Z

The batch_conf documentation was only added to the JavaScript docs. I think it should be added to the Ruby and Python docs as well. Re-opening.

chipotle · 2014-10-06T16:54:49Z

Argh.

chipotle · 2014-10-06T17:04:37Z

Code review 2181 open. (@AtnNn, I've set you as the reviewer for this one.)

coffeemug mentioned this issue Mar 27, 2014

Document r\run to contorl batch size rethinkdb/rethinkdb#2184

Closed

srh mentioned this issue Mar 27, 2014

Make batch_conf option use user-friendly names if it's documented. rethinkdb/rethinkdb#2185

Closed

danielmewes mentioned this issue May 2, 2014

run(stream=true) and custom batch size rethinkdb/rethinkdb#2342

Open

neumino added the not_settled label May 29, 2014

coffeemug removed the not_settled label Sep 18, 2014

coffeemug added this to the 1.16 milestone Sep 18, 2014

chipotle self-assigned this Oct 1, 2014

chipotle removed this from the 1.16 milestone Oct 2, 2014

chipotle closed this as completed in a73ec23 Oct 3, 2014

AtnNn reopened this Oct 5, 2014

chipotle closed this as completed in 88c3cdd Oct 8, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Document batch configuration options in run #248

Document batch configuration options in run #248

coffeemug commented Mar 27, 2014

AtnNn commented Mar 27, 2014

mlucy commented Mar 27, 2014

wojons commented May 2, 2014

mlucy commented May 2, 2014

coffeemug commented Sep 18, 2014

neumino commented Sep 25, 2014

coffeemug commented Sep 26, 2014

chipotle commented Oct 1, 2014

coffeemug commented Oct 1, 2014

chipotle commented Oct 1, 2014

neumino commented Oct 1, 2014

neumino commented Oct 1, 2014

coffeemug commented Oct 1, 2014

neumino commented Oct 1, 2014

chipotle commented Oct 1, 2014

coffeemug commented Oct 2, 2014

chipotle commented Oct 2, 2014

AtnNn commented Oct 5, 2014

chipotle commented Oct 6, 2014

chipotle commented Oct 6, 2014

Document batch configuration options in run #248

Document batch configuration options in run #248

Comments

coffeemug commented Mar 27, 2014

AtnNn commented Mar 27, 2014

mlucy commented Mar 27, 2014

wojons commented May 2, 2014

mlucy commented May 2, 2014

coffeemug commented Sep 18, 2014

neumino commented Sep 25, 2014

coffeemug commented Sep 26, 2014

chipotle commented Oct 1, 2014

coffeemug commented Oct 1, 2014

chipotle commented Oct 1, 2014

neumino commented Oct 1, 2014

neumino commented Oct 1, 2014

coffeemug commented Oct 1, 2014

neumino commented Oct 1, 2014

chipotle commented Oct 1, 2014

coffeemug commented Oct 2, 2014

chipotle commented Oct 2, 2014

AtnNn commented Oct 5, 2014

chipotle commented Oct 6, 2014

chipotle commented Oct 6, 2014