New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

when streaming, the connection can time out #849

Closed
calvinmetcalf opened this Issue May 28, 2015 · 11 comments

Comments

Projects
None yet
2 participants
@calvinmetcalf
Contributor

calvinmetcalf commented May 28, 2015

I have a very large table in postgres, and if I consume it slowly enough it will time out, reproduce with

var knex = require('knex');
var stream = require('stream');// if on node replace with io-stream
var i = 0;
knex({
  client: 'pg',
  connection: {}
}).raw('select * from generate_series(0, 1000000, 1)').pipe(new stream.Writable({
  objectMode: true,
  write: function(chunk, _, next) {
    console.log(++i);
    setTimeout(next, 10);
  }
}));

If I run that with DEBUG=* then eventually I get

5941
5942
  pool2 reap (cur=3, av=3) +10s
  pool2 Attempting to gracefully remove resource (id=0) +1ms
  pool2 Shrinking pool: destroying 1 idle connections +0ms
5943
  pool2 Ungracefully destroying resource (id=0) +7ms
5944
5945
5946
5947
5948
5949
5950
5951
5952
5953
events.js:141
      throw er; // Unhandled 'error' event
            ^
Error: Connection terminated
    at null.<anonymous> (/Users/me/projects/project/node_modules/pg/lib/client.js:184:29)
    at g (events.js:260:16)
    at emitNone (events.js:67:13)
    at emit (events.js:166:7)
    at Socket.<anonymous> (/Users/me/projects/project/node_modules/pg/lib/connection.js:66:10)
    at emitOne (events.js:77:13)
    at Socket.emit (events.js:169:7)
@myndzi

This comment has been minimized.

Collaborator

myndzi commented May 29, 2015

Looking at the node-postgres code, it is clear that this error is the result of a connection explicitly ending: https://github.com/brianc/node-postgres/blob/v4.3.0/lib/client.js#L175-L189

This would not be something under the purview of knex or pool2; you will likely need to investigate your network or database configuration, and/or verify there's no separate process potentially killing connections (pg_terminate_backend, for example)

@myndzi

This comment has been minimized.

Collaborator

myndzi commented May 29, 2015

Apparently node-postgres emits errors on the query object when there is a query active, so the error binding on the client object is insufficient. Thanks pg. I'm not even sure where to bind that handler after reading the docs.

@calvinmetcalf

This comment has been minimized.

Contributor

calvinmetcalf commented May 29, 2015

So I suspect though that connection is being closed by knex or pool2, notice earlier around 5942 where pool2 attempts to reclaim the connection, i belive the unhangled error is just a symptom

@myndzi

This comment has been minimized.

Collaborator

myndzi commented May 29, 2015

I rewrote my comments above a few times but that is already ruled out; pool2 cannot reclaim a connection that's "checked out" / idle; the log messages above are just routine cleanup of other resources that had been allocated and were cleaned out of the pool while the above was happening.

@myndzi

This comment has been minimized.

Collaborator

myndzi commented May 29, 2015

brianc set me straight about pg's behavior: brianc/node-postgres#795 (comment)

So it appears that the "stream mode" call to pg is missing an error handler somewhere. Presumably it's not using the callback interface, since then it wouldn't be streaming. I do notice here: https://github.com/tgriesser/knex/blob/master/lib/dialects/postgres/index.js#L107-L115
The 'error' event is bound to the stream that gets passed, not directly to the PGQueryStream object, but I thought errors propagated along pipes?

@myndzi

This comment has been minimized.

Collaborator

myndzi commented May 29, 2015

To the matter at hand, another look shows me this: pool2 reap (cur=3, av=3) +10s

Which seems to be indicating that 3/3 resources in the pool are available. Significantly, this includes the resource that should be checked out, because you're using it. Another oversight in the stream handling perhaps?

@myndzi

This comment has been minimized.

Collaborator

myndzi commented May 29, 2015

Think I found the culprit:

https://github.com/tgriesser/knex/blob/master/lib/runner.js#L84-L95

The stream-promise is being returned after the call to Promise.using, meaning that the connection is being cleaned up before the stream query is executed. It should be returning the runner.client.stream call inside of line 93, I think. I'll push a tweak shortly and hopefully you can verify that it behaves correctly.

@myndzi

This comment has been minimized.

Collaborator

myndzi commented May 29, 2015

If you could, try the 'issue-849' branch and see if it solves your problem? I don't quite want to merge to master since I have no idea where or how to write a test for this in the current structure...

@myndzi

This comment has been minimized.

Collaborator

myndzi commented May 29, 2015

To the other problem, the crash, it does not appear that pipes propagate errors, so we should definitely be binding an error event on the created query object, not on the proxy-stream, but I'm having trouble wrapping my head around all aspects of the flow here so I'm far from certain about how this should be handled.

(Node does bind an error handler in the .pipe machinery, but it re-throws if there are no non-.pipe listeners: https://github.com/joyent/node/blob/master/lib/stream.js#L92-L101)

@calvinmetcalf

This comment has been minimized.

Contributor

calvinmetcalf commented May 29, 2015

fix looks good, solves my problem, also as to errers and streams, you are correct they don't propagate errors, usual response is something like

var destStream = getDestStream();
stream.on('error', function(e) {destStream.emit('error', e)}).pipe(destStream);
@myndzi

This comment has been minimized.

Collaborator

myndzi commented May 29, 2015

Glad to hear it. I'm gonna leave this issue open until @tgriesser or someone with fuller knowledge of knex can address the dangling bits.

@tgriesser tgriesser closed this in 1135b3e Jun 1, 2015

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment