Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connection handling after streaming result set #40

Closed
lokkilok opened this issue Jan 28, 2019 · 10 comments
Closed

Connection handling after streaming result set #40

lokkilok opened this issue Jan 28, 2019 · 10 comments

Comments

@lokkilok
Copy link

I've encountered problems when streaming results. I've been using connections obtained from a Pool and then done connection.queryStream(...) to get the stream which was then subsequently piped to an HTTP response. Although it's possible to give the connection back to the pool, with conn.end(), often such connection ends up in a weird state, the conn.ping() fails. As a work around I decided to do conn.destroy() whenever the pipe finished. This however results in a problem as conn.destroy tries to send an error over the, stalled, socket. In line 342 of lib/connection.js: socketErrorDispatchToQueries(destroyError);
When I commented this line out, everything worked fine.

As far as I can see it will occasionally be needed to destroy the connection e.g. when the HTTP client aborts the HTTP request, so the HTTP response closes before all data was sent. So conn.destroy() should be made to always work.

The other issues is then why the connection (socket) sometimes stalls even in a completely successfully piped response. In my testing this happens more frequently in a framework like Koa, than when piped directly to the Node HTTP response stream.

@rusher
Copy link
Collaborator

rusher commented Jan 28, 2019

Using stream means different things:

  • not giving connection back to pool until resultset is entirely read.
  • Read results quickly

Not giving connection back until stream has end

Stream implementation doesn't use some specific server implementation : server will send the whole resultset. Until this resultset is totally send to client, server won't be able to handle next query like ping. A ping command won't be resolve until resulset is entirelly read.

For information, MariaDB server has some protocol to permit streaming by bunch, then permit some query to be executed while streaming, but this means a lot more exchanges with server, resulting in a slow stream. This type of implementation has not been implemented in connector.
The best/good way of using stream with pools is to not giving back connection until stream ends.

Possible error if stream takes too long

The server usually expects clients to read off the result set relatively quickly. The net_write_timeout server variable controls this behavior (defaults to 60s). If you don't expect results to be handled in this amount of time there is a different possibility:

  • With >= MariaDB 10.1.2, you can use the query "SET STATEMENT net_write_timeout=10000 FOR XXX" with XXX your "normal" query. This will indicate that specifically for this query, net_write_timeout will be set to a longer time (10000 in this example).
  • for older servers, a specific query will have to temporarily set net_write_timeout ("SET STATEMENT net_write_timeout=..."), and set it back afterward.

if your application usually uses a lot of long queries with fetch size, the connection can be set using option "sessionVariables=net_write_timeout=xxx"

@lokkilok
Copy link
Author

Thanks for letting me know about the net_write_timeout, that's good to know. But I'm talking of query streams that are completely read in < 1 sec (approx. 300 ms).

Perhaps I wasn't sufficiently clear but the problem, of an "stalled" connection happened when giving the connection back to the pool after the stream ended "successfully". As said, this doesn't occur always and the frequency was different when using different frameworks. E.g. when using the node stream function pipeline(...) directly on the node HTTP response it was rare, but it did happen. A framework like Koa does it's own thing when application code sets the (Koa-level) response to a stream; and with Koa it happened almost all the time.

The second. more important, issue was then that such a stalled connection cannot be destroyed, without throwing an error. I.e. when application code does not give the connection back to the pool, for example because the pipeline broke due to a "downstream" time out (aborted HTTP request), but instead decide to do conn.destroy() that function throws an error (from within socketErrorDispatchToQueries), as far as I can see for no good reason. And this is thrown on the nextTick, so it's hard (impossible?) to cache and then crashes the whole node application. But I could be wrong....

The 2nd issue I resolved with commenting out that one line.
The 1st issue I can live with, but it would be nice if there would be a way to make the driver/pool more robust.

@rusher
Copy link
Collaborator

rusher commented Jan 28, 2019

Can you indicate if the version used and if possible some code to identify issue?

@lokkilok
Copy link
Author

I'm using the 2.0.2-rc, i.e. the latest in github.
Here some (incomplete) code that will show the 2nd issue:

// create a pool with only 2 connections
const DB = MariaDB.createPool({
  host: env.DBHost,
  user: env.DBUser,
  password: env.DBPassword,
  database: env.DBName,
  connectionLimit: 2,
  acquireTimeout: 5000, // Keep this shorter than the typical timeout used on a client!
  noControlAfterUse: true // This allows for quick release after the connection has streamed results.
})

// grab a connection and release it first after a while, to test...
const conn1 = await DB.getConnection()
setTimeout((c) => c.end(), 6000, conn1)
console.log(`DB has ${DB.idleConnections()} idle connections and ${DB.taskQueueSize()} pending connection requests`)
const conn2 = await DB.getConnection()
setTimeout((c) => c.end(), 3900, conn2)
console.log(`DB has ${DB.idleConnections()} idle connections and ${DB.taskQueueSize()} pending connection requests`)

const conn3 = await DB.getConnection()
const recordStream = conn3.queryStream(someSQLHere) // I make a query that results in 100k records
pipeline(recordStream, JSONStream.stringify, res, err => {
  conn3.destroy()}) // pipe the records as JSON to the HTTP response

The above code should run inside a HTTP server. As you can see this request handler grabs all two connections of the pool, but doesn't do anything with them. After 3.9 secs the second one is released back to the pool. At that point the connection is assigned to 'conn3', soon the stream will be flowing....

Now to test this I use curl with the option -m 4 which means curl will abort the request in 4 secs, i.e. approximately after 100 ms of streaming records. On my system that is way before the end of the stream is reached. The conn3.destroy() gets called but results in an Error. If one uses conn3.end() instead, there's no error but when the pool gives the connection later to application code it will not communicate (I think because ping() fails/times out).

@rusher
Copy link
Collaborator

rusher commented Jan 29, 2019

Thanks, that helps a lot.

For the first issue, I imagine that stream (that is a Readable) is not in flowing mode anymore then, so the stream command is in pause, and connection hangs.

@rusher
Copy link
Collaborator

rusher commented Jan 30, 2019

Problem is when "piping", to handle the stream when writable stream fails

How to handle writable that stop for any reason ( in your case a ClientRequest that has been ended):
stream is then in pause mode, waiting for Writable to consume feed. All subsequent request are stopped until stream finished.

I don't see any other solution but to document piping properly to indicate to resume.
Probably a good solution for pool state is that streaming not consumed must be handled when connection is given back to pool.

example :

pool
  .getConnection()
  .then(conn => {
    const someWriterStream = fs.createWriteStream("./tmp.file")
      .on("error", err => {
        //export has ended, resume stream
        queryStream.resume();
      });

    const transform = new Transform({
      transform: function transformer(chunk, encoding, callback) {
        callback(null, JSON.stringify(chunk));
      },
      objectMode: true
    });

    const queryStream = conn.queryStream("SELECT seq as val FROM seq_1_to_10000");
    queryStream
      .pipe(transform)
      .pipe(someWriterStream)
      .on('close', () => {
        conn.end();
      });

    //forcing an WRITER end to simulate this error
    setTimeout(someWriterStream.destroy.bind(someWriterStream), 2);
  })
  .catch(err => {
    //error
  });

@rusher
Copy link
Collaborator

rusher commented Jan 30, 2019

hmm, this is actually because current implementation used some form of pause/resume mechanism ( https://github.com/MariaDB/mariadb-connector-nodejs/blob/master/lib/cmd/stream.js#L33 ), without that stream would continue in flowing mode avoiding this kind of issue.

Need to investigate the best way to handle backpressure there ...

@rusher
Copy link
Collaborator

rusher commented Jan 30, 2019

Correction is done with ddb2080, simplifying implementation, using basic TCP Back Pressure.
This ensures connection state in any case and avoids lots of edge case.

@rusher rusher closed this as completed Jan 30, 2019
@lokkilok
Copy link
Author

Thx, looks like a good improvement to me! Will test later this week.

@lokkilok
Copy link
Author

Already had a chance to test, and it works very well now! Both issues essentially gone, I can now use connection.end(), i.e. release the connection used for streaming back to the pool in all cases.
Thanks again!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants