Add support for a connection pool #1

jgaskins · 2018-11-18T22:20:41Z

This keeps us from having to reconnect to the DB on every query, removing the latency cost of repeated TCP handshakes, TLS negotiations, and TCP slow start. It made a huge difference just running the specs against my local Neo4j instance:

Before

➜  neo4j_model.cr git:(master) NEO4J_URL=bolt://neo4j:password@localhost crystal spec
........

Finished in 545.02 milliseconds
8 examples, 0 failures, 0 errors, 0 pending

With this PR

➜  neo4j_model.cr git:(connection-pool) ✗ NEO4J_URL=bolt://neo4j:password@localhost crystal spec
........

Finished in 256.88 milliseconds
8 examples, 0 failures, 0 errors, 0 pending

It removes the Model.connection class method and replaces it with a connection-checkout block for the ConnectionPool object:

models = Model.with_connection do |conn|
  conn.execute "MATCH (n:{{@type.name}}) RETURN n"
end

At the end of the block, the connection is checked back into the connection pool so another fiber can use it immediately. The default capacity for the connection pool is 25, but it will only create them as needed — if you're only ever handling one HTTP request at a time, for example, you'll only have a single connection, but if a second request comes in while the first one's query is still in flight, it'll spin up another connection.

The pool shard does allow for assigning the connection to the current fiber, but I don't know if I'd recommend using that functionality. In a web app, the HTTP::Server::Context will have a new fiber for every request and then it gets thrown away. Rails, on the other hand, uses a pool of persistent threads, which is why ActiveRecord::Base.connection makes sense there. Doing that here might leak connections as each fiber is thrown away.

It's possible that that isn't true and that once the Neo4j::Bolt::Connection object goes out of scope and gets GCed its connection will be closed, but even in that case, we'd still be spinning up connections for each request if we used one connection per fiber.

This keeps us from having to reconnect to the DB on every query, removing the latency cost of repeated TCP handshakes, TLS negotiations, and TCP slow start.

anamba · 2018-11-18T23:48:42Z

Thanks for this! Very helpful.

One thing I am wondering about though is whether there could be any cases where a connection gets into an invalid state before being returned to the pool, and the next unfortunate query checks out that connection and fails. But I haven't quite used your library enough yet to know what kinds of failures are possible. Maybe this is a total non-issue. 😄

jgaskins · 2018-11-19T01:48:51Z

I won't say it's impossible, but I've added a few things to try to get around that case. Just in case something happens within the query or while deserializing, RESET is an actual Bolt command that clears all state from the connection on the server side and it's exposed via Connection#reset.

One way this is handled automatically internally is inside a transaction. If a Neo4j-specific exception occurs while inside a transaction block, the transaction gets rolled back and the connection acknowledges failure to the server (the server sends Ignored results for every query until you send ACK_FAILURE, kinda like Postgres but we can keep going within the same transaction if we send that command). If any other exception is raised, we rollback and reset the connection to a pristine state.

The fact that the driver eagerly deserializes data from Bolt means this shouldn't happen in a way that needs to be handled outside the driver — by the time execute returns, the query is completely finished and all results are consumed. I'd love to hear about any failure cases that do leave the connection in a weird state, though. I just haven't seen any, despite doing as many weird things as I could think of. :-)

I've experimented with streaming responses/lazy deserialization (instead of Array(Array(Neo4j::Type)), the result contains an Iterator(Array(Neo4j::Type))), which definitely could (and did in some cases) put a connection into that sort of inconsistent state. I need to read through the Java driver a bit more to understand how they do it, but unless a streaming response performs better in all scenarios, is roughly API-compatible with the current execute implementation, and doesn't come with asynchronicity issues, execute will always be synchronous and the streaming API will be separate.

anamba · 2018-11-21T01:14:16Z

BTW I did find today that an exception in one of the specs would cause all the remaining specs to fail. Adding a rescue inside {{@type.id}}::QueryProxy#execute that calls Connection#reset before returning the connection to the pool allowed the other specs to run normally.

Add support for a connection pool

fb888e3

This keeps us from having to reconnect to the DB on every query, removing the latency cost of repeated TCP handshakes, TLS negotiations, and TCP slow start.

anamba approved these changes Nov 18, 2018

View reviewed changes

anamba merged commit cd58c38 into upspring:master Nov 18, 2018

jgaskins deleted the connection-pool branch November 18, 2018 23:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for a connection pool #1

Add support for a connection pool #1

jgaskins commented Nov 18, 2018

anamba commented Nov 18, 2018

jgaskins commented Nov 19, 2018

anamba commented Nov 21, 2018

Add support for a connection pool #1

Add support for a connection pool #1

Conversation

jgaskins commented Nov 18, 2018

Before

With this PR

anamba commented Nov 18, 2018

jgaskins commented Nov 19, 2018

anamba commented Nov 21, 2018