Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

r2d2 for tokio-postgres #233

Closed
theduke opened this issue Feb 14, 2017 · 20 comments
Closed

r2d2 for tokio-postgres #233

theduke opened this issue Feb 14, 2017 · 20 comments

Comments

@theduke
Copy link

@theduke theduke commented Feb 14, 2017

Since Postgres does not support multiplexing / concurrent queries on a single connection, do you have ideas about how to handle multiple connections?

Would r2d2 be a good fit? I don't know how r2d2 manages the connection internally. Does it use an extra thread or just does maintainance whenever a new connection is acquired?

Does r2d2 fit into this scenario?

Would it make sense to come up with a generic pool manager that is designed to be run on an event loop?

Other thougthts?

@sfackler

This comment has been minimized.

Copy link
Owner

@sfackler sfackler commented Feb 15, 2017

I have spent some time thinking about asynchronous pooling, and it's a bit more challenging implementation-wise than synchronous pooling. In particular, async driver methods take self by value, so the current approach of having a smart pointer wrapping the connection will no longer work. The API may end up looking something like

fn get<F, U, T, E>(&self, f: F) -> BoxFuture<T, E>
    where F: FnOnce(Connection) -> U,
          U: Future<Item = (T, Connection), Error = (E, Option<Connection>)>,
          E: From<GetError>
@theduke

This comment has been minimized.

Copy link
Author

@theduke theduke commented Feb 15, 2017

Yeah that is along the lines that I imagined.

Though I would probably call in run_fn rather than get.

@jwilm

This comment has been minimized.

Copy link
Contributor

@jwilm jwilm commented Feb 15, 2017

Apparently Hyper's tokio branch has support for keep-alive (at least on the client-side) which may be a good model for tokio-postgres--both the API paradigm and the implementation.

For example, creating a client in hyper is done with something like hyper::Client::new which returns a clonable handle for submitting requests to be run on the event loop. Internally, the event loop will either create a new connection if existing connections are busy (up to some limit) or reuse an existing connection if there is one available.

Abstracting away connection management seems like a perfectly natural thing to do for a tokio-based postgres client, and it would certainly solve the issues with connection ownership.

@sfackler

This comment has been minimized.

Copy link
Owner

@sfackler sfackler commented Feb 15, 2017

Things are a bit more complicated here since a database connection is inherently stateful. You often need to run a sequence of commands on specifically the same connection.

@theduke

This comment has been minimized.

Copy link
Author

@theduke theduke commented Feb 15, 2017

Thinking about it a bit more, and along the thinking of @jwilm, it might be nice to provide an abstraction that abstracts the need for a connection handle completely, with a tokio_service implementation and some convenience functions on top.

The chained Request would allow running multiple requests on the same connection.

#[derive(Clone)]
struct Pool { ... }

enum Request {
  Execute { query: String, params: Vec<Box<ToSql>> },
  BatchExecute { query: String },
  Query { query: String, params: Vec<Box<ToSql>> },
  Chained ( Vec<Request> ),
}

enum Response {
  Ok,
  RowsAffected (u64),
  Rows( Rows ),
  Chained ( Vec<Result<Response, Error>> ),
}

impl Service for Pool {
    type Request = Request;
    type Response = Response;
    type Error = PoolError;
    type Future = BoxFuture<Self::Response, Self::Error>;
   fn call ...
}

impl Pool {
  fn execute<S: Into<String>>(&self, stmt: S, params: &[&ToSql]) -> BoxFuture<u64, Error> {
    self.call(Request::Execute(...))
  }

  fn chain(&self) -> ChainBuilder {
    ChainBuilder { pool: self.clone(), requests: Vec::new() }
  }
}

struct ChainBuilder {
  pool: Pool,
  requests: Vec<Request>,
}

impl ChainBuilder {
  fn and_execute<S: Into<String>>(self, stmt: S, params: &[&ToSql]) -> Self {
    self.requests.push(Request::Execute(...));
    self
  }
  ...
  
  fn run(self) -> BoxFuture<Vec<Result<Response, Error>>, Error> { ... }
}
@jwilm

This comment has been minimized.

Copy link
Contributor

@jwilm jwilm commented Feb 15, 2017

@sfackler understood, but I don't think it changes what an ideal API looks like. Projects like PgBouncer have managed to address such complexity of connection pooling.

In the case of futures/tokio, I imagine that each transaction or series of statements would be wrapped up into a single future. Each connection could be a task which will drive 1 future to completion.

@theduke

This comment has been minimized.

Copy link
Author

@theduke theduke commented Feb 15, 2017

It probably makes sense to have a PoolFuture type that has the Connection, a Pool handle and the actual future as it's state.

That would make chaining easy, and the connection can be released back to the pool either in poll upon completion of the query, or in Drop if it's dropped without being completed.

I'll start hacking on an implementation and do a PR for feedback.

@sfackler

This comment has been minimized.

Copy link
Owner

@sfackler sfackler commented Feb 16, 2017

PgBouncer is a pretty different story from my understanding - you're maintaining a persistent connection to the bouncer, which is bound to one or more Postgres connections in a configurable way: https://pgbouncer.github.io/features.html

Keep in mind that buffering up a series of commands is not sufficient - you need to be able to run a query and then run some code that inspects the result of that before deciding what to do next.

@theduke

This comment has been minimized.

Copy link
Author

@theduke theduke commented Feb 16, 2017

You could always fall back on manually forwading the connection if you need that though.

I think a majority of queries won't actually need this.

@jwilm

This comment has been minimized.

Copy link
Contributor

@jwilm jwilm commented Feb 16, 2017

@sfackler PgBouncer's transaction pooling mode seems analogous to what's needed here. PgBouncer will share one database connection between multiple client connections for the duration of a transaction in this mode.PgBouncer has the added challenge that it actually needs to parse some of the query data to detect transactions.

Keep in mind that buffering up a series of commands is not sufficient - you need to be able to run a query and then run some code that inspects the result of that before deciding what to do next.

I think this is addressed with a transaction pooling approach; the connection is not returned to the pool until the Future that started a transaction either commits or rolls it back. A series of statements outside of a transaction wouldn't be guaranteed to execute on the same connection. I don't see any problems with that, but maybe you've got something in mind.

@sfackler

This comment has been minimized.

Copy link
Owner

@sfackler sfackler commented Feb 17, 2017

I am pretty wildly opposed to anything that involves trying to parse query strings.

Another reason that I want to have a separate pooling library is that there's a bunch of logic that goes into that that can be shared among all of the various database backends.

@jwilm

This comment has been minimized.

Copy link
Contributor

@jwilm jwilm commented Feb 17, 2017

Hmm, I didn't mean to suggest you actually parse query strings. Being the client library, you have a priori knowledge of when you'd be in a transaction or not.

Having a separate pooling library seems fine, but as a user, I don't really want to be managing my own connections.

@maurer

This comment has been minimized.

Copy link
Contributor

@maurer maurer commented Mar 15, 2017

@sfackler It's not done yet, but was interested in any comments you might have to give on my attempt to clone r2d2 into an asynchronous style. It's at maurer/r5d4.

Notable questions I have before I try to document/robustify it:

  • Does it make more sense for this to live in a separate repository, or in the main one?
  • I'm currently using a version of BoxFuture whose outputs are not necessarily Send, since asynchrony doesn't require multithreading. Is this going overboard? Should we use the regular BoxFuture from futures?
  • Unless I'm mistaken, tokio_postgres's TLS support doesn't allow for the re-using of TLS modes at the moment, which is why r5d4_postgres just uses no TLS. Is this the case? Can we work around it?
  • From my trivial test it's clear that we either need some adapter functions to make fitting the function type more convenient, or some kind of fancy trait work on the output. Do you have an opinion on which way it makes sense to do it?
@sfackler

This comment has been minimized.

Copy link
Owner

@sfackler sfackler commented Mar 15, 2017

Nice! I will take a look when I have some time, which unfortunately is probably not until tomorrow.

@rrichardson

This comment has been minimized.

Copy link

@rrichardson rrichardson commented Dec 11, 2017

Is this still being worked-on? Is there a new strategy/direction that obviates this effort?

@maurer

This comment has been minimized.

Copy link
Contributor

@maurer maurer commented Dec 11, 2017

I am not working on it - the work I was doing that wanted it no longer uses postgres. The repo is still there as a starting point if anyone wants to complete it.

@khuey

This comment has been minimized.

Copy link
Contributor

@khuey khuey commented Jan 29, 2018

I wrote a crate that's basically r2d2 but async, with a tokio_postgres adapter.

https://github.com/khuey/bb8

@nicoburns

This comment has been minimized.

Copy link

@nicoburns nicoburns commented Jul 29, 2018

Adding this to the list of async pool implementations https://github.com/tailhook/tk-pool
Also, redis-rs would be a good candidate for anyone wanting to prove out an implementation with a second adaptor...

@tyranron

This comment has been minimized.

Copy link

@tyranron tyranron commented Jan 22, 2019

Again, to the list of async pools.

@jwilm pointed out:

you can try out the l3-37 connection pooler we wrote and use at OneSignal. The pool can be cloned indefinitely, and you can ask it for a connection via a method like get_conn() -> Future<Conn>.

@sfackler

This comment has been minimized.

Copy link
Owner

@sfackler sfackler commented Apr 7, 2019

@sfackler sfackler closed this Apr 7, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
8 participants
You can’t perform that action at this time.