Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement connection pools in contrib #167

Closed
SergioBenitez opened this issue Feb 3, 2017 · 20 comments
Closed

Implement connection pools in contrib #167

SergioBenitez opened this issue Feb 3, 2017 · 20 comments
Labels
enhancement A minor feature request help wanted Contributions to this issue are needed
Milestone

Comments

@SergioBenitez
Copy link
Member

SergioBenitez commented Feb 3, 2017

Rocket's contrib should make it ease to retrieve a connection to a database from a connection pool. My initial thoughts on a design for this can be summarized as:

  • All configuration, including the pool size, which database to use, credentials, etc, should be done in the Rocket.toml file. It would be nice if configuration issues were reported at Rocket initialization, but this likely requires a more involved change to Rocket.
  • Dependencies should be included on an as-needed basis. For instance, contrib shouldn't pull in postgres if the user only wants sqlite.
  • A single type should be exposed per configured database. The type should implement FromRequest, which results in a connection being taken from the connection pool.
  • Multiple databases, including several of the same type, should be allowed in a single project.
  • All major databases with Rust adapters should be supported, this includes:
    • postgres
    • redis
    • mysql
    • sqlite
    • cypher/neo4j
    • couchdb
    • diesel (postgres, sqlite)

The list of "major databases" is just that from the r2d2 adapters page. The r2d2 library should likely be used as the backing connection pool. The state crate should likely be used to make the connection pool accessible globally to the contrib database module.

Edit: To clarify, it should be possible, and indeed a requirement of a good implementation, to use multiple databases in a single project, and even multiple different connections to a single type of database. For instance, I should be able to get handles to two different MySQL databases, a Redis store, and a PostgreSQL database without issue.

@SergioBenitez SergioBenitez added the enhancement A minor feature request label Feb 3, 2017
@SergioBenitez SergioBenitez added this to the 0.4.0 milestone Feb 3, 2017
@SergioBenitez SergioBenitez added the help wanted Contributions to this issue are needed label Feb 3, 2017
@impowski
Copy link
Contributor

impowski commented Feb 3, 2017

I might take on implementation of Diesel, will see in couple of days if I don't go anywhere.

@Maaarcocr
Copy link

Would this be a good start point for helping with the project? I would love to be involved.

@impowski
Copy link
Contributor

impowski commented Feb 3, 2017

So this is what I see so far, there is an easy way of doing it, and a hard way. The easy way will be just create a new directory inside of contrib call it database and put there multiple files for every database which we want to implement like postgres.rs, diesel.rs etc. Then wrap it in state::Storage and I guess implement all the things which we need for it like FromRequest and etc.

For the hard way we will need to implement a structure which will wrap any type of connection, like struct Database<C> which will accept any connection manager which implement ManageConnection and it will hold all information we need and PooledConnnection inside of it.

Then we need to think about our global state, which will create our global pool, so I guess it will like a helper function which will accept Config<C, Error> and connection url, and return our pool. Not sure how it will be done with state, but I guess Storage<Pool<C>> and we will work from it.

Implement checks on it if something wrong and then just use it. Then we will implement FromRequest on our Database struct and return any type of connection we will need, and then we will pass it in request guards, and just Database::connection will give us needed one, so the user will use some. Even better will be if we implement our own wrapper for connection so user can use it and don't thing about writing like:

pub fn create(connection: &PgConnection, username: &str)

And it will be:

pub fn create(connection: &OurWrapper, username: &str)

OurWrapper will basically will work with any type of connection. And then when a user will implement an API he will just use Database::connection which will return a reference on OurWrapper which will be a connection to a database, with some other safety checks.

So about Rocket.toml, I guess we will use like [development.database], where we will have a type (which will give us a needed connection through deserialization with enum), url (which we will use inside of our r2d2::Config, like pool_size maybe and something else.

With the first one it will be not hard to use all of those cfg things and we will not pull everything out, it will be less painful to implement it and much safer, but it will generate a little bit more code and at least our connections will not collide and we will know that we use diesel and diesel macros. And I guess second way of doing it really will make no sense with different types of connections it will only be more confusing.

We still could implement a wrapper for every database type, like I said like MysqlDatabase which will return reference on MysqlConnection and same for others. With Diesel the problem is that we have a two of them but it's not so hard to work it out. And like @SergioBenitez said, we don't really want to compile everything, because a user will decide which database he will use with which features, will it be diesel with postgres or sqlite, or just postgres or just mysql. And with such modular thing it will make more sense for him too when he will see that he uses a MysqlDatabase or a DieselDatabase.

@mehcode
Copy link
Contributor

mehcode commented Feb 3, 2017

OurWrapper will basically will work with any type of connection. And then when a user will implement an API he will just use Database::connection which will return a reference on OurWrapper which will be a connection to a database, with some other safety checks.

👎 on this (if I understand what you're saying correctly). Rocket should not strive to allow code-less database driver changes. That's the job of something like Diesel.


I also would like to bring up a problem with using request guards / FromRequest for this. An API I worked on recently used 3 databases (2 postgres and 1 redis). Django took forever to cleanly support multiple databases.

Don't have a good idea to fix this but it's something we should consider.

@impowski
Copy link
Contributor

impowski commented Feb 3, 2017

@mehcode Like I said I consider the second way is more complicated and not worth it.

@flosse
Copy link

flosse commented Feb 4, 2017

Some day I could contribute for the neo4j part :)
At the moment it works with lazy_static:
https://github.com/flosse/openfairdb/blob/master/src/infrastructure/web/neo4j.rs

@scull7
Copy link

scull7 commented Feb 6, 2017

I'm working on an API right now using rocket where we not only need connections to multiple databases but connections to multiple 3rd party REST APIs. Each of these APIs needs to have configuration and some need persistent connections (connection pooling). I think that designing around the case of a single database connection is not the right way to go. Even if you use just one database, what if you have sharded connections?

If it's going to be included into the library it should consider multiple connections from the start. From my very limited knowledge I think you could just write a macro that would instantiate a state container which could then be passed to the manage() function.

I think something like the following would be the very ergonomic:

// A very trivialized and contrived example.
struct MyCustomPool {
  foo: String,
  bar: String,
  thing: Option<String>,
}

impl MyCustomPool {
    pub fn init(f: String, b: String, t: Option<String>) -> Result<MyCustomPool, Error> {
        Ok(MyCustomPool { foo: f, bar: b, thing: t })
    }
}

impl PoolState for MyCustomPool {
    fn retrieve(&self) -> Result<&FooConnection, Error> {
        Ok(&self)
    }
}

conn1 = rocket_contrib::mysql::Connection::init("usa.mysql",  1234, "user", None);
conn2 = rocket_contrib::mysql::Connection::init("eu.mysql", None, "user", "password");
conn3 = MyConfig::init("foo", "bar", None);

// state_pool!(<container_name>, <implementation_name>, <instance>);
let pool1 = state_pool!("MysqlUs", rocket_contrib::mysql::Pool, conn1);
let pool2 = state_pool!("MysqlEu", rocket_contrib::mysql::Pool, conn2);
let pool3 = state_pool!("Foo", MyCustomPool, conn3);

rocket
    .manage(pool1)
    .manage(pool2)
    .manage(pool3);

I'm sure there's much that could be done to remove much of the boiler plate (maybe even a #[derive StatePool]), but as an initial implementation it certainly would make setup of several connections trivial. I've only been writing Rust for the past 6 days, so I have no idea regarding the merit of my ideas yet, but hopefully they're at least constructive to the discussion.

@SergioBenitez
Copy link
Member Author

SergioBenitez commented Feb 6, 2017

I've updated the initial post to clarify: I consider it a requirement of this feature to be able to use multiple different databases of the same or different type in a single project.

@shuaizhiwen
Copy link

why not mongodb?

@tomhoule
Copy link

tomhoule commented Jan 14, 2018

I'm looking into working on this feature, and one difficulty we have is that we want to be able to have multiple databases with the same (r2d2/adapter) connection type and still be able to distinguish them by their type, at compile time, so we can implement FromRequest for each individual database. That way we open only the connections we need in each handler without boilerplate in the handler body.

Here's one possible solution:

In Rocket.toml

[development.databases.mysql-eu]
host = "..."

[development.databases.mysql-us]
host =  "..."

[development.databases.redis]
host = "..."

In the app

#[derive(DatabaseConnectionPool)]
struct MysqlUs;

#[derive(DatabaseConnectionPool)]
struct MysqlEu;

#[derive(DatabaseConnectionPool)]
#[name = "redis"]
struct RedisConnection;

Since I don't think it is possible to read the Rocket.toml in a custom derive implementation, the generated implementation would have to fail on initialization if there is no corresponding database entry in Rocket.toml.

One other question is where the custom derive code should live. I thought it would have to be in rocket-codegen or its own crate, which would be unfortunate, but apparently diesel manages to reexport the derive functionality so we could conveniently put it behind a feature gate.

Does this look like a reasonable way to approach this feature?

edit: I have thought about what the impl for DatabaseConnectionPool for the structs would be, it's not as straightforward as I initially imagined. They probably need to hold an r2d2 pool of the right type, etc. Also, an ordinary macro probably makes more sense than a custom derive.

@ELD
Copy link
Member

ELD commented Feb 2, 2018

@SergioBenitez So, I've thought a bit about this too and would be interested in working on implementing it. It's similar to the one above here and also seems similar (but not identical) to your wishlist from your RustConf talk. For this particular approach, the Cargo.toml is what's used to specify the connection type(s). The #[derive(DbConn)] concept is almost identical to mine.

Cargo.toml:

...
[dependencies]
rocket_contrib = { version = "0.4", features = ["diesel_postgres"] }
...

Example usage:

#[derive(ConnectionPool)]
pub struct PrimaryConnection(pub Pool<ConnectionManager<PgConnection>>);

fn main() {
    let my_connection_string = env::var("DATABASE_URL").ok();
    rocket::ignite()
        .manage(init_pool::<PrimaryConnection>(my_connection_string))
        .mount("/", routes![test])
        .launch();
}

#[get("/")]
fn test(primary_connection: PrimaryConnection) {}

I think it's a good idea to refine the desired API before beginning work on this so I know what the desired outcome would be before getting too far in working on an implementation.

@SergioBenitez
Copy link
Member Author

SergioBenitez commented Feb 2, 2018

@tomhoule This is approximately the interface I had in mind as well. In particular, pages 84 through 86 of my RustConf talk slides on Rocket demonstrate what I had in mind. There, the user's type represents a single connection, not a pool, and directly indicates the type of connection via an inner type.

Because configuration can be dynamically set at run-time via environment variables or in the code itself, we can't rely on anything in Rocket.toml; we can only rely on the Config at run-time. Thus, the implementation would, indeed, need to fail at initialization. This is made nice and clean with attach fairings. This implies that the user is going to need to attach a fairing for this to work. In all, the user will need to write something like the following:

In Rocket.toml, or the equivalent at run-time:

[[global.databases]]
name = "my_sqlite_db"
adapter = "diesel"

[[global.databases]]
name = "my_postgres_db"
adapter = "rust-postgres"
url = "pg://my/postgres/db"
username = "sergio"
password = "honyedoodle"

In main.rs:

use rocket_contrib::Databases;

#[derive(DbConn)]
#[database("my_sqlite_db")]
struct DbConn(diesel::SqliteConnection);

#[derive(DbConn)]
#[database("my_postgres_db")]
struct DbConn(postgres::Connection);

fn main() {
    rocket::ignite()
        .attach(Databases::fairing())
        .launch();
}

The Databases fairing would read the Config at run-time and generate the pools for the specified adapters, building a structure mapping database name to database pool. We'd implement FromRequest for all T: DbConn by looking up the database pool for the given database name, grabbing one connection, and storing it in the user's type. This is identical to what we advocate for in the guide.

Ideally, it would be impossible for the request guard implementation to fail if the database fairing attaches successfully. This would mean that we check, at initialization, that the user configured a database with every given name of the right type. I'm not sure how to accomplish this just yet, but I'd love ideas. One possible approach is for the DbConn derive to store structures containing the relevant database information in a special section of the binary. The fairing would then iterate through structures in this section, checking the required information.

As far as where the code should go: we should write the code in a new proc-macro crate (still likely a derive) and reexport it from contrib when a feature-flag is set.

@ELD
Copy link
Member

ELD commented Feb 4, 2018

@tomhoule Just curious if you're already working on this? If not, would you mind if I take it on. You seem to have a similar idea to Sergio, so if you've already started working on it, that's no problem.

@tomhoule
Copy link

tomhoule commented Feb 4, 2018

No I haven't, and I am unlikely to have time for this in the near future, so by all means feel free to start :)

I noticed there is already an open PR for this feature, but I haven't looked at it in detail yet: #542

@ELD
Copy link
Member

ELD commented Feb 5, 2018

@tomhoule Sounds good. I'll probably start hacking on this over the next few days and see what I can do about it.

Excited to see this functionality land in the project!

@jebrosen
Copy link
Collaborator

Leaving this note here so it's not forgotten: it may be convenient, in the future, to allow to "shortcut" the url in the same way as cargo allows for versions:

My already-proposed structure for #571, using database name as the key:

[global.databases]
db1 = { url = "db1.sqlite" }
db2 = { url = "mysql://.../db2", pool_size = 3 }
db3 = { url = "postgres://.../db3" }

With my proposed url shortcut as well:

[global.databases]
db1 = "db1.sqlite" 
db2 = { url = "mysql://.../db2", pool_size = 3 }
db3 = "postgres://.../db3"

This would also simply environment variable syntax from ROCKET_DATABASES={db1={url="db1.sqlite"}} to ROCKET_DATABASES={db1="db1.sqlite"}

@SergioBenitez
Copy link
Member Author

As a note: I'd love for our database support to make transactional testing trivial.

@JeanMertz
Copy link

As a note: I'd love for our database support to make transactional testing trivial.

Has anything changed in recent months regarding this, or is the Todo test still the canonical way to do integration testing using Rocket and (e.g.) Diesel?

@SergioBenitez
Copy link
Member Author

@JeanMertz Nothing has changed officially. @ELD, we'd love your take on how we can make this happen.

@ELD
Copy link
Member

ELD commented May 14, 2019

@SergioBenitez @JeanMertz I'd love for this to be the case, too. I've given it some passing thought, but I need to devote more time to thinking about how to go about making this happen.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement A minor feature request help wanted Contributions to this issue are needed
Projects
None yet
Development

Successfully merging a pull request may close this issue.