Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

database/sql: consider getting a single conn from the pool #18081

Closed
kardianos opened this issue Nov 28, 2016 · 9 comments
Closed

database/sql: consider getting a single conn from the pool #18081

kardianos opened this issue Nov 28, 2016 · 9 comments

Comments

@kardianos
Copy link
Contributor

@kardianos kardianos commented Nov 28, 2016

In some instances it is important to be on the same connection between queries.
This is especially true for:

  • DDL Queries
  • Queries that use session specific temp tables

It isn't always desirable to use transactions in these cases due to the nature of DDL or certain long running query locking behavior.

The API could look like this:

func (db *DB) Connection(ctx context.Context) (Queryer, error) {...}

Where Queryer could be an interface or struct. When the ctx closes the Queryer connection would be closed and returned to the pool if possible.

@bradfitz
Copy link
Contributor

@bradfitz bradfitz commented Nov 28, 2016

We've always told people to use a Transaction for such things. Does that not work?

@kardianos
Copy link
Contributor Author

@kardianos kardianos commented Nov 28, 2016

DDL (database schema changes) and long running queries esp with temp tables are where that would fall down. Many DDL can't be rolled back, and you don't want a tran to lock or bloat database without need.

I could be wrong here, but I'd like to track it at least for a bit and report exact findings here.

@kardianos kardianos self-assigned this Nov 28, 2016
@bradfitz bradfitz added this to the Go1.9Maybe milestone Nov 28, 2016
@j7b
Copy link

@j7b j7b commented Nov 30, 2016

Many rdbms' support transactional DDL (db2 and postgresql come to mind). Most of the subtleties of temporary tables and session scope are avoided by using explicit transactions. It sounds like this issue addresses particular mysql-related deficiencies or peculiarities. The "lock or bloat" seems to presume that autocommit is somehow more lightweight than explicit transactions and isolation levels, which if valid sounds again like a deficiency or peculiarity of a particular backend. The Context type doesn't make sense in general in the database/sql package but in this particular implementation-as-proposal ctx being context.Background() would seem to imply the ability to permanently check a connection out of a pool which seems odd design for a pool, some mechanism to get a session for a given driver and connection string independent of a pool would be more reasonable and there's probably compelling cases for that.

@kardianos
Copy link
Contributor Author

@kardianos kardianos commented Nov 30, 2016

@j7b Let me flush out my thinking here and indirectly respond to you.

My primary experiences with databases are in the following order: MS SQL Server, Postgresql, Oracle, Sqlite, MySQL. In databases that take locks, such as SQL Server and MySQL, holding onto locks is often a huge issue. (At least one department in my State has mandated the use of (no lock) everywhere!) Sometimes I will try to do transactions in multiple batches batches just to ensure locks are released sooner. Part of this is not doing all the work in a single transaction, but doing it a part at a time.

In SQL Server in particular, it is common to have long alter scripts that reads like:

set xact_abort on;
begin tran;
alter table Invoice add running_total decimal(30,4) null;
go -- This separates query batches and happens prior to the database engine.
update Invoice set running_total = 0;
alter table Invoice alter column running_total decimal(30,4) not null;
commit tran;

The above query would be executed as two different sql batches (two different db.Exec calls). You wouldn't want to run this script within a transaction (it already contains one, but don't dwell on the details here, it is just an example) nor would you want to run this within two different sessions. One solution for this use case would be for the driver to do a basic parse and submit separately, but that would add overhead to the normal case. However with the ability to remove a single connection, use it for a stated duration, then return it, it allows other devs to handle this case with ease.


To address some of the points you mentioned:

Most of the subtleties of temporary tables and session scope are avoided by using explicit transactions.

Right. Often this comes up not in the context of an app itself, but in building plumbing around the app. And are assuming you want the driver to begin the transaction and not the script. This isn't always the case.

The "lock or bloat" seems to presume that autocommit is somehow more lightweight than explicit transactions and isolation levels, which if valid sounds again like a deficiency or peculiarity of a particular backend.

Thank you, I'll get right on moving all our clients to a better backend... That was slightly sarcastic. I'm not sure what your point was. I will freely acknowledge that some of the backends I use have deficiencies. I still have a product to deliver. It isn't a matter of lightweight or not, it is a matter of locking rows that have been read in a Tx so another process can't modify them until they are released, which could be a long time in a batch process as many of these use cases are.

The Context type doesn't make sense in general in the database/sql package but in this particular implementation-as-proposal ctx being context.Background() would seem to imply the ability to permanently check a connection out of a pool which seems odd design for a pool

Yes, you could do that, but you probably wouldn't. Normally in these uses you have a tight loop of queries you want to execute on the same session before returning them. There are many things you could do, but I'm not suggesting you do that.

some mechanism to get a session for a given driver and connection string independent of a pool would be more reasonable and there's probably compelling cases for that.

That would also be fine, but probably more work then just using the pool. Today you can call db.Driver().Open(dsn) to get a driver conn, but then you loose the client API.

... To be clear, I'm not sure this API is worth it. I will also admit that the cases where you need this is much smaller than the normal application use case. This isn't so much about transactional DDL, but about operations that must be run in different batches that involve DDL, alter scripts, and nightly processes more than anything.

@j7b
Copy link

@j7b j7b commented Dec 1, 2016

@kardianos thank you for sharing your train of thought.

To my way of thinking database/sql should be as generic as possible, as described in the package comment, so my point in pointing out several databases have transactional ddl was mainly that for a number of existant drivers your specific justifications work adequately with Transactions.

In terms of your implementation, mandating the Done() channel close is probably against the intent of the context package and probably unprecedented in the standard library as it invalidates the reason for Background() and TODO() to exist, so at the very least Querier needs a Close() that must be called.

But I think those points are probably moot as it appears a driver.Conn isn't necessarily something you could treat as something that would be Querier-esque for your purposes, I think you'd need an optional lower-level interface in driver and some sort of request/response interface that'd probably look, in practice, similar to something based on os/exec'ing a cli for a given database.

@kardianos
Copy link
Contributor Author

@kardianos kardianos commented Dec 1, 2016

@j7b It could also return a QueryCloser. It is just the connection would also be returned when the ctx wasn canceled.

The driver wouldn't need anything new. The change would just be in the sql package.

type Queryer interface {
    ExecContext(ctx context.Context, query string, args ...interface{}) (Result, error)
    QueryContext(ctx context.Context, query string, args ...interface{}) (*Rows, error)
    QueryRowContext(ctx context.Context, query string, args ...interface{}) (*Row, error)
    PrepareContext(ctx context.Context, query string) (*Stmt, error)

     // This would imply nested tx API rather than a RollbackTo/SavePoint API.
    BeginContext(ctx context.Context) (*Tx, error)
}
type QueryCloser interface {
    Queryer
    io.Closer
}

// Conn returns a connection from the database pool. The connection may be
/// returned to the pool by calling Close or canceling the connection.
func (db *DB) Conn(ctx context.Context) (QueryCloser, error) {...}

Assuming we were to move forward with a generic interface of Queryer, which has its own problems, the total API addition would be small. If we were to return a new type rather than an interface, the API surface wouldn't be huge.

type Queryer struct {...}
func (q *Queryer) ExecContext(ctx context.Context, query string, args ...interface{}) (Result, error)
func (q *Queryer) QueryContext(ctx context.Context, query string, args ...interface{}) (*Rows, error)
func (q *Queryer) QueryRowContext(ctx context.Context, query string, args ...interface{}) (*Row, error)
func (q *Queryer) PrepareContext(ctx context.Context, query string) (*Stmt, error)
func (q *Queryer) BeginContext(ctx context.Context) (*Tx, error)
func (q *Queryer) Close() error

// Conn returns a connection from the database pool. The connection may be
/// returned to the pool by calling Close or canceling the connection.
func (db *DB) Conn(ctx context.Context) (*Queryer, error) {...}

Behind the scenes it would only add a minimal amount of code. But that is the implementation.

The question I want to answer is "What is the need?" and "Does it enable reasonable uses?" I'm not too worried about the API semantics, esp at this point.

@disq
Copy link

@disq disq commented Jan 5, 2017

DDL statements autocommit. So, if I were to create a temporary table in the middle of my SQL session, and if I had uncommited (DML) changes in the same transaction, they would be committed. This is not ideal.

Temporary tables are usually connection-bound, so I can't create the said temporary table outside of the sql.Tx and expect it to be available on the next line when I get a connection from the pool using DB.Begin().

So, a separate method/way to reserve a connection from the pool is needed here. Something like DB.Begin(), but not calling the dc.ci.Begin(), so no transaction is started on the driver. Clearly separate the actual pool (on the database/sql) and the transaction (on the driver side) if you will.

@disq
Copy link

@disq disq commented Jan 24, 2017

As a workaround, recently I had to manually implement connection pooling mechanism in a private project. Redacted code can be seen here: https://gist.github.com/disq/63f1f7e5e1c4756aca19436e6214d1b1

The said project heavily uses temporary tables and runs as a long-lived daemon, so a pool of 1-connection connection pools had to be used.

@gopherbot
Copy link

@gopherbot gopherbot commented Apr 13, 2017

CL https://golang.org/cl/40694 mentions this issue.

@gopherbot gopherbot closed this in d234f9a Apr 24, 2017
@golang golang locked and limited conversation to collaborators Apr 24, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
5 participants
You can’t perform that action at this time.