Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

database/sql: lock contentions in DB, Stmt #9484

Closed
methane opened this issue Jan 1, 2015 · 20 comments
Closed

database/sql: lock contentions in DB, Stmt #9484

methane opened this issue Jan 1, 2015 · 20 comments

Comments

@methane
Copy link
Contributor

@methane methane commented Jan 1, 2015

When using same prepared statement in concurrently. I confirmed with 128 concurrent http access.

Issue happened here: TechEmpower/FrameworkBenchmarks#1164
And stackdump: https://gist.github.com/methane/dd3c3f8de8128730bd63

There are some issues:

1. db.mu contention

stmt.Query() uses addDep and removeDep. It locks db.mu.
stmt.connStmt() calls db.connIfFree() multiple times. Each call lock and unlock db.mu.

2. stmt.connStmt() hold stmt.mu long

Stmt.css may be bit long (e.g. 128).
https://github.com/golang/go/blob/master/src/database/sql/sql.go#L1375-L1388
This loop take bit long time.
Moving db.mu from DB.connIfFree to before the loop makes bit faster.

But I think moving Stmt.css to driverConn is more better.

@mikioh mikioh changed the title Lock contention in database/sql.DB database/sql: a few lock contentions in DB, Stmt Jan 1, 2015
@mikioh mikioh changed the title database/sql: a few lock contentions in DB, Stmt database/sql: lock contentions in DB, Stmt Jan 1, 2015
@johto
Copy link
Contributor

@johto johto commented Jan 1, 2015

But I think moving Stmt.css to driverConn is more better.

How would that work? The point of the css slice is to know which connections already have the statement prepared in order to reuse it. If you put it in the driverConn, how do you find out which connection to use?

@johto
Copy link
Contributor

@johto johto commented Jan 1, 2015

One idea would be to add another mutex to DB which would only guard DB.dep. It would ease the contention a bit in this case.

@methane
Copy link
Contributor Author

@methane methane commented Jan 1, 2015

One idea would be to add another mutex to DB which would only guard DB.dep. It would ease the contention a bit in this case.

Yes, I've tried it and it has some improvements. (I'm sorry, I lost benchmark result.)
But using multiple DB and Stmt is 2x faster than it.
We should make this loop faster.

@methane
Copy link
Contributor Author

@methane methane commented Jan 1, 2015

How would that work? The point of the css slice is to know which connections already have the statement prepared in order to reuse it. If you put it in the driverConn, how do you find out which connection to use?

I said about changing driverConn.openStmt from map[driver.Stmt]bool to map[*Stmt]driver.Stmt.
https://github.com/golang/go/blob/master/src/database/sql/sql.go#L244

But I couldn't remove Stmt.css simply because Stmt.finalClose() uses it.
I'll try changing Stmt.css from []connStmt to map[*driverConn]driver.Stmt.

@methane
Copy link
Contributor Author

@methane methane commented Jan 1, 2015

@johto
Copy link
Contributor

@johto johto commented Jan 1, 2015

I'll try changing Stmt.css from []connStmt to map[*driverConn]driver.Stmt.

So would the idea then be to iterate once over the freeConn list, doing lookups to Stmt.css?

I wonder if it wouldn't be simpler to just remove the first loop altogether. In other words, get a free conn from conn(), prepare the statement if it hasn't been prepared yet, or just execute if it already has been. This would likely spread the statements over the connections in the pool a bit more than the current routine does, but it's not clear that that would be a bad idea.

@methane
Copy link
Contributor Author

@methane methane commented Jan 1, 2015

@johto See here
If I remove first loop, no one remove garbages from Stmt.css.

@methane
Copy link
Contributor Author

@methane methane commented Jan 1, 2015

My current idea is this: https://gist.github.com/methane/0699f88fa6bc9f1e56de
First loop is now O(len(Stmt.css) + len(DB.freeConn)) instead of O(len(Stmt.css) * len(DB.freeConn)).

I'll try benchmark before fix comments and put it gerrit.

@methane
Copy link
Contributor Author

@methane methane commented Jan 2, 2015

Quick benchmark.

environment

Two c3.8xlarge on same placement group. (10Gbit ethernet)
Intel Xeon E5-2680 v2 (32 core)
Amazon Linux (SRIOV enabled)

source code

Run web app and wrk on one machine. Run MySQL on another machine.
wrk option is $ ./wrk -c128 -t8 -d10 http://localhost:8080/db

results

req/sec
original 68376.92
depmu 91589.11
depmu + map css 63830.53
depmu + map css (remove closed conn gc) 144693.79

Splitting mutex has significant improvements.

This patch make it slower since iterating map is slow. I'll try another approach.

FYI, using multiple DB to avoid lock contention hits 16k req/sec.
I want 15k req/sec with single DB.

@johto
Copy link
Contributor

@johto johto commented Jan 2, 2015

This patch make it slower since iterating map is slow. I'll try another approach.

Can you try what I suggested, i.e. not call connIfFreePrepared() at all? If that's fast enough, perhaps we can figure out a way to clean up dead connections from Stmt.css. It would also be interesting to see that approach with Stmt.css being a slice vs. it being a map.

@johto
Copy link
Contributor

@johto johto commented Jan 2, 2015

An idea for cleaning up: store an atomic counter in *DB, and increment it in driverConn.Close(). When creating a Stmt, store the current value of the counter inside the Stmt. In Stmt.connStmt(), compare the current value of the counter to the stored value. If the difference is bigger than N, iterate over css and clean up dead connections and store the new value of the counter in Stmt.

This way if no connections have recently died, we only need to do one atomic fetch + a comparison in connStmt(). Additionally, only one execution has to pay the price of cleaning up since the next one will see the counter having increased.

@methane
Copy link
Contributor Author

@methane methane commented Jan 2, 2015

@johto That looks nice idea!
I've implemented it: https://gist.github.com/methane/1341a204256e5dab52ac
I'll benchmark it in few days.

@methane
Copy link
Contributor Author

@methane methane commented Jan 2, 2015

Benchmark updated: https://gist.github.com/methane/1341a204256e5dab52ac
It looks very nice result. I'll send patch to gerrit.

@methane
Copy link
Contributor Author

@methane methane commented Jan 2, 2015

@johto
Copy link
Contributor

@johto johto commented Jan 2, 2015

I'd personally have used sync/atomic for it to avoid having to grab the DB mutex when not doing the cleanup.

Also, shouldn't lastNumClosed only be assigned to when the cleanup is actually done? The current coding seems incorrect.

@methane
Copy link
Contributor Author

@methane methane commented Jan 2, 2015

I'd personally have used sync/atomic for it to avoid having to grab the DB mutex when not doing the cleanup.

I've used DB.mu since it uses db.numOpen for threshold.
But fix sized threshold (e.g. 10) may be enough. I'll try to use atomic.

Also, shouldn't lastNumClosed only be assigned to when the cleanup is actually done? The current coding seems incorrect.

You're right. I'll fix it.

@dvyukov
Copy link
Member

@dvyukov dvyukov commented Jan 11, 2015

Block profile says that the main bottleneck is in connStmt.

The code is quite complex and I can't understand what it is trying to do and what from that complexity is inherent and what is not. Can somebody describe what connStmt/putConn are trying to achieve?

I would expect that connStmt must be a pop from a stack/queue and putConn must be a push to the stack/queue. And the rest must be designed around that idea. E.g. if a connection can't be used, then it must not be in the stack/queue.

@dvyukov
Copy link
Member

@dvyukov dvyukov commented Jan 11, 2015

@bradfitz can you describe what are the additional requirements for connStmt/putConn besides just popping and pushing a connection?

@johto
Copy link
Contributor

@johto johto commented Jan 11, 2015

I would expect that connStmt must be a pop from a stack/queue and putConn must be a push to the stack/queue. And the rest must be designed around that idea. E.g. if a connection can't be used, then it must not be in the stack/queue.

Yeah, that's pretty much it. The code in HEAD does some additional work:

  1. It tries to prefer connections which already have the statement prepared on them
  2. While doing that, it cleans up dead connections from the statement's list of connections which the statement has been prepared on

That's this loop. I would say other than that loop, the code is straightforward.

The suggested change is to make that cleanup happen less often, and spend less resources in the cleanup loop. That also means that it doesn't try to prefer connections which already have the statement prepared, but I don't consider that a problem.

Another alternative would be to go through all the statements in driverConn.finalClose(), and remove the connection from the css list. It's not obvious that that would be better, though.

@mikioh mikioh closed this in 1b61a97 Jan 24, 2015
@methane
Copy link
Contributor Author

@methane methane commented Jan 24, 2015

Thanks to all reviewers for persevering code review.

@golang golang locked and limited conversation to collaborators Jun 25, 2016
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
5 participants
You can’t perform that action at this time.