Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

usage of thread local #22

Closed
josh-devops-center opened this issue May 22, 2015 · 5 comments
Closed

usage of thread local #22

josh-devops-center opened this issue May 22, 2015 · 5 comments

Comments

@josh-devops-center
Copy link

Hi,

The masterdb pinning uses a thread local. In Django since a thread may outlive a request and threads may be reused between requests, there is a lack of thread safety and threadlocals are not recommended for use.

For the masterdb pinning, this may mean that queries which can query standby servers may instead query the master, and queries which should only query the master may query slaves.

I didn't find a disclaimer regarding this. I was wondering what sort of guarantees does multidb router provide?

Thanks

@jsocol
Copy link
Collaborator

jsocol commented May 22, 2015

multidb router was written with these constraints in mind. It is designed specifically to handle the case where the thread is used for multiple requests.

If you encounter issues where it fails, please let us know with specifics.

@jsocol jsocol closed this as completed May 22, 2015
@josh-devops-center
Copy link
Author

Threads may be reused between requests.

How does a cookie handle a request being reused? Multiple requests would overwrite each other.

"Also, whether every request gets its own thread is not guaranteed in WSGI. It could be that a request is reusing a thread from before, and hence data is left in the thread local object."

So when the db is actually selected, it would have the wrong value when multiple threads stomp on each other. So my original question still stands.

@jsocol
Copy link
Collaborator

jsocol commented May 22, 2015

Please read the source code.

Threads may be reused between requests.

Yes, and as I said, the package is specifically designed to handle this. Cookies are sent—or not—by each individual client with each request.

If you are actually encountering errors, please open issues with specific details including how it's running (Apache/mod_wsgi, uwsgi, gunicorn, etc) and any relevant settings (e.g. number of processes/threads-per-process, etc).

@josh-devops-center
Copy link
Author

So there are no test cases, just the fact that you say the design handles this?

Whats so special with a cookie that differs from storing the session in a thread local which is not advised?

session error

Repository owner locked and limited conversation to collaborators May 22, 2015
@jsocol
Copy link
Collaborator

jsocol commented May 22, 2015

There are tests for this library. Not all deployments can be included in automated tests. Please file issues if you encounter errors.

I strongly encourage you to actually read the source code. I also encourage you to read up on HTTP cookies. Here is, briefly, how the library works:

The middleware determines whether the request either (a) is not a "safe" method (GET, TRACE, OPTIONS, HEAD) or (b) includes a pinning cookie. If either is true, the thread is "pinned" to the master and all database reads are sent to the master. If neither is true, the library explicitly "unpins" the thread, to solve the exact issue you're describing (this is the code I linked earlier). An "unpinned" thread sends database reads to the read replicas.

The thread-local storage is used because the DB router does not have access to the current HTTPRequest object, so we need to store the pinned state somewhere we can guarantee the router can access.

A very common, and recommended, pattern after a POST request (e.g. creating a forum post) is to end it with a redirect to the new content, which causes a new GET request that will probably be handled by a different thread, if not a different host entirely. Cookies are used to ensure that users who just wrote content will read from the master for a few seconds (so their content will still be available even if there is some replication lag) regardless of how many requests they make or what hosts or threads serve those requests.

It is certainly possible that there are bugs, particularly with gunicorn with gevent workers (though gevent should patch threading.local with everything else) or other green-thread solutions. If you experience bugs, please open issues. Otherwise, please read the source code if you want to know how something works.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants