New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix lock_many race condition #43
Conversation
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
try: | ||
commit() | ||
except (sqlalchemy.exc.DBAPIError, sqlalchemy.exc.InvalidRequestError): | ||
rollback() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is no need to rollback here because the TransactionHook will do that for you when the request is a PUT/UPDATE/DELETE/POST
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unless you are allowing to hit this method on get requests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I put that there because without it, I was getting exceptions saying I needed to rollback.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is the output of the new test in paddles/tests/controllers/test_nodes_race.py
with rollback()
https://gist.github.com/zmc/beb1ecf31a8c3bed16a5
And here is the output of the same test without rollback()
https://gist.github.com/zmc/978b3df9d12e29c51c20
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see 'error locking nodes. please retry request' in the links that you pasted, if you are just removing rollback, how come there is not RaceConditionError
with that message shown?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's the server log; it doesn't show the error messages themselves, so you wouldn't see the "error locking nodes" string. As for the exception, it is caught in the block above...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are the strings you'd see in the server log:
2014-09-04 14:32:12,077 WARNI [paddles.controllers.nodes] lock_many() detected race condition
2014-09-04 14:32:12,077 INFO [paddles.controllers.nodes] retrying after race avoidance (1 tries left)
|
||
@classmethod | ||
def lock_many(cls, count, locked_by, machine_type, description=None): | ||
assert count >= 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are these asserts here for debugging or they are really meant to be set as safeguards? Haven't seen them in paddles so far.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, that's some leftover debugging. I'll tweak those.
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
We'll raise this when the only way to avoid a race condition is to have the client repeat the request Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
This test requires paddles to be running in a multi-worker mode, or else it is skipped. Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
Fix lock_many race condition
No description provided.