Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix lock_many race condition #43

Merged
merged 15 commits into from Sep 5, 2014
Merged

Fix lock_many race condition #43

merged 15 commits into from Sep 5, 2014

Conversation

zmc
Copy link
Member

@zmc zmc commented Sep 4, 2014

No description provided.

Zack Cerza added 6 commits August 28, 2014 08:34
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
try:
commit()
except (sqlalchemy.exc.DBAPIError, sqlalchemy.exc.InvalidRequestError):
rollback()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is no need to rollback here because the TransactionHook will do that for you when the request is a PUT/UPDATE/DELETE/POST

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unless you are allowing to hit this method on get requests?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I put that there because without it, I was getting exceptions saying I needed to rollback.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is the output of the new test in paddles/tests/controllers/test_nodes_race.py with rollback()
https://gist.github.com/zmc/beb1ecf31a8c3bed16a5

And here is the output of the same test without rollback()
https://gist.github.com/zmc/978b3df9d12e29c51c20

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see 'error locking nodes. please retry request' in the links that you pasted, if you are just removing rollback, how come there is not RaceConditionError with that message shown?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's the server log; it doesn't show the error messages themselves, so you wouldn't see the "error locking nodes" string. As for the exception, it is caught in the block above...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are the strings you'd see in the server log:

2014-09-04 14:32:12,077 WARNI [paddles.controllers.nodes] lock_many() detected race condition
2014-09-04 14:32:12,077 INFO  [paddles.controllers.nodes] retrying after race avoidance (1 tries left)


@classmethod
def lock_many(cls, count, locked_by, machine_type, description=None):
assert count >= 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are these asserts here for debugging or they are really meant to be set as safeguards? Haven't seen them in paddles so far.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, that's some leftover debugging. I'll tweak those.

Zack Cerza added 9 commits September 5, 2014 08:52
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
We'll raise this when the only way to avoid a race condition is to have
the client repeat the request

Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
This test requires paddles to be running in a multi-worker mode, or else
it is skipped.

Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
Signed-off-by: Zack Cerza <zack.cerza@inktank.com>
alfredodeza added a commit that referenced this pull request Sep 5, 2014
Fix lock_many race condition
@alfredodeza alfredodeza merged commit 14eb619 into master Sep 5, 2014
@zmc zmc deleted the fix_locking branch September 9, 2014 17:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants