Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix issue with concurrent contextual stack modification #747

Merged
merged 2 commits into from Feb 17, 2021

Conversation

uriyyo
Copy link
Contributor

@uriyyo uriyyo commented Dec 15, 2020

We are using gino in production, and we had a lot of unexpected exceptions from asynpg with a message:

asyncpg.exceptions._base.InterfaceError: cannot perform operation: another operation is in progress

The problem was, that these exceptions were unpredictable and we had no idea what we a doing wrong.

Basically, we a using asyncio based web framework and we had a lot of code patterns like this:

db = Gino()

def worker(entity):
     async with db.acquire(reuse=False):
          ... # do some database queries based on entity

def main():
     async with db.acquire(reuse=False):
          entities = await Entity.query.gino.all()
          tasks = [create_task(worker(entity)) for entity in entities)]
          # do some useful thing concurrently to started tasks
          result = await Entity.query.where(...).gino.all()  # make another query to db
          await gather(*tasks)

And sometimes we had an exception at this place:

result = await Entity.query.where(...).gino.all()  # make another query to db

This exception cancels one of the running worker tasks.

Today after hours of debugging I have found the root cause of this issue馃コ

The problem was with the concurrent _ContextualStack._ctx value modification. When we are creating a new asyncio task we copy current contextvars to the created task. In a case when there are reusable connections it will copy deque of those connections to the context of a child task. In a case when we acquire a new DB connection in a child task it will modify the deque of reusable connection which is a completely same deque that are using at main task.

I hope you understood the issue)

The solution that I used is to use an immutable single linked list as a contextual stack, in such case concurrent tasks won't mutate the stack of each other.

@fantix It will be great for me and my team if you can review this PR and release new version of a gino which will include this fix, so we can remove the workaround from our code that we currently have.

@fantix fantix added the bug Describes a bug in the system. label Dec 15, 2020
@fantix
Copy link
Member

fantix commented Dec 15, 2020

Thanks for the very detailed explanation and the PR, and sorry for the trouble! It does make sense to me - I'd like to further check the code, at the same time would you mind fixing the syntax for Python 3.5 pls?

@fantix fantix added this to the GINO 1.0 milestone Dec 15, 2020
@uriyyo
Copy link
Contributor Author

uriyyo commented Dec 15, 2020

I have fixed python 3.5 syntax error. Sorry about that, I forget that gino supports version 3.5)

@uriyyo
Copy link
Contributor Author

uriyyo commented Dec 19, 2020

Hi @fantix,
Do you have time to look at this PR?)

@suleimanmahmoud
Copy link

Great work @uriyyo

I was having the same issue over similar code steps and your PR fixed it.

@fantix
Copy link
Member

fantix commented Jan 4, 2021

Oh sorry for the late reply - I'll take a deeper look tonight.

@uriyyo
Copy link
Contributor Author

uriyyo commented Feb 1, 2021

Hi @fantix

Any updates? Had you a chance to take a look at this PR?

@wwwjfy
Copy link
Member

wwwjfy commented Feb 1, 2021

He's probably been busy.

While I do understand the issue, I ponder if there is an easier way to fix it, as linked list seems a bit overkill in this case.
I'll take some time to experiment with a non-sleepy mind tomorrow.

@fantix
Copy link
Member

fantix commented Feb 1, 2021

Oh thank you Tony! Yeah I'm sorry that I've been on a move and trying to put up a new uvloop release.

@wwwjfy
Copy link
Member

wwwjfy commented Feb 2, 2021

While this PR can fix the case in the test, the stack is still shared with tasks created by current coroutine.
This could cause the same problem, for example, when a connection is created and then reused in the new coroutines.

Ideally, in my mind, a new coroutine can automatically reset contextvars to empty. But it's not how contextvars works. Instead the new coroutine inherits the current one.

The solution I can think of now is to create task with _ctx reset to None before the actual execution. Gino can probably expose a method to do that. Any thought?

(Another option is to use loop.call_soon(task(), context=contextvars.Context()) to run it with an empty Context, but that would lose other contextvars too.)

@uriyyo uriyyo force-pushed the fix-context-stack-issue branch 2 times, most recently from 06d3761 to 10e8b2c Compare February 10, 2021 10:08
@uriyyo
Copy link
Contributor Author

uriyyo commented Feb 10, 2021

Hi @wwwjfy, @fantix

I have reimplemented this PR. I agree that the previous solution with a linked-list can be overkill for this issue.

With the new approach contextual stack is bound to task where it was created, in a case when ctx is using in another task then ctx will be rewritten.

While this PR can fix the case in the test, the stack is still shared with tasks created by current coroutine.
This could cause the same problem, for example, when a connection is created and then reused in the new coroutines.

This issue also resolved in this PR.

What is your opinion regarding the new implementation?

@wwwjfy
Copy link
Member

wwwjfy commented Feb 17, 2021

Looks good. One concern is if we add support to trio later, we need to apply similar strategy there.

BTW, can you rebase on latest master, which fixed CI?

@uriyyo
Copy link
Contributor Author

uriyyo commented Feb 17, 2021

Hi Tony, I have rebased on master. Thanks for feedback regarding implementation馃檪

@wwwjfy
Copy link
Member

wwwjfy commented Feb 17, 2021

Thanks for the fix!

@wwwjfy wwwjfy merged commit 163e9a0 into python-gino:master Feb 17, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Describes a bug in the system.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants