[2.0] Should Lock default to using contextvars? #1040

bmerry · 2021-06-28T12:11:40Z

I've never used the Lock class in redis-py. But when glancing at the code, I noticed that the default is to store the lock state in thread-local storage (so that two threads can each independently acquire the lock without interfering with each other).

For aioredis, the standard concurrency mechanism is tasks, not threads, so it probably makes sense to default to using contextvars (so that two tasks can independently acquire the lock).

The text was updated successfully, but these errors were encountered:

abrookins · 2021-07-29T22:08:30Z

This does not appear to be straightforward. The Python docs claim ContextVars should only be created at the top level of a module, that is, outside of a class. Multiple Tasks will see different values for the ContextVar, but multiple Lock instances within the same Task, or multiple Lock instances outside of a Task context, would share the same ContextVar. That doesn't work with the current design of Lock.

Putting a ContextVar instance inside of each Lock instance the same way that redis-py uses thread-local storage works, but the docs claim these variables won't get garbage-collected. 🤔

abrookins · 2021-07-29T22:11:07Z

@Andrew-Chen-Wang Have you looked into this much? I'm not sure how important this garbage collection reference in the docs is, or how different this is than the situation we already had with thread-local storage.

Andrew-Chen-Wang · 2021-07-29T22:31:09Z

@abrookins take a look at NoneGG/aredis#120

abrookins · 2021-07-29T22:31:31Z

I wonder why we even need a ContextVar at all. 🤔

redis-py was trying to solve the problem that multiple threads could try to change the same Lock instance.

We want to solve the problem that multiple Tasks might try to change the same Lock instance. Right?

Wouldn't using an asyncio.Lock work here, so that only one Task could change the state of the Lock instance at a given time?

... Without any memory leaks.

Andrew-Chen-Wang · 2021-07-29T22:32:51Z

I was about to say use asyncio.Lock, but without looking at the code at all, I assumed this was a Redis lock itself like https://github.com/joanvila/aioredlock

Edit: currently looking at an async database implementation here encode/databases#230

abrookins · 2021-07-29T22:35:11Z

Right, it's a lock backed by a Redis key.

I see in that thread that they did not actually resolve the question of whether the garbage collection problem mattered.

It seems like using asyncio.Lock to lock the internal state of the (heh, Redis-backed) Lock instance would achieve the goal without leaking memory with ContextVar instances inside the (Redis-backed) Lock instance.

I've been working on a PR that uses ContextVar, but let me refashion it to use asyncio.Lock to see if that makes more sense.

abrookins · 2021-07-29T23:19:22Z

Oh, right, locks don't do the same thing.

With thread-local storage, if two threads were using the same (Redis) Lock instance, one thread acquiring the Redis Lock would block the other thread. That's because each thread would use a different token value, which is what Lock uses to distinguish if the current instance "owns" the protected resource currently. I.e. current Redis value == Lock token.

If we did use asyncio.Locks inside the Redis Lock, two Tasks sharing the same Lock wouldn't block each other, they would see the same token value.

So we either do need to use ContextVar in the form that the aredis folks did, which is to say against the advice of the Python docs, or else...? I'm not sure what the alternative is.

The Lock implementation in redis-py uses thread-local storage so that multiple threads using the same Lock instance can acquire the Lock from each other. Thread-local storage ensures that each thread sees a different token value. Thread-local storage does not apply in the Task-based concurrency that asyncio programs use. To achieve a similar effect, we need to embed a ContextVar instance within each Lock and store the Lock instance's token withint he ContextVar instance. This allows every Task that uses the same Lock instance to see a different token. Thus, if both Task A and Task B refer to Lock 1, Task A can "acquire" Lock 1 and block Task B from acquiring the same Lock until Task A "releases" the Lock. NOTE: The Python documentation suggests only storing ContextVar instances in the top-level module scope due to issues around garbage collection. That won't work in the current design of Lock. For lack of a better alternative, and to preserve the original design of Lock taken from redis-py, we have created instances of ContextVar within instances of Lock. Fixes #1040.

abrookins · 2021-07-29T23:45:39Z

@Andrew-Chen-Wang I assume you already get pinged out the wazoo, but: I opened a PR with my changes to use ContextVar. See linked.

Andrew-Chen-Wang · 2021-07-29T23:51:56Z

Thanks for opening the PR and sorry for the lack of responses. Just a quick thought: Is asyncio.Queue feasible to store the aforementioned Lock tokens and continuously put and pop? asyncio.Queue is not thread-safe, but janus from aio-libs is a thread-safe asyncio Queue.

Also after looking around a bunch of async repos like FastAPI (comment, though I think that's actually a correct use case) and db wrappers like encode/databases, I keep seeing everyone recommend using contextvars. In the latter repo, they saw a problem with contextvars but haven't created a PR to resolve it yet (comment).

abrookins · 2021-07-30T00:08:56Z

The FastAPI comment sounds like the advised way to use ContextVars, based on the docs. I.e., a single top-level module instance.

As for the second link, that appears to solidly reject my PR and the use of ContextVars in general for this problem, because of the text quoted:

Tasks support the contextvars module. When a Task is created it copies the current context and later runs its coroutine in the copied context.

https://docs.python.org/3/library/asyncio-task.html#task-object

encode/databases did exactly what I'm proposing to do, but it appears not to work correctly because Tasks copy the current context -- thus a new Task might see the same token as another Task, and the Lock would not work as intended.

Update: Jeeze, I'm still wondering if there's a way to use ContextVar correctly here. I'll have to look at this again after I take a break, maybe draw some flow charts. 😂

abrookins · 2021-07-30T00:24:13Z

So with a Queue, what do you imagine? Something like this?

Create a Lock(), Lock 1 --> empty queue inside Lock 1
Task A acquires Lock 1 --> queue.join() exits -> queue.put(token 123) -> queue.get() retrieves token 123
Task B tries to acquire Lock 1 -> queue.join() blocks
Task A releases Lock 1 --> queue.task_done(token) -> Queue is empty
Task B acquires Lock 1 -> queue.join() exits -> queue.put(token 456) -> queue.get() retrieves token 456

That might work. If the queue's max size is 1, it can only ever hold one Lock token. So whichever blocked Task (waiting on the release of the Lock) pulls out a token first "wins."

Andrew-Chen-Wang · 2021-07-30T03:06:48Z

Basically, yes, we hope? I guess the problem would be the order; there's a probability that a token can never be put into the queue if we use max size 1 unless I'm not understanding the put correctly + locking mechanism. I guess it'd be worth a try? Btw it was just a guess; no idea if it'd work and did not think it thoroughly through.

fedej · 2021-08-04T00:58:19Z

Don't mean to beat a dead horse but I do think you can use contextvars (top-level module) without problems when starting nested tasks.
Made this minimal example: https://gist.github.com/fedej/8957769ff3db3b377d06f4bb74bcef77

Maybe I misunderstood the problem or the order of concurrent calls

achimnol · 2021-11-30T12:21:25Z

Initially the module-level contextvar-based token must be not-set, and when entering and exiting async contexts (async with) we need to initialize the token and pass the returned reference of reset to the exit handler. As long as the enter/exit handlers run in the same asyncio task and the lock is not used as if it is reentrant, this will keep the uniqueness of lock operation for different asyncio tasks, since the lock instance is identified with its name while the lock's stack is identified with the token.

achimnol · 2021-12-02T04:05:49Z

Tested with my own fork of the lock module, but no lock acquirers can actually acquire the lock, just like before patching with contextvars. 🤔

Andrew-Chen-Wang · 2021-12-24T03:38:51Z

multiple Lock instances within the same Task, or multiple Lock instances outside of a Task context, would share the same ContextVar.

Has anyone experimented with just creating a contextvar with a value as a dictionary, then for each lock, we assign it a unique id such as f"aioredis_lock_{hash(self)}", then save the lock instance inside the contextvar. Ref asgiref.local which is a thread and context based local rather than our sole purpose of context.

heyfavour · 2022-09-26T15:25:50Z

threading.local or contextvars.ContextVar("lock") should be global then it works
in aioredis.lock you use threading.local() it will create a new instance every time,it can't works

are you still sloving this problem?

abrookins added the need investigation Need to look into described issue. label Jul 2, 2021

abrookins mentioned this issue Jul 29, 2021

Use ContextVar to track Task-local Lock token #1076

Closed

5 tasks

abrookins added bug and removed need investigation Need to look into described issue. labels Jul 30, 2021

abrookins mentioned this issue Jul 30, 2021

Bump the version for 2.0.0 release #1075

Merged

5 tasks

Andrew-Chen-Wang mentioned this issue Nov 22, 2021

Redis update 2021-10-08 #1156

Open

5 tasks

Andrew-Chen-Wang mentioned this issue Nov 30, 2021

Necessary issues to resolve #1225

Open

8 tasks

achimnol mentioned this issue Mar 4, 2022

Write our own distributed locking (async etcd3 with concurrency API or aioredis v2 redlock) lablup/backend.ai#383

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[2.0] Should Lock default to using contextvars? #1040

[2.0] Should Lock default to using contextvars? #1040

bmerry commented Jun 28, 2021

abrookins commented Jul 29, 2021

abrookins commented Jul 29, 2021

Andrew-Chen-Wang commented Jul 29, 2021

abrookins commented Jul 29, 2021 •

edited

Andrew-Chen-Wang commented Jul 29, 2021 •

edited

abrookins commented Jul 29, 2021 •

edited

abrookins commented Jul 29, 2021 •

edited

abrookins commented Jul 29, 2021 •

edited

Andrew-Chen-Wang commented Jul 29, 2021 •

edited

abrookins commented Jul 30, 2021 •

edited

abrookins commented Jul 30, 2021 •

edited

Andrew-Chen-Wang commented Jul 30, 2021 •

edited

fedej commented Aug 4, 2021

achimnol commented Nov 30, 2021

achimnol commented Dec 2, 2021 •

edited

Andrew-Chen-Wang commented Dec 24, 2021

heyfavour commented Sep 26, 2022

[2.0] Should Lock default to using contextvars? #1040

[2.0] Should Lock default to using contextvars? #1040

Comments

bmerry commented Jun 28, 2021

abrookins commented Jul 29, 2021

abrookins commented Jul 29, 2021

Andrew-Chen-Wang commented Jul 29, 2021

abrookins commented Jul 29, 2021 • edited

Andrew-Chen-Wang commented Jul 29, 2021 • edited

abrookins commented Jul 29, 2021 • edited

abrookins commented Jul 29, 2021 • edited

abrookins commented Jul 29, 2021 • edited

Andrew-Chen-Wang commented Jul 29, 2021 • edited

abrookins commented Jul 30, 2021 • edited

abrookins commented Jul 30, 2021 • edited

Andrew-Chen-Wang commented Jul 30, 2021 • edited

fedej commented Aug 4, 2021

achimnol commented Nov 30, 2021

achimnol commented Dec 2, 2021 • edited

Andrew-Chen-Wang commented Dec 24, 2021

heyfavour commented Sep 26, 2022

abrookins commented Jul 29, 2021 •

edited

Andrew-Chen-Wang commented Jul 29, 2021 •

edited

abrookins commented Jul 29, 2021 •

edited

abrookins commented Jul 29, 2021 •

edited

abrookins commented Jul 29, 2021 •

edited

Andrew-Chen-Wang commented Jul 29, 2021 •

edited

abrookins commented Jul 30, 2021 •

edited

abrookins commented Jul 30, 2021 •

edited

Andrew-Chen-Wang commented Jul 30, 2021 •

edited

achimnol commented Dec 2, 2021 •

edited