Skip to content

Conversation

@janpantel
Copy link
Contributor

@janpantel janpantel commented Nov 28, 2018

Resolves: #26213

Before this change out of order releases of the same cache lock could lead to situation where client A acquired the lock, took longer than the timeout, which made client B successfully acquire the lock. If A now finishes while B is still holding the lock A will release the lock that does not belong to it any more.

This fix introduces a unique value that is written as the cache value to stop A from deleting B's lock.

Example from the original issue:

  • Client A acquires lock foo with a 10 second expiration
  • Client A begins performing a task
  • Client A takes longer than expected and the lock is automatically released by the expiration
  • Client B acquires the lock
  • Client A finishes it's task and attempts to release the lock, deleting client B's lock
  • Client C is now able to acquire the lock

This change does not break the existing API as it introduces the scoped($scope) function, plus a sugar function safe() that generates a random $scope using the uniqid() function.

This is most likely something that should be default for 5.8 (I will create a follow up PR after this one is merged) as it makes Cache locking much more robust. For 5.7 the opt-in is important as it could otherwise disrupt already existing locks in production deployments.

Happy to discuss the API of this :-)

@janpantel janpantel changed the title Scoped locks - allow locks to be secure against out of order releases. [5.7] Scoped locks - allow locks to be secure against out of order releases. Nov 28, 2018
@X-Coder264
Copy link
Contributor

This should target the master (5.8) branch as there are breaking changes (adding a method to a contract etc.).

@sisve
Copy link
Contributor

sisve commented Nov 28, 2018

Is there any reason this isn't the new default logic? It looks like it's opt-in by calling safe().

@janpantel
Copy link
Contributor Author

janpantel commented Nov 28, 2018

@sisve you're right this is opt-in. This is to keep the API stable, especially true for 5.7 as the docs currently suggest using locks like:

if (Cache::lock('foo', 10)->get()) {
    // Lock acquired for 10 seconds...

    Cache::lock('foo')->release();
}

Which means the probability of having production apps out there doing it that way is very high. Making it the default behaviour will make the example above break as the ->release() would try to release with a different scope value.

To use the scoped logs you have to retain the lock instance. So the example above translated would be:

$lock = Cache::lock('foo', 10)->safe();
if ($lock->get()) {
    // Lock acquired for 10 seconds...

    $lock->release();
}

What you could also do without keeping the instance:

if (Cache::lock('foo', 10)->scoped($someUniqueProcessIdentifier)->get()) {
    // Lock acquired for 10 seconds...

    Cache::lock('foo')->scoped($someUniqueProcessIdentifier)->release();
}

This approach could become the default in 5.8 in disregard of the drawback that it makes the API a little less pleasant. IMO it is a good trade off to make as it will make the locking way more robust. As I wrote above I will follow up with a second PR for 5.8 after this one was accepted and everyone is on the same page.

One idea for 5.8 would be to return a ?Lock from get() instead of bool. Like:

if ($lock = Cache::lock('foo', 10)->get()) {
    // Lock acquired for 10 seconds...

    $lock->release();
}

The only thing I am currently not super happy with is the wording scope though it is the best I have been able to come up with. I would like to find a more expressive wording.

@taylorotwell
Copy link
Member

Retaining a lock instance may not always be possible though? For example, you could acquire a lock during a web request, fire a queued job, when that queued job is finished you would release the lock from the job.

{
$this->redis->del($this->name);
if ($this->canRelease()) {
$this->redis->del($this->name);
Copy link
Contributor

@matt-allan matt-allan Nov 28, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a race condition here. It's possible for canRelease to return true, the lock to then be released because it was expired, and another client to acquire the lock before del is called. If that happens you will delete the other clients lock instead of your own.

With Redis you can use a Lua script to perform an atomic operation. Since scripts are blocking no other client can acquire the lock between the check and the delete.

if redis.call("get",KEYS[1]) == ARGV[1] then
    return redis.call("del",KEYS[1])
else
    return 0
end

Unfortunately I don't think there is a good solution for memcached.

https://redis.io/topics/distlock

Copy link
Contributor Author

@janpantel janpantel Nov 28, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't we use a transaction for that? The Redis docs say:

It can never happen that a request issued by another client is served in the middle of the execution of a Redis transaction.

For Memcached the cas command can be used setting it to null. Though this would eliminate the possibility of storing the lock for later release as @taylorotwell pointed out above, as the cas assumes a state on the client: http://php.net/manual/de/memcached.cas.php

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't we use a transaction for that?

You need to check if the key matches the token before deleting it. I don't think there is an equivalent Redis command that does that. Redis transactions aren't like database transactions, the GET will just be queued until EXEC and you can't read the value of the key in PHP before you call EXEC.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the GET will just be queued until EXEC

yep, right just tried that

@matt-allan
Copy link
Contributor

matt-allan commented Nov 28, 2018

Retaining a lock instance may not always be possible though? For example, you could acquire a lock during a web request, fire a queued job, when that queued job is finished you would release the lock from the job.

If you needed to use the same lock instance maybe you could pass the lock token around? i.e.

// in web request
$lock = Cache::lock('foo', 10)->get();
$job = SomeJob::dispatch($lock->token());

// in worker
$lock = Cache::lock('foo', 10, $this->token)->get();

In this example the token parameter would be optional, and if provided would be used instead of the random token. If an existing token is used you would need to check if the lock is still held with that token instead of trying to obtain a new lock (Actually, maybe that makes sense as a new method 😆 ).

@janpantel
Copy link
Contributor Author

What @yuloh said, actually using the ->scoped($token) function you can already do that. I think the question is whether it makes sense to bring this to 5.7 or rewrite it to target 5.8 and create a more pleasant API?

@janpantel janpantel changed the title [5.7] Scoped locks - allow locks to be secure against out of order releases. [5.7] Owned locks - allow locks to be secure against out of order releases. Nov 28, 2018
@janpantel
Copy link
Contributor Author

After some research I renamed the ->safe() method to ->owned() and changed scope to owner. This matches the usual lingo of lock packages in the OSS world.

@janpantel
Copy link
Contributor Author

@taylorotwell actually after trying it out I can confirm that you can dispatch a queued job (tested using the redis queue driver) containing the lock instance that you marked with ->owned() before.
From the job's handle() function the lock can be properly released as the owner property is properly restored.

@taylorotwell
Copy link
Member

taylorotwell commented Nov 28, 2018

Personally I would target all of this towards 5.8 and try to write it in the best way we can, and accept the current state of 5.7 locks as just having a known limitation. Laravel 5.8 is due in February so we aren't far off.

@janpantel
Copy link
Contributor Author

@taylorotwell totally fine with me. I will change the PR to default to owned locks without the need of explicitly stating it and target the 5.8 branch.
Do you have any strong objections against the need of keeping the lock instance in order to be able to release it?

Before this change out of order releases of the same cache lock could lead
to situation where client A acquired the lock, took longer than the timeout,
which made client B successfully acquire the lock. If A now finishes while B
is still holding the lock A will release the lock that does not belong to it.

This fix introduces a unique value that is written as the cache value to stop
A from deleting B's lock.
@taylorotwell
Copy link
Member

taylorotwell commented Dec 2, 2018

@janpantel I don't totally see a way around that if you want locks to work this way? You would at least need to know the scope. Have you researched how symfony/lock component is solving this issue? That would probably be a good thing to look into as they may have already worked through some "gotchas" that we may not be thinking of.

@taylorotwell
Copy link
Member

It looks like symfony/lock uses a similar "key" approach. However, I do not see a way to release the lock from within another process like a queued job. Maybe the lock can be serialized? I'm not sure.

@janpantel
Copy link
Contributor Author

janpantel commented Dec 4, 2018

@taylorotwell you just wrote that when I was about to answer ;-) . Yes exactly, they also use a LUA script for Redis. So given my real world test with my 5.7 code it was possible to hand the lock into a job which gets serialised into a Redis queue. The release was successful when done from the Job instance.

My biggest struggle currently is writing an integration test that can ensure that automatically. Is there a way to start a queue listener in the integration tests or maybe to at least work through the queue contents?

@janpantel janpantel changed the base branch from 5.7 to master December 4, 2018 14:23
@janpantel janpantel changed the title [5.7] Owned locks - allow locks to be secure against out of order releases. [5.8] Owned locks - allow locks to be secure against out of order releases. Dec 4, 2018
@GrahamCampbell GrahamCampbell changed the title [5.8] Owned locks - allow locks to be secure against out of order releases. [5.8] Allow locks to be secure against out of order releases Dec 4, 2018
/**
* Returns the owner value written into the driver for this lock.
*
* @return mixed
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably want this to be string|null?

Copy link
Contributor Author

@janpantel janpantel Dec 4, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually the nullability of the owner field is just an artifact that was left from the 5.7 implementation. Shouldn't we move to string only for all @return and @var (except for the constructors)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@GrahamCampbell any opinion on that one?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, if that is their type.

@taylorotwell
Copy link
Member

Eh, I don't think testing that specifically is important as simply testing the creation of a lock from an existing scope, etc.

@CthulhuDen
Copy link

How about Cache::block method, will it auto-generate token as well? Also, currently (in 5.7) if closure throws the Cache::block will not release the lock, do you think that also deserves amending?

@janpantel
Copy link
Contributor Author

janpantel commented Dec 10, 2018

Added the tests and changed the memcached implementation.
The only thing which is currently unpleasant about the API is that you will have to give the expiry second before the owner when restoring, like:

$original = Cache::lock('foo', 10)->get();
$owner = $original->getOwner();

$second = Cache::lock('foo', 10, $owner);
$second->release();

My first reaction was that it might make sense to move the expiry seconds to the ->get() call instead, so that we'd have:

Cache::lock('foo')->get(10);

I think that would also be cool semantically as it reads like get the lock for N seconds.

What do you guys think?

@taylorotwell
Copy link
Member

Hmm, yeah I don't think being forced to pass the expiry time in order to release the lock makes much sense.

@taylorotwell
Copy link
Member

It seems like you could also work around that by adding another method that accepts the owner as its second argument for reconstituting locks... Cache::from('foo', $owner)... etc.

@taylorotwell
Copy link
Member

Are there plans to keep working on this?

@janpantel
Copy link
Contributor Author

@taylorotwell yes, just had some downtime due to the holidays. I will add a commit introducing Cache::from() later. Also I'm waiting to @GrahamCampbell to answer :-)

@GrahamCampbell
Copy link
Collaborator

I've replied. Please let me know if you need anything else.

@janpantel
Copy link
Contributor Author

@GrahamCampbell The only thing left is to remove the request changes if you're fine with the current state.

@taylorotwell I created the restoreLock function and also ported to Redis implementation to the LUA script that @yuloh mentioned. Anything else you see left tbd?

@taylorotwell
Copy link
Member

Can you just comment with the basic usage documentation given the current API so I can look over it and use it as a guide in my own testing?

@janpantel
Copy link
Contributor Author

@taylorotwell for sure. I will also go ahead and update the real docs once we have committed on the API.

To obtain a lock and release it it is now necessary to carry the instance around.

if ($lock = Cache::lock('foo', 10)->get()) {
    // do stuff

    $lock->release()
}

If you are sure that there will be no race condition or the out of order release does not trouble you you can also use the forceRelease function without remembering the lock instance.

if (Cache::lock('foo', 10)->get()) {
    // do stuff

    Cache::lock('foo')->forceRelease()
}

If you have to release the lock in another process you can get the owner token from the lock instance and restore it later.

if ($lock = Cache::lock('foo', 10)->get()) {
    // do stuff

    event(new LockReleasingEvent($lock->getOwner()));
}

// later ...

$lock = Cache::restoreLock('foo', $owner);

// do stuff

$lock->release

A counter example, as in how to not do it would be:

if (Cache::lock('foo', 10)->get()) {
    // do stuff

    Cache::lock('foo')->release()

    /*
    lock still in place as the constructor generates a random
    owner token each time a lock instance is created
    */
}

@taylorotwell taylorotwell merged commit e26265b into laravel:master Jan 15, 2019
@taylorotwell
Copy link
Member

Nice work. I made the necessary changes to the new DynamoDbLock as well.

@driesvints
Copy link
Member

@janpantel thanks for your work on this! 👏

@janpantel
Copy link
Contributor Author

Thanks for making my first major contribution such a pleasant experience! :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants