Skip to content

Loading…

DDC-178: Query Hint for LOCK mechanisms plus support in $em->find() #2432

Closed
doctrinebot opened this Issue · 32 comments

2 participants

@doctrinebot

Jira issue originally created by user @beberlei:

In some scenarios it is necessary to explicitly lock rows for update in a select query. My idea would be to support it twofold:

  1. Add a LockMode Class:
final abstract class Lock
{
    const NONE = 0;
    const OPTIMISTIC = 1;
    const PESSIMISTIC_READ = 2;
    const PESSIMISTIC_WRITE = 4;
}
  1. Add methods to platforms that can add necessary READ/WRITE Lock query additions like FOR UPDATE/SHARED which afaik are supported by all rmdbs in some way.
  2. Add a query hint "lockMode" which takes a Lock constant.
  3. Add a query hint "lockVersion" which takes an integer or timestamp versioning value.
  4. Change DQL Parser to apply lock mode if set, and if lock_mode = optimistic, add a where clause for the only top-level class (more not supported? Reread Evans DDD / Aggregates)
  5. Change the $em->find() method to the following signature:
public function find($class, $identifier, $lockMode=0, $lockVersion=null);

And if the values are set, set the appropriate query hints.


Updated API specification

LockModes

final abstract class LockMode
{
    const NONE = 0;
    const OPTIMISTIC = 1;
    const PESSIMISTIC_READ = 2;
    const PESSIMISTIC_WRITE = 4;
}

Constraints

  • LockMode::OPTIMISTIC requires entities to be versioned
  • LockMode::PESSIMISTIC_READ/WRITE works similarly for versioned as well as non-versioned entities. However, lock() after read only works for versioned entities.

API spec

$query->setLockMode(LockMode::OPTIMISTIC)

Effects:

  • Throw OptimisticLockException if any of the entities fetched (or fetch-joined) by the query are not versioned.
  • Otherwise proceed normally, SQL is not modified.

$query->setLockMode(LockMode::PESSIMISTIC_READ/WRITE)

Effects:

  • Throw TransactionRequiredException if there is no running transaction.
  • Modify the SQL with an appropriate locking clause (i.e. FOR UPDATE) that can be platform-specific, to acquire a pessimistic lock on all read entities.
  • Throw PessimisticLockException if lock(s) could not be obtained.

$em->find($entity, LockMode::OPTIMISTIC) (or findBy et al)

Effects:

  • Throw OptimisticLockException if entity is not versioned.
  • Otherwise proceeed normally.

$em->find($entity, LockMode::OPTIMISTIC, $version) (or findBy et al)

Effects:

  • Throw OptimisticLockException if entity is not versioned.
  • find() entity normally, no SQL modification.
  • Throw OptimisticLockException if there is a version mismatch ($version != $entity->version)
  • Otherwise proceed normally.

$em->find($entity, LockMode::PESSIMISTIC_READ/WRITE)

Effects:

  • Throw TransactionRequiredException if there is no active transaction.
  • Modify the SQL to incude an appropriate platform-specific pessimistic lock (i.e. FOR UPDATE)
  • throw PessimisticLockException If lock could not be obtained
  • Otherwise proceed normally
  • Refresh entity with lock when entity with id exists in identity map already

$em->refresh($entity, LockMode::OPTIMISTIC)

Effects:

  • Throw OptimisticLockException if entity is not versioned.
  • Otherwise proceed normally. (What about cascades here? Need to take that into account probably)

$em->refresh($entity, LockMode::PESSIMISTIC_READ/WRITE)

Effects:

  • Throw TransactionRequiredException if there is no active transaction.
  • Ensure the SQL used for refreshing is modified accordingly with the platform-specific pessimistic locking clause.
  • Throw PessimisticLockException if the lock could not be obtained.
  • Otherwise proceed normally.

$em->lock($entity, LockMode::OPTIMISTIC)

Effects:

  • Throw OptimisticLockException if entity is not versioned.
  • Otherwise do nothing (NOOP).

$em->lock($entity, LockMode::PESSIMISTIC_READ/WRITE)

Effects:

  • Throw TransactionRequiredException if there is no active transaction.
  • Throw PessimisticLockException if entity is not versioned (this is always a "lock after read") ** Issue straight, minimal locking SQL (we probably must* include the version column in the select), platform-specific. Note: Probably get the SQL from the persisters to account for different inheritance mapping strategies. The last part of the SQL, the locking clause, is taken from the platforms.
  • Throw PessimisticLockException if the lock could not be obtained.
  • Compare the newly read version with the old version. If they dont match throw PessimisticLockException (this means the entity was changed since it was read).
  • Otherwise proceed normally.
@doctrinebot

Comment created by @guilhermeblanco:

We should be aware that PHP 5.3 now uses mysqlnd driver internally.
This means that queries like LOCK are applied via unbuffered queries, which may compromise subsequent data changes on DB table.

We need to do some testing before effectively apply any type of approach here.

@doctrinebot

Comment created by @beberlei:

What does this have to do with locking? I don't understand it :-) Please elaborate :D

@doctrinebot

Comment created by @beberlei:

Attached a quick try for pessimistic lock.

Optimistic lock enforcement is much more difficult...

@doctrinebot

Comment created by @beberlei:

Update of the patch adding tests to the Query SELECT tests.

@doctrinebot

Comment created by romanb:

Regarding Nr. 5, indeed more than one root entity is supported in DOctrine 2 when querying.

Example (assuming Customer and Employee are not related in any way):

$q = $em->createQuery("SELECT c, e FROM Customer c, Employee e WHERE c.hatsize = e.shoesize"

The result would (or should) be an array with both Customer and Employee objects.

@doctrinebot

Comment created by @beberlei:

We should probably go only for the PESSIMISTIC lock in the interface.

However it might be convenient to make optimistic locking possible via an additional parameter. Given that PHP is Stateless it might be a good pattern to do:

$id = $_POST['id'];
$lastKnownVersion = $_POST['version'];
$entity = $em->find('Yadda', $id, $lastKnownVersion);
// do stuff with $entity

$em->flush();
@doctrinebot

Comment created by @beberlei:

Hm given our discussion yesterday, we should translate the OPTIMISTICFORCEINCREMENT into PHPs execution context (script per request):

final abstract class Lock
{
    const NONE = 0;
    const OPTIMISTIC_READ = 1;
    const PESSIMISTIC_READ = 2;
    const PESSIMISTIC_WRITE = 4;
}

OPTIMISTIC_READ requires the fourth parameter of EM::find() to find being a positive integer and adds a version chck for that given version. This wont work with Dql though.

EntityRepository should also be changed, this affects:

EntityRepository::find($id, Lock $lockMode = null, $lockVersion = null);
EntityRpository::findOneBy(array $criteria, Lock $lockMode = null, $lockVersion = null);

Entity Repository would translate a $lockVersion != null into a $criteria. Then we onlly need to change:

StandardEntityPersister::load($criteria, $entity, $assoc, $hints = array(), $lockMode = null)
StandardEntityPersister::_getSelectEntitiesSQL($criteria, $assoc=null, $lockMode = null)

The all queries have to be done with DQL to get pessimistic locks imho. A DQL Query with locking would look like:

$q->setHint(Query::HINT*LOCK_MODE, Lock::PESSIMISTIC*READ);
@doctrinebot

Comment created by romanb:

I would also like to see an EntityManager#lock() method. However, after re-reading the spec and testing with Hibernate 3.5.0-CR-2, I'm not sure we really need the *FORCE_INCREMENT variations. In Hibernate, em.lock() with OPTIMISTIC doesnt do anything (it doesnt need to, versions are compared on update anyway). Even with OPTIMISTIC_FORCE_INCREMENT nothing happens, except the normal version update when the entity changed. Not sure whether this is a bug or intended. PESSIMISTIC_FORCEINCREMENT works as expected.

So I currently think we should have the following as a start:

final abstract class LockMode
{
    const NONE = 1; // default
    const PESSIMISTIC_READ = 2;
    const PESSIMISTIC_WRITE = 3;
}

The LockMode can then be used either in queries or with EntityManager#lock(). Moreso, these lock modes work independantly of whether the entity is versioned (@Version) or not. PESSIMISTICREAD would acquire a shared read lock while PESSIMISTICWRITE would acquire an exclusive write lock.

What I'm not sure yet about are the following things:

1) Whether to add a query hint or an explicit setLockMode() method + $_lockMode property on the query. I'm tending towards the latter to give a more explicit API for locking (hints are a bit non-obvious and may be overlooked).

2) Whether we need any kind of OPTIMISTIC support in LockMode or *FORCEINCREMENT support. I currently think we should start without both.

@doctrinebot

Comment created by @beberlei:

In my opinion EntityManager#lock() is not necessary for PHP

I think it has a place in long running scripts, where you dont know if you have already a write lock for an entity you have retrieved from the session some time ago, however in PHP you certainly know if a script you are executing needs a read/write consistency lock or not.

This is also where optimistic force increment shines in Java, because you update the version column you are certain you have a read consistent version of your entity. In PHP the time-frame between retrievial and a call to lock() is just unnecessary small and could be solved by directly locking the entity.

This is why i think EntityManager::find needs to be able to set the version also. A long running script in php means:

  1. Session A retrieves Entity with Id 1 and Version 1 and displays a form
  2. Session B retrieves Entity with Id 1 and Version 1and displays a form
  3. Session B retrieves Entity withId 1 and Version 1 and updates it to Version 2
  4. Session A retrieves Entity with Id 1 and Version 2 and a lost update happens.

This would be the case without checking the version column. A pessimistic lock in this case is not possible, because it would be lost between point 1 and point 4. However an optimistic lock could check upon retrieval of the entity in point 4, if it is still at version 1.

@doctrinebot

Comment created by romanb:

You dont always know at the point of reading an entity whether you want to lock it or not because these decisions can be made in different layers of the application, even during a single HTTP request. And pessimistic (online) locks are never retrieved in "long-running scripts" / "business transactions", i.e. held during user think time, that would be a concurrency nightmare! And the pessimistic locks in JPA are online locks (select ... for update), not offline locks, so there is no difference between Java and PHP when it comes to the pessimistic online locks (select ... for update).

We are really only talking about pessimistic online locks and optimistic offline locks here. (I dont even know if there is such a thing as an optimistic online lock).

All/Most of the examples on the following blog do not refer to long-running business transactions (transactions that span multiple database transactions with user-think time in between).
http://weblogs.java.net/blog/2009/07/30/jpa-20-concurrency-and-locking

Of course EntityManaher#find et al should be extended with optional lockMode and lockVersion arguments like you said but nevertheless there should still be EntityManager#lock().

@doctrinebot

Comment created by romanb:

Let me clarify, pessimistic online locks only really make sense for the duration of a transaction since they're released at the end of a transaction. You dont want to hold such a transaction open during a long-running business transaction (user-think time / multiple requests). Thats why there is no difference in usage of such locks in Java or PHP.

@doctrinebot

Comment created by @lsmith77:

actually pessimistic (time limited) offline locks are also commonly used.

@doctrinebot

Comment created by romanb:

Moreso, even if you do know you want to lock at the point of reading the entity (which may not be the case like I said above, these decisions can be made in different layers, or even in code of an extension that receives an existing object and now wants to lock it to do its work properly. Without em.lock() that would mean re-reading the whole entity), you may still not want to lock at the point of reading:

1) read entity
2) do some (potentially expensive) computations/calculations/whatever based on the state of the entity but we dont need a lock yet
3) after inspecting the outcome of 2) we decide we want to lock

(yes, locking after reading risks stale data but it locks for a shorter duration, its a compromise)

When you can only lock during reading, this would mean in such a case holding the lock unnecessarily long.

@doctrinebot

Comment created by @lsmith77:

Err .. what you just described is optimistic offline locking. Aka you read the version identifier, go off do your thing, commit .. possibly discover that there was a concurrent write ..

@doctrinebot

Comment created by romanb:

@Lukas: Yea. We're really only talking about pessimistic online locks and optimistic offline locks here. I know there are pessimistic offline locks (D1 even has an implementation of that which I wrote) its not currently in the focus for D2 and can even be provided by an extension.

@doctrinebot

Comment created by @beberlei:

Ok given layering EntityManager#lock() has use-cases.

However from my experience I want to lock entities at the beginning of long running scripts, because these runs are more important to my business than those small runs from users.

Say i have an entity that at a certain point in time needs to calculate very expensive stuff, before I start the calculations i would want to make sure that these wont get busted by a user updating the entity during that run, so i would apply SELECT * FOR UPDATE before the calculations and not after them. At point 3 when you apply the lock, you only know if you have a consistent lock between 1 and 3 when you have a version field to check against, a pessimistic lock wont help there.

The pessimistic lock in EntityManager#lock() makes sense if you want to make sure that after the aquiring no changes happen, for example:

  1. read entity
  2. do some simple checks
  3. realize after 2, we need to make expensive computations, lock() or refresh(entity, lockMode) here? (probably depends if the entity has a version column or not)
  4. do expensive computations
  5. flush
@doctrinebot

Comment created by romanb:

Based on all the discussion so far I updated the main issue description with an API specification. Feel free to comment.

@doctrinebot

Comment created by romanb:

Next on my list is working out the transaction semantics for all the cases, i.e. where to enforce that it is invoked inside an active transaction and in which cases the Optimistic/Pessimistic exceptions cause a transaction rollback (or whether there is actually a case in which they do not result in a rollback).

@doctrinebot

Comment created by romanb:

First update for transaction requirements.

@doctrinebot

Comment created by romanb:

Updated refresh() specification.

@doctrinebot

Comment created by @beberlei:

Add another condition for PESSIMITIC locks in combination with find()*

  • Refresh entity with lock when entity with id exists in identity map already
@doctrinebot

Comment created by @beberlei:

hm actually only EntityManager#lock($entity, $mode) has to be called.

@doctrinebot

Comment created by @beberlei:

Patch for the said functionality, its missing refresh() changes though.

@doctrinebot

Comment created by @beberlei:

I made the following observations on lock timeouts:

  • MySQL supports them for InnoDb at the Database Configuration level
  • Postgres has either NO WAIT or no timeout at all
  • Oracle has either NO WAIT, no timeout, or a specified number of miliseconds.
  • Sqlite has no locking

Unrelated but:

  • MsSql has a very weird row lock syntax, we need a new construct in AbstractPlatform for this.

Question:

Should we support something like lock timeouts via a query hint? If yes:

  • Should we only specifiy something like setHint(Query::HINTLOCK_TIMEOUTNOWAIT, true);
  • or like setHint(Query::HINTLOCKTIMEOUT, 0);

Vendors except Oracle and Postgres would simply ignore this option. Postgres would only support TIMEOUT = 0 as NOWAIT

@doctrinebot

Comment created by @beberlei:

The latest version of lock-support is available here:

http://github.com/beberlei/doctrine2/tree/lock-support

The Lock Support is now tested using Gearman Job Server allowing to have functional scenarios where waiting for lock releases is necessary, see:

http://www.whitewashing.de/blog/articles/129

Refresh() is still missing, I am not sure if this should be included in the 2.0 version (use-case is very slim).

@doctrinebot

Comment created by romanb:

Great work so far. I think we can skip refresh() support for now, so post-2.0 if at all.

@"Should we support something like lock timeouts via a query hint?"

I think setHint(Query::HINTLOCKTIMEOUT, 0) would be good. That way we keep the possibility open for later enhancements regarding other timeout values (i.e. if features change on databases) without requiring public API changes. However, it should be clearly documented that this is a hint and not a guarantee and it should be documented which database vendors interpret the timeout in which way I think.
As a start, only having timeout = 0 => NOWAIT would be enough.

@doctrinebot

Comment created by romanb:

You can reschedule the lock timeout / nowait to beta3 if you want that. I think we already have enough for beta2.

@doctrinebot

Comment created by romanb:

Pushing outstanding work back to beta3.

@doctrinebot

Comment created by @beberlei:

Implemented, Lock Timeouts will be handled in a dedicated ticket.

@doctrinebot

Issue was closed with resolution "Fixed"

@beberlei beberlei was assigned by doctrinebot
@doctrinebot doctrinebot added this to the 2.0-BETA3 milestone
@doctrinebot doctrinebot closed this
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.