Jira issue originally created by user @lsmith77:
this is just a reminder ticket .. it is important that all SQL code in D2 takes concurrent requests into account. this means simply fetching data and then writing data back to the RDBMS needs to take into account that there could be concurrent requests that change the fetched data. this means employing necessary explicit locking or optimistic locking etc.
Comment created by romanb:
I think we dont use a read - update - write approach that is supposed to be atomic anywhere. Its obvious that this is fragile. Its like the select max() increment in memory write record variant for generating primary keys which is obviously a bad idea (at least it would need SERIALIZABLE transaction isolation I think).
(Obviously, any object retrieval -> modification -> persistence is a read-update-write but thats expectedly not "concurrency safe". Thats what optimistic or pessimistic (not yet implemented) locking is for.)
As this is a "reminder" and I dont see any issues currently, this is surely not a blocker until someone finds a serious (and valid) problem related to this.
Comment created by @lsmith77:
and i want to reverse that logic .. i want to ensure that things have been reviewed so that it is clear that things will work concurrently. for example using CURRENT_TIMESTAMP, even just as an option, for versionable versions is uhm a very bad idea. at least make it microtime.
also D2 needs to support LOCK TABLE and it needs to be ensured that its possible to write "behaviors" that figure out the list of tables that need to be locked at the beginning of a flush() execution, because without it you are in a world of concurrency hurt when you want to implement stuff like Sortable or NestedSet on top of D2.
What is your definition of "that things work concurrently" ? The default transaction isolation for most databases (READ COMMITTED) is designed to allow concurrency for more liveness. This implies that things like lost updates and other stuff are allowed to happen. If you dont want that you can simply use a SERIALIZABLE isolation level for the transactions which will give stronger consistency but less liveness/performance, right?
I'm not convinced (yet) that timestamps are a "very bad idea" for optimistic locking (thats what it is used for, there is no "versionable" behavior or whatever built into Doctrine). Yes, you can still get lost updates in rare, high concurrency situations but most of the time lost updates are prevented and if this is fine for the particular application and a timestamp is a more useable and meaningful value then I dont see anything wrong with it. We added that feature because it was in the JPA spec and every major ORM I know supports timestamps for optimistic locking and I'm relatively confident that they know what they're doing (Its still possible that we just have a broken implementation in Doctrine, I dont know that yet, I didnt have the time to look in detail at the other implementations).
As for LOCK TABLE, we will probably provide an abstraction over the specific SQL in the different platforms in the DBAL. Then maybe Doctrine will use these in some scenarios for pessimistic locking but that is stuff for the future.
I'm not sure where you're aiming at with this ticket. Yes, locking features are going to be improved. Yes, concurrent transactions can cause lost updates and other stuff unless you use a higher isolation level. But this is all not new. The transaction isolation level is already under your control. The locking features will be improved. Where is the flaw you're apparently seeing? At least I get the feeling out of your ticket and comment that you seem to be very worried about concurrency, thinking that Doctrine doesnt do enough or does it wrong. I, however, dont yet see what more we should do than to provide support for optimistic and pessimistic locking and I dont see the world of concurrency hurt you're seeing and in which way this is the fault of Doctrine. I think I am very aware of concurrency-related problems but as its often a very difficult topic, its easy to overlook something.
i am aiming at being able to deliver algorithms on top of D2 that ensure the C in ACID .. for simple INSERT/UPDATE etc you can rely on transactions and native isolation levels, but for the things D2 needs to be able to do (versionable, sortable etc.) we need more than that. of course most apps do not going to have a bazillion of parallel requests .. but its still quite common for people to double submit .. or for multiple admins spotting and fixing the same data issue at the same time and if this breaks your nested set tree .. you are screwed. currently i have no reason to believe that D2 was written with sufficient brain cycles spend on these issues. if you are willing to release D2 1.0 without this being checked, then i think you are doing a horrible mistake.
D1 behaviors have tons of flaws that you have mentioned in your blog. but the biggest one never got mentioned, nor did it ever generate a response when i mailed the dev list. i fear that there is just too little attention being paid to this topic, maybe also because there isnt enough expertise in this area. but i think its mainly just not being looked at enough.
anyways, if you are certain enough that none of my concerns above are valid and that if there are any issues you can fix them without a BC break, then close the ticket.
"i am aiming at being able to deliver algorithms on top of D2 that ensure the C in ACID .. for simple INSERT/UPDATE etc you can rely on transactions and native isolation levels, but for the things D2 needs to be able to do (versionable, sortable etc.) we need more than that."
I assume you mean the "application level" consistency right? Such that an ordering is not messed up with 2 items having the same position or that a nested set tree is not corrupt? I understand that but why can't these be solved with a higher isolation level for the operations that modify the order/the tree?. I would be especially interested in such examples that can not be solved by using a higher isolation level. I currently cant see how it would be possible to break consistency of a sorted association or nestedset tree when using a SERIALIZABLE isolation level for the transactions that modify the order or the tree.
I'm not trying to dismiss your critiques here, I'm taking you very seriously, but you need to get more specific than "ensure the sql works with concurrent requests". That alone doesnt tell us much. We need concrete examples of problematic concurrent scenarios and what Doctrine itself can or should do to avoid these problems.
Currently, we dont know what to look for.
Thanks for your help.
the issue is that with FOR UPDATE .. and ultra paranoid isolation levels you can get locks on things that exist, but its not possible to necessarily get a LOCK on things that are just being inserted or altered to suddenly fall in your "range" or worse yet that get appended to your "range". some RDBMS do support range locks (like if I do WHERE foo BETWEEN 1 AND 100), but not all do .. and IIRC you do not really have much control over those (its simply a LOCK escalation strategy RDBMS may use to reduce the number of row level locks). this means in many situations fetching data with a FOR UPDATE even, might not be enough to ensure consistency when you do rights later on.
i have not done a code review of D2 .. actually i have spend very little time looking at any of the code, but from the questions and comments i have seen, there doesnt seem to be enough emphasis. for example i also saw code for the slug extension. which again employs the model of fetching all potential collisions and then doing an insert on the fetched data, without ensuring that between fetching and inserting there are no new collisions. stuff like this should try harder to do things in a single commit if possible, or getting the necessary locks.
locking is non trivial business and so expecting end users to get this right is a bad move imho.
but aside from any of the internal sql, i would also urge to ensure that some of the things i have been asking for (an efficient way to determine the affected tables and if they need to be locked at the start of a flush) a solid way to mark specific models or all model instances as dirty to ensure that users can either choose to reload certain dirty models at once or automatically when they are accessed again (see DDC-343) is really possible.
at the same time i acknowledge that i am just pointing at theoretical problems, without a promise to do the actual analysis and fixing. its just that this month i have promised people to spend my spare time on other stuff. next month might be more doable, i am traveling a lot .. where i can hopefully spend some time reviewing, then again traveling means that providing feedback and discussing things is more difficult.
Comment created by @beberlei:
Ok I have read some stuff, InnoDb for example locks on index ranges also locking the prev and next keys to avoid insert behind problms.
We are also planing to add support for "FOR UPDATE" and "FOR SHARE" as a query hint to all DQL queries aswell as a "Lock" Enum that allows to set this flags for transactions with the Entity Repositories (i.e ntityManager::find(), EntityRepository::findOne, findall() and such). See the linked issue DDC-178.
An automatic lock table is slightly more complex to compute but its possible using the "onFlush" event. My draft of a table locking manager is attached to the ticket.
A TableLocking Manager, usage:
$lock = new TableLocking();
$evm = new EventManager();
$em = EntityManager::create(..., $evm);
// do stuff
There should probably be a postCommit() event or something to clean up the mess automatically ;-)
Btw, this won't work at all with Joined Inheritance.
@lsmith: can you comment on the code? :)
This is looking quite promising. I guess it would be wise to give each model class the choice if it wants to cause a lock or not, because especially in high concurrency scenarios locking a table can of course have some drastic performance effects. Actually if I for example have a sortable behavior and I am not messing with the sorting at all, I do not need to lock the table either. So probably the best approach would be that if a model method is called that requires locking. Each model could register an onFlush() event (theoretically it could try and be smart and only register the event when it knows it has something to do) and then given the value of the flag tell the lock class to issue a lock for the given underlying table.
BTW: as for innodb .. like i said some RDBMS have this behavior .. but its nothing we can rely on at all ..
yeah of course, this code locks all the time, however some kind of configuration to check for models that need full table locks is probably necessary, but as you can see from the code, easy to implement.
My idea would be to add a "TableLockRequired" marker interface and in the loops check if any dirty model implements this marker interface.
Comment created by seldaek:
Just as a reminder, I think if some code attempts to lock and the current RDBMS/Engine doesn't allow the requested lock, D2 should (make it an option maybe, but default true imo) throw an exception saying locking doesn't work with the current engine. Silently doing nothing is good in production to avoid pages exploding, but in development I want to know when I lock if it's not really locking in the background. And if I can know that without reading all detailed specs about whatever engine is in use that's even better.
Synchronizing schedule with DDC-178. This issue is considered resolved when the improved locking support from DDC-178 has been implemented. As Benjamin demonstrated more custom (and generic, reusable) locking strategies are possible through the use of the event system.
Issue was closed with resolution "Fixed"
Imported 1 attachments from Jira into https://gist.github.com/2ab71126707ea7d96d10