Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DboSource::_matchRecords() race conditions and usage re affectedrows #1746

Closed
ghost opened this issue Oct 11, 2013 · 9 comments
Closed

DboSource::_matchRecords() race conditions and usage re affectedrows #1746

ghost opened this issue Oct 11, 2013 · 9 comments
Labels
Milestone

Comments

@ghost
Copy link

ghost commented Oct 11, 2013

Created by Dieter Plaetinck, 16th Mar 2010. (originally Lighthouse ticket #468):


DboSource::_matchRecords() is used in 2 places:
DboSource::update() and DboSource::delete()

In both places it's used to check if records can be found matching the specified conditions before doing the actual update/delete.

However:

  1. this is vulnerable to race conditions. Matching rows can appear or vanish between the find('all') call in _matchRecords() and the actual query execution in DboSource::update() or DboSource::delete().
  2. (imho) it's bad practice anyway to return false in DboSource::update() and DboSource::delete() when there will be/was no affected row. "false" is a return value which should only be used in case of errors. This also how it's documented in the api:
    http://api.cakephp.org/class/dbo-source#method-DboSourceupdate
    http://api.cakephp.org/class/dbo-source#method-DboSourcedelete

I propose to get rid of the redundant find('all') calls, just execute the update/delete queries. And (only) in case of failure, return false.
If users want a reliable way to check if rows were affected, they can always use Model::getAffectedRows()

If you guys agree with this logic, I will write a patch for 1.3

@ghost
Copy link
Author

ghost commented Oct 11, 2013

15th Mar 2010, Mark Story said:


How do race conditions get created? _matchConditions is used to fix database servers that don't allow deletion or updating using joins. Without _matchConditions if you were to attempt a delete query that requires joins, and were using a database server that does not support joins on delete/update, the query will error out, which is not acceptable.

I also don't see the issue with returning false from update() and delete(). The documentation says it returns the boolean of the queries success. If the conditions would fail to find anything, the query was unsuccessful. Running queries that could potentially generate errors doesn't seem like a good idea to me.

@ghost
Copy link
Author

ghost commented Oct 11, 2013

15th Mar 2010, Dieter Plaetinck said:


when you use DboSource::update() or DboSource::delete() to update/delete records matching certain conditions, the only way the query can be unsucessfull is when it failed to do what you ask. I.e. when it failed to update/delete the records matching your conditions.

When there are no matching records, that's something completely different then a failure.
Let's make the analogy with a find() call. If you supply conditions that match nothing, you just get an empty array. the query executed properly; there are just no rows in the resultset because the conditions did not match.

This is good, because if you get false back, you know the query/database failed, an empty array means no results, and so on. unambiguos return value. The return code of DboSource::update() and DboSource::delete() is right now very dubious.

race conditions: in _matchRecords you first do a find('all') with the given conditions and return false in no rows are found.
This return value is checked in DboSource::update() and DboSource::delete(), and if _matchRecords() returned false, then these functions also return false. but records that match the conditions can be added or removed by other php processes, so that _matchRecords() return value may not be accurate anymore.

@ghost
Copy link
Author

ghost commented Oct 11, 2013

16th Mar 2010, Mark Story said:


Fair enough, in a multi process environment there could be race conditions. I'd be more than willing to accept a patch that fixes this short coming. As long as that patch didn't involve shenanigans like locking tables.

@ghost
Copy link
Author

ghost commented Oct 11, 2013

16th Mar 2010, Dieter Plaetinck said:


in some transactionless storage engines (i.e. myisam) table locks are afaik the only way to do this in a reliable manner. but that's just how those engines are designed, and table locks are not that expensive in myisam.

But anyway my primary concern here is the return value. Do you agree with what i said about the return value?

@ghost
Copy link
Author

ghost commented Oct 11, 2013

28th Mar 2010, Mark Story said:


Your original suggestion of removing all the 'extra' find() calls isn't going to work, as those extra find calls are used to qualify conditions that the dbo cannot normally handle.

I also disagree that the return of update() and delete() should change to an array. I don't see how returning an empty array is a better alternative to false. It creates an additional type you need to check for and handle separately.

Moving to RFC, as I don't see the return of important methods changing in stable releases, but its something that can be discussed for future versions. Personally, I don't like the idea of additional return values, and would rather see exceptions used instead.

@ghost
Copy link
Author

ghost commented Oct 11, 2013

29th Mar 2010, cake_joe said:


I disagree with the suggestion that delete() and update() should return true if no data was changed due to no rows being affected. I consider that bad practice - the ORM should not force you to rely on something like getAffectedRows(), else one could directly start doing queries manually.

If at all there could be an additional option/flag that specifies if delete() and update() should still return true if they got executed without errors but did affect 0 rows (working somehow like deleteIfExists() and updateIfExists()).

I agree that on actual "errors" (like DBO/SQL errors, errors resulting out of race conditions) it should throw an error.
p.s.: Wouldn't optional transactions, if available, help with the race condition?

@ghost
Copy link
Author

ghost commented Oct 11, 2013

29th Mar 2010, Dieter Plaetinck said:


  • There must be a way to distinguish "no rows affected" from "an error occurred". Not being able to distinguish those is very bad. I suggested returning an empty array because that's how it's implemented for find(). Any solution is better then the ambiguous return value we have now. If you prefer the usage of exceptions over an extra possible return value, that's fine by me. I'm also fine with ionas' "deleteIfExists() and updateIfExists()" proposal.
    I'm willing to patch this, I just want clarity on how to deal with this.
  • As for the race conditions. I'm glad I managed to convince Mark, I leave the implementation to someone else, as I'm not familiar enough with the context.

@ghost
Copy link
Author

ghost commented Oct 11, 2013

29th Mar 2010, Mark Story said:


ionas: The Dbo's currently return false, when no rows were deleted, as well as when an error is generated.

Using transactions, is something that Cake often leaves up to the end developer to manage. Outside of saveAll there are no transactions used by cake.

There are other return value problems that are not addressed by changing the return for when _matchRecords() is falsey. If you grab a dbo object and run a delete command that doesn't hit anything it will return true, which is also technically incorrect from the arguments presented here. You would still need to check the affected rows in this situation as well.

@lorenzo
Copy link
Member

lorenzo commented Jan 23, 2014

Closing as now update and delete happen inside a transaction in 3.0

@lorenzo lorenzo closed this as completed Jan 23, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant