New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Serialization failure, Deadlock found when trying to get lock #14757

Closed
PVince81 opened this Issue Mar 9, 2015 · 39 comments

Comments

Projects
None yet
@PVince81
Member

PVince81 commented Mar 9, 2015

Steps to reproduce

  1. Prepare a local folder "lots" with 1600+ files
  2. Setup the sync client and let it upload all the files
  3. Rename "lots" to "lots2", wait for sync
  4. Rename "lots2" to "lots3"

Expected result

Renamed properly.

Actual result

Very strange internal server error:

{"reqId":"913558a89371e56c7248d7c0ac24f820","remoteAddr":"127.0.0.1","app":"remote","message":"An exception occurred while executing 'UPDATE `oc_filecache` SET `path` = ?, `path_hash` = ?, `name` = ?, `parent` =? WHERE `fileid` = ?' with params [\"files\\\/lots3\", \"6496454381429272daca5f1453986396\", \"lots3\", \"3\", 8]:\n\nSQLSTATE[40001]: Serialization failure: 1213 Deadlock found when trying to get lock; try restarting transaction","level":4,"time":"2015-03-09T09:45:40+00:00","method":"MOVE","url":"\/owncloud\/remote.php\/webdav\/lots2"}

And then the sync client reuploads all the files which takes a while...

Versions

ownCloud 8.0.0 / stable8 found while testing #14752
mariadb-10.0.14-2.3.x86_64
php5-5.6.6-1.1.x86_64
openSUSE Factory

Not sure if this can be reproduced consistently, will try again.

That could be another argument for closeCursor.

@DeepDiver1975 @icewind1991 @MorrisJobke

@PVince81

This comment has been minimized.

Show comment
Hide comment
@PVince81

PVince81 Mar 9, 2015

Member

Hmm I wonder if it's due to #14752 which introduces a transaction because that's the branch I got the issue on.

Member

PVince81 commented Mar 9, 2015

Hmm I wonder if it's due to #14752 which introduces a transaction because that's the branch I got the issue on.

@MorrisJobke

This comment has been minimized.

Show comment
Hide comment
@MorrisJobke

MorrisJobke Mar 9, 2015

Member

@PVince81 Also @felixboehm mentioned that we maybe need to handle aborted transactions because of conflicts somehow.

Member

MorrisJobke commented Mar 9, 2015

@PVince81 Also @felixboehm mentioned that we maybe need to handle aborted transactions because of conflicts somehow.

@PVince81

This comment has been minimized.

Show comment
Hide comment
@PVince81

PVince81 Mar 9, 2015

Member

Hmmm just now I noticed that while I renamed "lots3" to "lots4", now I have two duplicate local folders called "lots3" and "lots4".

Aborted transactions ? You mean when the sync client tries to do something but the server is still busy from the previous parallel operation ?

Member

PVince81 commented Mar 9, 2015

Hmmm just now I noticed that while I renamed "lots3" to "lots4", now I have two duplicate local folders called "lots3" and "lots4".

Aborted transactions ? You mean when the sync client tries to do something but the server is still busy from the previous parallel operation ?

@MorrisJobke

This comment has been minimized.

Show comment
Hide comment
@MorrisJobke

MorrisJobke Mar 9, 2015

Member

No aborted transaction as in: The database says "no I can't do this transaction, because there was a parallel concurrent transaction and this transaction isn't valid anymore"

Member

MorrisJobke commented Mar 9, 2015

No aborted transaction as in: The database says "no I can't do this transaction, because there was a parallel concurrent transaction and this transaction isn't valid anymore"

@PVince81

This comment has been minimized.

Show comment
Hide comment
@PVince81

PVince81 Mar 9, 2015

Member

I tried to reproduce the issue but this run didn't. One thing I observed though is that the sync client sends a MOVE operation for EVERY file inside the renamed folder instead of just renaming the folder. It doesn't seem to always do that though.

Member

PVince81 commented Mar 9, 2015

I tried to reproduce the issue but this run didn't. One thing I observed though is that the sync client sends a MOVE operation for EVERY file inside the renamed folder instead of just renaming the folder. It doesn't seem to always do that though.

@PVince81

This comment has been minimized.

Show comment
Hide comment
@PVince81

PVince81 Mar 9, 2015

Member

I'm trying to sequentially rename "lotsX" to "lotsY" and see what happens, in the hope to reproduce the issue.
There's more weird stuff happening:

{"reqId":"174cfbf66b485a8274cb2bc8071e6a21","remoteAddr":"127.0.0.1","app":"core","message":"unable to rename, file does not exists : files_versions\/lots9.v","level":3,"time":"2015-03-09T10:12:04+00:00","method":"MOVE","url":"\/owncloud\/remote.php\/webdav\/lots9"}

(notice that lots9 is a folder, not a file)

Member

PVince81 commented Mar 9, 2015

I'm trying to sequentially rename "lotsX" to "lotsY" and see what happens, in the hope to reproduce the issue.
There's more weird stuff happening:

{"reqId":"174cfbf66b485a8274cb2bc8071e6a21","remoteAddr":"127.0.0.1","app":"core","message":"unable to rename, file does not exists : files_versions\/lots9.v","level":3,"time":"2015-03-09T10:12:04+00:00","method":"MOVE","url":"\/owncloud\/remote.php\/webdav\/lots9"}

(notice that lots9 is a folder, not a file)

@PVince81

This comment has been minimized.

Show comment
Hide comment
@PVince81

PVince81 Mar 9, 2015

Member

I tried several times and didn't manage to reproduce the deadlock issue.
Maybe it was a one off or race condition of some sorts... but with the nasty side effect of having the folder duplicate itself.

Member

PVince81 commented Mar 9, 2015

I tried several times and didn't manage to reproduce the deadlock issue.
Maybe it was a one off or race condition of some sorts... but with the nasty side effect of having the folder duplicate itself.

@christianrj

This comment has been minimized.

Show comment
Hide comment
@christianrj

christianrj Mar 9, 2015

Maybe this has something to do with one these bugs that I previously reported?
#10711
#11847
#12659
#12661

christianrj commented Mar 9, 2015

Maybe this has something to do with one these bugs that I previously reported?
#10711
#11847
#12659
#12661

@lparmentier-quatral

This comment has been minimized.

Show comment
Hide comment
@lparmentier-quatral

lparmentier-quatral Jun 2, 2015

We've trying OC 8.1.0.6 beta2 on a cloud (HaProxy on frontend apache, HaProxy on frontend MariaDB Cluster) and there is the same problem. Not due to renaming directory, so not due to #14752 cause we've the last version where it's resolved, and we caused the problem without renaming directory.

lparmentier-quatral commented Jun 2, 2015

We've trying OC 8.1.0.6 beta2 on a cloud (HaProxy on frontend apache, HaProxy on frontend MariaDB Cluster) and there is the same problem. Not due to renaming directory, so not due to #14752 cause we've the last version where it's resolved, and we caused the problem without renaming directory.

@MCoLoL

This comment has been minimized.

Show comment
Hide comment
@MCoLoL

MCoLoL Jul 17, 2015

hi all, we have the same problem with OC8.0.5 and client version 1.8.4 ( build 2531 ), when we upload big files ( eg 300MB), the synchronization will abort with "Internal server error"
In the owncloud log, we can find:

Jul 17 09:29:02 sdcoca15 ownCloud[24989]: {remote} An exception occurred while executing 'UPDATE oc_filecache SET mtime = ?, etag=? WHERE (mtime <> ? OR etag <> ? ) AND fileid = ? ' with params [1437118142, "55a8aebee9f3a", 1437118142, "55a8aebee9f3a", "11251183"]:#12#012SQLSTATE[40001]: Serialization failure: 1213 Deadlock found when trying to get lock; try restarting transaction

Probably is the same bug?
thanks

MCoLoL commented Jul 17, 2015

hi all, we have the same problem with OC8.0.5 and client version 1.8.4 ( build 2531 ), when we upload big files ( eg 300MB), the synchronization will abort with "Internal server error"
In the owncloud log, we can find:

Jul 17 09:29:02 sdcoca15 ownCloud[24989]: {remote} An exception occurred while executing 'UPDATE oc_filecache SET mtime = ?, etag=? WHERE (mtime <> ? OR etag <> ? ) AND fileid = ? ' with params [1437118142, "55a8aebee9f3a", 1437118142, "55a8aebee9f3a", "11251183"]:#12#012SQLSTATE[40001]: Serialization failure: 1213 Deadlock found when trying to get lock; try restarting transaction

Probably is the same bug?
thanks

@PVince81

This comment has been minimized.

Show comment
Hide comment
@PVince81

PVince81 Oct 8, 2015

Member

Is anyone still seeing this with 8.1.3 with transactional redis locking enabled ? (https://doc.owncloud.org/server/8.1/admin_manual/configuration_files/files_locking_transactional.html)

Member

PVince81 commented Oct 8, 2015

Is anyone still seeing this with 8.1.3 with transactional redis locking enabled ? (https://doc.owncloud.org/server/8.1/admin_manual/configuration_files/files_locking_transactional.html)

@michi-roth

This comment has been minimized.

Show comment
Hide comment
@michi-roth

michi-roth Oct 16, 2015

Contributor

We are using OC 8.1.2 with transactional filelocking and MariaDB and are still getting this error.

Contributor

michi-roth commented Oct 16, 2015

We are using OC 8.1.2 with transactional filelocking and MariaDB and are still getting this error.

@PVince81

This comment has been minimized.

Show comment
Hide comment
@PVince81

PVince81 Oct 16, 2015

Member

Can you check the "Admin" page and under "Status" you should see a message that says that locking is enabled.

Member

PVince81 commented Oct 16, 2015

Can you check the "Admin" page and under "Status" you should see a message that says that locking is enabled.

@michi-roth

This comment has been minimized.

Show comment
Hide comment
@michi-roth

michi-roth Oct 16, 2015

Contributor

The message is visible.
Before activating the new locking we often had deadlocks in the table "file_locks" (I am not sure of the correct name, but is was the table for the file locking). After activating the new lock we only have deadlocks while updating mtime and etag in the "file_cache" table.#

Also our behaviour is, that after the message the sync client does not sync the new files. I think this is because of the unchanged etag

Contributor

michi-roth commented Oct 16, 2015

The message is visible.
Before activating the new locking we often had deadlocks in the table "file_locks" (I am not sure of the correct name, but is was the table for the file locking). After activating the new lock we only have deadlocks while updating mtime and etag in the "file_cache" table.#

Also our behaviour is, that after the message the sync client does not sync the new files. I think this is because of the unchanged etag

@PVince81

This comment has been minimized.

Show comment
Hide comment
@PVince81

PVince81 Oct 16, 2015

Member

Hmmm, the table "file_locks" doesn't exist in 8.1.x. The app "files_locking" is something different and doesn't use the database.
Did you actually upgrade to 8.2RC2 ?

Member

PVince81 commented Oct 16, 2015

Hmmm, the table "file_locks" doesn't exist in 8.1.x. The app "files_locking" is something different and doesn't use the database.
Did you actually upgrade to 8.2RC2 ?

@michi-roth

This comment has been minimized.

Show comment
Hide comment
@michi-roth

michi-roth Oct 21, 2015

Contributor

I just upgraded to OC 8.2.0 and got the same error again. The updated file is shown in the web interface but no client downloads the changes because the etag of the parentfolder is not changed.

The oc_file_locks table is still in the database and the admin page does not show the message for the activated transactional locking. But I don't think that this has something to do with the error while updating the database. What is ownCloud doing when a DB transaction fails? I would assume that ownCloud retries to change the DB but I doesn't look so.

The exact error message:

{"reqId":"DEPld/dVX2BKKGWwRuGK","remoteAddr":"137.250.18.11","app":"index","message":"Exception: {"Exception":"Doctrine\DBAL\Exception\DriverException","Message":"An exception occurred while executing 'UPDATE oc_filecache SET mtime = ?, etag=? WHERE (mtime <> ? OR etag <> ? ) AND fileid = ? ' with params [1445422901, \"56276735aadd0\", 1445422901, \"56276735aadd0\", 2511469]:\n\nSQLSTATE[40001]: Serialization failure: 1213 Deadlock found when trying to get lock; try restarting transaction","Code":0,"Trace":"#0 \/var\/www\/owncloud-8.2.0\/3rdparty\/doctrine\/dbal\/lib\/Doctrine\/DBAL\/DBALException.php(116): Doctrine\DBAL\Driver\AbstractMySQLDriver->convertException('An exception oc...', Object(Doctrine\DBAL\Driver\PDOException))\n#1 \/var\/www\/owncloud-8.2.0\/3rdparty\/doctrine\/dbal\/lib\/Doctrine\/DBAL\/Statement.php(172): Doctrine\DBAL\DBALException::driverExceptionDuringQuery(Object(Doctrine\DBAL\Driver\PDOMySql\Driver), Object(Doctrine\DBAL\Driver\PDOException), 'UPDATE oc_file...', Array)\\n#2 \\\/var\\\/www\\\/owncloud-8.2.0\\\/lib\\\/private\\\/db\\\/statementwrapper.php(73): Doctrine\\\\DBAL\\\\Statement->execute(Array)\\n#3 \\\/var\\\/www\\\/owncloud-8.2.0\\\/lib\\\/private\\\/db.php(147): OC_DB_StatementWrapper->execute(Array)\\n#4 \\\/var\\\/www\\\/owncloud-8.2.0\\\/lib\\\/private\\\/files\\\/cache\\\/cache.php(300): OC_DB::executeAudited('UPDATE*PREFIX...', Array)\n#5 \/var\/www\/owncloud-8.2.0\/lib\/private\/files\/cache\/changepropagator.php(79): OC\Files\Cache\Cache->update(2511469, Array)\n#6 \/var\/www\/owncloud-8.2.0\/lib\/private\/files\/cache\/updater.php(117): OC\Files\Cache\ChangePropagator->propagateChanges(NULL)\n#7 \/var\/www\/owncloud-8.2.0\/lib\/private\/files\/view.php(1038): OC\Files\Cache\Updater->update('\/thumbnails\/251...')\n#8 \/var\/www\/owncloud-8.2.0\/lib\/private\/files\/view.php(585): OC\Files\View->basicOperation('file_put_conten...', 'thumbnails\/2517...', Array, '\x89PNG\r\n\x1A\n\x00\x00\x00\rIHD...')\n#9 \/var\/www\/owncloud-8.2.0\/lib\/private\/preview.php(1048): OC\Files\View->file_put_contents('thumbnails\/2517...', '\x89PNG\r\n\x1A\n\x00\x00\x00\rIHD...')\n#10 \/var\/www\/owncloud-8.2.0\/lib\/private\/preview.php(894): OC\Preview->storePreview(2517738, 32, 32)\n#11 \/var\/www\/owncloud-8.2.0\/lib\/private\/preview.php(1162): OC\Preview->resizeAndStore(2517738)\n#12 \/var\/www\/owncloud-8.2.0\/lib\/private\/preview.php(770): OC\Preview->generatePreview(2517738)\n#13 \/var\/www\/owncloud-8.2.0\/lib\/private\/preview.php(795): OC\Preview->getPreview()\n#14 \/var\/www\/owncloud-8.2.0\/core\/ajax\/preview.php(63): OC\Preview->showPreview()\n#15 \/var\/www\/owncloud-8.2.0\/lib\/private\/route\/route.php(154) : runtime-created function(1): require_once('\/var\/www\/ownclo...')\n#16 [internal function]: __lambda_func(Array)\n#17 \/var\/www\/owncloud-8.2.0\/lib\/private\/route\/router.php(282): call_user_func('\x00lambda_83', Array)\n#18 \/var\/www\/owncloud-8.2.0\/lib\/base.php(831): OC\Route\Router->match('\/core\/preview.p...')\n#19 \/var\/www\/owncloud-8.2.0\/index.php(39): OC::handleRequest()\n#20 {main}","File":"\/var\/www\/owncloud-8.2.0\/3rdparty\/doctrine\/dbal\/lib\/Doctrine\/DBAL\/Driver\/AbstractMySQLDriver.php","Line":115}","level":3,"time":"2015-10-21T12:21:41+02:00"}

Contributor

michi-roth commented Oct 21, 2015

I just upgraded to OC 8.2.0 and got the same error again. The updated file is shown in the web interface but no client downloads the changes because the etag of the parentfolder is not changed.

The oc_file_locks table is still in the database and the admin page does not show the message for the activated transactional locking. But I don't think that this has something to do with the error while updating the database. What is ownCloud doing when a DB transaction fails? I would assume that ownCloud retries to change the DB but I doesn't look so.

The exact error message:

{"reqId":"DEPld/dVX2BKKGWwRuGK","remoteAddr":"137.250.18.11","app":"index","message":"Exception: {"Exception":"Doctrine\DBAL\Exception\DriverException","Message":"An exception occurred while executing 'UPDATE oc_filecache SET mtime = ?, etag=? WHERE (mtime <> ? OR etag <> ? ) AND fileid = ? ' with params [1445422901, \"56276735aadd0\", 1445422901, \"56276735aadd0\", 2511469]:\n\nSQLSTATE[40001]: Serialization failure: 1213 Deadlock found when trying to get lock; try restarting transaction","Code":0,"Trace":"#0 \/var\/www\/owncloud-8.2.0\/3rdparty\/doctrine\/dbal\/lib\/Doctrine\/DBAL\/DBALException.php(116): Doctrine\DBAL\Driver\AbstractMySQLDriver->convertException('An exception oc...', Object(Doctrine\DBAL\Driver\PDOException))\n#1 \/var\/www\/owncloud-8.2.0\/3rdparty\/doctrine\/dbal\/lib\/Doctrine\/DBAL\/Statement.php(172): Doctrine\DBAL\DBALException::driverExceptionDuringQuery(Object(Doctrine\DBAL\Driver\PDOMySql\Driver), Object(Doctrine\DBAL\Driver\PDOException), 'UPDATE oc_file...', Array)\\n#2 \\\/var\\\/www\\\/owncloud-8.2.0\\\/lib\\\/private\\\/db\\\/statementwrapper.php(73): Doctrine\\\\DBAL\\\\Statement->execute(Array)\\n#3 \\\/var\\\/www\\\/owncloud-8.2.0\\\/lib\\\/private\\\/db.php(147): OC_DB_StatementWrapper->execute(Array)\\n#4 \\\/var\\\/www\\\/owncloud-8.2.0\\\/lib\\\/private\\\/files\\\/cache\\\/cache.php(300): OC_DB::executeAudited('UPDATE*PREFIX...', Array)\n#5 \/var\/www\/owncloud-8.2.0\/lib\/private\/files\/cache\/changepropagator.php(79): OC\Files\Cache\Cache->update(2511469, Array)\n#6 \/var\/www\/owncloud-8.2.0\/lib\/private\/files\/cache\/updater.php(117): OC\Files\Cache\ChangePropagator->propagateChanges(NULL)\n#7 \/var\/www\/owncloud-8.2.0\/lib\/private\/files\/view.php(1038): OC\Files\Cache\Updater->update('\/thumbnails\/251...')\n#8 \/var\/www\/owncloud-8.2.0\/lib\/private\/files\/view.php(585): OC\Files\View->basicOperation('file_put_conten...', 'thumbnails\/2517...', Array, '\x89PNG\r\n\x1A\n\x00\x00\x00\rIHD...')\n#9 \/var\/www\/owncloud-8.2.0\/lib\/private\/preview.php(1048): OC\Files\View->file_put_contents('thumbnails\/2517...', '\x89PNG\r\n\x1A\n\x00\x00\x00\rIHD...')\n#10 \/var\/www\/owncloud-8.2.0\/lib\/private\/preview.php(894): OC\Preview->storePreview(2517738, 32, 32)\n#11 \/var\/www\/owncloud-8.2.0\/lib\/private\/preview.php(1162): OC\Preview->resizeAndStore(2517738)\n#12 \/var\/www\/owncloud-8.2.0\/lib\/private\/preview.php(770): OC\Preview->generatePreview(2517738)\n#13 \/var\/www\/owncloud-8.2.0\/lib\/private\/preview.php(795): OC\Preview->getPreview()\n#14 \/var\/www\/owncloud-8.2.0\/core\/ajax\/preview.php(63): OC\Preview->showPreview()\n#15 \/var\/www\/owncloud-8.2.0\/lib\/private\/route\/route.php(154) : runtime-created function(1): require_once('\/var\/www\/ownclo...')\n#16 [internal function]: __lambda_func(Array)\n#17 \/var\/www\/owncloud-8.2.0\/lib\/private\/route\/router.php(282): call_user_func('\x00lambda_83', Array)\n#18 \/var\/www\/owncloud-8.2.0\/lib\/base.php(831): OC\Route\Router->match('\/core\/preview.p...')\n#19 \/var\/www\/owncloud-8.2.0\/index.php(39): OC::handleRequest()\n#20 {main}","File":"\/var\/www\/owncloud-8.2.0\/3rdparty\/doctrine\/dbal\/lib\/Doctrine\/DBAL\/Driver\/AbstractMySQLDriver.php","Line":115}","level":3,"time":"2015-10-21T12:21:41+02:00"}

@nickvergessen

This comment has been minimized.

Show comment
Hide comment
@nickvergessen

nickvergessen Nov 17, 2015

Contributor

Duplicate of #20226 and #20555 ?

Contributor

nickvergessen commented Nov 17, 2015

Duplicate of #20226 and #20555 ?

@michi-roth

This comment has been minimized.

Show comment
Hide comment
@michi-roth

michi-roth Nov 17, 2015

Contributor

#20555 refers to the oc_file_locks table whereas our problem occurs in the oc_filecache table. Can this to problems be related?

Contributor

michi-roth commented Nov 17, 2015

#20555 refers to the oc_file_locks table whereas our problem occurs in the oc_filecache table. Can this to problems be related?

@PVince81

This comment has been minimized.

Show comment
Hide comment
@PVince81

PVince81 Dec 22, 2015

Member

Actually with the new locking this should not happen any more.

But if you observed it on 8.2.0 where the new locking exists, then there might be an additional issue.

Member

PVince81 commented Dec 22, 2015

Actually with the new locking this should not happen any more.

But if you observed it on 8.2.0 where the new locking exists, then there might be an additional issue.

@PVince81 PVince81 added this to the 9.0-current milestone Dec 22, 2015

@catborise

This comment has been minimized.

Show comment
Hide comment
@catborise

catborise Feb 1, 2016

Hi,
i am using 8.2.2 and windows client 2.1.0

I add a file to sync folder which is a big file about 4Gb.
Firstly, application try to sync that file and give error like insufficent space. Then i change the quota and change the file name...

Then every sync operation gives

Jul 17 09:29:02 sdcoca15 ownCloud[24989]: {remote} An exception occurred while executing 'UPDATE oc_filecache SET mtime = ?, etag=? WHERE (mtime <> ? OR etag <> ? ) AND fileid = ? ' with params [1437118142, "55a8aebee9f3a", 1437118142, "55a8aebee9f3a", "11251183"]:#12#012SQLSTATE[40001]: Serialization failure: 1213 Deadlock found when trying to get lock; try restarting transaction

error.

i am not using redis cache, only php cache...

catborise commented Feb 1, 2016

Hi,
i am using 8.2.2 and windows client 2.1.0

I add a file to sync folder which is a big file about 4Gb.
Firstly, application try to sync that file and give error like insufficent space. Then i change the quota and change the file name...

Then every sync operation gives

Jul 17 09:29:02 sdcoca15 ownCloud[24989]: {remote} An exception occurred while executing 'UPDATE oc_filecache SET mtime = ?, etag=? WHERE (mtime <> ? OR etag <> ? ) AND fileid = ? ' with params [1437118142, "55a8aebee9f3a", 1437118142, "55a8aebee9f3a", "11251183"]:#12#012SQLSTATE[40001]: Serialization failure: 1213 Deadlock found when trying to get lock; try restarting transaction

error.

i am not using redis cache, only php cache...

@PVince81

This comment has been minimized.

Show comment
Hide comment
@PVince81

PVince81 Feb 12, 2016

Member

Please update to 8.2.2, it solves a few locking issues when using DB locking (like stray locks)

@catborise you seem to have a different issue, so you say it always happen when hitting insufficent space ? Maybe that code path somehow doesn't free the database table ?

Member

PVince81 commented Feb 12, 2016

Please update to 8.2.2, it solves a few locking issues when using DB locking (like stray locks)

@catborise you seem to have a different issue, so you say it always happen when hitting insufficent space ? Maybe that code path somehow doesn't free the database table ?

@PVince81 PVince81 added the sev2-high label Feb 12, 2016

@michi-roth

This comment has been minimized.

Show comment
Hide comment
@michi-roth

michi-roth Feb 17, 2016

Contributor

Hello everybody,
we updated to 9.0RC1 on our test system and are still getting the problem.
We are using redis and mariadb.

{"reqId":"SuM+doLziYR6e8RC6p6c","remoteAddr":"xxx.xxx.xxx.xxx","app":"webdav","message":"Exception: {\"Message\":\"An exception occurred while executing 'UPDATE `oc_filecache` SET `mtime` = ?, `etag` = ?, `size`=? WHERE (`mtime` <> ? OR `etag` <> ? OR `size` <> ? OR `mtime` IS NULL OR `etag` IS NULL OR `size` IS NULL) AND `fileid` = ? ' with params [1455720965, \\\"56c48a05b0476\\\", -1, 1455720965, \\\"56c48a05b0476\\\", -1, 2334640]:

SQLSTATE[40001]: Serialization failure: 1213 Deadlock found when trying to get lock; try restarting transaction\",\"Exception\":\"Doctrine\DBAL\Exception\DriverException\",\"Code\":0,\"Trace\":\"#0 3rdparty/doctrine/dbal/lib/Doctrine/DBAL/DBALException.php(116): Doctrine\DBAL\Driver\AbstractMySQLDriver->convertException('An exception oc...', Object(Doctrine\DBAL\Driver\PDOException))
#1 3rdparty/doctrine/dbal/lib/Doctrine/DBAL/Connection.php(836): Doctrine\DBAL\DBALException::driverExceptionDuringQuery(Object(Doctrine\DBAL\Driver\PDOMySql\Driver), Object(Doctrine\DBAL\Driver\PDOException), 'UPDATE `oc_file...', Array)
#2 lib/private/db/connection.php(183): Doctrine\DBAL\Connection->executeQuery('UPDATE `oc_file...', Array, Array, NULL)
#3 lib/private/files/cache/cache.php(308): OC\DB\Connection->executeQuery('UPDATE `*PREFIX...', Array)
#4 lib/private/files/cache/propagator.php(67): OC\Files\Cache\Cache->update(2334640, Array)
#5 lib/private/files/view.php(1242): OC\Files\Cache\Propagator->propagateChange('.local/share/ev...', 1455720965)
#6 lib/private/files/view.php(1337): OC\Files\View->getCacheEntry(Object(OC\Files\Storage\Wrapper\Encryption), '.local/share/ev...', '/home/.local/sh...')
#7 apps/dav/lib/connector/sabre/directory.php(225): OC\Files\View->getDirectoryContent('/home/.local/sh...')
#8 3rdparty/sabre/dav/lib/DAV/Tree.php(195): OCA\DAV\Connector\Sabre\Directory->getChildren()
#9 3rdparty/sabre/dav/lib/DAV/Server.php(857): Sabre\DAV\Tree->getChildren('home/.local/sha...')
#10 3rdparty/sabre/dav/lib/DAV/Server.php(911): Sabre\DAV\Server->addPathNodesRecursively(Array, Object(Sabre\DAV\PropFind))
#11 3rdparty/sabre/dav/lib/DAV/CorePlugin.php(336): Sabre\DAV\Server->getPropertiesForPath('home/.local/sha...', Array, 1)
#12 [internal function]: Sabre\DAV\CorePlugin->httpPropFind(Object(Sabre\HTTP\Request), Object(Sabre\HTTP\Response))
#13 3rdparty/sabre/event/lib/EventEmitterTrait.php(105): call_user_func_array(Array, Array)
#14 3rdparty/sabre/dav/lib/DAV/Server.php(459): Sabre\Event\EventEmitter->emit('method:PROPFIND', Array)
#15 3rdparty/sabre/dav/lib/DAV/Server.php(248): Sabre\DAV\Server->invokeMethod(Object(Sabre\HTTP\Request), Object(Sabre\HTTP\Response))
#16 apps/dav/appinfo/v1/webdav.php(54): Sabre\DAV\Server->exec()
#17 remote.php(137): require_once('/var/www/ownclo...')
#18 {main}\",\"File\":\"3rdparty/doctrine/dbal/lib/Doctrine/DBAL/Driver/AbstractMySQLDriver.php\",\"Line\":115}","level":4,"time":"2016-02-17T14:56:05+00:00"}
Contributor

michi-roth commented Feb 17, 2016

Hello everybody,
we updated to 9.0RC1 on our test system and are still getting the problem.
We are using redis and mariadb.

{"reqId":"SuM+doLziYR6e8RC6p6c","remoteAddr":"xxx.xxx.xxx.xxx","app":"webdav","message":"Exception: {\"Message\":\"An exception occurred while executing 'UPDATE `oc_filecache` SET `mtime` = ?, `etag` = ?, `size`=? WHERE (`mtime` <> ? OR `etag` <> ? OR `size` <> ? OR `mtime` IS NULL OR `etag` IS NULL OR `size` IS NULL) AND `fileid` = ? ' with params [1455720965, \\\"56c48a05b0476\\\", -1, 1455720965, \\\"56c48a05b0476\\\", -1, 2334640]:

SQLSTATE[40001]: Serialization failure: 1213 Deadlock found when trying to get lock; try restarting transaction\",\"Exception\":\"Doctrine\DBAL\Exception\DriverException\",\"Code\":0,\"Trace\":\"#0 3rdparty/doctrine/dbal/lib/Doctrine/DBAL/DBALException.php(116): Doctrine\DBAL\Driver\AbstractMySQLDriver->convertException('An exception oc...', Object(Doctrine\DBAL\Driver\PDOException))
#1 3rdparty/doctrine/dbal/lib/Doctrine/DBAL/Connection.php(836): Doctrine\DBAL\DBALException::driverExceptionDuringQuery(Object(Doctrine\DBAL\Driver\PDOMySql\Driver), Object(Doctrine\DBAL\Driver\PDOException), 'UPDATE `oc_file...', Array)
#2 lib/private/db/connection.php(183): Doctrine\DBAL\Connection->executeQuery('UPDATE `oc_file...', Array, Array, NULL)
#3 lib/private/files/cache/cache.php(308): OC\DB\Connection->executeQuery('UPDATE `*PREFIX...', Array)
#4 lib/private/files/cache/propagator.php(67): OC\Files\Cache\Cache->update(2334640, Array)
#5 lib/private/files/view.php(1242): OC\Files\Cache\Propagator->propagateChange('.local/share/ev...', 1455720965)
#6 lib/private/files/view.php(1337): OC\Files\View->getCacheEntry(Object(OC\Files\Storage\Wrapper\Encryption), '.local/share/ev...', '/home/.local/sh...')
#7 apps/dav/lib/connector/sabre/directory.php(225): OC\Files\View->getDirectoryContent('/home/.local/sh...')
#8 3rdparty/sabre/dav/lib/DAV/Tree.php(195): OCA\DAV\Connector\Sabre\Directory->getChildren()
#9 3rdparty/sabre/dav/lib/DAV/Server.php(857): Sabre\DAV\Tree->getChildren('home/.local/sha...')
#10 3rdparty/sabre/dav/lib/DAV/Server.php(911): Sabre\DAV\Server->addPathNodesRecursively(Array, Object(Sabre\DAV\PropFind))
#11 3rdparty/sabre/dav/lib/DAV/CorePlugin.php(336): Sabre\DAV\Server->getPropertiesForPath('home/.local/sha...', Array, 1)
#12 [internal function]: Sabre\DAV\CorePlugin->httpPropFind(Object(Sabre\HTTP\Request), Object(Sabre\HTTP\Response))
#13 3rdparty/sabre/event/lib/EventEmitterTrait.php(105): call_user_func_array(Array, Array)
#14 3rdparty/sabre/dav/lib/DAV/Server.php(459): Sabre\Event\EventEmitter->emit('method:PROPFIND', Array)
#15 3rdparty/sabre/dav/lib/DAV/Server.php(248): Sabre\DAV\Server->invokeMethod(Object(Sabre\HTTP\Request), Object(Sabre\HTTP\Response))
#16 apps/dav/appinfo/v1/webdav.php(54): Sabre\DAV\Server->exec()
#17 remote.php(137): require_once('/var/www/ownclo...')
#18 {main}\",\"File\":\"3rdparty/doctrine/dbal/lib/Doctrine/DBAL/Driver/AbstractMySQLDriver.php\",\"Line\":115}","level":4,"time":"2016-02-17T14:56:05+00:00"}
@PVince81

This comment has been minimized.

Show comment
Hide comment
@PVince81

PVince81 Feb 17, 2016

Member

Some cleanup background jobs can also cause the DB to be locked for too long: #22243

Member

PVince81 commented Feb 17, 2016

Some cleanup background jobs can also cause the DB to be locked for too long: #22243

@cmonteroluque cmonteroluque added this to the 9.0.1-next-maintenance milestone Feb 23, 2016

@cmonteroluque cmonteroluque removed this from the 9.0-current milestone Feb 23, 2016

@MorrisJobke MorrisJobke modified the milestones: 9.0.2-current-maintenance, 9.0.1 Apr 11, 2016

@MorrisJobke

This comment has been minimized.

Show comment
Hide comment
@MorrisJobke

MorrisJobke Apr 11, 2016

Member

But if you observed it on 8.2.0 where the new locking exists, then there might be an additional issue.

8.2.2: https://github.com/owncloud/enterprise/issues/1150

Member

MorrisJobke commented Apr 11, 2016

But if you observed it on 8.2.0 where the new locking exists, then there might be an additional issue.

8.2.2: https://github.com/owncloud/enterprise/issues/1150

@MorrisJobke

This comment has been minimized.

Show comment
Hide comment
@MorrisJobke

MorrisJobke Apr 11, 2016

Member

@PVince81 @icewind1991 I could have a look at this instance - what are we looking for? What information is needed to debug this?

Member

MorrisJobke commented Apr 11, 2016

@PVince81 @icewind1991 I could have a look at this instance - what are we looking for? What information is needed to debug this?

@PVince81 PVince81 assigned PVince81 and unassigned icewind1991 Apr 19, 2016

@PVince81

This comment has been minimized.

Show comment
Hide comment
@PVince81

PVince81 Apr 21, 2016

Member

@MorrisJobke not sure. It might depend for what query it happens, then need to find out why it happens in the first place.

Member

PVince81 commented Apr 21, 2016

@MorrisJobke not sure. It might depend for what query it happens, then need to find out why it happens in the first place.

@PVince81

This comment has been minimized.

Show comment
Hide comment
@PVince81

PVince81 Apr 22, 2016

Member

Looking at @michi-roth's stack trace here #14757 (comment) it looks like the file was modified outside of ownCloud, and at the time of the PROPFIND ownCloud noticed the change and tried to update the cache and propagate it.

@michi-roth did you explicitly modify files outside of ownCloud, directly in the data folder ? If not, then this might be a side-effect of another issue.
Or potentially an occurrence of a similar issue before where the exception happened after updating the file on-disk but before the cache could be updated.

Member

PVince81 commented Apr 22, 2016

Looking at @michi-roth's stack trace here #14757 (comment) it looks like the file was modified outside of ownCloud, and at the time of the PROPFIND ownCloud noticed the change and tried to update the cache and propagate it.

@michi-roth did you explicitly modify files outside of ownCloud, directly in the data folder ? If not, then this might be a side-effect of another issue.
Or potentially an occurrence of a similar issue before where the exception happened after updating the file on-disk but before the cache could be updated.

@PVince81

This comment has been minimized.

Show comment
Hide comment
@PVince81

PVince81 Apr 22, 2016

Member

So far there are only to possibly long-running transactions:

Normally as I said before, the new file locking from 8.2/9.0 should be able to prevent the DB code to be reached in the first place.

Let's see whether there are some code paths that slip through the cracks and still produce this condition.

Member

PVince81 commented Apr 22, 2016

So far there are only to possibly long-running transactions:

Normally as I said before, the new file locking from 8.2/9.0 should be able to prevent the DB code to be reached in the first place.

Let's see whether there are some code paths that slip through the cracks and still produce this condition.

@PVince81

This comment has been minimized.

Show comment
Hide comment
@PVince81

PVince81 Apr 22, 2016

Member

I found this fix 26939a2 but it seems that it was already in 9.0 RC1, where the report has been reported again.

Member

PVince81 commented Apr 22, 2016

I found this fix 26939a2 but it seems that it was already in 9.0 RC1, where the report has been reported again.

@PVince81

This comment has been minimized.

Show comment
Hide comment
@PVince81

PVince81 Apr 22, 2016

Member

@michi-roth can you check whether you have a setting "file_system_check_changes" in your config.php ? Your last stack trace seems to imply that you have explicitly set this value to 1, while the default is 0. Usually setting it to 0 would solve the issue.

Member

PVince81 commented Apr 22, 2016

@michi-roth can you check whether you have a setting "file_system_check_changes" in your config.php ? Your last stack trace seems to imply that you have explicitly set this value to 1, while the default is 0. Usually setting it to 0 would solve the issue.

@PVince81

This comment has been minimized.

Show comment
Hide comment
@PVince81

PVince81 Apr 22, 2016

Member

For now I'll assume that you had "filesystem_check_changes" set to 1. Or maybe it was an external storage for which update detection is enabled anyway, so the situation is likely to happen there too.

Here is my first test run:

  1. Set "filesystem_check_changes" to 1 in config.php
  2. Create a folder "test" in the web UI
  3. Upload about 70 files into the "test" folder
  4. In the source code, add sleep(10) here in the loop https://github.com/owncloud/core/blob/v9.0.1/lib/private/files/cache/cache.php#L521
  5. Rename "test" to "test2" in the web UI.
  6. Set breakpoint in https://github.com/owncloud/core/blob/v9.0.1/lib/private/files/cache/scanner.php#L364
  7. On the CLI with cadaver, do a "ls" on the root folder, the one containing the "test" folder

The sleep call will make sure that every entry in the folder takes 10 to be renamed in the database.
Then when doing the "ls" / PROPFIND, continued debugging until the scanner tries to scan "files/test2". At this point, observe that the lock is preventing it to enter the folder and returns the old cache data.

This test did NOT reproduce the issue and showed that locking worked properly so far.

Member

PVince81 commented Apr 22, 2016

For now I'll assume that you had "filesystem_check_changes" set to 1. Or maybe it was an external storage for which update detection is enabled anyway, so the situation is likely to happen there too.

Here is my first test run:

  1. Set "filesystem_check_changes" to 1 in config.php
  2. Create a folder "test" in the web UI
  3. Upload about 70 files into the "test" folder
  4. In the source code, add sleep(10) here in the loop https://github.com/owncloud/core/blob/v9.0.1/lib/private/files/cache/cache.php#L521
  5. Rename "test" to "test2" in the web UI.
  6. Set breakpoint in https://github.com/owncloud/core/blob/v9.0.1/lib/private/files/cache/scanner.php#L364
  7. On the CLI with cadaver, do a "ls" on the root folder, the one containing the "test" folder

The sleep call will make sure that every entry in the folder takes 10 to be renamed in the database.
Then when doing the "ls" / PROPFIND, continued debugging until the scanner tries to scan "files/test2". At this point, observe that the lock is preventing it to enter the folder and returns the old cache data.

This test did NOT reproduce the issue and showed that locking worked properly so far.

@PVince81

This comment has been minimized.

Show comment
Hide comment
@PVince81

PVince81 Apr 22, 2016

Member

Side note: the web UI's regular "getstoragestats.php" call will also trigger a scan while trying to computer the used space. That one also successfully was excluded thanks to the lock.

Member

PVince81 commented Apr 22, 2016

Side note: the web UI's regular "getstoragestats.php" call will also trigger a scan while trying to computer the used space. That one also successfully was excluded thanks to the lock.

@PVince81

This comment has been minimized.

Show comment
Hide comment
@PVince81

PVince81 Apr 22, 2016

Member

Next step: try the reverse condition: have the scanner run first and then try and rename.

Member

PVince81 commented Apr 22, 2016

Next step: try the reverse condition: have the scanner run first and then try and rename.

@PVince81

This comment has been minimized.

Show comment
Hide comment
@PVince81

PVince81 Apr 22, 2016

Member

So far I could not reproduce this exact issues. All my attempts at reproducing it lead to a deadlock on oc_file_lock, but not oc_filecache, which were documented here: #20555 (comment)

Member

PVince81 commented Apr 22, 2016

So far I could not reproduce this exact issues. All my attempts at reproducing it lead to a deadlock on oc_file_lock, but not oc_filecache, which were documented here: #20555 (comment)

@PVince81

This comment has been minimized.

Show comment
Hide comment
@PVince81

PVince81 Apr 25, 2016

Member

I'm out of ideas for this and was unable to reproduce the deadlock on oc_filecache.
With a bit of luck the fix from #20555 (comment) might help mitigate/eliminate this case altogether.

Moving to 9.0.3 to keep in scope, and hopefully you guys will have upgraded to 9.0.2 and reported back whether the fix worked.

Thanks!

Member

PVince81 commented Apr 25, 2016

I'm out of ideas for this and was unable to reproduce the deadlock on oc_filecache.
With a bit of luck the fix from #20555 (comment) might help mitigate/eliminate this case altogether.

Moving to 9.0.3 to keep in scope, and hopefully you guys will have upgraded to 9.0.2 and reported back whether the fix worked.

Thanks!

@PVince81 PVince81 modified the milestones: 9.0.3-next-maintenance, 9.0.2-current-maintenance Apr 25, 2016

@PVince81

This comment has been minimized.

Show comment
Hide comment
@PVince81

PVince81 Jun 2, 2016

Member

Is anyone still seeing this in 9.0.2 ?
Also make sure to have the proper transaction isolation mode: https://doc.owncloud.org/server/9.0/admin_manual/configuration_database/linux_database_configuration.html#mysql-mariadb-with-binary-logging-enabled

Member

PVince81 commented Jun 2, 2016

Is anyone still seeing this in 9.0.2 ?
Also make sure to have the proper transaction isolation mode: https://doc.owncloud.org/server/9.0/admin_manual/configuration_database/linux_database_configuration.html#mysql-mariadb-with-binary-logging-enabled

@michi-roth

This comment has been minimized.

Show comment
Hide comment
@michi-roth

michi-roth Jun 3, 2016

Contributor

We were able to find the reason for the error in our configuration. The problem was the MariaDB Galera cluster when write requests were sent to all servers deadlock can occur. The solution was to send all write requests to a single server.
A more detailed explanation can be found under:
http://severalnines.com/blog/avoiding-deadlocks-galera-set-haproxy-single-node-writes-and-multi-node-reads

Contributor

michi-roth commented Jun 3, 2016

We were able to find the reason for the error in our configuration. The problem was the MariaDB Galera cluster when write requests were sent to all servers deadlock can occur. The solution was to send all write requests to a single server.
A more detailed explanation can be found under:
http://severalnines.com/blog/avoiding-deadlocks-galera-set-haproxy-single-node-writes-and-multi-node-reads

@centredecalcul

This comment has been minimized.

Show comment
Hide comment
@centredecalcul

centredecalcul Jun 8, 2016

Verified ! No error since I changed my HAproxy configuration, thank you !

centredecalcul commented Jun 8, 2016

Verified ! No error since I changed my HAproxy configuration, thank you !

@PVince81

This comment has been minimized.

Show comment
Hide comment
@PVince81

PVince81 Jun 17, 2016

Member

@michi-roth thanks a lot for the hint. I raised a ticket for the documentation, maybe having it as a hint there would help other people with this issue: owncloud/documentation#2490

Member

PVince81 commented Jun 17, 2016

@michi-roth thanks a lot for the hint. I raised a ticket for the documentation, maybe having it as a hint there would help other people with this issue: owncloud/documentation#2490

@PVince81 PVince81 closed this Jun 17, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment