Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
pnfsmanager: Fix race leading to transaction failures in Chimera
Motivation: 29 Feb 2016 16:57:34 (PnfsManager) [mCo:6882279:srm2:prepareToPut:-1232458768:-1232458767 SRM PnfsCreateUploadPath] Create upload path failed org.springframework.jdbc.UncategorizedSQLException: PreparedStatementCallback; uncategorized SQLException for SQL [SELECT ipnfsid,isize,inlink,itype,imode,iuid,igid,iatime,ictime,imtime from path2inodes(?, ?)]; SQL state [25P02]; error code [0]; ERROR: current transaction is aborted, commands ignored until end of transaction block; nested exception is org.postgresql.util.PSQLException: ERROR: current transaction is aborted, commands ignored until end of transaction block at org.springframework.jdbc.support.AbstractFallbackSQLExceptionTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:84) ~[spring-jdbc-4.2.4.RELEASE.jar:4.2.4.RELEASE] at org.springframework.jdbc.support.AbstractFallbackSQLExceptionTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:81) ~[spring-jdbc-4.2.4.RELEASE.jar:4.2.4.RELEASE] at org.springframework.jdbc.support.AbstractFallbackSQLExceptionTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:81) ~[spring-jdbc-4.2.4.RELEASE.jar:4.2.4.RELEASE] at org.springframework.jdbc.core.JdbcTemplate.execute(JdbcTemplate.java:645) ~[spring-jdbc-4.2.4.RELEASE.jar:4.2.4.RELEASE] at org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:680) ~[spring-jdbc-4.2.4.RELEASE.jar:4.2.4.RELEASE] at org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:712) ~[spring-jdbc-4.2.4.RELEASE.jar:4.2.4.RELEASE] at org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:762) ~[spring-jdbc-4.2.4.RELEASE.jar:4.2.4.RELEASE] at org.dcache.chimera.PgSQLFsSqlDriver.path2inodes(PgSQLFsSqlDriver.java:198) ~[chimera-2.14.13.jar:2.14.13] at org.dcache.chimera.JdbcFs.path2inodes(JdbcFs.java:633) ~[chimera-2.14.13.jar:2.14.13] at org.dcache.chimera.JdbcFs.path2inodes(JdbcFs.java:626) ~[chimera-2.14.13.jar:2.14.13] at sun.reflect.GeneratedMethodAccessor312.invoke(Unknown Source) ~[na:na] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.8.0_72] at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_72] at org.dcache.commons.stats.MonitoringProxy.invoke(MonitoringProxy.java:54) ~[dcache-common-2.14.13.jar:2.14.13] at com.sun.proxy.$Proxy33.path2inodes(Unknown Source) ~[na:na] at org.dcache.chimera.namespace.ChimeraNameSpaceProvider.pathToInode(ChimeraNameSpaceProvider.java:188) ~[dcache-chimera-2.14.13.jar:2.14.13] at org.dcache.chimera.namespace.ChimeraNameSpaceProvider.lookupDirectory(ChimeraNameSpaceProvider.java:1154) ~[dcache-chimera-2.14.13.jar:2.14.13] at org.dcache.chimera.namespace.ChimeraNameSpaceProvider.installDirectory(ChimeraNameSpaceProvider.java:1145) ~[dcache-chimera-2.14.13.jar:2.14.13] at org.dcache.chimera.namespace.ChimeraNameSpaceProvider.installDirectory(ChimeraNameSpaceProvider.java:1139) ~[dcache-chimera-2.14.13.jar:2.14.13] at org.dcache.chimera.namespace.ChimeraNameSpaceProvider.createUploadPath(ChimeraNameSpaceProvider.java:1175) ~[dcache-chimera-2.14.13.jar:2.14.13] at diskCacheV111.namespace.PnfsManagerV3.createUploadPath(PnfsManagerV3.java:1107) [dcache-core-2.14.13.jar:2.14.13] The error is caused by two concurrent uploads trying to create the same target directory. The code tries to recover from the failed mkdir in one of the transactions, but at that point the transaction is already invalid due to the failure. Modification: Propagate the error as a LockedCacheException and let SRM retry instead. Result: Fixed a race condition between two concurrent uploads to the same non-existing target directory. Symptoms of the race condition were 'PSQLException: ERROR: current transaction is aborted, commands ignored until end of transaction block' failures in the pnfs manager log. Both the srm and pnfsmanager services need to be updated to effectively resolve the race. Target: trunk Require-notes: yes Require-book: no Request: 2.15 Request: 2.14 Request: 2.13 Acked-by: Paul Millar <paul.millar@desy.de> Patch: https://rb.dcache.org/r/9084/ (cherry picked from commit 77b450d)
- Loading branch information