IllegalMonitorStateException in Transaction handling over HTTP #630

abrokenjester · 2016-10-26T02:02:18Z

During transactions executed over HTTP an exception can occur when commit is invoked.

Stacktrace:

org.eclipse.rdf4j.http.server.ServerHTTPException: Transaction handling error: null
        at org.eclipse.rdf4j.http.server.repository.transaction.TransactionController.processModific
ationOperation(TransactionController.java:311)
        at org.eclipse.rdf4j.http.server.repository.transaction.TransactionController.handleRequestI
nternal(TransactionController.java:194)
        at org.springframework.web.servlet.mvc.AbstractController.handleRequest(AbstractController.j
ava:147)
        at org.springframework.web.servlet.mvc.SimpleControllerHandlerAdapter.handle(SimpleControlle
rHandlerAdapter.java:50)
        at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:959)
        at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:893)
        at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:970
)
        at org.springframework.web.servlet.FrameworkServlet.doPut(FrameworkServlet.java:883)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:651)

....

Caused by: java.lang.IllegalMonitorStateException: null
        at java.util.concurrent.locks.ReentrantLock$Sync.tryRelease(ReentrantLock.java:151)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.release(AbstractQueuedSynchronizer.java:1261)
        at java.util.concurrent.locks.ReentrantLock.unlock(ReentrantLock.java:457)
        at org.eclipse.rdf4j.sail.nativerdf.NativeSailStore$NativeSailSink.close(NativeSailStore.java:341)
        at org.eclipse.rdf4j.sail.nativerdf.SailSourceModel.dataset(SailSourceModel.java:397)
        at org.eclipse.rdf4j.sail.nativerdf.SailSourceModel.iterator(SailSourceModel.java:299)
        at org.eclipse.rdf4j.sail.nativerdf.MemoryOverflowModel.iterator(MemoryOverflowModel.java:128)
        at org.eclipse.rdf4j.sail.base.SailSourceBranch.flush(SailSourceBranch.java:536)
        at org.eclipse.rdf4j.sail.base.SailSourceBranch.flush(SailSourceBranch.java:492)
        at org.eclipse.rdf4j.sail.base.SailSourceBranch.flush(SailSourceBranch.java:275)
        at org.eclipse.rdf4j.sail.base.SailSourceBranch.autoFlush(SailSourceBranch.java:359)
        at org.eclipse.rdf4j.sail.base.SailSourceBranch$2.close(SailSourceBranch.java:221)
        at org.eclipse.rdf4j.sail.base.SailSourceConnection.endUpdateInternal(SailSourceConnection.java:611)
        at org.eclipse.rdf4j.sail.helpers.AbstractSailConnection.endUpdate(AbstractSailConnection.java:573)
        at org.eclipse.rdf4j.sail.helpers.AbstractSailConnection.flush(AbstractSailConnection.java:374)
        at org.eclipse.rdf4j.sail.helpers.SailConnectionWrapper.flush(SailConnectionWrapper.java:247)
        at org.eclipse.rdf4j.repository.sail.SailRepositoryConnection.commit(SailRepositoryConnection.java:167)
        at org.eclipse.rdf4j.http.server.repository.transaction.TransactionController.processModificationOperation(TransactionController.java:293)

The problem is caused bythe fact that NativeSailStore keeps an internal ReentrantLock, and that (due to the way the T ransactionController is set up), this lock may try to be released by a different thread than the thread that originally locked it.

This bug was likely introduced by fixes for issue #293 and associated cleanup.

The text was updated successfully, but these errors were encountered:

abrokenjester · 2016-10-26T02:11:36Z

@ansell in NativeSailStore you introduced, as part of #293, this use of a variable nextTxnLockAcquired as an additional check around the locking and unlocking of the ReentrantLock:

        public synchronized void close() {
        boolean nextTxnLockAcquired = txnLockAcquired;
        txnLockAcquired = false;
        if (nextTxnLockAcquired) {
            txnLockManager.unlock();
        }
    }

and

        private synchronized void acquireExclusiveTransactionLock()
        throws SailException
    {
        boolean nextTxnLockAcquired = txnLockAcquired;
        if (!nextTxnLockAcquired) {
            txnLockManager.lock();
            try {
                if (txnLockManager.getHoldCount() == 1) {
                    // first object
                    tripleStore.startTransaction();
                }
                nextTxnLockAcquired = txnLockAcquired = true;
            }
            catch (IOException e) {
                throw new SailException(e);
            }
            finally {
                if (!nextTxnLockAcquired) {
                    txnLockManager.unlock();
                }
            }
        }
    }

Can you briefly talk me through why this extra boolean is necessary? I get the feeling the bug we encounter is caused by this extra check: some thread doesn't release the lock, which causes another thread later on in the transaction to fail when trying to release it.

ansell · 2016-10-26T04:03:03Z

The reasoning behind why I refactored it to match the pattern used in other cases is that if you are going to do a guard on a volatile value locally in a threaded situation, you should be taking a local copy of it to ensure the guard value matches the value that you see inside of the guarded block. It doesn't necessarily apply to synchronized methods, but there may still be side effects from non-synchronized methods that become more deterministic with a safe guard pattern.

I may have made a mistake but I can't see it right now.

They key to the pattern here appears to be that the unlock should only successfully complete if either getHoldCount was a value other than 1 (not sure what values that could be) or optionally if both getHoldCount was 1 and the transaction was started successfully.

I don't fully understand the getHoldCount significance and why it should pass through to get an exclusive lock if getHoldCount was a value other than 1, and perhaps more importantly why it could do be a value other than 1 given that it is checking from within a synchronized method.

The fact that it was all occurring within a synchronized method seemed to mean if there was a bug it was likely to be outside of that method, and the tests never started to fail over the time period that I was working through it, or in the weeks since. The local variable assignments are null within each synchronized method, but fit the general pattern after the refactoring so that their effects are easier to reason about if someone touched the variable outside of a synchronized method, which seems to point to outside interference with that variable that was previously being allowed for in part of the pattern that I don't understand.

abrokenjester · 2016-10-26T05:25:39Z

I believe there may be quite a bit of legacy here confusing the issue. The variable txnLockAcquired was originally introduced by me in NativeStoreConnection, before the entire native store was refactored for optimistic locking. I get the feeling the mechanism as currently in place is redundant. I'll take a closer look.

abrokenjester · 2016-10-26T06:15:51Z

I've done some simplifying, essentially just ripping this boolean variable out completely, and instead relying on Lock.isHeldByCurrentThread() to verify locking state. Will put up a PR soon (currently waiting for the build tests to complete), can you eyeball and see if it makes sense?

…tLock object, removing separate boolean Signed-off-by: Jeen Broekstra <jeen.broekstra@gmail.com>

Signed-off-by: Jeen Broekstra <jeen.broekstra@gmail.com>

abrokenjester · 2016-10-27T04:19:44Z

I've decided to start again - I think that instead of treating this as a bug in the native store, I should look at this as a fundamental flaw in the TransactionController. RepositoryConnections are not guaranteed thread-safe, yet TransactionController still tries to let multiple threads do things on the same connection.

A possible fix is to let TransactionController create dedicated threads, one for each transaction, and ensure that that same thread is used for all incoming operations on that transaction.

ansell · 2016-10-27T04:23:37Z

In the distant past I added synchronized statements around places where the repository connections were used in a few places, such as:

https://github.com/eclipse/rdf4j/blob/master/core/http/server-spring/src/main/java/org/eclipse/rdf4j/http/server/repository/RepositoryController.java#L155

Is it possible there are new locations that have popped up since then that also need synchronized statements around them?

abrokenjester · 2016-10-27T05:14:48Z

Possibly, but the problem is more fundamental than that. The REST API supports transactions by means of individual PUT/POST operations against a single "transaction resource". Since these are multiple Http requests that need to be mapped to the same transaction, the TransactionController uses a cache (the ActiveTransactionRegistry) to get a RepositoryConnection object for the identified transaction.

This is all fine and dandy but the problem is that each new HttpRequest runs on a new thread! So, we get that Thread A starts the transaction, Thread B then adds some data, and finally Thread C commits. It does all happen sequentially (that is why there is so much locking etc going on in the ActiveTransactionRegistry), but at the end of the day they're different threads - and NativeStoreConnection can't handle that, because internally it uses ReentrantLocks for the duration of a transaction - which can only be unlocked by the same thread that locked them first.

So I think the solution is that the TransactionController spawns a new, dedicated Thread for each new transaction, and keeps that Thread alive until the transaction is complete, making sure that each transaction operation is executed on that thread, instead of on the http request worker thread itself.

I'm currently working on a simple redesign using a Executors.singleThreadExecutor per transaction.

…d txn threads Signed-off-by: Jeen Broekstra <jeen.broekstra@gmail.com>

Transaction.close now ensures a second call is silently ignored Signed-off-by: Jeen Broekstra <jeen.broekstra@gmail.com>

…en checking for active operations Signed-off-by: Jeen Broekstra <jeen.broekstra@gmail.com>

Signed-off-by: Jeen Broekstra <jeen.broekstra@gmail.com>

Issue: #630 TransactionController threading

abrokenjester added the 🐞 bug issue is a bug label Oct 26, 2016

abrokenjester added a commit that referenced this issue Oct 26, 2016

#630 simplified concurrent locking code by relying _only_ on Reentran…

41aa1ad

…tLock object, removing separate boolean Signed-off-by: Jeen Broekstra <jeen.broekstra@gmail.com>

abrokenjester mentioned this issue Oct 26, 2016

Issue #630 simplified concurrent locking code #631

Closed

abrokenjester added a commit that referenced this issue Oct 26, 2016

#630 unlock if exception occurs

3c57552

Signed-off-by: Jeen Broekstra <jeen.broekstra@gmail.com>

abrokenjester added a commit that referenced this issue Oct 26, 2016

#630 re-removed unnecessary synchronized

486ed27

Signed-off-by: Jeen Broekstra <jeen.broekstra@gmail.com>

abrokenjester self-assigned this Oct 27, 2016

abrokenjester added the in progress label Oct 27, 2016

abrokenjester added a commit that referenced this issue Oct 28, 2016

#630 simplified ActiveTransactionRegistry now keeps track of dedicate…

ad6fec8

…d txn threads Signed-off-by: Jeen Broekstra <jeen.broekstra@gmail.com>

abrokenjester added a commit that referenced this issue Oct 28, 2016

#630 Registry takes care of closing the Transaction

5880ad6

Transaction.close now ensures a second call is silently ignored Signed-off-by: Jeen Broekstra <jeen.broekstra@gmail.com>

abrokenjester mentioned this issue Oct 28, 2016

Issue: #630 TransactionController threading #633

Merged

abrokenjester added a commit that referenced this issue Oct 28, 2016

#630 added synchronisation on futures list to avoid race condition wh…

6c9b517

…en checking for active operations Signed-off-by: Jeen Broekstra <jeen.broekstra@gmail.com>

abrokenjester added a commit that referenced this issue Oct 28, 2016

#630 updated javadoc

b0dcc14

Signed-off-by: Jeen Broekstra <jeen.broekstra@gmail.com>

abrokenjester added this to the 2.1.1 milestone Oct 28, 2016

abrokenjester added a commit that referenced this issue Oct 31, 2016

Merge pull request #633 from eclipse/issues/#630-txncontroller-threading

20a4105

Issue: #630 TransactionController threading

abrokenjester removed the in progress label Oct 31, 2016

abrokenjester closed this as completed Oct 31, 2016

abrokenjester added a commit that referenced this issue Oct 31, 2016

Merge pull request #633 from eclipse/issues/#630-txncontroller-threading

08f2b97

Issue: #630 TransactionController threading

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

IllegalMonitorStateException in Transaction handling over HTTP #630

IllegalMonitorStateException in Transaction handling over HTTP #630

abrokenjester commented Oct 26, 2016 •

edited

abrokenjester commented Oct 26, 2016

ansell commented Oct 26, 2016

abrokenjester commented Oct 26, 2016

abrokenjester commented Oct 26, 2016

abrokenjester commented Oct 27, 2016

ansell commented Oct 27, 2016 •

edited

abrokenjester commented Oct 27, 2016

IllegalMonitorStateException in Transaction handling over HTTP #630

IllegalMonitorStateException in Transaction handling over HTTP #630

Comments

abrokenjester commented Oct 26, 2016 • edited

abrokenjester commented Oct 26, 2016

ansell commented Oct 26, 2016

abrokenjester commented Oct 26, 2016

abrokenjester commented Oct 26, 2016

abrokenjester commented Oct 27, 2016

ansell commented Oct 27, 2016 • edited

abrokenjester commented Oct 27, 2016

abrokenjester commented Oct 26, 2016 •

edited

ansell commented Oct 27, 2016 •

edited