Deadlocks between writing threads #329

JPMoresmau · 2019-01-19T10:54:11Z

Unfortunately we ran into deadlock issues again. The setup is as follows: we have multiple threads writing into the DB using the same sqlgraph. Sometimes these write operations have to modify the underlying schema, and we get into a deadlock where one thread as done some uncommitted writes, thus holds some postgres locks, and then wants to do schema changes, so requires the topology locks. At the same time another thread has made some topology changes and got the lock, but it cannot write the data to postgres because the table is locked by the other thread. So one thread holds the DB locks and the other the Java locks, and they wait for the other.

Tests to demonstrate the issues

g.traversal().V().hasLabel("s1.v1").drop().iterate();
g.tx().commit();
g.getTopology().getSchema("s1").ifPresent(s->s.remove(false));
g.tx().commit();

Map<String,Object> m1=new HashMap<>();
m1.put("name","name1");
g.addVertex("s1.v1",m1);
g.tx().commit();
CountDownLatch t1Wrote=new CountDownLatch(1);
CountDownLatch t2Wrote=new CountDownLatch(1);

Thread t1=new Thread(new Runnable() {
	
	@Override
	public void run() {
			Map<String,Object> m1=new HashMap<>();
						m1.put("name","name2");
						g.addVertex("s1.v1",m1);
						
						t1Wrote.countDown();
						try {
							t2Wrote.await(10,TimeUnit.SECONDS);
							
							Map<String,Object> m2=new HashMap<>();
							m2.put("name","name3");
							m2.put("att1","val1");
							g.addVertex("s1.v1",m2);
							
							
							g.tx().commit();
						} catch (InterruptedException ie) {
							fail(ie.getMessage());
						}
					}
				},"First writer");
				
				Thread t2=new Thread(new Runnable() {
					
					@Override
					public void run() {
						try {
							t1Wrote.await();
							
							Map<String,Object> m1=new HashMap<>();
							m1.put("name","name4");
							m1.put("att2","val2");
							g.addVertex("s1.v1",m1);
							
							t2Wrote.countDown();
							
							g.tx().commit();
						} catch (InterruptedException ie) {
							fail(ie.getMessage());
						}
					}
				},"Second writer");
				
				t1.start();
				t2.start();
				
				t1.join();
				t2.join();
				
				assertEquals(4L,g.traversal().V().hasLabel("s1.v1").count());

(vertex attributes)
and

g.traversal().V().hasLabel("s1.v1").drop().iterate();
				g.traversal().V().hasLabel("s1.v2").drop().iterate();
				g.traversal().V().hasLabel("s1.v3").drop().iterate();
				g.traversal().V().hasLabel("s1.v4").drop().iterate();
				g.tx().commit();
				g.getTopology().getSchema("s1").ifPresent(s->s.remove(false));
				g.tx().commit();
				
				Map<String,Object> m1=new HashMap<>();
				m1.put("name","name1");
				Vertex v0=g.addVertex("s1.v1",m1);
				m1=new HashMap<>();
				m1.put("name","name1");
				Vertex v1=g.addVertex("s1.v2",m1);
				v0.addEdge("edge1", v1);
				g.tx().commit();
				CountDownLatch t1Wrote=new CountDownLatch(1);
				CountDownLatch t2Wrote=new CountDownLatch(1);
				
				Thread t1=new Thread(new Runnable() {
					
					@Override
					public void run() {
						Map<String,Object> m1=new HashMap<>();
						m1.put("name","name2");
						Vertex v2=g.addVertex("s1.v2",m1);
						v0.addEdge("edge1", v2);
						
						t1Wrote.countDown();
						try {
							t2Wrote.await(10,TimeUnit.SECONDS);
							
							Map<String,Object> m2=new HashMap<>();
							m2.put("name","name3");
							Vertex v3=g.addVertex("s1.v3",m2);
							v0.addEdge("edge1", v3);
							
							g.tx().commit();
						} catch (InterruptedException ie) {
							fail(ie.getMessage());
						}
					}
				},"First writer");
				
				Thread t2=new Thread(new Runnable() {
					
					@Override
					public void run() {
						try {
							t1Wrote.await();
							
							Map<String,Object> m1=new HashMap<>();
							m1.put("name","name4");
							Vertex v4=g.addVertex("s1.v4",m1);
							v0.addEdge("edge1", v4);
							t2Wrote.countDown();
							
							g.tx().commit();
						} catch (InterruptedException ie) {
							fail(ie.getMessage());
						}
					}
				},"Second writer");
				
				t1.start();
				t2.start();
				
				t1.join();
				t2.join();
				
				assertEquals(1L,g.traversal().V().hasLabel("s1.v3").count());
				assertEquals(1L,g.traversal().V().hasLabel("s1.v4").count());

(edges vertices)

One of my colleagues suggested replacing the topology lock by either immutable structures or transaction specific topology items, so that no topology lock would be required during the write transaction, and only when the transaction commits would the topology changes be visible to other threads (here we could take a lock to update the generic topology and ensure all changes that happened are properly merged with other changes that happened concurrently). I'm not sure how workable that is. But I suppose between distributed graphs we have no topology locks anyway.

One workaround is to have several graphs (one graph per thread, no sharing). Is that the way forward, or should we try to fix the locking?
Another workaround is to try to make all schema changes upfront before any write operation, but it's not sure in every use case we'll always the know the schema we need, and we may create a lot of attributes or edges that may not be needed (and end up with edge tables that are much wider than needed).

The text was updated successfully, but these errors were encountered:

pietermartin · 2019-01-19T12:13:11Z

Hi, I have not tried the above query yet but does it consistently duplicate the dead lock?

JPMoresmau · 2019-01-19T13:48:34Z

Yes, the countdown latches ensure we systematically replicate the deadlock. After 10 seconds thread 1 is sure thread2 has changed the topology and tries to acquire the lock.
Maybe we should have the topology in a thread local too and reread it from the db when another thread has changed it...

pietermartin · 2019-01-19T15:08:23Z

Ok, well its great that we can duplicate the dead lock now.

I am not sure all sorts of tricks will help as it might just push the dead lock down to postgres itself. This is why I moved the lock to java in the first place, to force topology change statements to only ever happen in one thread/transaction at a time.

I need to look a bit deeper at it. We might have to lock even more aggressively if the current strategy still leads to dead locks.

Regarding rereading the topology. I'll be hesitant to do that as it quite slow. Our production dbs are broad, 20 000 - 40 000 tables. It takes a while to populate the topology from the db.

pietermartin · 2019-01-23T16:17:38Z

I've had a chance too look at the dead lock a bit.
The way I see it is the second writer should not have been able to take the lock as it blocks on postgres waiting for the first writer to finish.
So it seems to me without doing a complete rethink of the locking mechanism the best will be to have some kind of write thread count and that the topology lock can only be taken if there are no write threads in progress.
My idea is that if a topology lock is requested all new write threads block, existing write threads complete and then the topology lock is granted. Once the topology lock is released the new write threads continue.

Makes sense? Opinions, ideas?

JPMoresmau · 2019-01-23T16:48:33Z

At first glance this seems to make perfect sense. Can we detect with 100% accuracy if we're writing something? I would suppose so. We need to handle the case when one thread commits then immediately continues writing, that is checks if there are some pending locks. I suppose writing transactions need to have a read lock on the topology lock

pietermartin · 2019-01-23T17:02:44Z

Ok good, I'll give it a bash on the weekend. Fortunately we know when a write call is made via the api.

pietermartin · 2019-01-27T10:15:17Z

I added a fix for this on the deadLock branch. Unfortunately as ReentrantReadWriteLock does not allow upgrading a read lock to a write lock it is not possible to just protect topology changes with a read write lock. I ended up adding two CountUpDownLatchs (copied from Hsqldb) to do the magic.

All the tests are passing including a new test in TestDeadLock that duplicated this issue.

Please have a look to see if you agree with the implementation. Concurrency is a tad complex, so I am not entirely convinced the current solution does not have dead lock scenarios.

I'll be doing some more tests during the week on my work code to see if all is good.

JPMoresmau · 2019-01-28T14:43:33Z

Hi, my test now passes, but I'm not too sure about the code change. The await call has no termination, so the thread will wait for ever, no? Shouldn't there be a timeout? Also, if thread1 starts writing some data, then thread2 stars two but is locked out by postgres so has to wait for thread1 to commit, and thread1 decides to do a topology change, we'll have a deadlock again, no? Because thread1 will wait for thread2 to commit before taking the lock...

JPMoresmau · 2019-01-28T14:45:47Z

Do we really need all this explicit locking? I mean Postgres does some locking for us, couldn't we use the fact that topology change impact postgres data inside the sqlg schema to ensure nobody writes on anybody's toes? I suppose this doesn't guarantee no deadlocks will happen in Postgres, which you were trying to avoid in the first place.

pietermartin · 2019-01-28T18:30:27Z

I'll add the timeouts. Any opinion on a sensible default? On the topology lock I am currently going with a hardcoded 2 minutes. I'll make it configurable.

Regarding your example,
If both threads are writing and doing topology change statements to the same table then it might dead lock in postgres also. These things are hard to reason about unless I actually write a test duplicating the scenario. But yes I'll go try simulate your scenario. I suspect we'll get a dead lock.

Regarding taking all these locks. I am not exactly happy with the recent refactor and adding of two extra locks. However before having the first topology lock postgres would dead lock all the time. Multi threaded topology change is not really a feature of postgres. I suspect part of the reason its more difficult on postgres is that the topology changes are actually transactional. For hsqldb, h2, mariadb and mysql schema changes are not transactional. They secretly commit the transaction immediately after executing the statement. Basically breaking the transaction contract.

Regarding "I mean Postgres does some locking for us, couldn't we use the fact that topology change impact postgres data inside the sqlg schema to ensure nobody writes on anybody's toes?"
Do you mean to have finer grained locks? Locks per table being written to changed?

pietermartin · 2019-01-28T18:34:32Z

Btw I also spent some time reading https://github.com/npgall/concurrent-locks. However it won't work for us as the update lock only allows one thread at a time.
Seems it is a non trivial problem we are trying solve.

MAndreasT · 2019-01-29T08:30:21Z

The cause for the deadlock is the acquisition if a lock in Java (let's call it J) and some lock in Postgres (may be called P, acquired implicitly by Postgres write requests) in varying order:

An update request may acquire P first and a later need of a schema extension can cause the acquisition of J second.
It is also possible that the need of a schema extension is detected before the first update request in Postgres. In this case J is acquired first and P may be acquired by a subsequent update request to Postgres second.

=> This pattern is known for causing deadlocks. T1 acquires J, T2 acquires P, T1 starts to wait for P, T2 starts to wait for J - deadlock.
How can it be avoided?
In principle the deadlock would be avoided by always acquiring J before any update request to Postgres is done. But this would sequentialize all update transactions which is of course not desirable.
The update of the schema data structures in Java require some synchronization between the threads accessing them, which means that you cannot get rid of J completely.
But it should be possible to avoid that a thread holds J when it starts a Postgres request. This is possible if J is only used to do updates in the schema data in Java exclusively and to get read access to a consistent state of the schema data in Java, about this way:
Topology.java:
private SchemaData schemaData;
// get read access to consistent schema:
SchemaData getSchemaData() {
synchronized (this) { // short term acquisition of "J"
return schemaData;
}
}
void updateSchemaData(SchemaData modifiedSchema) {
synchronized (this) { // short term acquisition of "J"
schemaData = modifiedSchema;
}
}

The assumption is here that a schema modification creates always a new SchemaData object which replaces the former one if the update is completed. Due to this the member Topology.schemaData refers always to a consistent schema data structure and there is no need for longer acquisition of a lock in Java.

In reality it is a bit more complicated because you have also to detect in updateSchema() that concurrent schema updates have occurred and to merge them appropriately into the modifiedSchema before it is assigned to the member Topology.schemaData. Given that the notifications of schema updates by other JVMs can also be merged this should be doable (within the synchronized block in updateSchema).

pietermartin · 2019-01-29T12:08:18Z

The problem that I see with the above is that the J topology/schema lock was not so much to protect the java SchemaData but to ensure schema changes to postgres occur in only one thread at a time. Schema changes are sql create and alter statements. They tend to take table wide locks and if not serialized cause postgres deadlocks to occur.

If I understand your solution correctly postgres schema changes will be able to occur simultaneously thus causing dead locks?

MAndreasT · 2019-01-30T08:52:39Z

Yes, Postgres schema changes may then be done concurrently by multiple threads and Postgres may detect deadlocks between the corresponding transactions. The question is if this causes additional problems in the applications? In see the following points why this shouldn't be the case:

Postgres detects deadlocks also between transactions executing only plain update requests and resolves them by "sacrificing" one of the involved transactions (SqlgGraph.tx().commit() throws an exception). This enforces the application of course to handle the exception, e.g. by repeating the transaction. We observed this at least in our sqlg application and I expect that our implemented commit exception handling will also work if the deadlock is caused by a schema change.
As far as I understand the current locking approach, it does not prevent concurrent schema changes by threads running in different JVMs (using the notification mechanism) and if this is true, the issue of Postgres deadlocks exists already with the current solution if there are multiple JVMs involved (which is the case for our application).

The need to program the application so that an sqlg transaction is repeated if the commit fails is certainly inconvenient but we found it necessary also if the schema did no more need updates. Deadlocks because of schema updates may cause some additional commit exceptions but this should't really matter, given that schema updates are typically rare events.

My bottom line is: This bug is about an unexpected deadlock, but the occurrence of deadlocks (or at least conflicts) between concurrent threads can't be completely avoided, you have to accept the need to handle them in the application code. This raises the question:
What is the advantage of letting Postgres detect deadlocks in comparison to the current solution in sqlg?
My point is that the deadlock detection in sqlg cannot analyse the locks in Postgres and depends therefore on a timeout (2 minutes), whereas Postgres can basically analyse all granted and the requested locks and detect and report deadlocks in principle quicker and more efficiently (selecting the proper transaction to sacrifice) making it more likely to resolve a deadlock by repeating failing transactions as our application code already does.
If we stay with the current deadlock detection by timeout in sqlg we have to rethink our exception handling in the application code and the primary reason for creating this bug ticket was to avoid this.

pietermartin · 2019-01-30T09:29:10Z

Hmm, I suppose the initial topology lock was precisely to prevent the application from having to code retry logic. With topology changes it happens so often that it is a glorious pain to deal with it in the application.

Regarding multiple JVMs the Sqlg, when in distributed mode, takes a LOCK TABLE sqlg_schema.V_log IN EXCLUSIVE MODE That's the attempt to serialize schema change commands across multiple JVMs.

My strategy is/was to only serialize topology changes with no further attempts at trying to control postgres locks. In a way its an attempt to get the TinkerPop's NoSql part to behave almost NoSql like.

Unfortunately its throwing us a curve ball now.

Regarding postgres detecting the dead lock. Does it sacrifice the transaction immediately? I have not had this problem for a while, the current Sqlg solution working for my particular use-case. What I recall however is postgres simply hanging with a bunch of pids waiting on other pids.

pietermartin · 2019-01-30T11:40:03Z

I'll investigate removing the topology lock, (probably making it optional) as too much of my own and probably others depend on it, (not having needed any retry logic). But yes I agree, first prize is no attempt at locking from Sqlg's side.
I'll need to understand the behavior of postgres's dead lock detection better to understand the impact at the application level. Hopefully it does not hang, in which case we'll be back to the beginning.

pietermartin · 2019-01-30T18:17:06Z

Did some testing, on the current way for context,

First test is doing what Sqlg does in the second test using pure sql with no locking.
The second test does the same via Sqlg with the topology lock.

    @Test
    public void testDeadLock5PureSql() throws InterruptedException {
        Connection connection = this.sqlgGraph.tx().getConnection();
        try (Statement statement = connection.createStatement()) {
            statement.execute("CREATE TABLE \"public\".\"V_A\" (\"ID\" BIGSERIAL PRIMARY KEY, \t\"nameA1\" TEXT);");
            statement.execute("INSERT INTO \"public\".\"V_A\" (\"nameA1\") VALUES ('haloA1');");
            statement.execute("CREATE TABLE \"public\".\"V_B\" (\"ID\" BIGSERIAL PRIMARY KEY, \t\"nameB1\" TEXT);");
            statement.execute("INSERT INTO \"public\".\"V_B\" (\"nameB1\") VALUES ('haloB1');");
        } catch (SQLException e) {
            this.sqlgGraph.tx().rollback();
            throw new RuntimeException(e);
        }
        this.sqlgGraph.tx().commit();

        Thread t1 = new Thread(() -> {
            Connection connection1 = this.sqlgGraph.tx().getConnection();
            try (Statement statement = connection1.createStatement()) {
                statement.execute("ALTER TABLE \"public\".\"V_A\" ADD \"nameA2\" TEXT;");
                statement.execute("INSERT INTO \"public\".\"V_A\" (\"nameA2\") VALUES ('haloA2');");
                statement.execute("UPDATE \"public\".\"V_B\" SET \"nameB1\" = 'haloAgainB2' WHERE \"ID\" = 1;");
            } catch (SQLException e) {
                this.sqlgGraph.tx().rollback();
                logger.error(e.getMessage(), e);
                throw new RuntimeException(e);
            }
            this.sqlgGraph.tx().commit();
        }, "First writer");

        Thread t2 = new Thread(() -> {
            Connection connection1 = this.sqlgGraph.tx().getConnection();
            try (Statement statement = connection1.createStatement()) {
                statement.execute("ALTER TABLE \"public\".\"V_B\" ADD \"nameB2\" TEXT;");
                statement.execute("INSERT INTO \"public\".\"V_B\" (\"nameB2\") VALUES ('haloB2');");
                statement.execute("UPDATE \"public\".\"V_A\" SET \"nameA1\" = 'haloAgainA1' WHERE \"ID\" = 1;");
            } catch (SQLException e) {
                this.sqlgGraph.tx().rollback();
                logger.error(e.getMessage(), e);
                throw new RuntimeException(e);
            }
            this.sqlgGraph.tx().commit();
        }, "Second writer");

        t1.start();
        t2.start();

        t1.join();
        t2.join();
        System.out.println("done");
    }

This test almost always throws a dead lock exception from postgres.

    @Test
    public void testDeadLock4() throws InterruptedException {
        Vertex a1 = this.sqlgGraph.addVertex(T.label, "A", "nameA1", "haloA1");
        Vertex b1 = this.sqlgGraph.addVertex(T.label, "B", "nameB1", "haloB1");
        this.sqlgGraph.tx().commit();

        Thread t1 = new Thread(() -> {
            //Lock table A
            this.sqlgGraph.addVertex(T.label, "A", "nameA2", "haloA2");
            b1.property("nameB1", "haloAgainB2");
            this.sqlgGraph.tx().commit();
        }, "First writer");

        Thread t2 = new Thread(() -> {
            //Lock table B
            this.sqlgGraph.addVertex(T.label, "B", "nameB2", "haloB2");
            a1.property("nameA1", "haloAgainA1");
            this.sqlgGraph.tx().commit();
        }, "Second writer");

        t1.start();
        t2.start();

        t1.join();
        t2.join();

        Assert.assertEquals(2, this.sqlgGraph.traversal().V().hasLabel("A").count().next(), 0);
        Assert.assertEquals(2, this.sqlgGraph.traversal().V().hasLabel("B").count().next(), 0);
        Assert.assertEquals(1, this.sqlgGraph.traversal().V().hasLabel("B").has("nameB1", "haloAgainB2").count().next(), 0);
        Assert.assertEquals(1, this.sqlgGraph.traversal().V().hasLabel("A").has("nameA1", "haloAgainA1").count().next(), 0);

    }

This one passes as the topology lock prevents the dead lock.

It seems to me changing Sqlg to have no topology lock will have a serious impact on existing clients as chances are they are relying on the lock, having neither created schemas upfront nor coded retry logic.

I am first going to try to duplicate dead locking the current implementation and see if there is a way to detect it and throw a dead lock exception faster than the current 2 minutes. This should not happen to often as its only when there is a topology change and still another write thread active.

Then I'll investigate making the topology lock optional, or perhaps removing all together.

Thought are most welcome.

pietermartin · 2019-01-31T05:47:09Z

I added dead lock detection code on the topology. So if the topology lock can not be taken because a write thread is waiting on the current thread a dead lock exception is thrown. This happens in millis so is better than waiting 2 minutes for a timeout.

I fixed the handling of timeouts, it was not throwing exceptions.

pietermartin · 2019-02-05T16:07:37Z

Say, is this fix working for you? Any more dead locks?
If not I'd like to merge this into master.

I'll create a separate ticket about refactoring the locking strategy to no longer take a schema change lock in java.

JPMoresmau · 2019-02-06T15:22:56Z

Yes, from our end we're fine with this fix, we understand what it fixes and what it doesn't and how to work around the cases it does not, so it can be merged.

pietermartin · 2019-02-06T16:18:48Z

Ok great tx, I'll create another ticket for removing the lock, but alas won't be working on it anytime soon. I pretended to start and quickly stopped as it affects the whole strategy.

pietermartin added the bug label Jan 25, 2019

pietermartin self-assigned this Jan 25, 2019

pietermartin closed this as completed Feb 6, 2019

pietermartin mentioned this issue Feb 6, 2019

Remove Sqlg's topology lock. #330

Closed

pietermartin mentioned this issue Mar 14, 2019

Exceptions when adding vertices/edges that do on-demand schema creation #338

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deadlocks between writing threads #329

Deadlocks between writing threads #329

JPMoresmau commented Jan 19, 2019

pietermartin commented Jan 19, 2019

JPMoresmau commented Jan 19, 2019

pietermartin commented Jan 19, 2019

pietermartin commented Jan 23, 2019

JPMoresmau commented Jan 23, 2019

pietermartin commented Jan 23, 2019

pietermartin commented Jan 27, 2019

JPMoresmau commented Jan 28, 2019

JPMoresmau commented Jan 28, 2019

pietermartin commented Jan 28, 2019

pietermartin commented Jan 28, 2019

MAndreasT commented Jan 29, 2019 •

edited

Loading

pietermartin commented Jan 29, 2019

MAndreasT commented Jan 30, 2019

pietermartin commented Jan 30, 2019

pietermartin commented Jan 30, 2019

pietermartin commented Jan 30, 2019 •

edited

Loading

pietermartin commented Jan 31, 2019

pietermartin commented Feb 5, 2019

JPMoresmau commented Feb 6, 2019

pietermartin commented Feb 6, 2019

Deadlocks between writing threads #329

Deadlocks between writing threads #329

Comments

JPMoresmau commented Jan 19, 2019

pietermartin commented Jan 19, 2019

JPMoresmau commented Jan 19, 2019

pietermartin commented Jan 19, 2019

pietermartin commented Jan 23, 2019

JPMoresmau commented Jan 23, 2019

pietermartin commented Jan 23, 2019

pietermartin commented Jan 27, 2019

JPMoresmau commented Jan 28, 2019

JPMoresmau commented Jan 28, 2019

pietermartin commented Jan 28, 2019

pietermartin commented Jan 28, 2019

MAndreasT commented Jan 29, 2019 • edited Loading

pietermartin commented Jan 29, 2019

MAndreasT commented Jan 30, 2019

pietermartin commented Jan 30, 2019

pietermartin commented Jan 30, 2019

pietermartin commented Jan 30, 2019 • edited Loading

pietermartin commented Jan 31, 2019

pietermartin commented Feb 5, 2019

JPMoresmau commented Feb 6, 2019

pietermartin commented Feb 6, 2019

MAndreasT commented Jan 29, 2019 •

edited

Loading

pietermartin commented Jan 30, 2019 •

edited

Loading