Race condition causing stale data in query last result cache #3894

pvlugter · 2023-09-18T00:40:33Z

We've been seeing an issue where a process does not always have read-your-writes consistency, and a previous version of a row was sometimes being returned.

The general scenario is events being processed to update a read-side projection. Events are processed one at a time, in order, retrieving the current value, and then applying a function to the event and the current value to create the next value. Sometimes, not often or consistently, it looked like a write was lost — the commit was successful but processing of the next event would not see the updated value, but be based on the value before.

We tracked this down to the query last result cache and a race condition around the modification id counters. The modification ids for tables are updated before a transaction is committed. The race condition is that a concurrent query can fall between the modification counter update and the commit to store, so that it associates the previous result with the next modification id, and on subsequent checks doesn't realise that the cache is outdated. It's slim, but possible in what we've seen.

It needed a particular combination to reproduce: simultaneous queries on the row being updated, and connection pooling (we're using R2DBC). We haven't created a reproducer for H2 directly, but here's the pattern we were seeing in more detail:

projected value will be a concatenation of the event values (A|B|C|D|...)
value for the projected row is currently A
modification id counter is currently N
process event B - update value from A to A|B and commit transaction (on connection 1)
modification counter is incremented to N+1, before the transaction commit to store
before value A|B is actually stored, there's a concurrent query for this row (on connection 2)
the query returns current value A, but associates this with the updated modification counter of N+1 <-- here's the bug
value A|B is committed to storage
process next event C - but on pooled connection 2 now, where the query last result cache is incorrect
query for the current value before update sees that the last modification id counter N+1 is up-to-date
query returns cached previous result of A instead of retrieving the updated value A|B
event processor upserts to A|C instead of A|B|C

This could potentially be resolved by simply reversing the order here:

h2database/h2/src/main/org/h2/engine/SessionLocal.java

Lines 682 to 683 in 4daeb2b

    
           markUsedTablesAsUpdated(); 
        
           transaction.commit();

so that the transaction is committed first, before the table modification ids are updated. That would avoid the stale data issue. Instead, the race condition would be flipped, and the last result cache could have an updated value associated with the previous modification id. That shouldn't be problematic for consistency — only that the cache would be considered stale on the next query, even though it has the newer value, and the query fetches the result again.

We've worked around this for now by setting OPTIMIZE_REUSE_RESULTS=FALSE.

The text was updated successfully, but these errors were encountered:

nPraml · 2024-03-20T15:07:52Z

Hello,

we hit the same bug. It seems that the result cache is hit before the update is completely done as @pvlugter described.

I could provide the following "reproducer":

package org.h2.test.unit;

import org.h2.test.TestBase;
import org.h2.test.TestDb;

import java.sql.Connection;
import java.sql.PreparedStatement;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.sql.Statement;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;

public class TestConcurrentCacheAccess extends TestDb {

	/**
	 * Run just this test.
	 *
	 * @param a ignored
	 */
	public static void main(String... a) throws Exception {
		TestBase.createCaller().init().test();
	}

	@Override
	public void test() throws Exception {
		deleteDb("statement");
		Connection conn1 = getConnection("statement");
		Connection conn2 = getConnection("statement");
		ExecutorService executor = Executors.newFixedThreadPool(2);
		try {
			conn1.setAutoCommit(false);
			conn2.setAutoCommit(false);
			try (Statement stat = conn1.createStatement()) {
				// This will fix the test
				// stat.execute("SET OPTIMIZE_REUSE_RESULTS 0");
				stat.execute("CREATE TABLE TEST(ID INT PRIMARY KEY, AGE integer)");
				conn1.commit();

				// insert something into test table
				stat.execute("INSERT INTO TEST VALUES(1, 0)");
				conn1.commit();
			}
			// DB is prepared

			for (int i = 0; i < 1000; i++) {
				// Normally, the test fails after 2-3 iterations
				System.out.println("Iteration: " + i);

				// Here we do an async update in an other thread...
				executor.execute(() -> setValue(conn1, 1));
				// and then we wait for the update
				expectValue(conn2,1);

				executor.execute(() -> setValue(conn1, 2));
				expectValue(conn2,2);

			}

		} finally {
			conn1.close();
			conn2.close();
			executor.shutdown();
		}
		deleteDb("statement");
	}

	private void setValue(Connection conn, int value) {
		try (PreparedStatement pstmt = conn.prepareStatement("UPDATE TEST SET AGE = ? WHERE ID = 1")) {
			pstmt.setInt(1, value);
			int count = pstmt.executeUpdate();
			assertEquals(1, count);
			conn.commit();
		} catch (SQLException e) {
			e.printStackTrace();
		}
	}

	/**
	 * polls the database until the expected value is seen.
	 *
	 * If value isn't found after 1s in DB, the test fails.
	 */
	private void expectValue(Connection conn, int value) {
		long endNano = System.nanoTime() + 1_000_000_000L; // One second
		while (endNano > System.nanoTime()) {
			try (PreparedStatement pstmt = conn.prepareStatement("SELECT AGE FROM TEST WHERE ID = 1");
				 ResultSet rs = pstmt.executeQuery();) {
				if (rs.next()) {
					if (rs.getInt(1) == value) {
						System.out.println("Success");
						return; // we've found the value in DB
					}
				}
			} catch (SQLException e) {
				e.printStackTrace();
			}
		}
		throw new AssertionError("Could not find value " + value + " in DB");
	}

}

I've tried the fix suggested above, and I also tried to add some more syncLastModificationIdWithDatabase in MVTable, but without success.

For now, we use OPTIMIZE_REUSE_RESULTS=FALSE as workaround

Noemi

pvlugter mentioned this issue Sep 18, 2023

Failed: R2dbcProjectionSpec akka/akka-projection#992

Closed

rPraml mentioned this issue May 6, 2024

Assertion error in shrinkStoreIfPossible #4048

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Race condition causing stale data in query last result cache #3894

Race condition causing stale data in query last result cache #3894

pvlugter commented Sep 18, 2023 •

edited

nPraml commented Mar 20, 2024

Race condition causing stale data in query last result cache #3894

Race condition causing stale data in query last result cache #3894

Comments

pvlugter commented Sep 18, 2023 • edited

nPraml commented Mar 20, 2024

pvlugter commented Sep 18, 2023 •

edited