Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Race condition causing stale data in query last result cache #3894

Open
pvlugter opened this issue Sep 18, 2023 · 1 comment
Open

Race condition causing stale data in query last result cache #3894

pvlugter opened this issue Sep 18, 2023 · 1 comment

Comments

@pvlugter
Copy link

pvlugter commented Sep 18, 2023

We've been seeing an issue where a process does not always have read-your-writes consistency, and a previous version of a row was sometimes being returned.

The general scenario is events being processed to update a read-side projection. Events are processed one at a time, in order, retrieving the current value, and then applying a function to the event and the current value to create the next value. Sometimes, not often or consistently, it looked like a write was lost — the commit was successful but processing of the next event would not see the updated value, but be based on the value before.

We tracked this down to the query last result cache and a race condition around the modification id counters. The modification ids for tables are updated before a transaction is committed. The race condition is that a concurrent query can fall between the modification counter update and the commit to store, so that it associates the previous result with the next modification id, and on subsequent checks doesn't realise that the cache is outdated. It's slim, but possible in what we've seen.

It needed a particular combination to reproduce: simultaneous queries on the row being updated, and connection pooling (we're using R2DBC). We haven't created a reproducer for H2 directly, but here's the pattern we were seeing in more detail:

  • projected value will be a concatenation of the event values (A|B|C|D|...)
  • value for the projected row is currently A
  • modification id counter is currently N
  • process event B - update value from A to A|B and commit transaction (on connection 1)
  • modification counter is incremented to N+1, before the transaction commit to store
  • before value A|B is actually stored, there's a concurrent query for this row (on connection 2)
  • the query returns current value A, but associates this with the updated modification counter of N+1 <-- here's the bug
  • value A|B is committed to storage
  • process next event C - but on pooled connection 2 now, where the query last result cache is incorrect
  • query for the current value before update sees that the last modification id counter N+1 is up-to-date
  • query returns cached previous result of A instead of retrieving the updated value A|B
  • event processor upserts to A|C instead of A|B|C

This could potentially be resolved by simply reversing the order here:

markUsedTablesAsUpdated();
transaction.commit();

so that the transaction is committed first, before the table modification ids are updated. That would avoid the stale data issue. Instead, the race condition would be flipped, and the last result cache could have an updated value associated with the previous modification id. That shouldn't be problematic for consistency — only that the cache would be considered stale on the next query, even though it has the newer value, and the query fetches the result again.

We've worked around this for now by setting OPTIMIZE_REUSE_RESULTS=FALSE.

@nPraml
Copy link

nPraml commented Mar 20, 2024

Hello,

we hit the same bug. It seems that the result cache is hit before the update is completely done as @pvlugter described.

I could provide the following "reproducer":

package org.h2.test.unit;

import org.h2.test.TestBase;
import org.h2.test.TestDb;

import java.sql.Connection;
import java.sql.PreparedStatement;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.sql.Statement;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;

public class TestConcurrentCacheAccess extends TestDb {

	/**
	 * Run just this test.
	 *
	 * @param a ignored
	 */
	public static void main(String... a) throws Exception {
		TestBase.createCaller().init().test();
	}

	@Override
	public void test() throws Exception {
		deleteDb("statement");
		Connection conn1 = getConnection("statement");
		Connection conn2 = getConnection("statement");
		ExecutorService executor = Executors.newFixedThreadPool(2);
		try {
			conn1.setAutoCommit(false);
			conn2.setAutoCommit(false);
			try (Statement stat = conn1.createStatement()) {
				// This will fix the test
				// stat.execute("SET OPTIMIZE_REUSE_RESULTS 0");
				stat.execute("CREATE TABLE TEST(ID INT PRIMARY KEY, AGE integer)");
				conn1.commit();

				// insert something into test table
				stat.execute("INSERT INTO TEST VALUES(1, 0)");
				conn1.commit();
			}
			// DB is prepared

			for (int i = 0; i < 1000; i++) {
				// Normally, the test fails after 2-3 iterations
				System.out.println("Iteration: " + i);

				// Here we do an async update in an other thread...
				executor.execute(() -> setValue(conn1, 1));
				// and then we wait for the update
				expectValue(conn2,1);

				executor.execute(() -> setValue(conn1, 2));
				expectValue(conn2,2);

			}

		} finally {
			conn1.close();
			conn2.close();
			executor.shutdown();
		}
		deleteDb("statement");
	}

	private void setValue(Connection conn, int value) {
		try (PreparedStatement pstmt = conn.prepareStatement("UPDATE TEST SET AGE = ? WHERE ID = 1")) {
			pstmt.setInt(1, value);
			int count = pstmt.executeUpdate();
			assertEquals(1, count);
			conn.commit();
		} catch (SQLException e) {
			e.printStackTrace();
		}
	}

	/**
	 * polls the database until the expected value is seen.
	 *
	 * If value isn't found after 1s in DB, the test fails.
	 */
	private void expectValue(Connection conn, int value) {
		long endNano = System.nanoTime() + 1_000_000_000L; // One second
		while (endNano > System.nanoTime()) {
			try (PreparedStatement pstmt = conn.prepareStatement("SELECT AGE FROM TEST WHERE ID = 1");
				 ResultSet rs = pstmt.executeQuery();) {
				if (rs.next()) {
					if (rs.getInt(1) == value) {
						System.out.println("Success");
						return; // we've found the value in DB
					}
				}
			} catch (SQLException e) {
				e.printStackTrace();
			}
		}
		throw new AssertionError("Could not find value " + value + " in DB");
	}

}

I've tried the fix suggested above, and I also tried to add some more syncLastModificationIdWithDatabase in MVTable, but without success.

For now, we use OPTIMIZE_REUSE_RESULTS=FALSE as workaround

Noemi

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants