-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Race condition causing stale data in query last result cache #3894
Comments
Hello, we hit the same bug. It seems that the result cache is hit before the update is completely done as @pvlugter described. I could provide the following "reproducer": package org.h2.test.unit;
import org.h2.test.TestBase;
import org.h2.test.TestDb;
import java.sql.Connection;
import java.sql.PreparedStatement;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.sql.Statement;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
public class TestConcurrentCacheAccess extends TestDb {
/**
* Run just this test.
*
* @param a ignored
*/
public static void main(String... a) throws Exception {
TestBase.createCaller().init().test();
}
@Override
public void test() throws Exception {
deleteDb("statement");
Connection conn1 = getConnection("statement");
Connection conn2 = getConnection("statement");
ExecutorService executor = Executors.newFixedThreadPool(2);
try {
conn1.setAutoCommit(false);
conn2.setAutoCommit(false);
try (Statement stat = conn1.createStatement()) {
// This will fix the test
// stat.execute("SET OPTIMIZE_REUSE_RESULTS 0");
stat.execute("CREATE TABLE TEST(ID INT PRIMARY KEY, AGE integer)");
conn1.commit();
// insert something into test table
stat.execute("INSERT INTO TEST VALUES(1, 0)");
conn1.commit();
}
// DB is prepared
for (int i = 0; i < 1000; i++) {
// Normally, the test fails after 2-3 iterations
System.out.println("Iteration: " + i);
// Here we do an async update in an other thread...
executor.execute(() -> setValue(conn1, 1));
// and then we wait for the update
expectValue(conn2,1);
executor.execute(() -> setValue(conn1, 2));
expectValue(conn2,2);
}
} finally {
conn1.close();
conn2.close();
executor.shutdown();
}
deleteDb("statement");
}
private void setValue(Connection conn, int value) {
try (PreparedStatement pstmt = conn.prepareStatement("UPDATE TEST SET AGE = ? WHERE ID = 1")) {
pstmt.setInt(1, value);
int count = pstmt.executeUpdate();
assertEquals(1, count);
conn.commit();
} catch (SQLException e) {
e.printStackTrace();
}
}
/**
* polls the database until the expected value is seen.
*
* If value isn't found after 1s in DB, the test fails.
*/
private void expectValue(Connection conn, int value) {
long endNano = System.nanoTime() + 1_000_000_000L; // One second
while (endNano > System.nanoTime()) {
try (PreparedStatement pstmt = conn.prepareStatement("SELECT AGE FROM TEST WHERE ID = 1");
ResultSet rs = pstmt.executeQuery();) {
if (rs.next()) {
if (rs.getInt(1) == value) {
System.out.println("Success");
return; // we've found the value in DB
}
}
} catch (SQLException e) {
e.printStackTrace();
}
}
throw new AssertionError("Could not find value " + value + " in DB");
}
} I've tried the fix suggested above, and I also tried to add some more For now, we use Noemi |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
We've been seeing an issue where a process does not always have read-your-writes consistency, and a previous version of a row was sometimes being returned.
The general scenario is events being processed to update a read-side projection. Events are processed one at a time, in order, retrieving the current value, and then applying a function to the event and the current value to create the next value. Sometimes, not often or consistently, it looked like a write was lost — the commit was successful but processing of the next event would not see the updated value, but be based on the value before.
We tracked this down to the query last result cache and a race condition around the modification id counters. The modification ids for tables are updated before a transaction is committed. The race condition is that a concurrent query can fall between the modification counter update and the commit to store, so that it associates the previous result with the next modification id, and on subsequent checks doesn't realise that the cache is outdated. It's slim, but possible in what we've seen.
It needed a particular combination to reproduce: simultaneous queries on the row being updated, and connection pooling (we're using R2DBC). We haven't created a reproducer for H2 directly, but here's the pattern we were seeing in more detail:
A|B|C|D|...
)A
N
B
- update value fromA
toA|B
and commit transaction (on connection1
)N+1
, before the transaction commit to storeA|B
is actually stored, there's a concurrent query for this row (on connection2
)A
, but associates this with the updated modification counter ofN+1
<-- here's the bugA|B
is committed to storageC
- but on pooled connection2
now, where the query last result cache is incorrectN+1
is up-to-dateA
instead of retrieving the updated valueA|B
A|C
instead ofA|B|C
This could potentially be resolved by simply reversing the order here:
h2database/h2/src/main/org/h2/engine/SessionLocal.java
Lines 682 to 683 in 4daeb2b
so that the transaction is committed first, before the table modification ids are updated. That would avoid the stale data issue. Instead, the race condition would be flipped, and the last result cache could have an updated value associated with the previous modification id. That shouldn't be problematic for consistency — only that the cache would be considered stale on the next query, even though it has the newer value, and the query fetches the result again.
We've worked around this for now by setting
OPTIMIZE_REUSE_RESULTS=FALSE
.The text was updated successfully, but these errors were encountered: