New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TestDiskFull: The file is locked #1212
Comments
|
But whenever I see this, it's always unstable:memFS:/diskFull18.mv.db, meaning in=memory. Maybe it gets full for real - throws OOME and catches it, but memFS file system gets corrupted? |
|
I'm guessing that one of our shutdown paths does not close the filechannel, perhaps because of the OOME @andreitokar mentions |
|
Then we would see that exception with real (i.e. FilePathNio), and not just in-memory (FilePathMem) file system. It might be that just that file system itself fails on OOME. |
|
If I run TestDiskFull with -Xmx64M, I can't reproduce this error, instead I get: Exception in thread "main" org.h2.jdbc.JdbcSQLException: General error: "java.lang.RuntimeException: Incompatible key type, expected org.h2.mvstore.db.ValueDataType@fffecd71 but got org.h2.mvstore.type.ObjectDataType@1bf8050 for index IDX_NAME" [50000-197] |
|
One more failure: |
|
This happens all the time (i.e yesterday I saw it twice). When OOME happened, all bets are off, anything is possible, including in-memory file system corruption, which would lead to outstanding lock. Unless we manage to pre-allocate enough memory for this "file system", this test is no good. |
|
I saw a lot of failures too in the last few days. |
|
I can reproduce this failure on my system if this test is invoked in a loop. There are 5 additional lines in modified Normal call sequence: Call sequence with a failure: It means that underlying storage was not properly closed in a first part of We either have a code path in this method that may lead to leaked connection or |
|
Before the failure Unfortunately, I cannot reproduce this problem when JVM is debugged. |
|
I can reproduce this issue on my system if TestDiskFull is invoked in a loop on Java 11 with |
|
However, this issue is not reproducible with attached debugger. |
|
This issue is also not reproducible with debug print in It looks like this issue appears when store is not closed together with connection, but is closed later from background writer thread. When |
|
Why However, I don't see how background writer thread may prevent database shutdown process. I checked with debug print that previous connection is closed and background writer thread is still alive before the failure. It looks like only connection is closed, but database is not properly closed yet (probably because of write failure during initialization). |
I guess because it is called from different paths, and each path has it's own locking regime.
Possibly the problem is then that the background writer thread should either not be started until later in init, or should be shut down better if we get a write failure during init |
|
I tried to add a separate flag to prevent its startup if database in already in shutdown process or after it, but it did not help. Also this thread has nothing to do with the storage. There is a combination of the following factors:
Due to (2) and (3) I think that database is not really closed. After (4) database is closed completely by exception handler. |
|
So the problem is that we are not properly closing the database when the last connection goes away, and we end up closing the database as a side effect of the exception handler when the the background writer tries to write to the database? |
|
Something like it. However, database is already deregistered from the But with some additional debug print I found that this method did nothing in the large |
|
And |
|
|
|
MVStore is already closed somehow, so |
|
the bulk of MVStore's close logic is in closeStore. Possibly what we have here is some kind of race condition, where the the "closed" flag is set because the close process is happening on thread A, but then thread B comes along, calls close(), does a short-circuit return, and then and tries to re-open the DB but the close process is still on-going. |
|
It looks like this issue disappears if both |
|
Maybe we need a some tri-state close flag and delayed return from |
|
Sounds reasonable |
|
Actually in looks like 4 states are needed, one additional state for termination of background writer. Also I see an unlikely, but possible race in background writer thread startup code if simple |
|
No, |
|
It's possible to start two background writer threads due to lack of proper synchronization. |
Sometimes
TestDiskFullfails because lock cannot be acquired. It's very strange, in should acquire a an exclusive lock inFileMemDataand release it when connection is closed.The text was updated successfully, but these errors were encountered: