-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
lob cleaner issue #3583
Comments
Something called interrupt() on the thread that as doing the cleanup. Perhaps some watchdog type timer you have in your application? |
Nope, I don't have anything like that. I'd rather guess it's related to automatic DB compaction on close (200 ms by default, IIRC). |
H2 never interrupts own threads performing disk I/O by itself (unless you modified something in sources of H2 in your custom build). So you need to figure out which part of your application (including libraries and web/application server, if any) does that and fix it somehow. If you cannot prevent these interrupts, you can use a separate H2 Server process and use remote connections to it from your application or you can use the |
In the same environment the DB compaction via SHUTDOWN COMPACT always succeeds even though it lasts about 40 minutes. Isn't this suggesting that the interruption comes from H2? The crash is almost instant after CLOSE if prior to the closing a large number of LOBs has been removed (and time that elapsed between removal of LOBs and CLOSE doesn't affect the occurence of the crash). |
No, H2 never interrupts its own threads excluding the watchdog thread that doesn't perform disk I/O. |
|
First of all, you should test the original H2 instead of your own custom build. To debug this issue you need to start your application with a debugger. Set up a breakpoint on |
The build has not been recompiled. The only change is that is has been repackaged with JarJar. Could you share some insight on point 1? |
Maybe your application (or web / application server, if you use it) waits some time and interrupts threads only after that when it tries to finish its work and close the database as a part of application termination? When cleanup tasks are fast connections are closed before the timeout, but when database needs more time to finish its work your application (or application server) tries to interrupt its threads? Anyway, you can find callers of |
My environment is single threaded. It waits 40 minutes for SHUTDOWN COMPACT to complete and I see no reason why - in exactly the same setting - would it interrupt normal close, but only after a massive LOB removal. I managed to remove all the LOBs by doing this operation in chunks: remove some LOBs (30000 of them; 50000 causes the problem to appear), close the DB, reopen the DB, repeat. A race condition is evident here. You suspect that it's caused by my environment, I suspect H2 itself. If SHUTDOWN COMPACT were failing in a similar manner then I'd agree with you. Unfortunately the LOB subsystem has a long history of problems and I believe this is yet another. |
I think we're talking about different issues. If you complain about slow |
Is it possible, that cleanup process takes about 10 sec. or more? Because it is asynchronous and thread (ThreadPoolExecutor) is forcibly shutdown after 10 sec. (may issue Thread.interrupt() in the process). Probably it wasn't the best idea to use such procedure, or time-out should be a way higher. Some cooperation from LobStorageMap is required. Also, I just realized that BLOB clean-up absolutely ignores the time allotted for a clean-up on shutdown. |
Yes, it does take more than 10 seconds. After your remarks I think I know why limiting the amount of LOBs deleted before DB closing to 30000 (on my quite old machine) remedies the issue (BTW if the LOBs are smaller, more of them can be removed before causing a crash). As long as I'm concerned, I don't mind if all deleted LOBs are removed on normal shutdown. A periodic (say once a week) SHUTDOWN COMPACT is acceptable. However, a crash on normal DB closing is a no-go. |
@wburzyns @andreitokar |
@katzyn |
To stop execution without interrupt is an easy part. The problem is that, as of now, LOB removal fact is not persisted, just queued for processing into in-memory queue, so premature interruption will leave LOBs as live forever. |
I guess, for now just eliminate interrupt and live with long shutdown time in case of massive LOB removal. |
@andreitokar
Unfortunately this happened to me while closing my DB after removal of a large number of LOBs:
Exception in thread "H2-lob-cleaner": email.com.gmail.wburzyns.org.h2.mvstore.MVStoreException: Reading from file split:30:W:/__db/FinSys DB/storage/matfile.mv.db failed at 41526190657 (length 45768523776), read 384, remaining 0 [2.1.214/1]
at email.com.gmail.wburzyns.org.h2.mvstore.DataUtils.newMVStoreException(DataUtils.java:1004)
at email.com.gmail.wburzyns.org.h2.mvstore.DataUtils.readFully(DataUtils.java:470)
at email.com.gmail.wburzyns.org.h2.mvstore.FileStore.readFully(FileStore.java:98)
at email.com.gmail.wburzyns.org.h2.mvstore.Chunk.readBufferForPage(Chunk.java:422)
at email.com.gmail.wburzyns.org.h2.mvstore.MVStore.readPage(MVStore.java:2569)
at email.com.gmail.wburzyns.org.h2.mvstore.MVMap.readPage(MVMap.java:633)
at email.com.gmail.wburzyns.org.h2.mvstore.Page$NonLeaf.getChildPage(Page.java:1125)
at email.com.gmail.wburzyns.org.h2.mvstore.CursorPos.traverseDown(CursorPos.java:61)
at email.com.gmail.wburzyns.org.h2.mvstore.MVMap.operate(MVMap.java:1770)
at email.com.gmail.wburzyns.org.h2.mvstore.MVMap.remove(MVMap.java:518)
at email.com.gmail.wburzyns.org.h2.mvstore.StreamStore.remove(StreamStore.java:301)
at email.com.gmail.wburzyns.org.h2.mvstore.db.LobStorageMap.doRemoveLob(LobStorageMap.java:493)
at email.com.gmail.wburzyns.org.h2.mvstore.db.LobStorageMap.cleanup(LobStorageMap.java:443)
at email.com.gmail.wburzyns.org.h2.mvstore.db.LobStorageMap.lambda$null$0(LobStorageMap.java:122)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.nio.channels.ClosedByInterruptException
at java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202)
at sun.nio.ch.FileChannelImpl.readInternal(FileChannelImpl.java:740)
at sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:718)
at email.com.gmail.wburzyns.org.h2.store.fs.split.FileSplit.read(FileSplit.java:63)
at email.com.gmail.wburzyns.org.h2.mvstore.DataUtils.readFully(DataUtils.java:456)
... 15 more
So it looks like a race condition is still lurking in the code. This is 100% reproducible but does not happen if the number of LOBs removed is greatly reduced. BTW the removal goes fine as long as the DB is still opened. This happens only on close.
The text was updated successfully, but these errors were encountered: