New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor SQLite usage #100
Comments
I'm not sure what we're thinking for this issue. Some ideas:
Which of these things make sense to do as part of this issue? (Are there things I didn't list that we should do?) Footnotes
|
I think this issue came from a concern that writing to the same SQLite file from two separate threads could lead to corruption, because of conflicting information I have seen around the interwebs about multi-threaded write support in SQLite. Because we perform the large imports in a separate thread, which take some time (e.g. 500Mb+ mbtiles files), it's quite likely that a write could occur in the main thread while that is happening. I'm just not sure how SQLite handles that, specifically the build of SQLite that we are using with better-sqlite3. It may be just a case of making sure we correctly handle SQLITE_BUSY errors, and the risk is not corruption after all. I found this reference on StackOverflow: https://stackoverflow.com/questions/1680249/how-to-use-sqlite-in-a-multi-threaded-application. I can't find the reference I originally read about not writing to SQLite from two separate threads, so maybe that is not an issue? If we need to, then we can move all SQLite operations to a single thread - this would be relatively simple to do in Drizzle with a custom proxy driver. But maybe I am over-complicating this, and the way we use SQLite currently is just fine. |
EDIT: Much of this comment is wrong. See this response to my issue on better-sqlite3. SQLite can be safely used across multiple processes. The story is more complicated when using multiple threads, like in our case. SQLite has three threading modes, and only one of them is fully safe for multi-threaded access. Unfortunately, this safe option is not the default option in better-sqlite-3.1 I think our options are as follows, assuming the consumer of this module passes in a database instance:
I don't think moving this repo's connection to a worker thread will help because it has the same problem: multi-threaded database access. It might not be multi-threaded in this repo, but any consumer will need to be super careful. We could simplify these problems by letting the map server manage its own database file, which would make a lot of this stuff simpler. In that world, all of these options are much easier because we don't have to worry about the consumer's use of SQLite. If we keep the API such that the consumer provides the database, then I think option 3 is my favorite. If we let this module manage its own database, I think option 1 is my favorite (compile SQLite in thread-safe mode). Footnotes
|
@EvanHahn Couple of thoughts based on your proposals:
I think this makes sense to do in general. Looking back on it, there hasn't been any noticeable benefit to having the consumer create the database. Let's say we do this, regardless of the options you listed to address concurrent write issues:
|
Done in #119.
I glanced at this and I don't think it's an issue, but good to call out—I wasn't sure things were okay there.
A lot of my comment was wrong (see WiseLibs/better-sqlite3#1138 (comment)). There's no safety difference between a separate process and a worker thread, unless we deliberately change SQLite's multi-threading mode (I see no reason to do this). |
Once #119 merges, I don't think there's anything else we need to do here. We may want to file a new issue about I'm going to close this issue, but let me know if that's wrong and I'll reopen! |
Refactor to avoid corruption in the database
The text was updated successfully, but these errors were encountered: