Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor SQLite usage #100

Closed
ErikSin opened this issue Nov 30, 2023 · 6 comments
Closed

Refactor SQLite usage #100

ErikSin opened this issue Nov 30, 2023 · 6 comments
Assignees
Labels

Comments

@ErikSin
Copy link
Contributor

ErikSin commented Nov 30, 2023

Refactor to avoid corruption in the database

@achou11 achou11 changed the title Refactor SQLite Refactor SQLite usage Nov 30, 2023
@EvanHahn EvanHahn self-assigned this Jan 29, 2024
@EvanHahn
Copy link
Contributor

EvanHahn commented Feb 5, 2024

I'm not sure what we're thinking for this issue. Some ideas:

  • Audit our code for corruption. This probably means reading through "How To Corrupt An SQLite Database File" and making sure we aren't doing any of these things.

  • Audit for proper transaction usage. We may not be running operations in transactions correctly.

  • Audit for connection cleanups. We might not be closing the database if there are errors. For example, this could be problematic:

    const db = new Database(filePath)
    somethingThatMightThrow()
    db.close()

    Something like this is better:

    const db = new Database(filePath)
    try {
      somethingThatMightThrow()
    } finally {
      db.close()
    }

    I fixed at least one of these problems in fix: add fallback when MBTiles import lacks "name" in metadata #113, but there may be others. I don't expect this to cause corruption, but maybe?

  • Change database setup options. Some ideas:

    • Enable foreign keys, not enabled by default.
    • Set the synchronous option to something a little safer, like "FULL". I don't think this is necessary because we use WAL mode, and it's slower.
    • Set the fullfsync or checkpoint_fullfsync options for better reliability at the cost of performance. In my experience, checkpoint_fullfsync is worth it but fullfsync is not (it is significantly slower).
  • Periodic optimizations. I don't think any of these affect corruption. Some ideas:

  • Improve the experience when the database gets corrupted. We want to avoid this, but we can do some things if this happens:

  • Have a single worker responsible for SQLite interactions. I know we've discussed having a single "choke point" for the database. Admittedly, I don't understand the value of this as it relates to corruption, as SQLite typically1 allows multiple connections just fine. Maybe there's a deadlock bug we're worried about, or something else I don't understand.

Which of these things make sense to do as part of this issue? (Are there things I didn't list that we should do?)

Footnotes

  1. It wouldn't if we set the locking_mode to EXCLUSIVE, but I don't think we do that.

@gmaclennan
Copy link
Member

I think this issue came from a concern that writing to the same SQLite file from two separate threads could lead to corruption, because of conflicting information I have seen around the interwebs about multi-threaded write support in SQLite.

Because we perform the large imports in a separate thread, which take some time (e.g. 500Mb+ mbtiles files), it's quite likely that a write could occur in the main thread while that is happening. I'm just not sure how SQLite handles that, specifically the build of SQLite that we are using with better-sqlite3. It may be just a case of making sure we correctly handle SQLITE_BUSY errors, and the risk is not corruption after all. I found this reference on StackOverflow: https://stackoverflow.com/questions/1680249/how-to-use-sqlite-in-a-multi-threaded-application. I can't find the reference I originally read about not writing to SQLite from two separate threads, so maybe that is not an issue? If we need to, then we can move all SQLite operations to a single thread - this would be relatively simple to do in Drizzle with a custom proxy driver. But maybe I am over-complicating this, and the way we use SQLite currently is just fine.

@EvanHahn
Copy link
Contributor

EvanHahn commented Feb 8, 2024

EDIT: Much of this comment is wrong. See this response to my issue on better-sqlite3.

SQLite can be safely used across multiple processes.

The story is more complicated when using multiple threads, like in our case. SQLite has three threading modes, and only one of them is fully safe for multi-threaded access.

Unfortunately, this safe option is not the default option in better-sqlite-3.1

I think our options are as follows, assuming the consumer of this module passes in a database instance:

  1. Document that whoever creates the database should change to the thread-safe mode, which isn't the default in better-sqlite3.

    That punts this problem to the consumer of this module (CoMapeo? mapeo-core-next?), and probably requires making a custom SQLite build or forking better-sqlite3.

  2. better-sqlite3 uses a semi-safe threading mode. It allows multi-threaded database access as long as it isn't concurrent. We could build our own database locking system.

    Like the above, this probably needs some coordination with the consumers of this module. You could imagine the worker thread locking the map server's database connection but not CoMapeo's, which would cause problems.

  3. Stop using a worker thread and start using a completely separate process, letting us rely on SQLite's file locking mechanisms. This is probably slower and a little more complex, but should be safer.

  4. Move all database access to a single thread. This would likely block the database for a long time, but maybe there's something we can do to speed this up.

  5. Something else I'm not thinking of!

I don't think moving this repo's connection to a worker thread will help because it has the same problem: multi-threaded database access. It might not be multi-threaded in this repo, but any consumer will need to be super careful.

We could simplify these problems by letting the map server manage its own database file, which would make a lot of this stuff simpler. In that world, all of these options are much easier because we don't have to worry about the consumer's use of SQLite.

If we keep the API such that the consumer provides the database, then I think option 3 is my favorite. If we let this module manage its own database, I think option 1 is my favorite (compile SQLite in thread-safe mode).

Footnotes

  1. This might be intentional but it seems wrong to me; I've opened an issue.

@ximenabb ximenabb added the MVP label Feb 8, 2024
@achou11
Copy link
Member

achou11 commented Feb 8, 2024

@EvanHahn Couple of thoughts based on your proposals:

We could simplify these problems by letting the map server manage its own database file, which would make a lot of this stuff simpler. In that world, all of these options are much easier because we don't have to worry about the consumer's use of SQLite.

I think this makes sense to do in general. Looking back on it, there hasn't been any noticeable benefit to having the consumer create the database. Let's say we do this, regardless of the options you listed to address concurrent write issues:

  • I want to note that for the mobile app, we have to create native builds of better-sqlite3 in order for it to work on Android (see https://github.com/digidem/better-sqlite3-nodejs-mobile). not sure how that plays into your proposals (most likely relevant for (1)) - maybe it's a non-issue but wanted to highlight

  • re: (3), what's the difference between using a separate process vs a worker thread? (under the assumption that whichever is chosen is the sole manager of the db connection)

EvanHahn added a commit that referenced this issue Feb 11, 2024
The core of this change:

```diff
 const mapServer = createMapServer(
   {},
-  { database: new Database('./example.db') }
+  { storagePath: './map-server-example.db' }
 )
```

See also: [#92], [#93], [this comment][comment].

[#92]: #92
[#93]: #93
[comment]: #100 (comment)
@EvanHahn
Copy link
Contributor

I think this makes sense to do in general. Looking back on it, there hasn't been any noticeable benefit to having the consumer create the database. Let's say we do this [...]

Done in #119.

I want to note that for the mobile app, we have to create native builds of better-sqlite3 in order for it to work on Android (see https://github.com/digidem/better-sqlite3-nodejs-mobile). not sure how that plays into your proposals (most likely relevant for (1)) - maybe it's a non-issue but wanted to highlight

I glanced at this and I don't think it's an issue, but good to call out—I wasn't sure things were okay there.

re: (3), what's the difference between using a separate process vs a worker thread? (under the assumption that whichever is chosen is the sole manager of the db connection)

A lot of my comment was wrong (see WiseLibs/better-sqlite3#1138 (comment)). There's no safety difference between a separate process and a worker thread, unless we deliberately change SQLite's multi-threading mode (I see no reason to do this).

@EvanHahn
Copy link
Contributor

Once #119 merges, I don't think there's anything else we need to do here. We may want to file a new issue about SQLITE_BUSY errors, but that's separate and unrelated to corruption.

I'm going to close this issue, but let me know if that's wrong and I'll reopen!

@EvanHahn EvanHahn closed this as not planned Won't fix, can't repro, duplicate, stale Feb 11, 2024
EvanHahn added a commit that referenced this issue Feb 12, 2024
The core of this change:

```diff
 const mapServer = createMapServer(
   {},
-  { database: new Database('./example.db') }
+  { storagePath: './map-server-example.db' }
 )
```

See also: [#92], [#93], [this comment][comment].

[#92]: #92
[#93]: #93
[comment]: #100 (comment)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants