-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
use better-sqlite3 as sqlite database driver (or provide the option) #2825
Comments
We use the Rust driver for SQLite under the hood: https://github.com/rusqlite/rusqlite . We interface with it via our quaint library: https://github.com/prisma/quaint We can't consume any Node.js library as a driver so for this feature request to be feasiable, there should be a Rust counter part and I also assume Rust SQLite might already be faster than better-sqlite3 so that would need some investigation. Prisma client is powered by a Rust engine under the hood. You can find the source code of that here: https://github.com/prisma/prisma-engines. We are not using any node js driver under the hood. |
Also I'm rewriting the whole stack for sqlite/pg/my in the next few weeks for many reasons. The library we'll be using is SQLx, and as you said, connections are stateful and they are used by one request at a time. What you can do is you can set How the asynchronous nature of SQLite in SQLx is implemented is they use the blocking API and use Tokio's You are right that SQLite allows only one writer at a time, and using multiple connections to the same file is waste of resources. I can also guarantee to you that with Prisma the speed of accessing SQLite will not be the first bottleneck. |
@pimeys interesting, will this replace Quaint? |
Quaint will be powered by SQLx under the hood. Please follow: prisma/quaint#138 |
ah. I see now. Apologies for spreading misinformation! I had assumed the driver was node-sqlite3 after poking around the prisma/prisma repo and seeing it in the cli's package.json. I will limit my question to the SQLx rewrite then:
Cool, the first part makes sense to me, the interface is a blocking api. Do you know how this will affect the node interface? Will it remain asynchronous to stay aligned with the pg/my drivers? At what point will the access become blocking? The second part of
Is slightly confusing because I dont know if we are talking about a thread inside a rust (a tokio async block?) or a libuv thread (the rust code is executed in an async manner from nodejs) E.g. accessing a sqlite database still looks like this: const allUsers = await prisma.user.findMany() Rust is not my strong suit, so apologies if my questions are missing the point. The bottom line question is: Will there be asynchronosity when using prisma with sqlite, where will it occur, and what are its implications? |
The node interface is asynchronous due to the rust implementation using a threadpool underneath to spread the cpu heavy stuff to all the available cores in your computer. The database interfaces will be asynchronous due to it being easy to plug with the rest of the system. What happens with pg/my and soon mssql is the async interface will wait for IO and let the threads to do other things until the response is ready. With sqlite, the right thing to do is to use one connection and share it with the threads, and queue up all waiting tasks until the mutex is free using more than one connections with it will just consume more RAM and the connections will still wait for others to finish. We're talking about a few megabytes maximum here, so nothing noticeable. In the end I wouldn't do really high performance things with SQLite. It's not meant for super fast processing, more of a nice tool to work with data in memory and for testing (and Android!). It has one mutex blocking all other writes, and although you can get read-only mode in it that allows others using it in parallel, none of the popular clients currently support that. Anyhow, play with the connection limits and see how fast you can go with Prisma and SQLite. |
Got it, I do think sqlite is a very fast database (just based on the fact that there are less hops that data needs to make to write to disk than say postgres or any other network-based database). It just happens to scale poorly with a large volume of requests. Its perfect for local web apps like I am describing. I will just assume that doing things in rust will be performant and thread-safe. However, there are a few concerns surrounding backpressure. If we were to use a fully sync database driver, and sqlite were to become blocked, then requests would start to queue up at the OS level on the TCP backlog. If it were to get overwhelmed, requests would start getting back messages like ECONNRESET (or whatever the browser equivalent is). In the case of an async database, if sqlite were to get blocked, requests would queue up inside the prisma database threadpool. If it were to get overwhelmed, what would happen? Would the nodejs process crash with a heap out of memory error? I understand we are dealing with extremes here, but I do think its worth knowing what it will do in those situations. E.g. the sync database server will stop accepting connections if it is blocked and therefore cannot overflow its memory. However the async database server could potentially keep swelling while its database is blocked until its memory overflows. By the way, thanks for indulging me here. I had to learn a bit more about how nodejs works to understand what else a node server would do under stressful conditions and learning about the TCP kernel queue is was pretty neat! |
Ok, let's think about this a bit :) First underneath we have SQLite, protected with a mutex (takes about max 20 nanoseconds to lock, queries what you push are hopefully done in a few microseconds). One request at a time is allowed for access, but it's quite fast due to the database being in the same memory space. We have an RPC server here that does a hop from Node to Rust and back, which makes sense to do in an asynchronous manner. Fine. We can saturate all CPUs and we can hit a limit with the database when having a crazy amount of requests. On top of this there's Node, a single-threaded system with a dynamic JavaScript or dynamic-ish TypeScript. What will happen here is that the node service will never be able to push enough traffic to the underlying rust system. The bottleneck will for sure be node's event loop. But. If you are able to eat all the RAM with back pressure. Or if you start getting system-level errors, such as ECONNRESET, I'm more than happy to come and debug this scenario! It sounds interesting... |
Great, thanks for the info. If you are interested, all I did to generate It is probably still possible to create a scenario where the database becomes a bottleneck before the server (e.g. sufficiently large datasets being written to the database, or an old, old HDD being written to), but these are definitely edge cases. I think what you said makes the most sense here:
Im going to close this now since I have a pretty good idea that I cant get into a situation where an async driver is going to harm me. Ill be curious to see what kind of benchmarks for prisma I can produce after this rewrite (though I think until prepared statements can be reused, its likely going to be lagging behind better-sqlite3) |
Funny that you posted that reusing of prepared statements link. I'm the engineer working on that, but focusing first on MySQL. The PR for SQLx was just accepted today. The plan is to have caching for MySQL for 2.2.0, but if everything goes well, I'd say all our databases should be using SQLx in 2.2.0. Hopefully. And with caching! |
🎉 how about that. Best of luck! |
Problem
sqlite is a synchronous access database, e.g. only one connection can read from the database at a time. So the async api that sqlite3 provides actually does more harm than good. This is better explained by the creator of better-sqlite3 than I can:
I am building a performance-heavy local web app. My gut feeling when building apps like this is that handwritten sql statements will always be more performant than ORM generated statements. Given that, I feel that picking prisma for sqlite has an extra hurdle to get over, that the driver under the hood isnt as performant as it could be.
Suggested solution
If support for this driver feels worthwhile, there are two options. Either provide a
driver
option in the schema datasource:or just replace the internal sqlite3 database with better-sqlite3.
Alternatives
Additional context
If sqlite3 was chosen because it is backed by mapbox and therefore has a better chance of being long-lived, it appears that sqlite3 is also maintained by a single person: https://kewde.github.io/sqlite
If this looks interesting to the contributors, I am willing to open a pr with some guidance!
The text was updated successfully, but these errors were encountered: