New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
N+1 query problem on api/sync #1402
Comments
You're welcome to investigate if there's a clean way to reduce the number of queries, but otherwise it's a relatively niche use case that I don't think the main contributors are willing to spend much time on. |
Yes, sure, I'll try to investigate this by my own! I just wanted to flag this problem so people who are used to code in rust can help me find a cleaner way of doing those requests! |
I also encountered the same problem, can I consider introducing a caching mechanism? |
Feel free to propose something, but I suspect that there won't be a clean solution to this that isn't effectively a rewrite of the project. |
Also, keep in mind that some people are running multiple vaultwarden instances with one database, which makes caching done internally a bit of a pain. You then would need to go to tools like redis or memcached or something, which i doubt would be beneficial to the project. |
I'm not sure if I'm seeing this effect too, or if it's unrelated... I've just migrated from the official Bitwarden server to Vaultwarden (docker, v1.23.0, SQLite database). I have around 400 items in my vault, and the syncs are noticeably slower than before... I will try to dig further when I get the chance. |
Keep in mind that it also matters on what kind of hardware it is running, the network connection etc.. etc... But, any help on improving speed in any way would be great. |
Ah, apologies, these are both self hosted - Bitwarden in docker in a VM, Vaultwarden on native docker (no VM), both on the same physical hardware. |
Apologies for my noise here, it seems that the performance issue I was seeing is fully addressed by moving away from SQLite to MariaDB (not totally unexpected). More infoTest setup:
Tests:
|
I've investigated this a bit with some test data (200 collections, 600 ciphers), here is a rough estimate:
That's between 1062 and 2122 sql queries to prepare the full json blob. And that's not counting the queries for Folders, Policies and Sends. The sync query takes about 6s to load. |
@raphaelcoutu, @BlackDex: 30-50ms latency for a database connection is not a good thing in any case. Usually you would want to keep it below 10ms, however, as @bendem discovered, doing 1-2k SQL queries for a single API call might not be ideal either and will definitely cause issues no matter the hardware / connectivity on larger collections. Definitely not a niche issue, unless the majority of users have less than 50 passwords. This can be avoided by making use of |
i can take a better look into the queries. But JOIN's arn't that easy with all the different tables and rust structs etc... |
We indeed need an optimised code path for that route. The danger lies in duplicating all the (already quite complex) code that ensures users only see what they are authorised to see. That duplication would mean there is a chance that part of the code gets updated and the optimised query does not (or the other way around). |
I just tried looking into this but I can't for the life of me figure out how to add |
It's relatively simple actually: Also, do not forget the to set the log_level to DEBUG, else you will not see the output. diff --git a/Cargo.toml b/Cargo.toml
index 46a7ca0..0432bf5 100644
--- a/Cargo.toml
+++ b/Cargo.toml
@@ -65,6 +65,7 @@ fern = { version = "0.6.0", features = ["syslog-4"] }
# A safe, extensible ORM and Query builder
diesel = { version = "1.4.8", features = [ "chrono", "r2d2"] }
diesel_migrations = "1.4.0"
+diesel_logger = "0.1.1"
# Bundled SQLite
libsqlite3-sys = { version = "0.22.2", features = ["bundled"], optional = true }
diff --git a/src/db/mod.rs b/src/db/mod.rs
index bcbb7ce..586f4f5 100644
--- a/src/db/mod.rs
+++ b/src/db/mod.rs
@@ -72,9 +72,9 @@ macro_rules! generate_connections {
}
generate_connections! {
- sqlite: diesel::sqlite::SqliteConnection,
- mysql: diesel::mysql::MysqlConnection,
- postgresql: diesel::pg::PgConnection
+ sqlite: diesel_logger::LoggingConnection<diesel::sqlite::SqliteConnection>,
+ mysql: diesel_logger::LoggingConnection<diesel::mysql::MysqlConnection>,
+ postgresql: diesel_logger::LoggingConnection<diesel::pg::PgConnection>
}
impl DbConnType { |
Thanks, I'm sure it's easy if you know how macros work, that's why I asked :) |
Improved sync speed by resolving the N+1 query issues. Solves dani-garcia#1402 and Solves dani-garcia#1453 With this change there is just one query done to retreive all the important data, and matching is done in-code/memory. With a very large database the sync time went down about 3 times. Also updated misc crates and Github Actions versions.
Improved sync speed by resolving the N+1 query issues. Solves dani-garcia#1402 and Solves dani-garcia#1453 With this change there is just one query done to retreive all the important data, and matching is done in-code/memory. With a very large database the sync time went down about 3 times. Also updated misc crates and Github Actions versions.
Solved via #2429 |
@BlackDex Amazing work! The user experience is improved so much! |
Very nice, can't wait for the next release! |
Subject of the issue
Syncing/Loading ciphers with a remote PostgreSQL can be really slow (~ 50-60 seconds) due to a very large amount of SQL queries. The problem doesn't seem to affect users who are using SQLite or docker/local SQL databases due to no/minimal latency roundtrips. Browser extension could simply not sync.
Your environment
I have about ~300 ciphers in my Bitwarden wallet. My remote PostgreSQL server is about 30-50 ms from my local docker Bitwarden_rs setup (for testing purposes).
Steps to reproduce
I used different setups to identity the problem. I have no problem with local SQLite or docker PostgreSQL.
I finally found the problem using a PostgreSQL log configuration in postgresql.conf. There were a lot of requests written for a single sync (2400 lines so I estimate about 1000 requests):
Docker-compose used:
Expected behaviour
Should load faster (~up to 5-10 seconds could be acceptable)
Actual behaviour
Really slow. (~1 min)
Relevant code blocks
(I don't usally code in Rust but I found the place where we might start looking) Is there a way to use query joins instead of doing many requests?
https://github.com/dani-garcia/bitwarden_rs/blob/d69be7d03a0369faf1f6be6ed2cb908ec6b7a253/src/api/core/ciphers.rs#L86-L106
https://github.com/dani-garcia/bitwarden_rs/blob/d69be7d03a0369faf1f6be6ed2cb908ec6b7a253/src/db/models/collection.rs#L60-L67
https://github.com/dani-garcia/bitwarden_rs/blob/d69be7d03a0369faf1f6be6ed2cb908ec6b7a253/src/db/models/collection.rs#L227-L247
etc.
https://github.com/dani-garcia/bitwarden_rs/blob/d69be7d03a0369faf1f6be6ed2cb908ec6b7a253/src/db/models/cipher.rs#L82-L85
The text was updated successfully, but these errors were encountered: