Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

N+1 query problem on api/sync #1402

Closed
raphaelcoutu opened this issue Feb 18, 2021 · 19 comments
Closed

N+1 query problem on api/sync #1402

raphaelcoutu opened this issue Feb 18, 2021 · 19 comments
Labels
enhancement New feature or request low priority Won't fix anytime soon, but will accept PR if provided

Comments

@raphaelcoutu
Copy link

Subject of the issue

Syncing/Loading ciphers with a remote PostgreSQL can be really slow (~ 50-60 seconds) due to a very large amount of SQL queries. The problem doesn't seem to affect users who are using SQLite or docker/local SQL databases due to no/minimal latency roundtrips. Browser extension could simply not sync.

Your environment

  • Install method: docker image
  • Clients used: Web or Browser extension
  • Bitwarden_rs version: v1.19.0
  • Web-vault version: v2.18.1
  • Running within Docker: true
  • Internet access: true
  • Uses a proxy: false
  • DNS Check: true
  • Time Check: true
  • Domain Configuration Check: true
  • HTTPS Check: false
  • Database type: PostgreSQL
  • Version of mysql/postgresql: 12.6
    I have about ~300 ciphers in my Bitwarden wallet. My remote PostgreSQL server is about 30-50 ms from my local docker Bitwarden_rs setup (for testing purposes).

Steps to reproduce

  1. Creating a docker container with bitwarden_rs
  2. Importing Bitwarden wallet (official self hosted) into bitwarden_rs
  3. Loading vault home page

I used different setups to identity the problem. I have no problem with local SQLite or docker PostgreSQL.

I finally found the problem using a PostgreSQL log configuration in postgresql.conf. There were a lot of requests written for a single sync (2400 lines so I estimate about 1000 requests):

log_statement = 'all
log_directory = 'log'
log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log'
logging_collector = on

Docker-compose used:

version: '3'
services:
  # postgres:
  #   image: postgres:12
  #   restart: always
  #   volumes:
  #     - pgdata:/var/lib/postgresql/data
  #   environment:
  #     - POSTGRES_PASSWORD=admin
  #     - TZ=America/Montreal
  bitwarden:
    image: bitwardenrs/server
    restart: always
    volumes:
      - ./bw-data:/data
    ports:
      - 8080:80
      #- 3012:3012
    environment:
      - DOMAIN=http://localhost:8080
      - DATABASE_URL=postgresql://bitwardenrs:<<password>>@<<host>>:5432/bitwardenrs
      # (testing local) - DATABASE_URL=postgresql://postgres:admin@postgres
      - ENABLE_DB_WAL=false
      - SIGNUPS_ALLOWED=true
      #- WEBSOCKET_ENABLED=true
      - ADMIN_TOKEN=<<token>>
      - LOG_FILE=/data/bitwarden.log
      - LOG_LEVEL=warn
      - DATABASE_MAX_CONNS=25
volumes:
  pgdata:

Expected behaviour

Should load faster (~up to 5-10 seconds could be acceptable)

Actual behaviour

Really slow. (~1 min)

Relevant code blocks

(I don't usally code in Rust but I found the place where we might start looking) Is there a way to use query joins instead of doing many requests?

https://github.com/dani-garcia/bitwarden_rs/blob/d69be7d03a0369faf1f6be6ed2cb908ec6b7a253/src/api/core/ciphers.rs#L86-L106

  • On api/core/ciphers.rs L95, each collection would generate a query for details.

https://github.com/dani-garcia/bitwarden_rs/blob/d69be7d03a0369faf1f6be6ed2cb908ec6b7a253/src/db/models/collection.rs#L60-L67

https://github.com/dani-garcia/bitwarden_rs/blob/d69be7d03a0369faf1f6be6ed2cb908ec6b7a253/src/db/models/collection.rs#L227-L247

  • On api/core/ciphers.rs L104, each cipher would generate a query for attachments.
    etc.

https://github.com/dani-garcia/bitwarden_rs/blob/d69be7d03a0369faf1f6be6ed2cb908ec6b7a253/src/db/models/cipher.rs#L82-L85

@jjlin
Copy link
Contributor

jjlin commented Feb 18, 2021

You're welcome to investigate if there's a clean way to reduce the number of queries, but otherwise it's a relatively niche use case that I don't think the main contributors are willing to spend much time on.

@raphaelcoutu
Copy link
Author

Yes, sure, I'll try to investigate this by my own! I just wanted to flag this problem so people who are used to code in rust can help me find a cleaner way of doing those requests!

@BlackDex BlackDex added enhancement New feature or request low priority Won't fix anytime soon, but will accept PR if provided labels Feb 25, 2021
@dislazy
Copy link

dislazy commented May 18, 2021

I also encountered the same problem, can I consider introducing a caching mechanism?

@jjlin
Copy link
Contributor

jjlin commented May 18, 2021

Feel free to propose something, but I suspect that there won't be a clean solution to this that isn't effectively a rewrite of the project.

@BlackDex
Copy link
Collaborator

Also, keep in mind that some people are running multiple vaultwarden instances with one database, which makes caching done internally a bit of a pain. You then would need to go to tools like redis or memcached or something, which i doubt would be beneficial to the project.

@attie
Copy link

attie commented Nov 3, 2021

I'm not sure if I'm seeing this effect too, or if it's unrelated... I've just migrated from the official Bitwarden server to Vaultwarden (docker, v1.23.0, SQLite database).

I have around 400 items in my vault, and the syncs are noticeably slower than before...

I will try to dig further when I get the chance.

@BlackDex
Copy link
Collaborator

BlackDex commented Nov 3, 2021

Keep in mind that it also matters on what kind of hardware it is running, the network connection etc.. etc...
The Bitwarden Cloud server is probably running multiple servers, and if not, it is probably large enough to handle a lot of load.
Also it's network connection is probably big enough.

But, any help on improving speed in any way would be great.
Best thing to check is the response time from the server via the Developer Console for example.
And to be fair, make sure it is always done in a clean incognito/private browser so that no previous cache is used which makes it all look different.

@attie
Copy link

attie commented Nov 3, 2021

Ah, apologies, these are both self hosted - Bitwarden in docker in a VM, Vaultwarden on native docker (no VM), both on the same physical hardware.

@attie
Copy link

attie commented Nov 9, 2021

Apologies for my noise here, it seems that the performance issue I was seeing is fully addressed by moving away from SQLite to MariaDB (not totally unexpected).

More info

Test setup:

  • NUC (i7 8559U), 32GB RAM, NVMe local storage, NFS remote storage (Gigabit network, ZFS)
  • ~400 items in my vault / 207KiB compact JSON
  • Querying /api/ciphers, measuring TTFB (~1.15 MiB payload)

Tests:

  • Vaultwarden, Docker, SQLite, remote storage, ~3.65s
  • Vaultwarden, Docker, local MariaDB service, local storage, ~200ms
  • Vaultwarden, Docker, local MariaDB service, remote storage, ~225ms

@bendem
Copy link

bendem commented Dec 2, 2021

I've investigated this a bit with some test data (200 collections, 600 ciphers), here is a rough estimate:

  • collection query: 1
    • to_json_details
      • is_writable_by_user: 200
        • find_by_user_and_org: 0-200
      • hide_passwords_for_user: 200
        • find_by_user_and_org: 0-200
  • cipher query: 1
    • to_json:
      • Attachment::find_by_cipher: 600
      • get_access_restrictions: 0-600
        • is_in_full_access_org: 0-600
      • get_collections: 600

That's between 1062 and 2122 sql queries to prepare the full json blob. And that's not counting the queries for Folders, Policies and Sends. The sync query takes about 6s to load.

@lifeofguenter
Copy link

lifeofguenter commented Jan 7, 2022

@raphaelcoutu, @BlackDex: 30-50ms latency for a database connection is not a good thing in any case. Usually you would want to keep it below 10ms, however, as @bendem discovered, doing 1-2k SQL queries for a single API call might not be ideal either and will definitely cause issues no matter the hardware / connectivity on larger collections. Definitely not a niche issue, unless the majority of users have less than 50 passwords.

This can be avoided by making use of JOINs or SELECT .. IN (1, 2, 3, ...) or even just a SELECT by some parent-id.

@BlackDex
Copy link
Collaborator

BlackDex commented Jan 7, 2022

i can take a better look into the queries. But JOIN's arn't that easy with all the different tables and rust structs etc...

@bendem
Copy link

bendem commented Feb 1, 2022

We indeed need an optimised code path for that route. The danger lies in duplicating all the (already quite complex) code that ensures users only see what they are authorised to see. That duplication would mean there is a chance that part of the code gets updated and the optimised query does not (or the other way around).

@bendem
Copy link

bendem commented Feb 1, 2022

I just tried looking into this but I can't for the life of me figure out how to add diesel_logger. Where does DbConn get transformed into an actual connection ? It's just macros over macros and I'm not a rust dev.

@BlackDex
Copy link
Collaborator

BlackDex commented Feb 1, 2022

It's relatively simple actually:
I didn't have time my self to check this out on how to optimize this unfortunately.

Also, do not forget the to set the log_level to DEBUG, else you will not see the output.

diff --git a/Cargo.toml b/Cargo.toml
index 46a7ca0..0432bf5 100644
--- a/Cargo.toml
+++ b/Cargo.toml
@@ -65,6 +65,7 @@ fern = { version = "0.6.0", features = ["syslog-4"] }
 # A safe, extensible ORM and Query builder
 diesel = { version = "1.4.8", features = [ "chrono", "r2d2"] }
 diesel_migrations = "1.4.0"
+diesel_logger = "0.1.1"
 
 # Bundled SQLite
 libsqlite3-sys = { version = "0.22.2", features = ["bundled"], optional = true }
diff --git a/src/db/mod.rs b/src/db/mod.rs
index bcbb7ce..586f4f5 100644
--- a/src/db/mod.rs
+++ b/src/db/mod.rs
@@ -72,9 +72,9 @@ macro_rules! generate_connections {
 }
 
 generate_connections! {
-    sqlite: diesel::sqlite::SqliteConnection,
-    mysql: diesel::mysql::MysqlConnection,
-    postgresql: diesel::pg::PgConnection
+    sqlite: diesel_logger::LoggingConnection<diesel::sqlite::SqliteConnection>,
+    mysql: diesel_logger::LoggingConnection<diesel::mysql::MysqlConnection>,
+    postgresql: diesel_logger::LoggingConnection<diesel::pg::PgConnection>
 }
 
 impl DbConnType {

@bendem
Copy link

bendem commented Feb 8, 2022

Thanks, I'm sure it's easy if you know how macros work, that's why I asked :)

BlackDex added a commit to BlackDex/vaultwarden that referenced this issue May 4, 2022
Improved sync speed by resolving the N+1 query issues.
Solves dani-garcia#1402 and Solves dani-garcia#1453

With this change there is just one query done to retreive all the
important data, and matching is done in-code/memory.

With a very large database the sync time went down about 3 times.

Also updated misc crates and Github Actions versions.
BlackDex added a commit to BlackDex/vaultwarden that referenced this issue May 6, 2022
Improved sync speed by resolving the N+1 query issues.
Solves dani-garcia#1402 and Solves dani-garcia#1453

With this change there is just one query done to retreive all the
important data, and matching is done in-code/memory.

With a very large database the sync time went down about 3 times.

Also updated misc crates and Github Actions versions.
@BlackDex
Copy link
Collaborator

Solved via #2429

@jtsang4
Copy link

jtsang4 commented May 12, 2022

@BlackDex Amazing work! The user experience is improved so much!

@bendem
Copy link

bendem commented May 12, 2022

Very nice, can't wait for the next release!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request low priority Won't fix anytime soon, but will accept PR if provided
Projects
None yet
Development

No branches or pull requests

8 participants