Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -137,6 +137,7 @@ While the main benchmark uses a specific machine configuration for reproducibili
- [x] VictoriaLogs
- [x] SingleStore
- [x] GreptimeDB
- [x] FerretDB
- [ ] Quickwit
- [ ] Meilisearch
- [ ] Sneller
Expand All @@ -146,7 +147,6 @@ While the main benchmark uses a specific machine configuration for reproducibili
- [ ] OpenText Vertica
- [ ] PartiQL
- [ ] FishStore
- [ ] FerretDB
- [ ] Apache Drill
- [ ] GlareDB

Expand Down
1 change: 1 addition & 0 deletions ferretdb/benchmark.sh
1 change: 1 addition & 0 deletions ferretdb/count.sh
1 change: 1 addition & 0 deletions ferretdb/create_and_load.sh
1 change: 1 addition & 0 deletions ferretdb/data_size.sh
1 change: 1 addition & 0 deletions ferretdb/ddl_snappy.js
1 change: 1 addition & 0 deletions ferretdb/ddl_zstd.js
1 change: 1 addition & 0 deletions ferretdb/drop_table.sh
1 change: 1 addition & 0 deletions ferretdb/index_size.sh
47 changes: 47 additions & 0 deletions ferretdb/index_usage.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
#!/bin/bash

# If you change something in this file, please also change mongodb/index_usage.sh

# Check if the required arguments are provided
if [[ $# -lt 1 ]]; then
echo "Usage: $0 <DB_NAME>"
exit 1
fi

# Arguments
DB_NAME="$1"

QUERY_NUM=1

# File containing MongoDB queries (replace 'queries.js' with your file)
QUERY_FILE="queries.js"

# Check if the query file exists
if [[ ! -f "$QUERY_FILE" ]]; then
echo "Error: Query file '$QUERY_FILE' does not exist."
exit 1
fi

cat "$QUERY_FILE" | while read -r query; do

# Print the query number
echo "------------------------------------------------------------------------------------------------------------------------"
echo "Index usage for query Q$QUERY_NUM:"
echo

# Modify the query to include the explain option inside the aggregate call
MODIFIED_QUERY=$(echo "$query" | sed 's/]);$/], { explain: "queryPlanner" });/')

# Escape the modified query for safe passing to mongosh
ESCAPED_QUERY=$(echo "$MODIFIED_QUERY" | sed 's/\([\"\\]\)/\\\1/g' | sed 's/\$/\\$/g')

# Due to a difference in query planner outputs from postgresql and mongodb, entire json is printed here.
mongosh --quiet --eval "
const db = db.getSiblingDB('$DB_NAME');
const result = eval(\"$ESCAPED_QUERY\");
printjson(result);
"

# Increment the query number
QUERY_NUM=$((QUERY_NUM + 1))
done;
30 changes: 30 additions & 0 deletions ferretdb/install.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
#!/bin/bash

sudo snap install docker

sudo sudo apt-get install gnupg curl
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: Why do we install MongoDB here? Is this a leftover from the corresponding MongoDB benchmark scripts? I am asking because 1. the uninstall script doesn't remove MongoDB (which supports the theory it's a leftover) and 2. all other scripts in this PR call mongosh. I have the suspicion that we are really benchmarking MongoDB and not FerretDB.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see that this PR does not start the MongoDB daemon (unlike the MongoDB scripts).

Is there any way to check we are really running queries against FerretDB?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since FerretDB is supposed to be a drop-in replacement for MongoDB and since almost all of the scripts are 100% copypaste of the MongoDB scripts, it would be nice if you could delete them, and create symbolic symlinks to the MongoDB scripts instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason i installed mongodb was because the benchmark scripts uses mongosh and some more mongo specific imports which required them. I do not actually start a mongodb container i just installed it for using mongosh primarily to communicate with the ferretdb docker image. Good call i will add uninstallation in the uninstall.sh as well.

Is there any way to check we are really running queries against FerretDB?

Ferretdb docker container is mapped to mongo db's default port 27017.
One way to make sure that queries are not running on mongo db is through the output of the querry planner in index_usage logs. Its different from mongodb since ferretdb use postgresql and has cost models in the output.

While it is supposed to be a 100% drop in replacement, there are few configurations which are yet to be supported. Eg: enabling covering indexes on ferretdb always fails. I have enabled index_only scans on postgresql containers for this reason.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rschu1ze i ran the benchmark again and here are logs from the postgres container.

PostgreSQL init process complete; ready for start up.

2025-03-23 14:46:29.397 UTC [1] LOG:  Initialized documentdb_core extension
2025-03-23 14:46:29.398 UTC [1] LOG:  Initialized pg_documentdb extension
2025-03-23 14:46:29.406 UTC [1] LOG:  starting PostgreSQL 17.4 (Debian 17.4-1.pgdg120+2) on x86_64-pc-linux-gnu, compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit
2025-03-23 14:46:29.406 UTC [1] LOG:  listening on IPv4 address "0.0.0.0", port 5432
2025-03-23 14:46:29.406 UTC [1] LOG:  listening on IPv6 address "::", port 5432
2025-03-23 14:46:29.412 UTC [1] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2025-03-23 14:46:29.419 UTC [78] LOG:  database system was shut down at 2025-03-23 14:46:29 UTC
2025-03-23 14:46:29.426 UTC [1] LOG:  database system is ready to accept connections
2025-03-23 14:46:29.430 UTC [81] LOG:  pg_cron scheduler started
2025-03-23 14:46:30.220 UTC [84] LOG:  database bluesky_1m_snappy has collections: false
2025-03-23 14:46:30.220 UTC [84] CONTEXT:  SQL statement "SELECT documentdb_api.create_collection($1, $2)"
2025-03-23 14:46:30.220 UTC [84] STATEMENT:  SELECT p_result::bytea, p_success FROM documentdb_api.insert($1, $2::bytea, $3::bytea)
2025-03-23 14:46:30.232 UTC [84] LOG:  Creating and returning documentdb_data.documents_1 for the sentinel database bluesky_1m_snappy
2025-03-23 14:46:30.232 UTC [84] CONTEXT:  SQL statement "SELECT documentdb_api.create_collection($1, $2)"
2025-03-23 14:46:30.232 UTC [84] STATEMENT:  SELECT p_result::bytea, p_success FROM documentdb_api.insert($1, $2::bytea, $3::bytea)
2025-03-23 14:47:00.007 UTC [81] LOG:  cron job 1 starting: CALL documentdb_api_internal.delete_expired_rows();
2025-03-23 14:47:00.027 UTC [81] LOG:  cron job 1 COMMAND completed: CALL 
2025-03-23 14:47:23.015 UTC [76] LOG:  checkpoint starting: wal
2025-03-23 14:48:00.010 UTC [81] LOG:  cron job 1 starting: CALL documentdb_api_internal.delete_expired_rows();
2025-03-23 14:48:00.026 UTC [81] LOG:  cron job 1 COMMAND completed: CALL 
2025-03-23 14:48:12.845 UTC [195] LOG:  database bluesky_1m_zstd has collections: false
2025-03-23 14:48:12.845 UTC [195] STATEMENT:  SELECT create_collection FROM documentdb_api.create_collection($1, $2)
2025-03-23 14:48:12.852 UTC [195] LOG:  Creating and returning documentdb_data.documents_3 for the sentinel database bluesky_1m_zstd
2025-03-23 14:48:12.852 UTC [195] STATEMENT:  SELECT create_collection FROM documentdb_api.create_collection($1, $2)

Just to be sure mongod wasn't running during benchmarking:

systemctl status mongod
○ mongod.service - MongoDB Database Server
     Loaded: loaded (/usr/lib/systemd/system/mongod.service; disabled; preset: enabled)
     Active: inactive (dead)
       Docs: https://docs.mongodb.org/manual

curl -fsSL https://www.mongodb.org/static/pgp/server-8.0.asc | \
sudo gpg --dearmor --yes -o /usr/share/keyrings/mongodb-server-8.0.gpg
echo "deb [ arch=amd64,arm64 signed-by=/usr/share/keyrings/mongodb-server-8.0.gpg ] https://repo.mongodb.org/apt/ubuntu noble/mongodb-org/8.0 multiverse" | sudo tee /etc/apt/sources.list.d/mongodb-org-8.0.list
sudo apt-get update
sudo apt-get install -y mongodb-org

# Run PostgreSQL with DocumentDB as storage extension
docker run -d --name postgres \
--platform linux/amd64 \
--restart on-failure \
-e POSTGRES_USER=username \
-e POSTGRES_PASSWORD=password \
-e POSTGRES_DB=postgres \
-v pgdata:/var/lib/postgresql/data \
ghcr.io/ferretdb/postgres-documentdb:17-0.102.0-ferretdb-2.0.0 \
-c enable_indexscan=on -c enable_indexonlyscan=on

# Run FerretDB
docker run -d --name ferretdb \
--restart on-failure \
--link postgres \
-p 27017:27017 \
-e FERRETDB_POSTGRESQL_URL=postgres://username:password@postgres:5432/postgres \
-e FERRETDB_AUTH=false \
ghcr.io/ferretdb/ferretdb:2.0.0
1 change: 1 addition & 0 deletions ferretdb/load_data.sh
1 change: 1 addition & 0 deletions ferretdb/main.sh
1 change: 1 addition & 0 deletions ferretdb/queries.js
1 change: 1 addition & 0 deletions ferretdb/queries_formatted.js
1 change: 1 addition & 0 deletions ferretdb/query_results.sh
Loading