Skip to content

Latest commit

 

History

History
319 lines (263 loc) · 8.34 KB

loki_quickwit_benchmark.md

File metadata and controls

319 lines (263 loc) · 8.34 KB

Benchmark Results: Quickwit vs. Loki

Goals

This benchmark aims to highlight the differences and trade-offs between Quickwit and Loki for their users. Read more on our dedicated blog post

Dataset

The dataset for this benchmark was generated by the Elastic Integration Corpus Generator Tool, it is available at gs://quickwit-datasets-public/benchmarks/generated-logs/generated-logs-v1-*. We used the first 200 files, totalling 243,527,673 logs representing 212.40 GB. The logs are structured as JSON documents, as shown in this example.

Metrics

Ingestion Metrics

We compared the following metrics during data ingestion:

  • Ingestion time
  • Total CPU time utilized
  • Size of the index created
  • Number of files stored on object storage

Query Benchmarks

We focused on query types commonly used in the Grafana Explore view:

  • Fetching the last 100 logs
  • Analyzing log volume by log level

The following table illustrates the query configurations tested:

Query Last 100 logs Log volume per log level
Match all logs [ ] [x]
Logs with queen [x] [x]
Logs labeled region: us-east-2 [x] [x]
Logs with region: us-east-2 and queen [x] [x]

Metrics recorded for each query include latency, CPU time, and the number of GET requests on the object storage.

Setup

Object storage

We used Google Cloud Storage (GCS).

Caching

All caching mechanisms within Quickwit and Loki were disabled to ensure a fair comparison.

Loki setup

Loki 2.9 was used.

Loki was configured with labels for region and log levels, creating up to 100 distinct streams. Configurations are detailed in the Loki configuration file. Vector was used to route logs to Loki.

Quickwit setup

Quickwit latest build was used, it should be more or less the same than 0.8.1.

Results

Ingestion

Engine Quickwit Loki
Ingestion time (min) 123 (+123%) 55
Mean vCPU 2.2 2.75
Total CPU time (min) ~270 (+80%) ~151
Number of files on GCS 25 145,756 (x5,829)
Bucket size (GiB) 53 55

Queries

Query last 100 logs

Query Metric Quickwit Loki
`queen` Latency (s) 0.6s 9.3s (+1,425%)
CPU Time (s) 2.7s 146s (+5,270%)
# of GET Requests 206 14,821* (+7,095%)
`us-east-2` (label) Latency (s) 0.6s 1.0s (+74%)
CPU Time (s) 2.7s (+35%) 2s
# of GET Requests 211* (+348%) 47
`us-east-2` (label) and `queen` Latency (s) 0.6s 0.98s (+59%)
CPU Time (s) 2.8s 11s (+279%)
# of GET Requests 255 561* (+120%)

Query log volume by level

Query Metric Quickwit Loki
All dataset Latency (s) 2.1s 90s (x42)
CPU Time (s) 22s 1,160s (x51)
# of GET Requests 88 203,665 (x2,313)
`queen` Latency (s) 0.4s 565s (x11,423)
CPU Time (s) 3.2s 8,713s (x12,688)
# of GET Requests 132 204,622 (x1,549)
`us-east-2` (label) Latency (s) 0.6s 4.8s (+685%)
CPU Time (s) 2.8s 40s (x13)
# of GET Requests 211 6,163 (x28)
`us-east-2` (label) and `queen` Latency (s) 0.4s 28s (x70)
CPU Time (s) 2.9s 337s (x115)
# of GET Requests 176 5,596 (x31)

Reproducing the benchmark

Note: this is still a WIP, I need to run again all commands to be sure there is no typo and nothing is missing. Open an issue if something is broken.

Requirements

  • Python > 3.10
  • docker and docker-compose
  • gsutil
  • rustup to build qbench, a tool to ingest data into Quickwit.

Setup

git clone git@github.com:quickwit-oss/benchmarks.git
cd benchmarks
pip install -r requirements.txt

Start Prometheus and Grafana to store Loki metrics and visualize them.

docker compose up

Download datasets

# Download generated-logs-v1
mkdir /data/datasets
gsutil rsync -y ".*logs-\d{4}.*\.gz$" gs://quickwit-datasets-public/benchmarks/generated-logs/ /data/datasets

Run Loki benchmark

You need to have read/write permissions on a GCS bucket named bench202403-loki-100streams.

Ingest data

We will ingest logs with Vector, it will take a bit less than 1h.

cd engines/loki
## start Loki
make start
## start vector
docker run -i -v $(pwd)/:/etc/vector/ -v /data/datasets/generated-logs-v1/:/datasets/ --net benchmark --rm timberio/vector:0.36.0-debian --config /etc/vector/vector_100streams.yaml

Benchmark the queries

# Go back to the root directory of the repository
cd ../..
python3 run.py --engine loki --storage s3 --track generated-logs-for-loki --engine_specific_queries_subdir loki_100streams_100docs --instance n2-16vcpus --tags loki-benchmark --search-only

# Results are stored in a JSON file
cat results/generated-logs-for-loki.loki.loki-benchmark.n2-16vcpus/search-results.json

Command to get the number of files stored by Loki

gsutil ls -lR gs://bench20240414-loki-2-9-6--100streams | tail -n 1
TOTAL: 145756 objects, 55137804027 bytes (51.35 GiB)

Run Quickwit benchmark

You need to have read/write permissions on the GCS bucket named gcs://quickwit-dev.

Compile qbench

cd qbench
cargo build --release
cd ..

Update quickwit.yaml

You need to edit the quickwit config file at engines/quickwit/configs/quickwit.yaml to specify the metastore URI, default index root URI and your access/secret keys.

+metastore_uri: s3://bench202403-loki-qwdata/indexes
+default_index_root_uri: s3://bench202403-loki-qwdata/indexes
+
+storage:
+  s3:
+    flavor: gcs
+    region: us-east1
+    endpoint: https://storage.googleapis.com
+    access_key_id: XXXX
+    secret_access_key: XXXX

Ingest data and run queries

We will ingest logs with qbench, it will take a bit less than 1h.

## First start Quickwit
cd engines/quickwit
make start
cd ../..
## Then run ingest + queries
python3 run.py --engine quickwit --storage s3 --track generated-logs-for-loki --engine_specific_queries_subdir qw_100docs --instance n2-16vcpus --tags ingest-v1

This will produce two files indexing-results.json and search-results.json in the directory results/generated-logs-for-loki.quickwit.ingest-v1.n2-16vcpus/

You should now have all the results used in the benchmark.