redis-performance
diff --git a/‎README.md‎
Lines changed: 8 additions & 125 deletions b/‎README.md‎
Lines changed: 8 additions & 125 deletions
diff --git a/‎engine/base_client/client.py‎
Lines changed: 26 additions & 20 deletions b/‎engine/base_client/client.py‎
Lines changed: 26 additions & 20 deletions
diff --git a/‎engine/base_client/configure.py‎
Lines changed: 3 additions & 0 deletions b/‎engine/base_client/configure.py‎
Lines changed: 3 additions & 0 deletions
diff --git a/‎engine/clients/milvus/configure.py‎
Lines changed: 1 addition & 0 deletions b/‎engine/clients/milvus/configure.py‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎engine/clients/qdrant/search.py‎
Lines changed: 5 additions & 2 deletions b/‎engine/clients/qdrant/search.py‎
Lines changed: 5 additions & 2 deletions
diff --git a/‎engine/servers/weaviate-single-node/docker-compose.yaml‎
Lines changed: 3 additions & 1 deletion b/‎engine/servers/weaviate-single-node/docker-compose.yaml‎
Lines changed: 3 additions & 1 deletion
diff --git a/‎experiments/configurations/qdrant-single-node-rps.json‎
Lines changed: 6 additions & 0 deletions b/‎experiments/configurations/qdrant-single-node-rps.json‎
Lines changed: 6 additions & 0 deletions
@@ -12,138 +12,21 @@ scenario against which it should be tested. A specific scenario may assume
 running the server in a single or distributed mode, a different client
 implementation and the number of client instances.
 
-## TL;DR
+## How to run a benchmark?
 
-First, we need to run the server instance:
+Benchmarks are implemented in server-client mode, meaning that the server is
+running in a single machine, and the client is running on another.
 
-```shell
-python main.py run-server qdrant-0.8.4
-```
+### Run the server
 
-After launching it, we can configure the engine to accept loading the data:
 
-```shell
-python main.py run-client qdrant-0.8.4 configure random-100
-```
+### Run the client
 
-Then, a client instance may be launched, or even several ones:
 
-```shell
-python main.py run-client qdrant-0.8.4 load random-100
-```
 
-- `qdrant-0.8.4` is an engine to launch
-- `random-100` defines a dataset
+## How to update benchmark parameters?
 
-Expected output should look like following:
+## How to register a dataset?
 
-```text
-sum(load::time) = 0.22590500000000002
-count(load::time) = 100
-mean(load::time) = 0.0022590500000000003
-```
+## How to implement a new engine?
 
-### Backend
-
-A specific way of managing the containers. Right now only Docker, but might be
-Docker Swarm, SSH or Kubernetes, so the benchmark is not executed on a single
-machine, but on several servers.
-
-### Engine
-
-There are various vector search projects available. Some of them are just pure
-libraries (like FAISS or Annoy) and they offer great performance, but doesn't
-fit well any production systems. Those could be also benchmarked, however the
-primary focus is on vector databases using client-server architecture.
-
-All the engine configurations are kept in `./engine` subdirectories.
-
-Each engine has its own configuration defined in `config.json` file:
-
-```json
-{
-  "server": {
-    "image": "qdrant/qdrant:v0.8.4",
-    "hostname": "qdrant_server",
-    "environment": {
-      "DEBUG": true
-    }
-  },
-  "client": {
-    "dockerfile": "client.Dockerfile",
-    "main": "python cmd.py"
-  }
-}
-```
-
-- Either `image` or `dockerfile` has to be defined, similar to
-  `docker-compose.yaml` file. The `dockerfile` has a precedence over `image`
-- The `main` parameter points to a main client script which takes parameters.
-  Those parameters define the operations to perform with a client library.
-
-##### Supported engines:
-
-- `qdrant-0.8.4`
-- `elasticsearch-8.3.1`
-
-#### Server
-
-The server is a process, or a bunch of processes, responsible for creating
-vector indexes and handling all the user requests. It may be run on a single
-machine, or in case of some engines using the distributed mode (**in the future**).
-
-#### Client
-
-A client process performing all the operations, as it would be typically done in
-any client-server based communication. There might be several clients launched
-in parallel and each of them might be using part of the data. The number of
-clients depends on the scenario.
-
-Each client has to define a main script which takes some parameters and allow
-performing typical CRUD-like operations:
-
-- `load [file]`
-- `search [file]`
-
-If the scenario attempts to load the data from a given file, then it will call
-the following command:
-
-`python cmd.py load vectors.jsonl`
-
-The main script has to handle the conversion and load operations.
-
-By introducing a main script, we can allow using different client libraries, if
-available, so there is no assumption about the language used, as long as it can
-accept parameters.
-
-### Dataset
-
-Consists of vectors and/or payloads. Scenario decides what to do with the data.
-
-## Metrics
-
-Metrics are being measured by the clients themselves and displayed on stdout.
-The benchmark will collect all the metrics and display some statistics at the
-end of each test.
-
-All the displayed metrics should be printed in the following way:
-
-```shell
-phase::kpi_name = 0.242142
-```
-
-Where `0.242142` is a numerical value specific for the `kpi_name`. In the
-simplest case that might be a time spent in a specific operation, like:
-
-```
-load::time = 0.0052424
-```
-
-## Open topics
-
-1. The list of supported KPIs should be still established and implemented by
-   every single engine, so can be tracked in all the benchmark scenarios.
-2. What should be the format supported in the datasets? JSON lines are cross
-   language and platform, what makes them easy to be parsed to whatever format
-   a specific engine support.
-3. How do we handle engine errors?
@@ -14,19 +14,19 @@
 
 class BaseClient:
     def __init__(
-        self,
-        name: str,  # name of the experiment
-        configurator: BaseConfigurator,
-        uploader: BaseUploader,
-        searchers: List[BaseSearcher],
+            self,
+            name: str,  # name of the experiment
+            configurator: BaseConfigurator,
+            uploader: BaseUploader,
+            searchers: List[BaseSearcher],
     ):
         self.name = name
         self.configurator = configurator
         self.uploader = uploader
         self.searchers = searchers
 
     def save_search_results(
-        self, dataset_name: str, results: dict, search_id: int, search_params: dict
+            self, dataset_name: str, results: dict, search_id: int, search_params: dict
     ):
         now = datetime.now()
         timestamp = now.strftime("%Y-%m-%d-%H-%M-%S")
@@ -49,23 +49,29 @@ def save_upload_results(self, dataset_name: str, results: dict, upload_params: d
             }
             out.write(json.dumps(upload_stats, indent=2))
 
-    def run_experiment(self, dataset: Dataset):
-        print("Experiment stage: Configure")
-        execution_params = self.configurator.configure(
+    def run_experiment(self, dataset: Dataset, skip_upload: bool = False):
+        execution_params = self.configurator.execution_params(
             distance=dataset.config.distance,
-            vector_size=dataset.config.vector_size,
-        )
+            vector_size=dataset.config.vector_size)
 
         reader = dataset.get_reader(execution_params.get("normalize", False))
-        print("Experiment stage: Upload")
-        upload_stats = self.uploader.upload(
-            distance=dataset.config.distance,
-            records=reader.read_data()
-        )
-        self.save_upload_results(dataset.config.name, upload_stats, upload_params={
-            **self.uploader.upload_params,
-            **self.configurator.collection_params
-        })
+
+        if not skip_upload:
+            print("Experiment stage: Configure")
+            self.configurator.configure(
+                distance=dataset.config.distance,
+                vector_size=dataset.config.vector_size,
+            )
+
+            print("Experiment stage: Upload")
+            upload_stats = self.uploader.upload(
+                distance=dataset.config.distance,
+                records=reader.read_data()
+            )
+            self.save_upload_results(dataset.config.name, upload_stats, upload_params={
+                **self.uploader.upload_params,
+                **self.configurator.collection_params
+            })
 
         print("Experiment stage: Search")
         for search_id, searcher in enumerate(self.searchers):
 
@@ -18,3 +18,6 @@ def recreate(self, distance, vector_size, collection_params):
     def configure(self, distance, vector_size) -> Optional[dict]:
         self.clean()
         return self.recreate(distance, vector_size, self.collection_params) or {}
+
+    def execution_params(self, distance, vector_size) -> dict:
+        return {}
@@ -65,4 +65,5 @@ def recreate(
         for index in collection.indexes:
             index.drop()
 
+    def execution_params(self, distance, vector_size):
         return {"normalize": distance == Distance.COSINE}
@@ -1,7 +1,10 @@
+import multiprocessing
 from typing import List, Optional, Tuple
 
+import httpx
 from qdrant_client import QdrantClient
 from qdrant_client.http import models as rest
+from qdrant_client import grpc
 
 from engine.base_client.search import BaseSearcher
 from engine.clients.qdrant.config import QDRANT_COLLECTION_NAME
@@ -28,7 +31,7 @@ def search_one(cls, vector, meta_conditions, top) -> List[Tuple[int, float]]:
             query_vector=vector,
             query_filter=cls.conditions_to_filter(meta_conditions),
             limit=top,
-            **cls.search_params
+            search_params=rest.SearchParams(**cls.search_params.get("search_params", {})),
         )
-
         return [(hit.id, hit.score) for hit in res]
+
@@ -18,7 +18,9 @@ services:
       DEFAULT_VECTORIZER_MODULE: 'none'
       ENABLE_MODULES: ''
       CLUSTER_HOSTNAME: 'node1'
+      GOMEMLIMIT: 25GiB
+      GOGC: 50
     deploy:
       resources:
         limits:
-          memory: 25Gb
+          memory: 27Gb
@@ -8,6 +8,7 @@
       "hnsw_config": { "m": 16, "ef_construct": 128 }
     },
     "search_params": [
+      { "parallel": 1, "search_params": { "hnsw_ef": 64 } }, { "parallel": 1, "search_params": { "hnsw_ef": 128 } }, { "parallel": 1, "search_params": { "hnsw_ef": 256 } }, { "parallel": 1, "search_params": { "hnsw_ef": 512 } },
       { "parallel": 2, "search_params": { "hnsw_ef": 64 } }, { "parallel": 2, "search_params": { "hnsw_ef": 128 } }, { "parallel": 2, "search_params": { "hnsw_ef": 256 } }, { "parallel": 2, "search_params": { "hnsw_ef": 512 } },
       { "parallel": 4, "search_params": { "hnsw_ef": 64 } }, { "parallel": 4, "search_params": { "hnsw_ef": 128 } }, { "parallel": 4, "search_params": { "hnsw_ef": 256 } }, { "parallel": 4, "search_params": { "hnsw_ef": 512 } },
       { "parallel": 8, "search_params": { "hnsw_ef": 64 } }, { "parallel": 8, "search_params": { "hnsw_ef": 128 } }, { "parallel": 8, "search_params": { "hnsw_ef": 256 } }, { "parallel": 8, "search_params": { "hnsw_ef": 512 } },
@@ -24,6 +25,7 @@
       "hnsw_config": { "m": 32, "ef_construct": 128 }
     },
     "search_params": [
+      { "parallel": 1, "search_params": { "hnsw_ef": 64 } }, { "parallel": 1, "search_params": { "hnsw_ef": 128 } }, { "parallel": 1, "search_params": { "hnsw_ef": 256 } }, { "parallel": 1, "search_params": { "hnsw_ef": 512 } },
       { "parallel": 2, "search_params": { "hnsw_ef": 64 } }, { "parallel": 2, "search_params": { "hnsw_ef": 128 } }, { "parallel": 2, "search_params": { "hnsw_ef": 256 } }, { "parallel": 2, "search_params": { "hnsw_ef": 512 } },
       { "parallel": 4, "search_params": { "hnsw_ef": 64 } }, { "parallel": 4, "search_params": { "hnsw_ef": 128 } }, { "parallel": 4, "search_params": { "hnsw_ef": 256 } }, { "parallel": 4, "search_params": { "hnsw_ef": 512 } },
       { "parallel": 8, "search_params": { "hnsw_ef": 64 } }, { "parallel": 8, "search_params": { "hnsw_ef": 128 } }, { "parallel": 8, "search_params": { "hnsw_ef": 256 } }, { "parallel": 8, "search_params": { "hnsw_ef": 512 } },
@@ -40,6 +42,7 @@
       "hnsw_config": { "m": 32, "ef_construct": 256 }
     },
     "search_params": [
+      { "parallel": 1, "search_params": { "hnsw_ef": 64 } }, { "parallel": 1, "search_params": { "hnsw_ef": 128 } }, { "parallel": 1, "search_params": { "hnsw_ef": 256 } }, { "parallel": 1, "search_params": { "hnsw_ef": 512 } },
       { "parallel": 2, "search_params": { "hnsw_ef": 64 } }, { "parallel": 2, "search_params": { "hnsw_ef": 128 } }, { "parallel": 2, "search_params": { "hnsw_ef": 256 } }, { "parallel": 2, "search_params": { "hnsw_ef": 512 } },
       { "parallel": 4, "search_params": { "hnsw_ef": 64 } }, { "parallel": 4, "search_params": { "hnsw_ef": 128 } }, { "parallel": 4, "search_params": { "hnsw_ef": 256 } }, { "parallel": 4, "search_params": { "hnsw_ef": 512 } },
       { "parallel": 8, "search_params": { "hnsw_ef": 64 } }, { "parallel": 8, "search_params": { "hnsw_ef": 128 } }, { "parallel": 8, "search_params": { "hnsw_ef": 256 } }, { "parallel": 8, "search_params": { "hnsw_ef": 512 } },
@@ -56,6 +59,7 @@
       "hnsw_config": { "m": 32, "ef_construct": 512 }
     },
     "search_params": [
+      { "parallel": 1, "search_params": { "hnsw_ef": 64 } }, { "parallel": 1, "search_params": { "hnsw_ef": 128 } }, { "parallel": 1, "search_params": { "hnsw_ef": 256 } }, { "parallel": 1, "search_params": { "hnsw_ef": 512 } },
       { "parallel": 2, "search_params": { "hnsw_ef": 64 } }, { "parallel": 2, "search_params": { "hnsw_ef": 128 } }, { "parallel": 2, "search_params": { "hnsw_ef": 256 } }, { "parallel": 2, "search_params": { "hnsw_ef": 512 } },
       { "parallel": 4, "search_params": { "hnsw_ef": 64 } }, { "parallel": 4, "search_params": { "hnsw_ef": 128 } }, { "parallel": 4, "search_params": { "hnsw_ef": 256 } }, { "parallel": 4, "search_params": { "hnsw_ef": 512 } },
       { "parallel": 8, "search_params": { "hnsw_ef": 64 } }, { "parallel": 8, "search_params": { "hnsw_ef": 128 } }, { "parallel": 8, "search_params": { "hnsw_ef": 256 } }, { "parallel": 8, "search_params": { "hnsw_ef": 512 } },
@@ -72,6 +76,7 @@
       "hnsw_config": { "m": 64, "ef_construct": 256 }
     },
     "search_params": [
+      { "parallel": 1, "search_params": { "hnsw_ef": 64 } }, { "parallel": 1, "search_params": { "hnsw_ef": 128 } }, { "parallel": 1, "search_params": { "hnsw_ef": 256 } }, { "parallel": 1, "search_params": { "hnsw_ef": 512 } },
       { "parallel": 2, "search_params": { "hnsw_ef": 64 } }, { "parallel": 2, "search_params": { "hnsw_ef": 128 } }, { "parallel": 2, "search_params": { "hnsw_ef": 256 } }, { "parallel": 2, "search_params": { "hnsw_ef": 512 } },
       { "parallel": 4, "search_params": { "hnsw_ef": 64 } }, { "parallel": 4, "search_params": { "hnsw_ef": 128 } }, { "parallel": 4, "search_params": { "hnsw_ef": 256 } }, { "parallel": 4, "search_params": { "hnsw_ef": 512 } },
       { "parallel": 8, "search_params": { "hnsw_ef": 64 } }, { "parallel": 8, "search_params": { "hnsw_ef": 128 } }, { "parallel": 8, "search_params": { "hnsw_ef": 256 } }, { "parallel": 8, "search_params": { "hnsw_ef": 512 } },
@@ -88,6 +93,7 @@
       "hnsw_config": { "m": 64, "ef_construct": 512 }
     },
     "search_params": [
+      { "parallel": 1, "search_params": { "hnsw_ef": 64 } }, { "parallel": 1, "search_params": { "hnsw_ef": 128 } }, { "parallel": 1, "search_params": { "hnsw_ef": 256 } }, { "parallel": 1, "search_params": { "hnsw_ef": 512 } },
       { "parallel": 2, "search_params": { "hnsw_ef": 64 } }, { "parallel": 2, "search_params": { "hnsw_ef": 128 } }, { "parallel": 2, "search_params": { "hnsw_ef": 256 } }, { "parallel": 2, "search_params": { "hnsw_ef": 512 } },
       { "parallel": 4, "search_params": { "hnsw_ef": 64 } }, { "parallel": 4, "search_params": { "hnsw_ef": 128 } }, { "parallel": 4, "search_params": { "hnsw_ef": 256 } }, { "parallel": 4, "search_params": { "hnsw_ef": 512 } },
       { "parallel": 8, "search_params": { "hnsw_ef": 64 } }, { "parallel": 8, "search_params": { "hnsw_ef": 128 } }, { "parallel": 8, "search_params": { "hnsw_ef": 256 } }, { "parallel": 8, "search_params": { "hnsw_ef": 512 } },