-
Notifications
You must be signed in to change notification settings - Fork 104
Closed
Description
I try some (personal) benchmark on distributed HNSW index.
Deploy 5 pods in k8s and each pod has 2gb RAM / 10 CPU cores.
During benchmark, I noticed some spike on elapsed time on query but cannot figure out.
Here is some configuration / scripts in bench
DATA_SIZE = 2000
QUERY_SIZE = 100
DIM = 256
TOP_K = 100
client = weaviate.Client("http://localhost:8080")
samples = np.random.randn(DATA_SIZE, DIM).astype("float32")
....
for k in range(QUERY_SIZE):
start_time = time.time()
target_query = {"vector": samples[k]}
result = (client.query.get(TARGET_CLASS, ["index_id"]).with_near_vector(target_query).with_limit(TOP_K).do())
end_time = time.time()
elapsed = 1000. * (end_time - start_time)
print(f"{k}: Elapsed", elapsed)trial (1)
....
91: Elapsed 8.621692657470703
92: Elapsed 8.2550048828125
93: Elapsed 7.802248001098633
94: Elapsed 86.1060619354248
95: Elapsed 7.91478157043457
96: Elapsed 81.93159103393555
97: Elapsed 78.98426055908203 # why this happened?
....trial (2)
91: Elapsed 9.263992309570312
92: Elapsed 87.82005310058594
93: Elapsed 8.857011795043945
94: Elapsed 88.12785148620605
95: Elapsed 9.41157341003418
96: Elapsed 87.73326873779297
97: Elapsed 8.994817733764648 # same vector as previous trial but doesn't show spike this time.schema
scahem.get {
"class": "RandomVectors",
"invertedIndexConfig": {
"bm25": {
"b": 0.75,
"k1": 1.2
},
"cleanupIntervalSeconds": 60,
"stopwords": {
"additions": null,
"preset": "en",
"removals": null
}
},
"properties": [
{
"dataType": [
"int"
],
"indexInverted": false,
"name": "index_id"
}
],
"replicationConfig": {
"factor": 1
},
"shardingConfig": {
"virtualPerPhysical": 128,
"desiredCount": 1,
"actualCount": 1,
"desiredVirtualCount": 128,
"actualVirtualCount": 128,
"key": "_id",
"strategy": "hash",
"function": "murmur3"
},
"vectorIndexConfig": {
"skip": false,
"cleanupIntervalSeconds": 300,
"maxConnections": 48,
"efConstruction": 128,
"ef": -1,
"dynamicEfMin": 100,
"dynamicEfMax": 500,
"dynamicEfFactor": 8,
"vectorCacheMaxObjects": 1000000000000,
"flatSearchCutoff": 0,
"distance": "dot"
},
"vectorIndexType": "hnsw",
"vectorizer": "none"
}
I changed shardingConfig from 1 to 5 but spike always happened.
Metadata
Metadata
Assignees
Labels
No labels