Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Oversizing vectors in faiss querying functionality #497

Closed
jmazanec15 opened this issue Aug 8, 2022 · 1 comment
Closed

Oversizing vectors in faiss querying functionality #497

jmazanec15 opened this issue Aug 8, 2022 · 1 comment
Labels
bug Something isn't working

Comments

@jmazanec15
Copy link
Member

In our jni layer, in knn_jni::faiss_wrapper::QueryIndex, we execute a query against a faiss index.

We initialize 2 c++ vectors to have k*dimension elements:

    std::vector<float> dis(kJ * dim);
    std::vector<faiss::Index::idx_t> ids(kJ * dim);

"dis" refers to a vector of distances and "ids" refers to a vector of returned ids. Given that "kJ" results are returned, and that distance and ID only have a single value, multiplying by "dim" overallocates memory. This does not have correctness implications, however, it is an inefficiency that may impact performance.

This should be fixed, but Id also like to compare the metrics before and after the fix to see how much this impacts performance.

@jmazanec15 jmazanec15 added the bug Something isn't working label Aug 8, 2022
@jmazanec15
Copy link
Member Author

Experimental setup

I ran performance tests on the sift data set which has 1M 128D vectors with a 10K query set. I tested faiss HNSW with m=16, ef_search=64 and ef_construction=64. I setup cluster with 2.1 with 3 c5.xlarge leaders and 1 r5.2xlarge data nodes. The control cluster came from the official 2.1 release. The test cluster was based off of https://github.com/jmazanec15/k-NN-1/tree/faiss-oversize-fix. I ran 3 iterations of each. I used 1 primary shard and 0 replicas. Benchmark code can be found in https://github.com/opensearch-project/k-NN/tree/main/benchmarks/osb.

Results

Control

Experiment 1

|                                                 Min Throughput | knn-query-from-data-set |     6.47 |  ops/s |
|                                                Mean Throughput | knn-query-from-data-set |    188.2 |  ops/s |
|                                              Median Throughput | knn-query-from-data-set |   211.49 |  ops/s |
|                                                 Max Throughput | knn-query-from-data-set |   231.17 |  ops/s |
|                                        50th percentile latency | knn-query-from-data-set |  39.0271 |     ms |
|                                        90th percentile latency | knn-query-from-data-set |  52.0604 |     ms |
|                                        99th percentile latency | knn-query-from-data-set |  81.5243 |     ms |
|                                      99.9th percentile latency | knn-query-from-data-set |  133.603 |     ms |
|                                     99.99th percentile latency | knn-query-from-data-set |  854.554 |     ms |
|                                       100th percentile latency | knn-query-from-data-set |   867.14 |     ms |
|                                                     error rate | knn-query-from-data-set |        0 |      % |

Experiment 2

|                                                 Min Throughput | knn-query-from-data-set |      10.87 |  ops/s |
|                                                Mean Throughput | knn-query-from-data-set |     234.99 |  ops/s |
|                                              Median Throughput | knn-query-from-data-set |     251.93 |  ops/s |
|                                                 Max Throughput | knn-query-from-data-set |     260.83 |  ops/s |
|                                        50th percentile latency | knn-query-from-data-set |    35.7589 |     ms |
|                                        90th percentile latency | knn-query-from-data-set |    44.9456 |     ms |
|                                        99th percentile latency | knn-query-from-data-set |    58.0603 |     ms |
|                                      99.9th percentile latency | knn-query-from-data-set |    503.916 |     ms |
|                                     99.99th percentile latency | knn-query-from-data-set |    517.919 |     ms |
|                                       100th percentile latency | knn-query-from-data-set |    1057.44 |     ms |
|                                                     error rate | knn-query-from-data-set |          0 |      % |

Experiment 3

|                                                 Min Throughput | knn-query-from-data-set |    20.74 |  ops/s |
|                                                Mean Throughput | knn-query-from-data-set |    232.5 |  ops/s |
|                                              Median Throughput | knn-query-from-data-set |   246.69 |  ops/s |
|                                                 Max Throughput | knn-query-from-data-set |   253.55 |  ops/s |
|                                        50th percentile latency | knn-query-from-data-set |  37.2449 |     ms |
|                                        90th percentile latency | knn-query-from-data-set |  45.7525 |     ms |
|                                        99th percentile latency | knn-query-from-data-set |   55.398 |     ms |
|                                      99.9th percentile latency | knn-query-from-data-set |  100.796 |     ms |
|                                     99.99th percentile latency | knn-query-from-data-set |  492.354 |     ms |
|                                       100th percentile latency | knn-query-from-data-set |  493.705 |     ms |
|                                                     error rate | knn-query-from-data-set |        0 |    %   |

Test

Experiment 1

|                                                 Min Throughput | knn-query-from-data-set |       6.69 |  ops/s |
|                                                Mean Throughput | knn-query-from-data-set |     201.05 |  ops/s |
|                                              Median Throughput | knn-query-from-data-set |     226.86 |  ops/s |
|                                                 Max Throughput | knn-query-from-data-set |     249.49 |  ops/s |
|                                        50th percentile latency | knn-query-from-data-set |     36.028 |     ms |
|                                        90th percentile latency | knn-query-from-data-set |    48.3404 |     ms |
|                                        99th percentile latency | knn-query-from-data-set |     75.483 |     ms |
|                                      99.9th percentile latency | knn-query-from-data-set |    173.583 |     ms |
|                                     99.99th percentile latency | knn-query-from-data-set |    824.394 |     ms |
|                                       100th percentile latency | knn-query-from-data-set |    824.904 |     ms |
|                                                     error rate | knn-query-from-data-set |          0 |      % |

Experiment 2

|                                                 Min Throughput | knn-query-from-data-set |    10.58 |  ops/s |
|                                                Mean Throughput | knn-query-from-data-set |   262.89 |  ops/s |
|                                              Median Throughput | knn-query-from-data-set |   283.81 |  ops/s |
|                                                 Max Throughput | knn-query-from-data-set |   295.38 |  ops/s |
|                                        50th percentile latency | knn-query-from-data-set |  31.4916 |     ms |
|                                        90th percentile latency | knn-query-from-data-set |  39.8768 |     ms |
|                                        99th percentile latency | knn-query-from-data-set |  49.8694 |     ms |
|                                      99.9th percentile latency | knn-query-from-data-set |  96.7512 |     ms |
|                                     99.99th percentile latency | knn-query-from-data-set |  518.972 |     ms |
|                                       100th percentile latency | knn-query-from-data-set |  522.429 |     ms |
|                                                     error rate | knn-query-from-data-set |        0 |      % |

Experiment 3

|                                                 Min Throughput | knn-query-from-data-set |    38.89 |  ops/s |
|                                                Mean Throughput | knn-query-from-data-set |   265.19 |  ops/s |
|                                              Median Throughput | knn-query-from-data-set |   281.91 |  ops/s |
|                                                 Max Throughput | knn-query-from-data-set |   288.52 |  ops/s |
|                                        50th percentile latency | knn-query-from-data-set |  32.4187 |     ms |
|                                        90th percentile latency | knn-query-from-data-set |  40.7416 |     ms |
|                                        99th percentile latency | knn-query-from-data-set |  50.1702 |     ms |
|                                      99.9th percentile latency | knn-query-from-data-set |  109.783 |     ms |
|                                     99.99th percentile latency | knn-query-from-data-set |  462.385 |     ms |
|                                       100th percentile latency | knn-query-from-data-set |  463.454 |     ms |
|                                                     error rate | knn-query-from-data-set |        0 |      % |

Conclusion

From these experiments, we see that removing the additional overhead might slightly improve performance at this scale.

jmazanec15 added a commit to jmazanec15/k-NN-1 that referenced this issue Aug 8, 2022
Removes overallocation of 2 c++ vectors in faiss querying functionality.
Performance results can be viewed in [497](opensearch-project#497 (comment)).
 In general, this change could provide a small improvement in memory
 footprint during search workloads.

Signed-off-by: John Mazanec <jmazane@amazon.com>
jmazanec15 added a commit that referenced this issue Aug 8, 2022
Removes overallocation of 2 c++ vectors in faiss querying functionality.
Performance results can be viewed in [497](#497 (comment)).
 In general, this change could provide a small improvement in memory
 footprint during search workloads.

Signed-off-by: John Mazanec <jmazane@amazon.com>
opensearch-trigger-bot bot pushed a commit that referenced this issue Aug 8, 2022
Removes overallocation of 2 c++ vectors in faiss querying functionality.
Performance results can be viewed in [497](#497 (comment)).
 In general, this change could provide a small improvement in memory
 footprint during search workloads.

Signed-off-by: John Mazanec <jmazane@amazon.com>
(cherry picked from commit 507bafe)
jmazanec15 added a commit that referenced this issue Aug 9, 2022
Removes overallocation of 2 c++ vectors in faiss querying functionality.
Performance results can be viewed in [497](#497 (comment)).
 In general, this change could provide a small improvement in memory
 footprint during search workloads.

Signed-off-by: John Mazanec <jmazane@amazon.com>
(cherry picked from commit 507bafe)

Co-authored-by: John Mazanec <jmazane@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant