Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make cosine similarity faster by storing magnitude and normalizing vectors #99445

Merged

Conversation

benwtrent
Copy link
Member

cosine is our default similarity and should provide a good experience on speed.

dot_product is faster than cosine as it doesn't require calculating vector magnitudes in the similarity comparison loop. Instead, it can assume vectors have a length of 1 and use an optimized dot_product calculation.

However, cosine as it exists today accepts vectors of any magnitude and cannot take advantage of this.

This commit addresses this by:

  • Normalizing all vectors passed when indexing via cosine
  • Storing the calculated magnitude in an additional field (only if its != 1).
  • Using the dot_product Lucene calculation
  • Normalizing query vectors when used against these new cosine fields
  • De-normalizing vectors when accessed via scripts
  • Allowing scripts to access these stored magnitudes.

@benwtrent benwtrent added >enhancement test-full-bwc Trigger full BWC version matrix tests :Search/Vectors Vector search v8.11.0 labels Sep 11, 2023
@elasticsearchmachine
Copy link
Collaborator

Hi @benwtrent, I've created a changelog YAML for you.

@elasticsearchmachine elasticsearchmachine added the Team:Search Meta label for search team label Sep 11, 2023
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search (Team:Search)

@benwtrent
Copy link
Member Author

Did a comparison of this vs. regular dot_product.

Its weird how on search its continuously slightly slower (ran multiple times to confirm).

Also the 99.9+% latency on index append is crazy!

All these vectors are already normalized. So, I am not 100% sure what could be causing this issue. The comparison should be pretty near 1-to-1 when using this PR with cosine and already normalized vectors.

------------------------------------------------------
    _______             __   _____
   / ____(_)___  ____ _/ /  / ___/_________  ________
  / /_  / / __ \/ __ `/ /   \__ \/ ___/ __ \/ ___/ _ \
 / __/ / / / / / /_/ / /   ___/ / /__/ /_/ / /  /  __/
/_/   /_/_/ /_/\__,_/_/   /____/\___/\____/_/   \___/
------------------------------------------------------

|                                                        Metric |                                          Task |       Baseline |      Contender |         Diff |   Unit |   Diff % |
|--------------------------------------------------------------:|----------------------------------------------:|---------------:|---------------:|-------------:|-------:|---------:|
|                    Cumulative indexing time of primary shards |                                               |   30.0224      |   30.151       |      0.12858 |    min |   +0.43% |
|             Min cumulative indexing time across primary shard |                                               |   14.8865      |   15.0412      |      0.15467 |    min |   +1.04% |
|          Median cumulative indexing time across primary shard |                                               |   15.0112      |   15.0755      |      0.06429 |    min |   +0.43% |
|             Max cumulative indexing time across primary shard |                                               |   15.1359      |   15.1098      |     -0.02608 |    min |   -0.17% |
|           Cumulative indexing throttle time of primary shards |                                               |    0           |    0           |      0       |    min |    0.00% |
|    Min cumulative indexing throttle time across primary shard |                                               |    0           |    0           |      0       |    min |    0.00% |
| Median cumulative indexing throttle time across primary shard |                                               |    0           |    0           |      0       |    min |    0.00% |
|    Max cumulative indexing throttle time across primary shard |                                               |    0           |    0           |      0       |    min |    0.00% |
|                       Cumulative merge time of primary shards |                                               |   31.829       |   31.4611      |     -0.3679  |    min |   -1.16% |
|                      Cumulative merge count of primary shards |                                               |    8           |    8           |      0       |        |    0.00% |
|                Min cumulative merge time across primary shard |                                               |   15.8994      |   15.6213      |     -0.27807 |    min |   -1.75% |
|             Median cumulative merge time across primary shard |                                               |   15.9145      |   15.7305      |     -0.18395 |    min |   -1.16% |
|                Max cumulative merge time across primary shard |                                               |   15.9296      |   15.8398      |     -0.08983 |    min |   -0.56% |
|              Cumulative merge throttle time of primary shards |                                               |    3.5286      |    3.38408     |     -0.14452 |    min |   -4.10% |
|       Min cumulative merge throttle time across primary shard |                                               |    1.6825      |    1.66343     |     -0.01907 |    min |   -1.13% |
|    Median cumulative merge throttle time across primary shard |                                               |    1.7643      |    1.69204     |     -0.07226 |    min |   -4.10% |
|       Max cumulative merge throttle time across primary shard |                                               |    1.8461      |    1.72065     |     -0.12545 |    min |   -6.80% |
|                     Cumulative refresh time of primary shards |                                               |    0.192633    |    0.210533    |      0.0179  |    min |   +9.29% |
|                    Cumulative refresh count of primary shards |                                               |   88           |   87           |     -1       |        |   -1.14% |
|              Min cumulative refresh time across primary shard |                                               |    0.0946      |    0.1025      |      0.0079  |    min |   +8.35% |
|           Median cumulative refresh time across primary shard |                                               |    0.0963167   |    0.105267    |      0.00895 |    min |   +9.29% |
|              Max cumulative refresh time across primary shard |                                               |    0.0980333   |    0.108033    |      0.01    |    min |  +10.20% |
|                       Cumulative flush time of primary shards |                                               |    1.7835      |    1.81745     |      0.03395 |    min |   +1.90% |
|                      Cumulative flush count of primary shards |                                               |   42           |   42           |      0       |        |    0.00% |
|                Min cumulative flush time across primary shard |                                               |    0.8826      |    0.90285     |      0.02025 |    min |   +2.29% |
|             Median cumulative flush time across primary shard |                                               |    0.89175     |    0.908725    |      0.01698 |    min |   +1.90% |
|                Max cumulative flush time across primary shard |                                               |    0.9009      |    0.9146      |      0.0137  |    min |   +1.52% |
|                                       Total Young Gen GC time |                                               |    2.271       |    2.608       |      0.337   |      s |  +14.84% |
|                                      Total Young Gen GC count |                                               |  231           |  230           |     -1       |        |   -0.43% |
|                                         Total Old Gen GC time |                                               |    0           |    0           |      0       |      s |    0.00% |
|                                        Total Old Gen GC count |                                               |    0           |    0           |      0       |        |    0.00% |
|                                                    Store size |                                               |    7.67249     |    7.84566     |      0.17317 |     GB |   +2.26% |
|                                                 Translog size |                                               |    1.02445e-07 |    1.02445e-07 |      0       |     GB |    0.00% |
|                                        Heap used for segments |                                               |    0           |    0           |      0       |     MB |    0.00% |
|                                      Heap used for doc values |                                               |    0           |    0           |      0       |     MB |    0.00% |
|                                           Heap used for terms |                                               |    0           |    0           |      0       |     MB |    0.00% |
|                                           Heap used for norms |                                               |    0           |    0           |      0       |     MB |    0.00% |
|                                          Heap used for points |                                               |    0           |    0           |      0       |     MB |    0.00% |
|                                   Heap used for stored fields |                                               |    0           |    0           |      0       |     MB |    0.00% |
|                                                 Segment count |                                               |    2           |    2           |      0       |        |    0.00% |
|                                   Total Ingest Pipeline count |                                               |    0           |    0           |      0       |        |    0.00% |
|                                    Total Ingest Pipeline time |                                               |    0           |    0           |      0       |     ms |    0.00% |
|                                  Total Ingest Pipeline failed |                                               |    0           |    0           |      0       |        |    0.00% |
|                                                Min Throughput |                                  index-append | 1654.81        | 1647.34        |     -7.47173 | docs/s |   -0.45% |
|                                               Mean Throughput |                                  index-append | 1719.16        | 1698.85        |    -20.312   | docs/s |   -1.18% |
|                                             Median Throughput |                                  index-append | 1704.63        | 1681.01        |    -23.6223  | docs/s |   -1.39% |
|                                                Max Throughput |                                  index-append | 1996.53        | 1947.94        |    -48.5886  | docs/s |   -2.43% |
|                                       50th percentile latency |                                  index-append |  288.811       |  291.058       |      2.24698 |     ms |   +0.78% |
|                                       90th percentile latency |                                  index-append |  314.591       |  315.889       |      1.29822 |     ms |   +0.41% |
|                                       99th percentile latency |                                  index-append |  342.74        |  349.585       |      6.84538 |     ms |   +2.00% |
|                                     99.9th percentile latency |                                  index-append | 2092.04        |  740.746       |  -1351.3     |     ms |  -64.59% |
|                                      100th percentile latency |                                  index-append | 2181.47        | 2359.55        |    178.083   |     ms |   +8.16% |
|                                  50th percentile service time |                                  index-append |  288.811       |  291.058       |      2.24698 |     ms |   +0.78% |
|                                  90th percentile service time |                                  index-append |  314.591       |  315.889       |      1.29822 |     ms |   +0.41% |
|                                  99th percentile service time |                                  index-append |  342.74        |  349.585       |      6.84538 |     ms |   +2.00% |
|                                99.9th percentile service time |                                  index-append | 2092.04        |  740.746       |  -1351.3     |     ms |  -64.59% |
|                                 100th percentile service time |                                  index-append | 2181.47        | 2359.55        |    178.083   |     ms |   +8.16% |
|                                                    error rate |                                  index-append |    0           |    0           |      0       |      % |    0.00% |
|                                                Min Throughput |                           refresh-after-index |    2.26249     |    1.35853     |     -0.90396 |  ops/s |  -39.95% |
|                                               Mean Throughput |                           refresh-after-index |    2.26249     |    1.35853     |     -0.90396 |  ops/s |  -39.95% |
|                                             Median Throughput |                           refresh-after-index |    2.26249     |    1.35853     |     -0.90396 |  ops/s |  -39.95% |
|                                                Max Throughput |                           refresh-after-index |    2.26249     |    1.35853     |     -0.90396 |  ops/s |  -39.95% |
|                                      100th percentile latency |                           refresh-after-index |  440.864       |  735.073       |    294.209   |     ms |  +66.73% |
|                                 100th percentile service time |                           refresh-after-index |  440.864       |  735.073       |    294.209   |     ms |  +66.73% |
|                                                    error rate |                           refresh-after-index |    0           |    0           |      0       |      % |    0.00% |
|                                                Min Throughput |                    knn-search-10-50-match-all |   15.5335      |   14.6522      |     -0.88127 |  ops/s |   -5.67% |
|                                               Mean Throughput |                    knn-search-10-50-match-all |   17.5069      |   16.668       |     -0.8389  |  ops/s |   -4.79% |
|                                             Median Throughput |                    knn-search-10-50-match-all |   17.5069      |   16.668       |     -0.8389  |  ops/s |   -4.79% |
|                                                Max Throughput |                    knn-search-10-50-match-all |   19.4802      |   18.6837      |     -0.79653 |  ops/s |   -4.09% |
|                                       50th percentile latency |                    knn-search-10-50-match-all |   17.6462      |   17.4476      |     -0.19858 |     ms |   -1.13% |
|                                       90th percentile latency |                    knn-search-10-50-match-all |   18.1759      |   17.9754      |     -0.20046 |     ms |   -1.10% |
|                                       99th percentile latency |                    knn-search-10-50-match-all |   18.7467      |   18.3087      |     -0.43792 |     ms |   -2.34% |
|                                      100th percentile latency |                    knn-search-10-50-match-all |   20.4538      |   19.0464      |     -1.40742 |     ms |   -6.88% |
|                                  50th percentile service time |                    knn-search-10-50-match-all |   17.6462      |   17.4476      |     -0.19858 |     ms |   -1.13% |
|                                  90th percentile service time |                    knn-search-10-50-match-all |   18.1759      |   17.9754      |     -0.20046 |     ms |   -1.10% |
|                                  99th percentile service time |                    knn-search-10-50-match-all |   18.7467      |   18.3087      |     -0.43792 |     ms |   -2.34% |
|                                 100th percentile service time |                    knn-search-10-50-match-all |   20.4538      |   19.0464      |     -1.40742 |     ms |   -6.88% |
|                                                    error rate |                    knn-search-10-50-match-all |    0           |    0           |      0       |      % |    0.00% |
|                                                Min Throughput |             knn-search-10-50-acceptedAnswerId |   23.9651      |   27.0564      |      3.0913  |  ops/s |  +12.90% |
|                                               Mean Throughput |             knn-search-10-50-acceptedAnswerId |   25.7496      |   27.8561      |      2.10646 |  ops/s |   +8.18% |
|                                             Median Throughput |             knn-search-10-50-acceptedAnswerId |   25.9144      |   27.8561      |      1.94165 |  ops/s |   +7.49% |
|                                                Max Throughput |             knn-search-10-50-acceptedAnswerId |   27.3693      |   28.6557      |      1.28645 |  ops/s |   +4.70% |
|                                       50th percentile latency |             knn-search-10-50-acceptedAnswerId |   27.0287      |   26.49        |     -0.53865 |     ms |   -1.99% |
|                                       90th percentile latency |             knn-search-10-50-acceptedAnswerId |   27.57        |   26.9309      |     -0.63916 |     ms |   -2.32% |
|                                       99th percentile latency |             knn-search-10-50-acceptedAnswerId |   28.148       |   27.5053      |     -0.6427  |     ms |   -2.28% |
|                                      100th percentile latency |             knn-search-10-50-acceptedAnswerId |   29.0551      |   28.1325      |     -0.92258 |     ms |   -3.18% |
|                                  50th percentile service time |             knn-search-10-50-acceptedAnswerId |   27.0287      |   26.49        |     -0.53865 |     ms |   -1.99% |
|                                  90th percentile service time |             knn-search-10-50-acceptedAnswerId |   27.57        |   26.9309      |     -0.63916 |     ms |   -2.32% |
|                                  99th percentile service time |             knn-search-10-50-acceptedAnswerId |   28.148       |   27.5053      |     -0.6427  |     ms |   -2.28% |
|                                 100th percentile service time |             knn-search-10-50-acceptedAnswerId |   29.0551      |   28.1325      |     -0.92258 |     ms |   -3.18% |
|                                                    error rate |             knn-search-10-50-acceptedAnswerId |    0           |    0           |      0       |      % |    0.00% |
|                                                Min Throughput |                         knn-search-10-50-java |    4.47349     |    5.2768      |      0.8033  |  ops/s |  +17.96% |
|                                               Mean Throughput |                         knn-search-10-50-java |    5.57128     |    6.26206     |      0.69077 |  ops/s |  +12.40% |
|                                             Median Throughput |                         knn-search-10-50-java |    5.611       |    6.30372     |      0.69272 |  ops/s |  +12.35% |
|                                                Max Throughput |                         knn-search-10-50-java |    6.54278     |    7.14285     |      0.60007 |  ops/s |   +9.17% |
|                                       50th percentile latency |                         knn-search-10-50-java |   67.7199      |   65.8461      |     -1.87381 |     ms |   -2.77% |
|                                       90th percentile latency |                         knn-search-10-50-java |   69.0514      |   66.8893      |     -2.16206 |     ms |   -3.13% |
|                                       99th percentile latency |                         knn-search-10-50-java |   71.5516      |   68.6132      |     -2.93832 |     ms |   -4.11% |
|                                      100th percentile latency |                         knn-search-10-50-java |  113.31        |   72.984       |    -40.3261  |     ms |  -35.59% |
|                                  50th percentile service time |                         knn-search-10-50-java |   67.7199      |   65.8461      |     -1.87381 |     ms |   -2.77% |
|                                  90th percentile service time |                         knn-search-10-50-java |   69.0514      |   66.8893      |     -2.16206 |     ms |   -3.13% |
|                                  99th percentile service time |                         knn-search-10-50-java |   71.5516      |   68.6132      |     -2.93832 |     ms |   -4.11% |
|                                 100th percentile service time |                         knn-search-10-50-java |  113.31        |   72.984       |    -40.3261  |     ms |  -35.59% |
|                                                    error rate |                         knn-search-10-50-java |    0           |    0           |      0       |      % |    0.00% |
|                                                Min Throughput |                          knn-search-10-50-css |   22.9257      |   23.127       |      0.20128 |  ops/s |   +0.88% |
|                                               Mean Throughput |                          knn-search-10-50-css |   25.3103      |   25.6052      |      0.29495 |  ops/s |   +1.17% |
|                                             Median Throughput |                          knn-search-10-50-css |   25.588       |   25.897       |      0.30898 |  ops/s |   +1.21% |
|                                                Max Throughput |                          knn-search-10-50-css |   27.4171      |   27.7917      |      0.3746  |  ops/s |   +1.37% |
|                                       50th percentile latency |                          knn-search-10-50-css |   24.8951      |   24.4704      |     -0.42467 |     ms |   -1.71% |
|                                       90th percentile latency |                          knn-search-10-50-css |   26.1126      |   25.0664      |     -1.04628 |     ms |   -4.01% |
|                                       99th percentile latency |                          knn-search-10-50-css |   26.99        |   27.1562      |      0.16627 |     ms |   +0.62% |
|                                      100th percentile latency |                          knn-search-10-50-css |   32.9178      |   27.3402      |     -5.5775  |     ms |  -16.94% |
|                                  50th percentile service time |                          knn-search-10-50-css |   24.8951      |   24.4704      |     -0.42467 |     ms |   -1.71% |
|                                  90th percentile service time |                          knn-search-10-50-css |   26.1126      |   25.0664      |     -1.04628 |     ms |   -4.01% |
|                                  99th percentile service time |                          knn-search-10-50-css |   26.99        |   27.1562      |      0.16627 |     ms |   +0.62% |
|                                 100th percentile service time |                          knn-search-10-50-css |   32.9178      |   27.3402      |     -5.5775  |     ms |  -16.94% |
|                                                    error rate |                          knn-search-10-50-css |    0           |    0           |      0       |      % |    0.00% |
|                                                Min Throughput |                  knn-search-10-50-concurrency |  161.706       |  176.31        |     14.604   |  ops/s |   +9.03% |
|                                               Mean Throughput |                  knn-search-10-50-concurrency |  161.706       |  176.31        |     14.604   |  ops/s |   +9.03% |
|                                             Median Throughput |                  knn-search-10-50-concurrency |  161.706       |  176.31        |     14.604   |  ops/s |   +9.03% |
|                                                Max Throughput |                  knn-search-10-50-concurrency |  161.706       |  176.31        |     14.604   |  ops/s |   +9.03% |
|                                       50th percentile latency |                  knn-search-10-50-concurrency |    4.18992     |    4.02269     |     -0.16723 |     ms |   -3.99% |
|                                       90th percentile latency |                  knn-search-10-50-concurrency |    4.49825     |    4.54279     |      0.04454 |     ms |   +0.99% |
|                                       99th percentile latency |                  knn-search-10-50-concurrency |    4.66164     |    4.8439      |      0.18227 |     ms |   +3.91% |
|                                      100th percentile latency |                  knn-search-10-50-concurrency |    4.73279     |    7.37954     |      2.64675 |     ms |  +55.92% |
|                                  50th percentile service time |                  knn-search-10-50-concurrency |    4.18992     |    4.02269     |     -0.16723 |     ms |   -3.99% |
|                                  90th percentile service time |                  knn-search-10-50-concurrency |    4.49825     |    4.54279     |      0.04454 |     ms |   +0.99% |
|                                  99th percentile service time |                  knn-search-10-50-concurrency |    4.66164     |    4.8439      |      0.18227 |     ms |   +3.91% |
|                                 100th percentile service time |                  knn-search-10-50-concurrency |    4.73279     |    7.37954     |      2.64675 |     ms |  +55.92% |
|                                                    error rate |                  knn-search-10-50-concurrency |    0           |    0           |      0       |      % |    0.00% |
|                                                Min Throughput |                                   force-merge |    0.00070864  |    0.000722399 |      1e-05   |  ops/s |   +1.94% |
|                                               Mean Throughput |                                   force-merge |    0.00070864  |    0.000722399 |      1e-05   |  ops/s |   +1.94% |
|                                             Median Throughput |                                   force-merge |    0.00070864  |    0.000722399 |      1e-05   |  ops/s |   +1.94% |
|                                                Max Throughput |                                   force-merge |    0.00070864  |    0.000722399 |      1e-05   |  ops/s |   +1.94% |
|                                      100th percentile latency |                                   force-merge |    1.41115e+06 |    1.38427e+06 | -26877.9     |     ms |   -1.90% |
|                                 100th percentile service time |                                   force-merge |    1.41115e+06 |    1.38427e+06 | -26877.9     |     ms |   -1.90% |
|                                                    error rate |                                   force-merge |    0           |    0           |      0       |      % |    0.00% |
|                                                Min Throughput |        knn-search-10-50-match-all-force-merge |  261.137       |  248.653       |    -12.4838  |  ops/s |   -4.78% |
|                                               Mean Throughput |        knn-search-10-50-match-all-force-merge |  261.137       |  248.653       |    -12.4838  |  ops/s |   -4.78% |
|                                             Median Throughput |        knn-search-10-50-match-all-force-merge |  261.137       |  248.653       |    -12.4838  |  ops/s |   -4.78% |
|                                                Max Throughput |        knn-search-10-50-match-all-force-merge |  261.137       |  248.653       |    -12.4838  |  ops/s |   -4.78% |
|                                       50th percentile latency |        knn-search-10-50-match-all-force-merge |    1.88773     |    1.87592     |     -0.01181 |     ms |   -0.63% |
|                                       90th percentile latency |        knn-search-10-50-match-all-force-merge |    2.019       |    2.07412     |      0.05512 |     ms |   +2.73% |
|                                       99th percentile latency |        knn-search-10-50-match-all-force-merge |    2.08819     |    2.2044      |      0.11621 |     ms |   +5.56% |
|                                      100th percentile latency |        knn-search-10-50-match-all-force-merge |    2.09508     |    2.269       |      0.17392 |     ms |   +8.30% |
|                                  50th percentile service time |        knn-search-10-50-match-all-force-merge |    1.88773     |    1.87592     |     -0.01181 |     ms |   -0.63% |
|                                  90th percentile service time |        knn-search-10-50-match-all-force-merge |    2.019       |    2.07412     |      0.05512 |     ms |   +2.73% |
|                                  99th percentile service time |        knn-search-10-50-match-all-force-merge |    2.08819     |    2.2044      |      0.11621 |     ms |   +5.56% |
|                                 100th percentile service time |        knn-search-10-50-match-all-force-merge |    2.09508     |    2.269       |      0.17392 |     ms |   +8.30% |
|                                                    error rate |        knn-search-10-50-match-all-force-merge |    0           |    0           |      0       |      % |    0.00% |
|                                                Min Throughput | knn-search-10-50-acceptedAnswerId-force-merge |   96.6224      |   99.747       |      3.12453 |  ops/s |   +3.23% |
|                                               Mean Throughput | knn-search-10-50-acceptedAnswerId-force-merge |   96.6224      |   99.747       |      3.12453 |  ops/s |   +3.23% |
|                                             Median Throughput | knn-search-10-50-acceptedAnswerId-force-merge |   96.6224      |   99.747       |      3.12453 |  ops/s |   +3.23% |
|                                                Max Throughput | knn-search-10-50-acceptedAnswerId-force-merge |   96.6224      |   99.747       |      3.12453 |  ops/s |   +3.23% |
|                                       50th percentile latency | knn-search-10-50-acceptedAnswerId-force-merge |    9.25081     |    8.90498     |     -0.34583 |     ms |   -3.74% |
|                                       90th percentile latency | knn-search-10-50-acceptedAnswerId-force-merge |    9.49023     |    9.13729     |     -0.35294 |     ms |   -3.72% |
|                                       99th percentile latency | knn-search-10-50-acceptedAnswerId-force-merge |    9.74293     |    9.58257     |     -0.16036 |     ms |   -1.65% |
|                                      100th percentile latency | knn-search-10-50-acceptedAnswerId-force-merge |    9.84292     |   11.8045      |      1.96162 |     ms |  +19.93% |
|                                  50th percentile service time | knn-search-10-50-acceptedAnswerId-force-merge |    9.25081     |    8.90498     |     -0.34583 |     ms |   -3.74% |
|                                  90th percentile service time | knn-search-10-50-acceptedAnswerId-force-merge |    9.49023     |    9.13729     |     -0.35294 |     ms |   -3.72% |
|                                  99th percentile service time | knn-search-10-50-acceptedAnswerId-force-merge |    9.74293     |    9.58257     |     -0.16036 |     ms |   -1.65% |
|                                 100th percentile service time | knn-search-10-50-acceptedAnswerId-force-merge |    9.84292     |   11.8045      |      1.96162 |     ms |  +19.93% |
|                                                    error rate | knn-search-10-50-acceptedAnswerId-force-merge |    0           |    0           |      0       |      % |    0.00% |
|                                                Min Throughput |             knn-search-10-50-java-force-merge |  122.746       |  125.876       |      3.12992 |  ops/s |   +2.55% |
|                                               Mean Throughput |             knn-search-10-50-java-force-merge |  122.746       |  125.876       |      3.12992 |  ops/s |   +2.55% |
|                                             Median Throughput |             knn-search-10-50-java-force-merge |  122.746       |  125.876       |      3.12992 |  ops/s |   +2.55% |
|                                                Max Throughput |             knn-search-10-50-java-force-merge |  122.746       |  125.876       |      3.12992 |  ops/s |   +2.55% |
|                                       50th percentile latency |             knn-search-10-50-java-force-merge |    3.17337     |    3.17144     |     -0.00194 |     ms |   -0.06% |
|                                       90th percentile latency |             knn-search-10-50-java-force-merge |    3.34029     |    3.42971     |      0.08942 |     ms |   +2.68% |
|                                       99th percentile latency |             knn-search-10-50-java-force-merge |    3.81214     |    3.96263     |      0.15049 |     ms |   +3.95% |
|                                      100th percentile latency |             knn-search-10-50-java-force-merge |    3.85512     |    4.00454     |      0.14942 |     ms |   +3.88% |
|                                  50th percentile service time |             knn-search-10-50-java-force-merge |    3.17337     |    3.17144     |     -0.00194 |     ms |   -0.06% |
|                                  90th percentile service time |             knn-search-10-50-java-force-merge |    3.34029     |    3.42971     |      0.08942 |     ms |   +2.68% |
|                                  99th percentile service time |             knn-search-10-50-java-force-merge |    3.81214     |    3.96263     |      0.15049 |     ms |   +3.95% |
|                                 100th percentile service time |             knn-search-10-50-java-force-merge |    3.85512     |    4.00454     |      0.14942 |     ms |   +3.88% |
|                                                    error rate |             knn-search-10-50-java-force-merge |    0           |    0           |      0       |      % |    0.00% |
|                                                Min Throughput |              knn-search-10-50-css-force-merge |    8.74072     |    8.49841     |     -0.24231 |  ops/s |   -2.77% |
|                                               Mean Throughput |              knn-search-10-50-css-force-merge |   10.9285      |   10.7283      |     -0.20027 |  ops/s |   -1.83% |
|                                             Median Throughput |              knn-search-10-50-css-force-merge |   11.0567      |   10.8389      |     -0.21778 |  ops/s |   -1.97% |
|                                                Max Throughput |              knn-search-10-50-css-force-merge |   12.9882      |   12.8475      |     -0.1407  |  ops/s |   -1.08% |
|                                       50th percentile latency |              knn-search-10-50-css-force-merge |   25.1508      |   24.9297      |     -0.2211  |     ms |   -0.88% |
|                                       90th percentile latency |              knn-search-10-50-css-force-merge |   25.8575      |   25.8358      |     -0.02169 |     ms |   -0.08% |
|                                       99th percentile latency |              knn-search-10-50-css-force-merge |   27.027       |   26.8407      |     -0.18631 |     ms |   -0.69% |
|                                      100th percentile latency |              knn-search-10-50-css-force-merge |   27.4154      |   26.9383      |     -0.47713 |     ms |   -1.74% |
|                                  50th percentile service time |              knn-search-10-50-css-force-merge |   25.1508      |   24.9297      |     -0.2211  |     ms |   -0.88% |
|                                  90th percentile service time |              knn-search-10-50-css-force-merge |   25.8575      |   25.8358      |     -0.02169 |     ms |   -0.08% |
|                                  99th percentile service time |              knn-search-10-50-css-force-merge |   27.027       |   26.8407      |     -0.18631 |     ms |   -0.69% |
|                                 100th percentile service time |              knn-search-10-50-css-force-merge |   27.4154      |   26.9383      |     -0.47713 |     ms |   -1.74% |
|                                                    error rate |              knn-search-10-50-css-force-merge |    0           |    0           |      0       |      % |    0.00% |
|                                                Min Throughput |      knn-search-10-50-concurrency-force-merge |  225.414       |  229.221       |      3.80756 |  ops/s |   +1.69% |
|                                               Mean Throughput |      knn-search-10-50-concurrency-force-merge |  225.414       |  229.221       |      3.80756 |  ops/s |   +1.69% |
|                                             Median Throughput |      knn-search-10-50-concurrency-force-merge |  225.414       |  229.221       |      3.80756 |  ops/s |   +1.69% |
|                                                Max Throughput |      knn-search-10-50-concurrency-force-merge |  225.414       |  229.221       |      3.80756 |  ops/s |   +1.69% |
|                                       50th percentile latency |      knn-search-10-50-concurrency-force-merge |    2.64465     |    2.52746     |     -0.11719 |     ms |   -4.43% |
|                                       90th percentile latency |      knn-search-10-50-concurrency-force-merge |    2.80474     |    2.66522     |     -0.13952 |     ms |   -4.97% |
|                                       99th percentile latency |      knn-search-10-50-concurrency-force-merge |    3.03305     |    2.83597     |     -0.19708 |     ms |   -6.50% |
|                                      100th percentile latency |      knn-search-10-50-concurrency-force-merge |    3.03758     |    2.86587     |     -0.17171 |     ms |   -5.65% |
|                                  50th percentile service time |      knn-search-10-50-concurrency-force-merge |    2.64465     |    2.52746     |     -0.11719 |     ms |   -4.43% |
|                                  90th percentile service time |      knn-search-10-50-concurrency-force-merge |    2.80474     |    2.66522     |     -0.13952 |     ms |   -4.97% |
|                                  99th percentile service time |      knn-search-10-50-concurrency-force-merge |    3.03305     |    2.83597     |     -0.19708 |     ms |   -6.50% |
|                                 100th percentile service time |      knn-search-10-50-concurrency-force-merge |    3.03758     |    2.86587     |     -0.17171 |     ms |   -5.65% |
|                                                    error rate |      knn-search-10-50-concurrency-force-merge |    0           |    0           |      0       |      % |    0.00% |

@benwtrent benwtrent marked this pull request as draft September 12, 2023 11:39
@benwtrent benwtrent removed the v8.11.0 label Sep 12, 2023
@benwtrent benwtrent added cloud-deploy Publish cloud docker image for Cloud-First-Testing and removed test-full-bwc Trigger full BWC version matrix tests labels Oct 23, 2023
@benwtrent
Copy link
Member Author

@elasticmachine update branch

Copy link
Contributor

@jpountz jpountz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its weird how on search its continuously slightly slower

On which task in particular are you seeing a slowdown? From a quick look performance looks similar on the baseline and candidate?

if (magnitudeIn == null) {
return false;
}
int currentDoc = magnitudeIn.docID();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we used advanceExact() we could simplify this logic to just do something like

if (magnitudeIn.advanceExact(docId)) {
  magnitude = Float.intBitsToFloat((int) magnitudeIn.longValue());
} else {
  magnitude = 1f;
}

In bencharks I ran a few years ago, advanceExact() performed noticeably faster than advance because it just needs to check if a doc exists, not compute the next doc.

@benwtrent
Copy link
Member Author

@jpountz my main concern is index_append. Maybe there is a bug in my runner, I can try running these again.

Here is another run, baseline is dot_product and contender is cosine. Note, all the vectors are already normalized, there shouldn't be any difference here. I don't know where I see any difference.

------------------------------------------------------
    _______             __   _____
   / ____(_)___  ____ _/ /  / ___/_________  ________
  / /_  / / __ \/ __ `/ /   \__ \/ ___/ __ \/ ___/ _ \
 / __/ / / / / / /_/ / /   ___/ / /__/ /_/ / /  /  __/
/_/   /_/_/ /_/\__,_/_/   /____/\___/\____/_/   \___/
------------------------------------------------------
            
|                                                        Metric |                                Task |       Baseline |       Contender |        Diff |   Unit |    Diff % |
|--------------------------------------------------------------:|------------------------------------:|---------------:|----------------:|------------:|-------:|----------:|
|                    Cumulative indexing time of primary shards |                                     |   61.0397      |    60.2555      |    -0.78423 |    min |    -1.28% |
|             Min cumulative indexing time across primary shard |                                     |    0           |     0           |     0       |    min |     0.00% |
|          Median cumulative indexing time across primary shard |                                     |    0           |     0           |     0       |    min |     0.00% |
|             Max cumulative indexing time across primary shard |                                     |   32.5594      |    30.6551      |    -1.90433 |    min |    -5.85% |
|           Cumulative indexing throttle time of primary shards |                                     |    0           |     0           |     0       |    min |     0.00% |
|    Min cumulative indexing throttle time across primary shard |                                     |    0           |     0           |     0       |    min |     0.00% |
| Median cumulative indexing throttle time across primary shard |                                     |    0           |     0           |     0       |    min |     0.00% |
|    Max cumulative indexing throttle time across primary shard |                                     |    0           |     0           |     0       |    min |     0.00% |
|                       Cumulative merge time of primary shards |                                     |   50.1438      |    50.6291      |     0.48525 |    min |    +0.97% |
|                      Cumulative merge count of primary shards |                                     |   35           |   292           |   257       |        |  +734.29% |
|                Min cumulative merge time across primary shard |                                     |    0           |     0           |     0       |    min |     0.00% |
|             Median cumulative merge time across primary shard |                                     |    0           |     0           |     0       |    min |     0.00% |
|                Max cumulative merge time across primary shard |                                     |   25.707       |    27.1952      |     1.4882  |    min |    +5.79% |
|              Cumulative merge throttle time of primary shards |                                     |    9.51352     |     9.59723     |     0.08372 |    min |    +0.88% |
|       Min cumulative merge throttle time across primary shard |                                     |    0           |     0           |     0       |    min |     0.00% |
|    Median cumulative merge throttle time across primary shard |                                     |    0           |     0           |     0       |    min |     0.00% |
|       Max cumulative merge throttle time across primary shard |                                     |    5.1863      |     5.65005     |     0.46375 |    min |    +8.94% |
|                     Cumulative refresh time of primary shards |                                     |    0.324867    |     1.34748     |     1.02262 |    min |  +314.78% |
|                    Cumulative refresh count of primary shards |                                     |  832           | 23075           | 22243       |        | +2673.44% |
|              Min cumulative refresh time across primary shard |                                     |    0           |     0.000116667 |     0.00012 |    min |     0.00% |
|           Median cumulative refresh time across primary shard |                                     |    0           |     0.000191667 |     0.00019 |    min |     0.00% |
|              Max cumulative refresh time across primary shard |                                     |    0.169267    |     1.0429      |     0.87363 |    min |  +516.13% |
|                       Cumulative flush time of primary shards |                                     |    3.61322     |     5.78422     |     2.171   |    min |   +60.08% |
|                      Cumulative flush count of primary shards |                                     |  349           |  8319           |  7970       |        | +2283.67% |
|                Min cumulative flush time across primary shard |                                     |    3.33333e-05 |     3.33333e-05 |     0       |    min |     0.00% |
|             Median cumulative flush time across primary shard |                                     |    4.16667e-05 |     4.16667e-05 |     0       |    min |     0.00% |
|                Max cumulative flush time across primary shard |                                     |    1.82853     |     2.2348      |     0.40627 |    min |   +22.22% |
|                                       Total Young Gen GC time |                                     |    2.092       |     1.731       |    -0.361   |      s |   -17.26% |
|                                      Total Young Gen GC count |                                     |   94           |    94           |     0       |        |     0.00% |
|                                         Total Old Gen GC time |                                     |    0           |     0           |     0       |      s |     0.00% |
|                                        Total Old Gen GC count |                                     |    0           |     0           |     0       |        |     0.00% |
|                                                    Store size |                                     |   10.7354      |    10.3996      |    -0.33579 |     GB |    -3.13% |
|                                                 Translog size |                                     |    1.33179e-06 |     1.33179e-06 |     0       |     GB |     0.00% |
|                                        Heap used for segments |                                     |    0           |     0           |     0       |     MB |     0.00% |
|                                      Heap used for doc values |                                     |    0           |     0           |     0       |     MB |     0.00% |
|                                           Heap used for terms |                                     |    0           |     0           |     0       |     MB |     0.00% |
|                                           Heap used for norms |                                     |    0           |     0           |     0       |     MB |     0.00% |
|                                          Heap used for points |                                     |    0           |     0           |     0       |     MB |     0.00% |
|                                   Heap used for stored fields |                                     |    0           |     0           |     0       |     MB |     0.00% |
|                                                 Segment count |                                     |   70           |    91           |    21       |        |   +30.00% |
|                                   Total Ingest Pipeline count |                                     |    0           |     0           |     0       |        |     0.00% |
|                                    Total Ingest Pipeline time |                                     |    0           |     0           |     0       |     ms |     0.00% |
|                                  Total Ingest Pipeline failed |                                     |    0           |     0           |     0       |        |     0.00% |
|                                                Min Throughput |                        index-append |  505.775       |   514.35        |     8.57475 | docs/s |    +1.70% |
|                                               Mean Throughput |                        index-append |  515.394       |   528.457       |    13.0629  | docs/s |    +2.53% |
|                                             Median Throughput |                        index-append |  511.37        |   524.346       |    12.9757  | docs/s |    +2.54% |
|                                                Max Throughput |                        index-append |  641.352       |   680.372       |    39.0203  | docs/s |    +6.08% |
|                                       50th percentile latency |                        index-append |  943.516       |   910.452       |   -33.0644  |     ms |    -3.50% |
|                                       90th percentile latency |                        index-append | 1067           |  1063.73        |    -3.27117 |     ms |    -0.31% |
|                                       99th percentile latency |                        index-append | 1231.43        |  1313.84        |    82.4032  |     ms |    +6.69% |
|                                     99.9th percentile latency |                        index-append | 1405.15        |  1561.99        |   156.84    |     ms |   +11.16% |
|                                      100th percentile latency |                        index-append | 1474.08        |  2080.38        |   606.291   |     ms |   +41.13% |
|                                  50th percentile service time |                        index-append |  943.516       |   910.452       |   -33.0644  |     ms |    -3.50% |
|                                  90th percentile service time |                        index-append | 1067           |  1063.73        |    -3.27117 |     ms |    -0.31% |
|                                  99th percentile service time |                        index-append | 1231.43        |  1313.84        |    82.4032  |     ms |    +6.69% |
|                                99.9th percentile service time |                        index-append | 1405.15        |  1561.99        |   156.84    |     ms |   +11.16% |
|                                 100th percentile service time |                        index-append | 1474.08        |  2080.38        |   606.291   |     ms |   +41.13% |
|                                                    error rate |                        index-append |    0           |     0           |     0       |      % |     0.00% |
|                                                Min Throughput |                 refresh-after-index |    0.799737    |     0.808216    |     0.00848 |  ops/s |    +1.06% |
|                                               Mean Throughput |                 refresh-after-index |    0.799737    |     0.808216    |     0.00848 |  ops/s |    +1.06% |
|                                             Median Throughput |                 refresh-after-index |    0.799737    |     0.808216    |     0.00848 |  ops/s |    +1.06% |
|                                                Max Throughput |                 refresh-after-index |    0.799737    |     0.808216    |     0.00848 |  ops/s |    +1.06% |
|                                      100th percentile latency |                 refresh-after-index | 1248.85        |  1235.88        |   -12.9689  |     ms |    -1.04% |
|                                 100th percentile service time |                 refresh-after-index | 1248.85        |  1235.88        |   -12.9689  |     ms |    -1.04% |
|                                                    error rate |                 refresh-after-index |    0           |     0           |     0       |      % |     0.00% |
|                                                Min Throughput |          knn-search-10-50-match-all |   11.3318      |    11.3243      |    -0.0075  |  ops/s |    -0.07% |
|                                               Mean Throughput |          knn-search-10-50-match-all |   11.7378      |    11.721       |    -0.01678 |  ops/s |    -0.14% |
|                                             Median Throughput |          knn-search-10-50-match-all |   11.7733      |    11.7483      |    -0.02501 |  ops/s |    -0.21% |
|                                                Max Throughput |          knn-search-10-50-match-all |   12.0352      |    12.0315      |    -0.00372 |  ops/s |    -0.03% |
|                                       50th percentile latency |          knn-search-10-50-match-all |   76.5551      |    74.7179      |    -1.83715 |     ms |    -2.40% |
|                                       90th percentile latency |          knn-search-10-50-match-all |   77.1839      |    75.6568      |    -1.52704 |     ms |    -1.98% |
|                                       99th percentile latency |          knn-search-10-50-match-all |   81.1146      |    79.4676      |    -1.647   |     ms |    -2.03% |
|                                      100th percentile latency |          knn-search-10-50-match-all |   99.2469      |    84.3022      |   -14.9448  |     ms |   -15.06% |
|                                  50th percentile service time |          knn-search-10-50-match-all |   76.5551      |    74.7179      |    -1.83715 |     ms |    -2.40% |
|                                  90th percentile service time |          knn-search-10-50-match-all |   77.1839      |    75.6568      |    -1.52704 |     ms |    -1.98% |
|                                  99th percentile service time |          knn-search-10-50-match-all |   81.1146      |    79.4676      |    -1.647   |     ms |    -2.03% |
|                                 100th percentile service time |          knn-search-10-50-match-all |   99.2469      |    84.3022      |   -14.9448  |     ms |   -15.06% |
|                                                    error rate |          knn-search-10-50-match-all |    0           |     0           |     0       |      % |     0.00% |
|                                                Min Throughput |        script-score-query-match-all |    1.52286     |     1.45914     |    -0.06372 |  ops/s |    -4.18% |
|                                               Mean Throughput |        script-score-query-match-all |    1.55278     |     1.47964     |    -0.07314 |  ops/s |    -4.71% |
|                                             Median Throughput |        script-score-query-match-all |    1.553       |     1.48164     |    -0.07136 |  ops/s |    -4.60% |
|                                                Max Throughput |        script-score-query-match-all |    1.57641     |     1.49341     |    -0.08299 |  ops/s |    -5.26% |
|                                       50th percentile latency |        script-score-query-match-all |  606.988       |   651.364       |    44.3755  |     ms |    +7.31% |
|                                       90th percentile latency |        script-score-query-match-all |  638.173       |   661.634       |    23.4607  |     ms |    +3.68% |
|                                       99th percentile latency |        script-score-query-match-all |  687.888       |   680.672       |    -7.21588 |     ms |    -1.05% |
|                                      100th percentile latency |        script-score-query-match-all |  692.299       |   688.061       |    -4.23756 |     ms |    -0.61% |
|                                  50th percentile service time |        script-score-query-match-all |  606.988       |   651.364       |    44.3755  |     ms |    +7.31% |
|                                  90th percentile service time |        script-score-query-match-all |  638.173       |   661.634       |    23.4607  |     ms |    +3.68% |
|                                  99th percentile service time |        script-score-query-match-all |  687.888       |   680.672       |    -7.21588 |     ms |    -1.05% |
|                                 100th percentile service time |        script-score-query-match-all |  692.299       |   688.061       |    -4.23756 |     ms |    -0.61% |
|                                                    error rate |        script-score-query-match-all |    0           |     0           |     0       |      % |     0.00% |
|                                                Min Throughput |   knn-search-10-50-acceptedAnswerId |   11.5283      |    11.4587      |    -0.06959 |  ops/s |    -0.60% |
|                                               Mean Throughput |   knn-search-10-50-acceptedAnswerId |   11.6751      |    11.6585      |    -0.01654 |  ops/s |    -0.14% |
|                                             Median Throughput |   knn-search-10-50-acceptedAnswerId |   11.6891      |    11.6817      |    -0.00741 |  ops/s |    -0.06% |
|                                                Max Throughput |   knn-search-10-50-acceptedAnswerId |   11.7835      |    11.7955      |     0.012   |  ops/s |    +0.10% |
|                                       50th percentile latency |   knn-search-10-50-acceptedAnswerId |   81.3848      |    80.5352      |    -0.84963 |     ms |    -1.04% |
|                                       90th percentile latency |   knn-search-10-50-acceptedAnswerId |   85.3796      |    82.4673      |    -2.91223 |     ms |    -3.41% |
|                                       99th percentile latency |   knn-search-10-50-acceptedAnswerId |   87.9692      |    84.1178      |    -3.85139 |     ms |    -4.38% |
|                                      100th percentile latency |   knn-search-10-50-acceptedAnswerId |   89.0924      |    86.7282      |    -2.3642  |     ms |    -2.65% |
|                                  50th percentile service time |   knn-search-10-50-acceptedAnswerId |   81.3848      |    80.5352      |    -0.84963 |     ms |    -1.04% |
|                                  90th percentile service time |   knn-search-10-50-acceptedAnswerId |   85.3796      |    82.4673      |    -2.91223 |     ms |    -3.41% |
|                                  99th percentile service time |   knn-search-10-50-acceptedAnswerId |   87.9692      |    84.1178      |    -3.85139 |     ms |    -4.38% |
|                                 100th percentile service time |   knn-search-10-50-acceptedAnswerId |   89.0924      |    86.7282      |    -2.3642  |     ms |    -2.65% |
|                                                    error rate |   knn-search-10-50-acceptedAnswerId |    0           |     0           |     0       |      % |     0.00% |
|                                                Min Throughput | script-score-query-acceptedAnswerId |    1.8915      |     1.75887     |    -0.13262 |  ops/s |    -7.01% |
|                                               Mean Throughput | script-score-query-acceptedAnswerId |    1.90071     |     1.76242     |    -0.13829 |  ops/s |    -7.28% |
|                                             Median Throughput | script-score-query-acceptedAnswerId |    1.90097     |     1.7615      |    -0.13946 |  ops/s |    -7.34% |
|                                                Max Throughput | script-score-query-acceptedAnswerId |    1.90746     |     1.76651     |    -0.14095 |  ops/s |    -7.39% |
|                                       50th percentile latency | script-score-query-acceptedAnswerId |  518.126       |   560.893       |    42.767   |     ms |    +8.25% |
|                                       90th percentile latency | script-score-query-acceptedAnswerId |  530.127       |   569.624       |    39.4964  |     ms |    +7.45% |
|                                       99th percentile latency | script-score-query-acceptedAnswerId |  548.652       |   591.509       |    42.8567  |     ms |    +7.81% |
|                                      100th percentile latency | script-score-query-acceptedAnswerId |  564.902       |   675.545       |   110.644   |     ms |   +19.59% |
|                                  50th percentile service time | script-score-query-acceptedAnswerId |  518.126       |   560.893       |    42.767   |     ms |    +8.25% |
|                                  90th percentile service time | script-score-query-acceptedAnswerId |  530.127       |   569.624       |    39.4964  |     ms |    +7.45% |
|                                  99th percentile service time | script-score-query-acceptedAnswerId |  548.652       |   591.509       |    42.8567  |     ms |    +7.81% |
|                                 100th percentile service time | script-score-query-acceptedAnswerId |  564.902       |   675.545       |   110.644   |     ms |   +19.59% |
|                                                    error rate | script-score-query-acceptedAnswerId |    0           |     0           |     0       |      % |     0.00% |
|                                                Min Throughput |               knn-search-10-50-java |   11.86        |    11.6922      |    -0.16782 |  ops/s |    -1.41% |
|                                               Mean Throughput |               knn-search-10-50-java |   11.9889      |    11.833       |    -0.15592 |  ops/s |    -1.30% |
|                                             Median Throughput |               knn-search-10-50-java |   11.99        |    11.8664      |    -0.12365 |  ops/s |    -1.03% |
|                                                Max Throughput |               knn-search-10-50-java |   12.0956      |    11.9091      |    -0.18654 |  ops/s |    -1.54% |
|                                       50th percentile latency |               knn-search-10-50-java |   78.668       |    80.0862      |     1.41822 |     ms |    +1.80% |
|                                       90th percentile latency |               knn-search-10-50-java |   81.1299      |    82.1098      |     0.97984 |     ms |    +1.21% |
|                                       99th percentile latency |               knn-search-10-50-java |   92.348       |    96.8455      |     4.49748 |     ms |    +4.87% |
|                                      100th percentile latency |               knn-search-10-50-java |  123.166       |   162.033       |    38.867   |     ms |   +31.56% |
|                                  50th percentile service time |               knn-search-10-50-java |   78.668       |    80.0862      |     1.41822 |     ms |    +1.80% |
|                                  90th percentile service time |               knn-search-10-50-java |   81.1299      |    82.1098      |     0.97984 |     ms |    +1.21% |
|                                  99th percentile service time |               knn-search-10-50-java |   92.348       |    96.8455      |     4.49748 |     ms |    +4.87% |
|                                 100th percentile service time |               knn-search-10-50-java |  123.166       |   162.033       |    38.867   |     ms |   +31.56% |
|                                                    error rate |               knn-search-10-50-java |    0           |     0           |     0       |      % |     0.00% |
|                                                Min Throughput |             script-score-query-java |    7.69805     |     7.38493     |    -0.31312 |  ops/s |    -4.07% |
|                                               Mean Throughput |             script-score-query-java |    7.77661     |     7.43554     |    -0.34107 |  ops/s |    -4.39% |
|                                             Median Throughput |             script-score-query-java |    7.78847     |     7.44002     |    -0.34845 |  ops/s |    -4.47% |
|                                                Max Throughput |             script-score-query-java |    7.82127     |     7.47088     |    -0.35039 |  ops/s |    -4.48% |
|                                       50th percentile latency |             script-score-query-java |  123.851       |   131.058       |     7.20693 |     ms |    +5.82% |
|                                       90th percentile latency |             script-score-query-java |  130.048       |   134.854       |     4.80606 |     ms |    +3.70% |
|                                       99th percentile latency |             script-score-query-java |  135.062       |   139.128       |     4.06665 |     ms |    +3.01% |
|                                      100th percentile latency |             script-score-query-java |  141.156       |   140.048       |    -1.10806 |     ms |    -0.78% |
|                                  50th percentile service time |             script-score-query-java |  123.851       |   131.058       |     7.20693 |     ms |    +5.82% |
|                                  90th percentile service time |             script-score-query-java |  130.048       |   134.854       |     4.80606 |     ms |    +3.70% |
|                                  99th percentile service time |             script-score-query-java |  135.062       |   139.128       |     4.06665 |     ms |    +3.01% |
|                                 100th percentile service time |             script-score-query-java |  141.156       |   140.048       |    -1.10806 |     ms |    -0.78% |
|                                                    error rate |             script-score-query-java |    0           |     0           |     0       |      % |     0.00% |
|                                                Min Throughput |                knn-search-10-50-css |   11.097       |    11.5912      |     0.49425 |  ops/s |    +4.45% |
|                                               Mean Throughput |                knn-search-10-50-css |   11.2567      |    11.7548      |     0.49805 |  ops/s |    +4.42% |
|                                             Median Throughput |                knn-search-10-50-css |   11.2735      |    11.7672      |     0.49376 |  ops/s |    +4.38% |
|                                                Max Throughput |                knn-search-10-50-css |   11.3721      |    11.8706      |     0.49855 |  ops/s |    +4.38% |
|                                       50th percentile latency |                knn-search-10-50-css |   84.0377      |    80.7433      |    -3.29441 |     ms |    -3.92% |
|                                       90th percentile latency |                knn-search-10-50-css |   85.5124      |    82.1293      |    -3.38307 |     ms |    -3.96% |
|                                       99th percentile latency |                knn-search-10-50-css |   88.0246      |    83.1954      |    -4.82915 |     ms |    -5.49% |
|                                      100th percentile latency |                knn-search-10-50-css |   89.4279      |    93.7963      |     4.36847 |     ms |    +4.88% |
|                                  50th percentile service time |                knn-search-10-50-css |   84.0377      |    80.7433      |    -3.29441 |     ms |    -3.92% |
|                                  90th percentile service time |                knn-search-10-50-css |   85.5124      |    82.1293      |    -3.38307 |     ms |    -3.96% |
|                                  99th percentile service time |                knn-search-10-50-css |   88.0246      |    83.1954      |    -4.82915 |     ms |    -5.49% |
|                                 100th percentile service time |                knn-search-10-50-css |   89.4279      |    93.7963      |     4.36847 |     ms |    +4.88% |
|                                                    error rate |                knn-search-10-50-css |    0           |     0           |     0       |      % |     0.00% |
|                                                Min Throughput |              script-score-query-css |   10.7783      |    10.6817      |    -0.09664 |  ops/s |    -0.90% |
|                                               Mean Throughput |              script-score-query-css |   10.8981      |    10.7423      |    -0.15579 |  ops/s |    -1.43% |
|                                             Median Throughput |              script-score-query-css |   10.9122      |    10.7455      |    -0.16674 |  ops/s |    -1.53% |
|                                                Max Throughput |              script-score-query-css |   10.9768      |    10.774       |    -0.20286 |  ops/s |    -1.85% |
|                                       50th percentile latency |              script-score-query-css |   88.1633      |    89.8398      |     1.67646 |     ms |    +1.90% |
|                                       90th percentile latency |              script-score-query-css |   91.1757      |    94.9943      |     3.81866 |     ms |    +4.19% |
|                                       99th percentile latency |              script-score-query-css |   92.4962      |   103.803       |    11.3064  |     ms |   +12.22% |
|                                      100th percentile latency |              script-score-query-css |   92.979       |   114.415       |    21.4357  |     ms |   +23.05% |
|                                  50th percentile service time |              script-score-query-css |   88.1633      |    89.8398      |     1.67646 |     ms |    +1.90% |
|                                  90th percentile service time |              script-score-query-css |   91.1757      |    94.9943      |     3.81866 |     ms |    +4.19% |
|                                  99th percentile service time |              script-score-query-css |   92.4962      |   103.803       |    11.3064  |     ms |   +12.22% |
|                                 100th percentile service time |              script-score-query-css |   92.979       |   114.415       |    21.4357  |     ms |   +23.05% |
|                                                    error rate |              script-score-query-css |    0           |     0           |     0       |      % |     0.00% |
|                                                Min Throughput |        knn-search-10-50-concurrency |   13.3577      |    13.2901      |    -0.06765 |  ops/s |    -0.51% |
|                                               Mean Throughput |        knn-search-10-50-concurrency |   13.5367      |    13.4865      |    -0.05023 |  ops/s |    -0.37% |
|                                             Median Throughput |        knn-search-10-50-concurrency |   13.5566      |    13.5084      |    -0.04821 |  ops/s |    -0.36% |
|                                                Max Throughput |        knn-search-10-50-concurrency |   13.6696      |    13.6277      |    -0.04196 |  ops/s |    -0.31% |
|                                       50th percentile latency |        knn-search-10-50-concurrency |   70.2237      |    70.1882      |    -0.03557 |     ms |    -0.05% |
|                                       90th percentile latency |        knn-search-10-50-concurrency |   71.2916      |    70.9705      |    -0.32106 |     ms |    -0.45% |
|                                       99th percentile latency |        knn-search-10-50-concurrency |   72.1722      |    75.7432      |     3.57101 |     ms |    +4.95% |
|                                      100th percentile latency |        knn-search-10-50-concurrency |   73.2519      |    75.8564      |     2.60444 |     ms |    +3.56% |
|                                  50th percentile service time |        knn-search-10-50-concurrency |   70.2237      |    70.1882      |    -0.03557 |     ms |    -0.05% |
|                                  90th percentile service time |        knn-search-10-50-concurrency |   71.2916      |    70.9705      |    -0.32106 |     ms |    -0.45% |
|                                  99th percentile service time |        knn-search-10-50-concurrency |   72.1722      |    75.7432      |     3.57101 |     ms |    +4.95% |
|                                 100th percentile service time |        knn-search-10-50-concurrency |   73.2519      |    75.8564      |     2.60444 |     ms |    +3.56% |
|                                                    error rate |        knn-search-10-50-concurrency |    0           |     0           |     0       |      % |     0.00% |
|                                                Min Throughput |      script-score-query-concurrency |   13.4423      |    13.2616      |    -0.18072 |  ops/s |    -1.34% |
|                                               Mean Throughput |      script-score-query-concurrency |   13.5947      |    13.4035      |    -0.19116 |  ops/s |    -1.41% |
|                                             Median Throughput |      script-score-query-concurrency |   13.6097      |    13.4263      |    -0.1834  |  ops/s |    -1.35% |
|                                                Max Throughput |      script-score-query-concurrency |   13.709       |    13.4936      |    -0.21537 |  ops/s |    -1.57% |
|                                       50th percentile latency |      script-score-query-concurrency |   70.2271      |    70.5552      |     0.32814 |     ms |    +0.47% |
|                                       90th percentile latency |      script-score-query-concurrency |   71.1898      |    71.6862      |     0.49643 |     ms |    +0.70% |
|                                       99th percentile latency |      script-score-query-concurrency |   72.5708      |    80.7532      |     8.18234 |     ms |   +11.27% |
|                                      100th percentile latency |      script-score-query-concurrency |   81.2887      |   179.037       |    97.7485  |     ms |  +120.25% |
|                                  50th percentile service time |      script-score-query-concurrency |   70.2271      |    70.5552      |     0.32814 |     ms |    +0.47% |
|                                  90th percentile service time |      script-score-query-concurrency |   71.1898      |    71.6862      |     0.49643 |     ms |    +0.70% |
|                                  99th percentile service time |      script-score-query-concurrency |   72.5708      |    80.7532      |     8.18234 |     ms |   +11.27% |
|                                 100th percentile service time |      script-score-query-concurrency |   81.2887      |   179.037       |    97.7485  |     ms |  +120.25% |
|                                                    error rate |      script-score-query-concurrency |    0           |     0           |     0       |      % |     0.00% |


-------------------------------
[INFO] SUCCESS (took 0 seconds)
-------------------------------

@benwtrent benwtrent marked this pull request as ready for review November 15, 2023 15:22
@jpountz
Copy link
Contributor

jpountz commented Nov 15, 2023

Thinking about next steps out loud: if we do that then we could deprecate dot_product and make it an alias of cosine?

Copy link
Contributor

@jpountz jpountz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not competent to review all the bits that this PR touches but the change makes sense to me.

Copy link
Contributor

@ChrisHegarty ChrisHegarty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@mayya-sharipova mayya-sharipova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@benwtrent Thanks Ben, great work!

@benwtrent benwtrent added the auto-merge Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) label Dec 1, 2023
@elasticsearchmachine elasticsearchmachine merged commit caec612 into elastic:main Dec 1, 2023
16 checks passed
@benwtrent benwtrent deleted the feature/make-cosine-faster branch December 1, 2023 18:46
mayya-sharipova added a commit to mayya-sharipova/elasticsearch that referenced this pull request Jun 22, 2024
PR elastic#99445 introduced automatic normalization of dense vectors with
cosine similarity. This adds a note about this in the documentation.

Relates to elastic#99445
elasticsearchmachine pushed a commit that referenced this pull request Jun 22, 2024
PR #99445 introduced automatic normalization of dense vectors with
cosine similarity. This adds a note about this in the documentation.

Relates to #99445
mayya-sharipova added a commit to mayya-sharipova/elasticsearch that referenced this pull request Jun 22, 2024
PR elastic#99445 introduced automatic normalization of dense vectors with
cosine similarity. This adds a note about this in the documentation.

Relates to elastic#99445
mayya-sharipova added a commit to mayya-sharipova/elasticsearch that referenced this pull request Jun 22, 2024
PR elastic#99445 introduced automatic normalization of dense vectors with
cosine similarity. This adds a note about this in the documentation.

Relates to elastic#99445
mayya-sharipova added a commit to mayya-sharipova/elasticsearch that referenced this pull request Jun 22, 2024
PR elastic#99445 introduced automatic normalization of dense vectors with
cosine similarity. This adds a note about this in the documentation.

Relates to elastic#99445
elasticsearchmachine pushed a commit that referenced this pull request Jun 22, 2024
PR #99445 introduced automatic normalization of dense vectors with
cosine similarity. This adds a note about this in the documentation.

Relates to #99445
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-merge Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) cloud-deploy Publish cloud docker image for Cloud-First-Testing >enhancement :Search/Vectors Vector search Team:Search Meta label for search team v8.12.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants