Add byte quantization for float vectors in HNSW #102093

benwtrent · 2023-11-13T17:36:00Z

Adds new quantization_options to dense_vector. This allows for vectors to be automatically quantized to byte when indexed.

Example:

PUT vectors
{
  "mappings": {
    "properties": {
      "my_vector": {
        "type": "dense_vector",
        "index": true,
        "index_options": {
          "type": "int8_hnsw"
        }
      }
    }
  }
}

When querying, the query vector is automatically quantized and used when querying the HNSW graph. This reduces the memory required to only 25% of what was previously required for float vectors at a slight loss of accuracy.

This is currently only available when index: true and when using hnsw

elasticsearchmachine · 2023-11-14T14:37:51Z

Pinging @elastic/es-search (Team:Search)

kderusso

This looks awesome!!

kderusso · 2023-11-15T16:15:03Z

docs/reference/how-to/knn-search.asciidoc

+
+As of 8.12 the default <<dense-vector-element-type,`element_type`>> is `float`. But this can be
+automatically quantized during index time through the <<dense-vector-quantization,`quantization`>>. Quantization will
+reduce the required memory by 4x, but it will also reduce the precision of the vectors. For `float` vectors with


Can we link to any information (blog maybe?) here that can explain the space/precision tradeoffs?

docs/reference/mapping/types/dense-vector.asciidoc

server/src/main/java/org/elasticsearch/index/mapper/vectors/DenseVectorFieldMapper.java

kderusso · 2023-11-15T16:22:16Z

...rc/yamlRestTest/resources/rest-api-spec/test/search.vectors/41_knn_search_byte_quantized.yml

+                similarity: l2_norm
+                index_options:
+                  type: hnsw
+                  quantization_options:


Do we want to add any tests validating errors if we try to create quantized indices with unsupported values?

…add-int8-quantization

jimczi

That's great @benwtrent !
I left some minor comments.
We'll also need to update the tune-knn-search docs but that can be done in a follow up.

jimczi · 2023-11-15T17:26:20Z

docs/reference/how-to/knn-search.asciidoc

+[discrete]
+=== Reduce vector memory foot-print
+
+As of 8.12 the default <<dense-vector-element-type,`element_type`>> is `float`. But this can be


The doc is per version so not sure if it's worth mentioning 8.12?

DOH! for sure, docs are already versioned!

jimczi · 2023-11-15T17:28:35Z

docs/reference/search/search-your-data/knn-search.asciidoc

+in the index. This allows you to use the original `float` vectors for re-scoring, but the `byte` vectors for
+indexing.
+
+To use quantization, you must provide a `quantization_params` object in the `dense_vector` mapping.


It should refer to quantization_options?

jimczi · 2023-11-15T17:28:42Z

docs/reference/search/search-your-data/knn-search.asciidoc

+        "element_type": "float",
+        "dims": 2,
+        "index": true,
+        "quantization_params": {


docs/reference/mapping/types/dense-vector.asciidoc

jimczi · 2023-11-15T17:38:55Z

server/src/main/java/org/elasticsearch/index/mapper/vectors/DenseVectorFieldMapper.java

            if (mNode == null) {
-                throw new MapperParsingException("[index_options] of type [hnsw] requires field [m] to be configured");
+                mNode = Lucene99HnswVectorsFormat.DEFAULT_MAX_CONN;


We probably need a test that configures m and not ef_construction since it was not possible before this change?

…add-int8-quantization

benwtrent · 2023-11-16T15:56:53Z

Did some rally tests (this is without force-merging), this is the so_vector dataset, so its 2M 768 float32 vectors, so requiring about 6240000000 bytes or ~6GB of off heap if using float. But quantizing to byte, it only requires about 1.5GB.

You can see how script score over all the vectors hits many page faults as the data doesn't fit in memory.

Also note all numbers reflect going from California (where the test cluster was located) to east coast (where my rally machine is).

------------------------------------------------------
    _______             __   _____
   / ____(_)___  ____ _/ /  / ___/_________  ________
  / /_  / / __ \/ __ `/ /   \__ \/ ___/ __ \/ ___/ _ \
 / __/ / / / / / /_/ / /   ___/ / /__/ /_/ / /  /  __/
/_/   /_/_/ /_/\__,_/_/   /____/\___/\____/_/   \___/
------------------------------------------------------
|                                                         Metric |                                Task |         Value |   Unit |
|---------------------------------------------------------------:|------------------------------------:|--------------:|-------:|
|                                                 Min Throughput |          knn-search-10-50-match-all |  10.06        |  ops/s |
|                                                Mean Throughput |          knn-search-10-50-match-all |  10.63        |  ops/s |
|                                              Median Throughput |          knn-search-10-50-match-all |  10.67        |  ops/s |
|                                                 Max Throughput |          knn-search-10-50-match-all |  11.07        |  ops/s |
|                                        50th percentile latency |          knn-search-10-50-match-all |  78.8566      |     ms |
|                                        90th percentile latency |          knn-search-10-50-match-all |  81.0507      |     ms |
|                                        99th percentile latency |          knn-search-10-50-match-all |  87.6091      |     ms |
|                                       100th percentile latency |          knn-search-10-50-match-all |  87.7286      |     ms |
|                                   50th percentile service time |          knn-search-10-50-match-all |  78.8566      |     ms |
|                                   90th percentile service time |          knn-search-10-50-match-all |  81.0507      |     ms |
|                                   99th percentile service time |          knn-search-10-50-match-all |  87.6091      |     ms |
|                                  100th percentile service time |          knn-search-10-50-match-all |  87.7286      |     ms |
|                                                     error rate |          knn-search-10-50-match-all |   0           |      % |
|                                                 Min Throughput |        script-score-query-match-all |   1.66        |  ops/s |
|                                                Mean Throughput |        script-score-query-match-all |   1.68        |  ops/s |
|                                              Median Throughput |        script-score-query-match-all |   1.68        |  ops/s |
|                                                 Max Throughput |        script-score-query-match-all |   1.69        |  ops/s |
|                                        50th percentile latency |        script-score-query-match-all | 573.453       |     ms |
|                                        90th percentile latency |        script-score-query-match-all | 587.929       |     ms |
|                                        99th percentile latency |        script-score-query-match-all | 618.025       |     ms |
|                                       100th percentile latency |        script-score-query-match-all | 653.512       |     ms |
|                                   50th percentile service time |        script-score-query-match-all | 573.453       |     ms |
|                                   90th percentile service time |        script-score-query-match-all | 587.929       |     ms |
|                                   99th percentile service time |        script-score-query-match-all | 618.025       |     ms |
|                                  100th percentile service time |        script-score-query-match-all | 653.512       |     ms |
|                                                     error rate |        script-score-query-match-all |   0           |      % |
|                                                 Min Throughput |   knn-search-10-50-acceptedAnswerId |  10.27        |  ops/s |
|                                                Mean Throughput |   knn-search-10-50-acceptedAnswerId |  10.41        |  ops/s |
|                                              Median Throughput |   knn-search-10-50-acceptedAnswerId |  10.42        |  ops/s |
|                                                 Max Throughput |   knn-search-10-50-acceptedAnswerId |  10.52        |  ops/s |
|                                        50th percentile latency |   knn-search-10-50-acceptedAnswerId |  91.0864      |     ms |
|                                        90th percentile latency |   knn-search-10-50-acceptedAnswerId |  95.1255      |     ms |
|                                        99th percentile latency |   knn-search-10-50-acceptedAnswerId | 102.892       |     ms |
|                                       100th percentile latency |   knn-search-10-50-acceptedAnswerId | 105.84        |     ms |
|                                   50th percentile service time |   knn-search-10-50-acceptedAnswerId |  91.0864      |     ms |
|                                   90th percentile service time |   knn-search-10-50-acceptedAnswerId |  95.1255      |     ms |
|                                   99th percentile service time |   knn-search-10-50-acceptedAnswerId | 102.892       |     ms |
|                                  100th percentile service time |   knn-search-10-50-acceptedAnswerId | 105.84        |     ms |
|                                                     error rate |   knn-search-10-50-acceptedAnswerId |   0           |      % |
|                                                 Min Throughput | script-score-query-acceptedAnswerId |   1.94        |  ops/s |
|                                                Mean Throughput | script-score-query-acceptedAnswerId |   1.95        |  ops/s |
|                                              Median Throughput | script-score-query-acceptedAnswerId |   1.95        |  ops/s |
|                                                 Max Throughput | script-score-query-acceptedAnswerId |   1.95        |  ops/s |
|                                        50th percentile latency | script-score-query-acceptedAnswerId | 506.142       |     ms |
|                                        90th percentile latency | script-score-query-acceptedAnswerId | 521.093       |     ms |
|                                        99th percentile latency | script-score-query-acceptedAnswerId | 544.209       |     ms |
|                                       100th percentile latency | script-score-query-acceptedAnswerId | 552.22        |     ms |
|                                   50th percentile service time | script-score-query-acceptedAnswerId | 506.142       |     ms |
|                                   90th percentile service time | script-score-query-acceptedAnswerId | 521.093       |     ms |
|                                   99th percentile service time | script-score-query-acceptedAnswerId | 544.209       |     ms |
|                                  100th percentile service time | script-score-query-acceptedAnswerId | 552.22        |     ms |
|                                                     error rate | script-score-query-acceptedAnswerId |   0           |      % |
|                                                 Min Throughput |               knn-search-10-50-java |  10.48        |  ops/s |
|                                                Mean Throughput |               knn-search-10-50-java |  10.61        |  ops/s |
|                                              Median Throughput |               knn-search-10-50-java |  10.61        |  ops/s |
|                                                 Max Throughput |               knn-search-10-50-java |  10.72        |  ops/s |
|                                        50th percentile latency |               knn-search-10-50-java |  89.0557      |     ms |
|                                        90th percentile latency |               knn-search-10-50-java |  94.2473      |     ms |
|                                        99th percentile latency |               knn-search-10-50-java | 188.35        |     ms |
|                                       100th percentile latency |               knn-search-10-50-java | 227.795       |     ms |
|                                   50th percentile service time |               knn-search-10-50-java |  89.0557      |     ms |
|                                   90th percentile service time |               knn-search-10-50-java |  94.2473      |     ms |
|                                   99th percentile service time |               knn-search-10-50-java | 188.35        |     ms |
|                                  100th percentile service time |               knn-search-10-50-java | 227.795       |     ms |
|                                                     error rate |               knn-search-10-50-java |   0           |      % |
|                                                 Min Throughput |             script-score-query-java |   7.72        |  ops/s |
|                                                Mean Throughput |             script-score-query-java |   7.76        |  ops/s |
|                                              Median Throughput |             script-score-query-java |   7.76        |  ops/s |
|                                                 Max Throughput |             script-score-query-java |   7.78        |  ops/s |
|                                        50th percentile latency |             script-score-query-java | 125.497       |     ms |
|                                        90th percentile latency |             script-score-query-java | 132.617       |     ms |
|                                        99th percentile latency |             script-score-query-java | 138.362       |     ms |
|                                       100th percentile latency |             script-score-query-java | 139.082       |     ms |
|                                   50th percentile service time |             script-score-query-java | 125.497       |     ms |
|                                   90th percentile service time |             script-score-query-java | 132.617       |     ms |
|                                   99th percentile service time |             script-score-query-java | 138.362       |     ms |
|                                  100th percentile service time |             script-score-query-java | 139.082       |     ms |
|                                                     error rate |             script-score-query-java |   0           |      % |
|                                                 Min Throughput |                knn-search-10-50-css |  10.66        |  ops/s |
|                                                Mean Throughput |                knn-search-10-50-css |  10.8         |  ops/s |
|                                              Median Throughput |                knn-search-10-50-css |  10.82        |  ops/s |
|                                                 Max Throughput |                knn-search-10-50-css |  10.89        |  ops/s |
|                                        50th percentile latency |                knn-search-10-50-css |  88.3835      |     ms |
|                                        90th percentile latency |                knn-search-10-50-css |  90.6141      |     ms |
|                                        99th percentile latency |                knn-search-10-50-css |  95.0067      |     ms |
|                                       100th percentile latency |                knn-search-10-50-css | 100.368       |     ms |
|                                   50th percentile service time |                knn-search-10-50-css |  88.3835      |     ms |
|                                   90th percentile service time |                knn-search-10-50-css |  90.6141      |     ms |
|                                   99th percentile service time |                knn-search-10-50-css |  95.0067      |     ms |
|                                  100th percentile service time |                knn-search-10-50-css | 100.368       |     ms |
|                                                     error rate |                knn-search-10-50-css |   0           |      % |
|                                                 Min Throughput |              script-score-query-css |   5.29        |  ops/s |
|                                                Mean Throughput |              script-score-query-css |   6.23        |  ops/s |
|                                              Median Throughput |              script-score-query-css |   6.29        |  ops/s |
|                                                 Max Throughput |              script-score-query-css |   7           |  ops/s |
|                                        50th percentile latency |              script-score-query-css |  87.5079      |     ms |
|                                        90th percentile latency |              script-score-query-css |  90.4777      |     ms |
|                                        99th percentile latency |              script-score-query-css |  93.7029      |     ms |
|                                       100th percentile latency |              script-score-query-css |  94.0575      |     ms |
|                                   50th percentile service time |              script-score-query-css |  87.5079      |     ms |
|                                   90th percentile service time |              script-score-query-css |  90.4777      |     ms |
|                                   99th percentile service time |              script-score-query-css |  93.7029      |     ms |
|                                  100th percentile service time |              script-score-query-css |  94.0575      |     ms |
|                                                     error rate |              script-score-query-css |   0           |      % |
|                                                 Min Throughput |        knn-search-10-50-concurrency |  13.14        |  ops/s |
|                                                Mean Throughput |        knn-search-10-50-concurrency |  13.36        |  ops/s |
|                                              Median Throughput |        knn-search-10-50-concurrency |  13.38        |  ops/s |
|                                                 Max Throughput |        knn-search-10-50-concurrency |  13.52        |  ops/s |
|                                        50th percentile latency |        knn-search-10-50-concurrency |  70.4345      |     ms |
|                                        90th percentile latency |        knn-search-10-50-concurrency |  71.4835      |     ms |
|                                        99th percentile latency |        knn-search-10-50-concurrency |  72.8865      |     ms |
|                                       100th percentile latency |        knn-search-10-50-concurrency |  73.7157      |     ms |
|                                   50th percentile service time |        knn-search-10-50-concurrency |  70.4345      |     ms |
|                                   90th percentile service time |        knn-search-10-50-concurrency |  71.4835      |     ms |
|                                   99th percentile service time |        knn-search-10-50-concurrency |  72.8865      |     ms |
|                                  100th percentile service time |        knn-search-10-50-concurrency |  73.7157      |     ms |
|                                                     error rate |        knn-search-10-50-concurrency |   0           |      % |
|                                                 Min Throughput |      script-score-query-concurrency |  13.01        |  ops/s |
|                                                Mean Throughput |      script-score-query-concurrency |  13.25        |  ops/s |
|                                              Median Throughput |      script-score-query-concurrency |  13.28        |  ops/s |
|                                                 Max Throughput |      script-score-query-concurrency |  13.44        |  ops/s |
|                                        50th percentile latency |      script-score-query-concurrency |  70.6921      |     ms |
|                                        90th percentile latency |      script-score-query-concurrency |  71.8873      |     ms |
|                                        99th percentile latency |      script-score-query-concurrency |  73.6924      |     ms |
|                                       100th percentile latency |      script-score-query-concurrency |  74.5011      |     ms |
|                                   50th percentile service time |      script-score-query-concurrency |  70.6921      |     ms |
|                                   90th percentile service time |      script-score-query-concurrency |  71.8873      |     ms |
|                                   99th percentile service time |      script-score-query-concurrency |  73.6924      |     ms |
|                                  100th percentile service time |      script-score-query-concurrency |  74.5011      |     ms |
|                                                     error rate |      script-score-query-concurrency |   0           |      % |

jimczi

LGTM!

jimczi · 2023-11-16T18:53:58Z

...tial/src/main/java/org/elasticsearch/xpack/spatial/index/fielddata/CartesianShapeValues.java

@@ -67,6 +67,7 @@ public CartesianShapeValue() {
            super(CoordinateEncoder.CARTESIAN, CartesianPoint::new);
        }

+        @SuppressWarnings("this-escape")


looks unrelated?

jimczi · 2023-11-16T18:54:03Z

...in/spatial/src/main/java/org/elasticsearch/xpack/spatial/index/fielddata/GeoShapeValues.java

@@ -70,6 +70,7 @@ public GeoShapeValue() {
            this.tile2DVisitor = new Tile2DVisitor();
        }

+        @SuppressWarnings("this-escape")


Again, this is something that I fixed in main but is breaking my local build. For sanity, I fixed it here. I will be a no-op once main is merged here

jimczi · 2023-11-16T18:56:05Z

docs/reference/search/search-your-data/knn-search.asciidoc

+  },
+  "fields": [ "title" ],
+  "rescore": {
+    "window_size": 10,


I think the window_size should be set to 15 to match the intent with the k value above?

It honestly doesn't matter. I would think you would get a larger k, and then rerank some sub-set of those.

jimczi · 2023-11-16T18:56:41Z

server/src/main/java/org/elasticsearch/action/ingest/SimulateIndexResponse.java

@@ -31,13 +31,15 @@ public class SimulateIndexResponse extends IndexResponse {
    private final BytesReference source;
    private final XContentType sourceXContentType;

+    @SuppressWarnings("this-escape")


It will go away when we merge main, but it is breaking my local testing.

kderusso

Very nice work!

jpountz · 2023-11-21T16:16:39Z

Curiosity question: is there anything that is in the way of making this the default?

benwtrent · 2023-11-21T16:55:23Z

is there anything that is in the way of making this the default?

What would the API look like to disable it? If the user provides any HNSW params and doesn't specify quantization, does this mean we should default?

jpountz · 2023-11-21T18:08:27Z

What would the API look like to disable it? If the user provides any HNSW params and doesn't specify quantization, does this mean we should default?

Ideally defaults would reflect the knn search tuning guide, so my current thinking is to have something like:

quantization enabled automatically when dim >= 384
have a quantization_options.type = "none" or something along these lines that can be used to disable quantization explicitly
indeed quantization would remain enabled if a user overrides HNSW options without touching quantization_options.type

jpountz · 2023-11-22T09:11:24Z

Alternatively, we could have a separate enabled flag on quantization if type: none looks too ugly:

PUT vectors
{
  "mappings": {
    "properties": {
      "my_vector": {
        "type": "dense_vector",
        "index": true,
        "index_options": {
          "type": "hnsw",
          "quantization": {
            "enabled": true,
            "type": "byte"
          }
        }
      }
    }
  }
}

jimczi · 2023-11-22T11:24:35Z

I wonder if specialising index_options.type would leave more room for better default in the future.
Ideally the index_options.type default should be auto and we can make the decision internally.
What about introducing a new type, something like int8_hnsw? That would allow to force the type to quantisation and to change the default when the type is not provided.

PUT vectors
{
  "mappings": {
    "properties": {
      "my_vector": {
        "type": "dense_vector",
        "index_options": {
          "type": "int8_hnsw"
        }
}

We already have parameters at this level such as ef_construction which are tailored to the hnsw type so adding confidence_interval for the new type appears to be consistent. The determination of available parameters should be based on the index_options.type.

ChrisHegarty

This is a great improvement.

I left some nit picky comments on the docs; mostly do to with my own confusion - this is not an easy concept to grasp, and I believe a tightening of the wording will help folks with it.

What I struggle with is the use of byte while there is already a byte element_type. How to we want users to think about this: quantization is just an implementation detail when using float element_type that improves runtime memory footprint OR quantization is somehow coercing floats to bytes, and the user now need to think of bytes. I would think the former, which kinda relates to other comments in this thread; maybe refer to the quantization as int8 or 1-byte integer value ? Now users of float do not need to think of the byte element_type.

docs/reference/mapping/types/dense-vector.asciidoc

ChrisHegarty · 2023-11-24T11:44:39Z

docs/reference/search/search-your-data/knn-search.asciidoc

+// TEST[s/"num_candidates": 100/"num_candidates": 3/]
+
+Since the original `float` vectors are still retained in the index, you can optionally use them for re-scoring. This will
+do the heavy query against the indexed vectors, and then you can get the absolute nearest neighbors by re-scoring.


"heavy query" implies no memory footprint improvements, right? The original float vectors are loaded.

Docs here are unclear obviously. I mean to say that the expensive query is done via approximate search against smaller foot print.

benwtrent · 2023-11-27T14:22:31Z

@jpountz @jimczi

is there anything that is in the way of making this the default?

I would not be comfortable making it the default right away. I really want more testing before we do this. But, we should indeed design the API in a way that makes this possible.

For @jpountz 's option,

{
  "mappings": {
    "properties": {
      "my_vector": {
        "type": "dense_vector",
        "index": true,
        "index_options": {
          "type": "hnsw",
          "quantization": true, // like "index"
          "quantization_options": { // like "index_options"
            "type": "byte",
            "confidence_interval": 0.9
          }
        }
      }
    }
  }
}

For @jimczi option, where we add a new index type, that could also work.

For setting confidence interval, it would be

{
  "mappings": {
    "properties": {
      "my_vector": {
        "type": "dense_vector",
        "index": true,
        "index_options": {
          "type": "int8_hnsw",
          "confidence_interval": 0.9
        }
      }
    }
  }
}

For things that don't support that parameter, we would throw.

What do you think @jpountz ^ For things in the future pq int4, it would be pq_hnsw or int4_hnsw and for flat it would be int8_flat, etc.

@ChrisHegarty 's comment about byte vs int8.

Yes, I should update the quantization value name from byte to int8 to prevent term overloading and confusion.

jpountz · 2023-11-27T23:41:36Z

What do you think @jpountz ^ For things in the future pq int4, it would be pq_hnsw or int4_hnsw and for flat it would be int8_flat, etc.

The question that this raises to me is whether "flat" should be considered as a special index type, or as a lack of index. I had initially assumed that flat storage would be enabled by setting index: false because it seems more consistent with other fields we support as flat storage feels like doc values, but on the other hand other vector search libraries do seem to consider flat storage as a form of index. If we want to only enable flat storage through index: false then it would be awkward to configure quantization as part of index options? And if flat is a special form of indexing, then what does index: false do?

benwtrent · 2023-11-28T00:46:21Z

We need to support quantization over various indexing methodologies, including "flat".

I think using "index: false" as a flat index mistake. It should only be useful for scripting.

It is really weird to have "index: false" and then configure what you are "not" indexing 🤦.

I think we need a new "flat" index kind that requires the similarity to be configured, and allows quantization. Admittedly, just "flat" isn't much better than "index: false", but it will clean up the query API really nicely (as we know the similarity that's configured).

Another option is to have quantization live as a top level thing. But all this configuration is getting unwieldy.

I am starting to like @jimczi's suggestion more and more.

mayya-sharipova · 2023-11-28T14:41:01Z

docs/reference/mapping/types/dense-vector.asciidoc

+(Optional, object)
+An optional section that configures the quantization configuration. The
+quantization configuration is used to reduce the memory footprint of the
+index. Only `byte` quantization is currently supported and can only be configured if


Do you think we should add a sentence about disk usage increase with quantization_options?

benwtrent · 2023-11-28T14:43:16Z

@mayya-sharipova what do you think of @jimczi's suggested API? Adding a new "int8_hnsw" index type.

docs/reference/mapping/types/dense-vector.asciidoc

jpountz · 2023-11-28T15:42:36Z

So if I read your suggestion correctly, we'd treat flat as a form of index and parse index: false as meaning "flat index without quantization" for compatibility? That would work for me. I like that the suggested API from @jimczi is less verbose, much nicer to read.

benwtrent · 2023-11-28T15:49:26Z

Awesome, we are in agreement then.

Moving to use int8_hnsw instead of all these quantization options/params thing. It will accept an optional parameter (similar to m and ef_search) for confidence_interval.

I will update the PR when I can :D.

mayya-sharipova · 2023-11-28T15:54:51Z

+1 on @jimczi's proposal.

+1 also to add flat as an index_option: (may be in another PR):

"similarity": "l2_norm",
"index_options": {
  "type": "flat"
}

jimczi

LGTM, thanks for the iteration

mayya-sharipova

Thanks @benwtrent, great work.

New changes to index_options look good as well.

ChrisHegarty

LGTM

benwtrent added >feature cloud-deploy Publish cloud docker image for Cloud-First-Testing :Search/Vectors Vector search v8.12.0 labels Nov 13, 2023

Add byte quantization for float vectors in HNSW

2176ee2

benwtrent force-pushed the feature/add-int8-quantization branch from a950a4f to 2176ee2 Compare November 14, 2023 13:34

Adding docs and fixing format

344b7a9

benwtrent marked this pull request as ready for review November 14, 2023 14:37

elasticsearchmachine added the Team:Search Meta label for search team label Nov 14, 2023

kderusso reviewed Nov 15, 2023

View reviewed changes

Merge remote-tracking branch 'upstream/lucene_snapshot' into feature/…

3a25cf8

…add-int8-quantization

jimczi reviewed Nov 15, 2023

View reviewed changes

benwtrent added 3 commits November 15, 2023 14:43

addressing PR feedback

d3bf15a

Merge remote-tracking branch 'upstream/lucene_snapshot' into feature/…

ab00cc5

…add-int8-quantization

fixing format printout test

76ab2f8

benwtrent requested review from kderusso and jimczi November 16, 2023 14:12

fixing tests

f08908b

benwtrent added 2 commits November 16, 2023 11:26

fixing docs

25e3492

fix test

4ff7f46

jimczi reviewed Nov 16, 2023

View reviewed changes

kderusso approved these changes Nov 16, 2023

View reviewed changes

ChrisHegarty reviewed Nov 24, 2023

View reviewed changes

mayya-sharipova reviewed Nov 28, 2023

View reviewed changes

docs/reference/mapping/types/dense-vector.asciidoc Show resolved Hide resolved

benwtrent added 3 commits November 28, 2023 12:23

Merge branch 'lucene_snapshot' into feature/add-int8-quantization

7e45957

Moving to use int8_hnsw instead of adding quantization options

daef8dd

clarifying docs

f58c77b

benwtrent requested review from mayya-sharipova, ChrisHegarty and jimczi November 28, 2023 18:37

jimczi approved these changes Nov 29, 2023

View reviewed changes

mayya-sharipova approved these changes Nov 29, 2023

View reviewed changes

ChrisHegarty approved these changes Nov 29, 2023

View reviewed changes

Merge branch 'lucene_snapshot' into feature/add-int8-quantization

db028bb

benwtrent added the auto-merge Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) label Nov 29, 2023

elasticsearchmachine merged commit f00364a into elastic:lucene_snapshot Nov 29, 2023
15 checks passed

benwtrent deleted the feature/add-int8-quantization branch November 29, 2023 17:30

benwtrent added a commit to ChrisHegarty/elasticsearch that referenced this pull request Nov 30, 2023

Adding changelog for PR elastic#102093

36869d4

Add byte quantization for float vectors in HNSW #102093

Add byte quantization for float vectors in HNSW #102093

Conversation

benwtrent commented Nov 13, 2023 • edited

elasticsearchmachine commented Nov 14, 2023

kderusso left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jimczi left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

benwtrent commented Nov 16, 2023

jimczi left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kderusso left a comment

Choose a reason for hiding this comment

jpountz commented Nov 21, 2023

benwtrent commented Nov 21, 2023

jpountz commented Nov 21, 2023 • edited

jpountz commented Nov 22, 2023

jimczi commented Nov 22, 2023 • edited

ChrisHegarty left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

benwtrent commented Nov 27, 2023

jpountz commented Nov 27, 2023

benwtrent commented Nov 28, 2023

Choose a reason for hiding this comment

benwtrent commented Nov 28, 2023

jpountz commented Nov 28, 2023

benwtrent commented Nov 28, 2023

mayya-sharipova commented Nov 28, 2023 • edited

jimczi left a comment

Choose a reason for hiding this comment

mayya-sharipova left a comment

Choose a reason for hiding this comment

ChrisHegarty left a comment

Choose a reason for hiding this comment

benwtrent commented Nov 13, 2023 •

edited

jpountz commented Nov 21, 2023 •

edited

jimczi commented Nov 22, 2023 •

edited

mayya-sharipova commented Nov 28, 2023 •

edited