Distance measures for dense and sparse vectors #37947

Merged
merged 11 commits into from Feb 20, 2019
+1,339 −74

Conversation

Projects
None yet
8 participants
Contributor

mayya-sharipova commented Jan 29, 2019 • edited

 Introduce painless functions of cosineSimilarity and dotProduct distance measures for dense and sparse vector fields. ```{ "query": { "script_score": { "query": { "match_all": {} }, "script": { "source": "cosineSimilarity(params.queryVector, doc['my_dense_vector'])", "params": { "queryVector": [4, 3.4, -1.2] } } } } }``` ```{ "query": { "script_score": { "query": { "match_all": {} }, "script": { "source": "cosineSimilaritySparse(params.queryVector, doc['my_sparse_vector'])", "params": { "queryVector": {"2": 0.5, "10" : 111.3, "50": -1.3, "113": 140.8, "4545": 111.0} } } } } }``` Closes #31615
``` Distance measures for dense and sparse vectors ```
```Introduce painless functions of
cosineSimilarity and dotProduct distance
measures for dense and sparse vector fields.

```js
{
"query": {
"script_score": {
"query": {
"match_all": {}
},
"script": {
"source": "cosineSimilarity(params.queryVector, doc['my_dense_vector'].value)",
"params": {
"queryVector": [4, 3.4, -1.2]
}
}
}
}
}
```

```js
{
"query": {
"script_score": {
"query": {
"match_all": {}
},
"script": {
"source": "cosineSimilaritySparse(params.queryVector, doc['my_sparse_vector'].value)",
"params": {
"queryVector": {"2": -0.5, "10" : 111.3, "50": -13.0, "113": 14.8, "4545": -156.0}
}
}
}
}
}
```

Closes #31615```
``` 1f033c5 ```

Collaborator

elasticmachine commented Jan 29, 2019

 Pinging @elastic/es-search

Closed

Contributor

jpountz left a comment

 I only had a quick look, one concern that I have is that we are leaking the internal representation of vector fields. I believe we should instead expose vectors in scripts via a dedicated ScriptDocValues sub-class, like we are doing for dates for instance, or only give access to vector fields via functions, whose signature would look like `dotProduct(queryVector, fieldName)`.
 @@ -9,7 +9,8 @@ not exceed 500. The number of dimensions can be different across documents. A `dense_vector` field is a single-valued field. These vectors can be used for document scoring. These vectors can be used for {ref}/query-dsl-script-score-query.html#vector-functions[document scoring].

jpountz Jan 29, 2019

Contributor

is there a reason not use an internal link, eg. `<<vector-functions,document scoring>>`?

mayya-sharipova Jan 30, 2019

Author Contributor

Thanks Adrien. I think we can use internal links only to reference within the same document. What I wanted to do here is reference a section of the external document

jpountz Jan 31, 2019

Contributor

I'm still a bit confused, this is the same document, isn't it?

mayya-sharipova Feb 5, 2019

Author Contributor

@jpountz Sorry Adrien, I meant that inside one asciidoc doc `dense-vector.asciidoc` we want to reference a section of another asciidoc doc `script-score-query.asciidoc`.

We can indeed use an easier format : <<query-dsl-script-score-query,`document_scoring`>>, but this will link to the whole document. And as I understood after talking with the documentation team, the only way to link to the section of another doc is to use this full html link.

Contributor

ok

mayya-sharipova Feb 12, 2019

Author Contributor

 this.queryVectorMagnitude = (float) Math.sqrt(dotProduct); } public float cosineSimilarity(BytesRef docVectorBR) {

jpountz Jan 29, 2019

Contributor

I would make these methods return a double. We only support floats at index time because of space contraints, but this isn't a problem here.

mayya-sharipova Jan 29, 2019

Author Contributor

@jpountz Thanks for the review, Adrien. I will change this to `double`. The main reason for `float` was that it is a document's score, and all other Scorers are returning floats.

mayya-sharipova added some commits Jan 30, 2019

``` Address Adrien's comments ```
``` 7075c03 ```
``` Merge branch 'master' into vector-fied-query ```
``` 0d4517f ```
``` Removes typed calls from YAML REST tests ```
``` 3535e48 ```
Contributor Author

mayya-sharipova commented Jan 30, 2019

 @jpountz Thanks for the initial review, Adrien. I have tried to address your comments and this PR is ready for the review when you have time: I have made functions return `double` instead of `float` I have modified the format of the functions from `dotProduct(queryVector, doc['my_dense_vector'].value)` to `dotProduct(queryVector, doc['my_dense_vector'])` I could not exactly made them as you suggested: `dotProduct(queryVector, fieldName)` inside the painless script. About exposing vectors in scripts via a dedicated ScriptDocValues sub-class - this was already initially implemented through `VectorScriptDocValues.java`. About leaking the internal representation of vector fields - I have made `getValue()` method of `VectorScriptDocValues` package private, so that vector fields are NOT accessible in scripts, sorting, or aggs outside our distance unctions. Or are you concerned that vector values are returned as a part of the search request as below? ``````"hits": [ { "_index": "dindex", "_type": "_doc", "_id": "2", "_score": 1.0000001, "_source": { "my_text": "text2", "my_vector": [ 4.5, 3.4, -1.2 ] } }, ``````

LiuGangR commented Jan 31, 2019 • edited

 how to use cosineSimilarity？ I use the snapshot which is built from your branch 'vector-fied-query' it just tell me '"lang":"painless","caused_by":{"type":"illegal_argument_exception","reason":"Variable [my_feature] is not defined' `{"error":{"root_cause":[{"type":"script_exception","reason":"compile error","script_stack":["... (params.queryVector, doc[my_feature].value)"," ^---- HERE"],"script":"cosineSimilarity(params.queryVector, doc[my_feature].value)","lang":"painless"}],"type":"search_phase_execution_exception","reason":"all shards failed","phase":"query","grouped":true,"failed_shards":[{"shard":0,"index":"test_index","node":"mZKn55wJSSi-vs3hMxocbQ","reason":{"type":"query_shard_exception","reason":"script_score: the script could not be loaded","index_uuid":"Q8lJHJLLRIatXSOY1_2UJg","index":"test_index","caused_by":{"type":"script_exception","reason":"compile error","script_stack":["... (params.queryVector, doc[my_feature].value)"," ^---- HERE"],"script":"cosineSimilarity(params.queryVector, doc[my_feature].value)","lang":"painless","caused_by":{"type":"illegal_argument_exception","reason":"Variable [my_feature] is not defined."}}}}],"caused_by":{"type":"script_exception","reason":"compile error","script_stack":["... (params.queryVector, doc[my_feature].value)"," ^---- HERE"],"script":"cosineSimilarity(params.queryVector, doc[my_feature].value)","lang":"painless","caused_by":{"type":"illegal_argument_exception","reason":"Variable [my_feature] is not defined."}}},"status":400}` and this is my mapping `{ "test_index": { "mappings": { "properties": { "my_feature": { "type": "dense_vector" } } } } }`

Contributor

jpountz left a comment

 Thanks Mayya, I like this approach much more. I left some minor comments. One additional thing that would be nice to address would be to make sure that users get a nice error if they call the sparse functions on dense vectors or vice-versa, I have the feeling that users would get cryptic decoding errors if they do that with the current state of your PR?
 @Override public SortedBinaryDocValues getBytesValues() { return null;

jpountz Jan 31, 2019

Contributor

can you throw an exception instead?

 @@ -9,7 +9,8 @@ not exceed 500. The number of dimensions can be different across documents. A `dense_vector` field is a single-valued field. These vectors can be used for document scoring. These vectors can be used for {ref}/query-dsl-script-score-query.html#vector-functions[document scoring].

jpountz Jan 31, 2019

Contributor

I'm still a bit confused, this is the same document, isn't it?

 @@ -74,6 +74,108 @@ to be the most efficient by using the internal mechanisms. -------------------------------------------------- // NOTCONSOLE [[vector-functions]] ===== Distance functions for vector fields These functions are used to calculate distances

jpountz Jan 31, 2019

Contributor

Let's maybe avoid mentioning "distance" since eg. cosineSimilarity measure the similarity between two vectors rather than their distance?

 // NOTCONSOLE NOTE: If a document doesn't have a value for a vector field on which a distance function is executed, 0 will be returned as a result.

jpountz Jan 31, 2019

Contributor

Let's also clarify what happens for dense vectors if they don't have the same number of dimensions?

 public static int[] decodeSparseVectorDims(BytesRef vectorBR) { if (vectorBR == null) { throw new IllegalStateException("A document doesn't have a value for a vector field!"); }

jpountz Jan 31, 2019

Contributor

Shouldn't this be an illegal argument exception?

 int i = 0; for (Map.Entry dimValue : queryVector.entrySet()) { queryDims[i] = Integer.parseInt(dimValue.getKey()); queryValues[i] = dimValue.getValue().floatValue();

jpountz Jan 31, 2019

Contributor

s/floatValue/doubleValue/?

 double dotProduct = 0; int i = 0; for (Map.Entry dimValue : queryVector.entrySet()) { queryDims[i] = Integer.parseInt(dimValue.getKey());

jpountz Jan 31, 2019

Contributor

catch the NumberFormatException to return a more user-friendly exception?

 // calculate docVector magnitude double dotProduct = 0; for (float value : docValues) { dotProduct += value * value;

jpountz Jan 31, 2019

Contributor

cast one of the values to a double to have better accuracy and avoid overflows?

 VectorDVAtomicFieldData(BinaryDocValues values) { super(); this.values = values;

jpountz Jan 31, 2019

Contributor

let's take a LeafReader and a String field like other impls do and re-pull binary doc values each time, this way calling getScriptDocValues() multiple times on the same AtomicFieldData instance will work as expected

 } // package private access only for {@link ScoreScriptUtils} BytesRef getValue() {

jpountz Jan 31, 2019

Contributor

let's call it something like `getEncodedValue` to clarify what it is about?

Contributor

jpountz commented Jan 31, 2019

 @LiuGangR You need to put quotes around the field name.

LiuGangR commented Jan 31, 2019 • edited

 @jpountz Thanks! This query is working. `{ "query": { "script_score": { "query": { "match_all": {} }, "script": { "source": "dotProduct(params.queryVector, doc[\"my_feature\"])", "params": { "queryVector": [4, 3.4, -1.2] } } } } }`

LiuGangR commented Jan 31, 2019

 But there is new problem script score function must not produce negative scores `{"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"script score function must not produce negative scores, but got: [-0.1967234265776135]"}],"type":"search_phase_execution_exception","reason":"all shards failed","phase":"query","grouped":true,"failed_shards":[{"shard":0,"index":"test_index","node":"mZKn55wJSSi-vs3hMxocbQ","reason":{"type":"illegal_argument_exception","reason":"script score function must not produce negative scores, but got: [-0.1967234265776135]"}}],"caused_by":{"type":"illegal_argument_exception","reason":"script score function must not produce negative scores, but got: [-0.1967234265776135]","caused_by":{"type":"illegal_argument_exception","reason":"script score function must not produce negative scores, but got: [-0.1967234265776135]"}}},"status":400}`
Contributor

jpountz commented Jan 31, 2019

 This is a good point, we should update examples so that they may only create positive scores, regardless of what vectors are indexed.

LiuGangR commented Feb 1, 2019

 @jpountz That is cool. And you have any plan to support that in which version ?
Contributor

jpountz commented Feb 1, 2019

 @LiuGangR Hopefully 7.1.

LiuGangR commented Feb 1, 2019

 @jpountz another question. If I what to search 'dense_vector' field, the 'cosineSimilarity' is the only why. And is there a default vector query？ Thanks！
Contributor Author

mayya-sharipova commented Feb 5, 2019

 @LiuGangR yes, the only way to use `dense_vector` or `sparse_vector` in queries is through `cosineSimilarity` and `dotProduct` functions

mayya-sharipova added some commits Feb 5, 2019

``` Address Adrien's comments 2 ```
``` ac0205c ```
``` Merge branch 'master' into vector-fied-query ```
``` 608a1fb ```
Contributor Author

mayya-sharipova commented Feb 5, 2019

 @jpountz Thanks Adrien for another review. I have addressed all your feedback except 1 comment, and this PR is ready for another round of review whenever you have time. Unaddressed feedback: One additional thing that would be nice to address would be to make sure that users get a nice error if they call the sparse functions on dense vectors or vice-versa, I have the feeling that users would get cryptic decoding errors if they do that with the current state of your PR? Uses can make two mistakes here: provide a query vector in a wrong format. Here we have some safeguards for `parseInt` or painless script engine will complain and I can't do anything (e.g. queryVector is expected to be a `Map` but `Array` was provided) provide a document vector in a wrong format (dense versus sparse). Looks like here a user will not see failures, but will see unexpected scores (either 0, or very huge negative float numbers). The only way to prevent it, is to have the first byte in BytesRef as a special value that can tell us if the encoded vector is dense or sparse. What do you think? About changing the encoding for vector fields, I was also thinking possibly to encode the magnitude of a document vector, so not to calculate it each time. What do you think about this?
``` Correct check style ```
``` e00f7d5 ```

LiuGangR commented Feb 12, 2019

 @jpountz I build source today from the last commit of vector-field-query. And I use the data and search which are success in last built version. But it is failed. And this is the log. `{"error":{"root_cause":[{"type":"script_exception","reason":"runtime error","script_stack":["cosineSimilarity(params.queryVector, doc[\"my_feature\"])"," ^---- HERE"],"script":"cosineSimilarity(params.queryVector, doc[\"my_feature\"])","lang":"painless"}],"type":"search_phase_execution_exception","reason":"all shards failed","phase":"query","grouped":true,"failed_shards":[{"shard":0,"index":"test_index","node":"1FxxsdfiRfGh_O_3OAe6ZA","reason":{"type":"script_exception","reason":"runtime error","script_stack":["cosineSimilarity(params.queryVector, doc[\"my_feature\"])"," ^---- HERE"],"script":"cosineSimilarity(params.queryVector, doc[\"my_feature\"])","lang":"painless","caused_by":{"type":"class_cast_exception","reason":"class org.elasticsearch.index.query.VectorScriptDocValues cannot be cast to class org.apache.lucene.util.BytesRef (org.elasticsearch.index.query.VectorScriptDocValues is in unnamed module of loader java.net.FactoryURLClassLoader @46046c06; org.apache.lucene.util.BytesRef is in unnamed module of loader 'app')"}}}]},"status":400}`

Contributor

jpountz left a comment

 +1 Thanks @mayya-sharipova.

mayya-sharipova merged commit `3260fd1` into elastic:master Feb 20, 2019 9 checks passed

9 checks passed

CLA Commit author is a member of Elasticsearch
Details
elasticsearch-ci/1 Build finished.
Details
elasticsearch-ci/2 Build finished.
Details
elasticsearch-ci/bwc Build finished.
Details
elasticsearch-ci/default-distro Build finished.
Details
elasticsearch-ci/docbldesx Build finished.
Details
elasticsearch-ci/docs-check Build finished.
Details
elasticsearch-ci/oss-distro-docs Build finished.
Details
elasticsearch-ci/packaging-sample Build finished.
Details

weizijun added a commit to weizijun/elasticsearch that referenced this pull request Feb 20, 2019

``` Distance measures for dense and sparse vectors (elastic#37947) ```
```* Distance measures for dense and sparse vectors

Introduce painless functions of
cosineSimilarity and dotProduct distance
measures for dense and sparse vector fields.

```js
{
"query": {
"script_score": {
"query": {
"match_all": {}
},
"script": {
"source": "cosineSimilarity(params.queryVector, doc['my_dense_vector'].value)",
"params": {
"queryVector": [4, 3.4, -1.2]
}
}
}
}
}
```

```js
{
"query": {
"script_score": {
"query": {
"match_all": {}
},
"script": {
"source": "cosineSimilaritySparse(params.queryVector, doc['my_sparse_vector'].value)",
"params": {
"queryVector": {"2": -0.5, "10" : 111.3, "50": -13.0, "113": 14.8, "4545": -156.0}
}
}
}
}
}
```

Closes elastic#31615```
``` b017f3d ```

Closed

mayya-sharipova added a commit to mayya-sharipova/elasticsearch that referenced this pull request Feb 22, 2019

``` Distance measures for dense and sparse vectors (elastic#37947) ```
```* Distance measures for dense and sparse vectors

Introduce painless functions of
cosineSimilarity and dotProduct distance
measures for dense and sparse vector fields.

```js
{
"query": {
"script_score": {
"query": {
"match_all": {}
},
"script": {
"source": "cosineSimilarity(params.queryVector, doc['my_dense_vector'].value)",
"params": {
"queryVector": [4, 3.4, -1.2]
}
}
}
}
}
```

```js
{
"query": {
"script_score": {
"query": {
"match_all": {}
},
"script": {
"source": "cosineSimilaritySparse(params.queryVector, doc['my_sparse_vector'].value)",
"params": {
"queryVector": {"2": -0.5, "10" : 111.3, "50": -13.0, "113": 14.8, "4545": -156.0}
}
}
}
}
}
```

Closes elastic#31615```
``` 47594ac ```

Merged

mayya-sharipova added a commit that referenced this pull request Feb 23, 2019

``` Backport distance functions vectors (#39330) ```
```Distance functions for dense and sparse vectors

Backport for #37947, #39313```
``` e802842 ```

Closed

wmelton commented May 13, 2019

 @mayya-sharipova - For clarification, does this native vector function use `source` values for the computations or the document values? Only ask because there seems to be performance degredations anytime source values are accessed at query time for queries like this. Only asking because the documentation seems to suggest the use of _source values - https://www.elastic.co/guide/en/elasticsearch/reference/master/query-dsl-script-score-query.html#vector-functions Also - do you have any performance numbers you've run/tested? Someone mentioned this feature was being added and said a test with 5 Million documents with vectors of dim=300 took 5 seconds to return results, which seems like pretty anemic response times.
Contributor Author

mayya-sharipova commented May 17, 2019

 @wmelton Answering your questions: For clarification, does this native vector function use source values for the computations or the document values? We use binary document values, we encode vectors as binaries during indexing, and decode them back to numeric vectors during search. do you have any performance numbers you've run/tested? No, currently, we don't have, but we plan to work on adding some benchmarks. Vector functions use linear scan over all matched docs, so the response time should increase linearly with the number of matched docs. Also, would like to note that vector fields is an experimental feature, and APIs and the way the vectors are indexed and encoded may be changed in the non-backward compatible way.

wmelton commented May 19, 2019 • edited

 Thank you for your responses. Regarding "Vector functions use linear scan over all matched docs, so the response time should increase linearly with the number of matched docs." - I think taking the linear approach for this is a mistake, personally. The pL2AP algorithm and Facebooks open source FAISS (Fast Similarity Search) both highlight ways to parallelize the search space. I think implementing a linear search approach will be frustrating to the type of users who are actually the most likely to want to use the dense or sparse vector field type you are proposing adding.
Contributor Author

mayya-sharipova commented May 22, 2019

 @wmelton Thanks for your comment. Indeed linear scan would not scale, and it is intended mostly to score a limited set of documents. About `FAISS` library, the speed ups there are based on the hardware acceleration and approximate knn algorithms. We currently don't have plans to employ hardware acceleration, but we are exploring algorithms for approximate knn.

ra1ski commented May 27, 2019 • edited

 @mayya-sharipova Hi! Is there any chance to use long dense vectors to compute cosine distance? I have these kinds of vectors `[0.7831882238388062, 0.8473913073539734, 0.6641695499420166...]` with 200 floating point numbers

ra1ski commented May 28, 2019

 @mayya-sharipova Hi! Is there any chance to use long dense vectors to compute cosine distance? I have these kinds of vectors `[0.7831882238388062, 0.8473913073539734, 0.6641695499420166...]` with 200 floating point numbers Your example above with "queryVector": [ 4.5, 3.4, -1.2] works fine, but when it comes to [0.7831882238388062, 0.8473913073539734, 0.6641695499420166...] vectors, I get an error: ` "caused_by": { "type": "script_exception", "reason": "compile error", "script_stack": [ "cosineSimilarity(params.q ...", "^---- HERE" ], "script": "cosineSimilarity(params.queryVector, doc['vector'])", "lang": "painless", "caused_by": { "type": "illegal_argument_exception", "reason": "Unknown call [cosineSimilarity] with [2] arguments." } }`
Contributor Author

mayya-sharipova commented May 28, 2019 • edited

 @ra1ski What do you mean by "long dense vectors"? Do you mean to use 200 dimensions? Yes, you can use up to 1024 dimensions. It should be fine. I am not sure why you are experiencing this error. Can you provide the whole query? Also are you testing this against the current master?

ra1ski commented May 29, 2019 • edited

 @ra1ski What do you mean by "long dense vectors"? Do you mean to use 200 dimensions? Yes, you can use up to 1024 dimensions. It should be fine. I am not sure why you are experiencing this error. Can you provide the whole query? Also are you testing this against the current master? Yes, 200 dimensions. I'm using it against 7.0.0. master, tried with 7.1.1 also Here is the query ```{ "query": { "script_score": { "query": { "match_all": {} }, "script": { "source": "cosineSimilarity(params.queryVector, doc['vector'])", "params": { "queryVector": [0.7831882238388062, 0.8473913073539734, 0.6641695499420166, -0.7800988554954529, 0.6427151560783386, 0.8618375062942505, -0.7508959174156189, 0.8940073251724243, -0.8382183313369751, -0.8465797305107117, 0.8887408375740051, 0.8348124623298645, 0.7685972452163696, -0.8586599230766296, 0.7378193140029907, -0.7119467854499817, -0.8077011108398438, 0.8601088523864746, 0.8935535550117493, 0.6392208337783813, 0.8716743588447571, -0.7871374487876892, 0.6682323217391968, -0.8151301145553589, -0.8227899670600891, -0.7399943470954895, -0.897373378276825, 0.8426622152328491, 0.8269796371459961, 0.8424233198165894, 0.8509830236434937, -0.7777097821235657, 0.8377213478088379, 0.9059052467346191, 0.7352653741836548, -0.7400990128517151, -0.8934587240219116, -0.9130118489265442, -0.8574285507202148, -0.8946468234062195, 0.8552821278572083, 0.8763160705566406, -0.7989016771316528, -0.642711341381073, -0.7476733922958374, -0.8486865758895874, 0.8278630971908569, -0.8525271415710449, -0.8806391954421997, -0.6730614304542542, -0.881908118724823, 0.7430080771446228, 0.7847618460655212, 0.8260719180107117, -0.8224948644638062, -0.7607067823410034, 0.8367544412612915, 0.20206642150878906, 0.7692943215370178, -0.8679789304733276, -0.7517973780632019, -0.8642300367355347, -0.7322789430618286, -0.8890762329101562, -0.8113778829574585, -0.8182528614997864, -0.8263254165649414, 0.8806875944137573, -0.8628260493278503, 0.838936984539032, 0.8677369952201843, -0.776382565498352, 0.8289804458618164, 0.6592877507209778, -0.8425590395927429, -0.763074517250061, 0.8569432497024536, -0.7417001128196716, 0.8681409955024719, -0.8540714979171753, -0.8500930070877075, -0.8368064761161804, -0.8406449556350708, -0.8733716011047363, -0.8958595991134644, 0.8130819201469421, -0.8314911723136902, 0.8423287272453308, 0.8449920415878296, -0.8795095682144165, 0.7511520981788635, -0.8035956621170044, 0.7193001508712769, 0.7730565071105957, -0.857988178730011, 0.8187726140022278, 0.831302285194397, 0.8996239900588989, -0.863531231880188, 0.8358138799667358, -0.8426796197891235, 0.8390976190567017, 0.7986222505569458, -0.8568884134292603, 0.8369844555854797, 0.8447090983390808, 0.8311792612075806, -0.8208156824111938, -0.7700560092926025, -0.784808874130249, -0.874031662940979, 0.8473763465881348, 0.8083603978157043, 0.8634394407272339, 0.8724079132080078, -0.7952577471733093, 0.5091663599014282, 0.656829833984375, -0.8029653429985046, -0.8171727061271667, 0.8314194679260254, -0.8559287190437317, 0.8022019267082214, 0.7917070388793945, -0.8446627855300903, -0.7673274278640747, 0.832277774810791, -0.8024963140487671, 0.9498147964477539, -0.7452983856201172, 0.8978539705276489, 0.8834426999092102, 0.8543949127197266, 0.8466156721115112, -0.8207280039787292, 0.8191858530044556, -0.8309515118598938, 0.7519159317016602, 0.8341091275215149, -0.8656532168388367, 0.8573458790779114, -0.8247408866882324, 0.9135391116142273, -0.8272571563720703, -0.8448845148086548, -0.8408781290054321, -0.8409822583198547, -0.842566967010498, 0.7356223464012146, 0.8904960751533508, 0.8448322415351868, -0.8642748594284058, 0.8605462908744812, 0.8045945167541504, -0.8715876340866089, -0.8079540133476257, -0.8474785089492798, -0.8472393155097961, 0.8432945013046265, -0.8253397941589355, 0.7905577421188354, 0.7081928253173828, 0.6722716093063354, 0.8101333379745483, -0.8465112447738647, 0.8858150243759155, 0.8352972269058228, -0.7904651761054993, -0.8659583330154419, -0.8847810626029968, -0.762391209602356, -0.7752716541290283, -0.7860286831855774, -0.8350412249565125, -0.8377161026000977, -0.8326281309127808, 0.6579743027687073, -0.8490581512451172, 0.7932018041610718, 0.7292879819869995, 0.8307300806045532, 0.8333244323730469, -0.7778127193450928, -0.8621459007263184, -0.8240952491760254, 0.8149698376655579, 0.8036678433418274, 0.7759568691253662, -0.8074528574943542, -0.8319423794746399, -0.685379683971405, -0.6155311465263367, 0.771338701248169, 0.7577664256095886, 0.7837430238723755, -0.7604954838752747, 0.8120626211166382, -0.8959243893623352, -0.7081544995307922, 0.8636442422866821] } } } } }```
Contributor Author

mayya-sharipova commented Jun 4, 2019

 @ra1ski Vector functions are available starting from 7.2

Merged

Merged

mayya-sharipova added a commit to mayya-sharipova/elasticsearch that referenced this pull request Jul 9, 2019

``` Add l1norm and l2norm distances for vectors ```
```Add L1norm - Manhattan distance
relates to elastic#37947```
``` eb48e80 ```

Merged

pullbot pushed a commit to indux/elasticsearch that referenced this pull request Jul 11, 2019

``` Add l1norm and l2norm distances for vectors (elastic#44116) ```
```* Add l1norm and l2norm distances for vectors

relates to elastic#37947

- organize vector functions as a separate doc
- increase precision in tests calculations
- add a separate test when sparse doc dims
are bigger and less than query vector dims

``` 16747f8 ```

mayya-sharipova added a commit that referenced this pull request Jul 11, 2019

``` Add l1norm and l2norm distances for vectors (#44116) ```
```Add L1norm - Manhattan distance
relates to #37947```
``` 32cb47b ```

skontos added a commit to skontos/elasticsearch that referenced this pull request Jul 13, 2019

``` Add l1norm and l2norm distances for vectors (elastic#44116) ```
```* Add l1norm and l2norm distances for vectors

``` b00f161 ```