Skip to content
This repository has been archived by the owner on Aug 2, 2022. It is now read-only.

Support KNN Backward compatibility. Avoid KNNCodec in the index settings and use generic settings. #20

Closed
vamshin opened this issue Jan 23, 2020 · 1 comment

Comments

@vamshin
Copy link
Member

vamshin commented Jan 23, 2020

Problem Statement

Backward Compatability

Lucene Codecs are expected to be versioned to support backward compatibility. As part of the KNN plugin, we introduced new codec (KNNCodec) and this codec is not versioned. When we create segments, we write the Codec information to the SegmentInfo file. So if we have to change the KNNCodec from version v1 to v2, then we do not have a way to refer the different versions of the Codec as it will always try to find codec with name KNNCodec which will always be the latest codec.

Customer Experience

To create KNN index, customer have to pass the Codec information in the index settings.
Example:-

  PUT /myindex
{
  "settings" : {
    "index.codec": "KNNCodec"   // Codec info
  },
  "mappings": {
      "properties": {
        "my_vector": { 
          "type": "knn_vector",
          "dimension": 2
        }
      }
  }
}
  • Once we have codec versioned, customers will have to keep up with KNNCodec versioning with respect to each Elasticsearch version which can cause confusion and issues with ES version upgrades.
  • It will be pain to update customer documentation for each ES version release to point out the right codec version

Solution

Have KNNCodec versioned to support backward compatibility. If we have any breaking changes with respect to a Elasticsearch version, then the KNNCodec can be versioned for example KNNCodec70, KNNCodec80. As part of segment creation, we write KNNCodec70 or KNNCodec80 in segmentInfo file and Lucene can find the respective codec logic to read/write segments, thus supporting read on old segments and read/write on new segments in case of ES version upgrades.

Introduce new index setting called “index.knn” to identify the knn indices and avoid KNNCodec information during the index creation from the Customers so that they could be agnostic to the KNNCodec versions.
Example:-

PUT /myindex
{
  "settings" : {
    "index.knn": true   //setting to identify knn index
  },
  "mappings": {
      "properties": {
        "my_vector": { 
          "type": "knn_vector",
          "dimension": 2
        }
      }
  }
} 
@vamshin vamshin changed the title Avoid KNNCodec in the index settings and use generic settings Support KNN Backward compatibility. Avoid KNNCodec in the index settings and use generic settings. Jan 23, 2020
@vamshin
Copy link
Member Author

vamshin commented Jan 23, 2020

#21

@vamshin vamshin closed this as completed Feb 4, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant