## Documentation

To read more about how to apply pre-filtering with KNN search, visit the [docs](https://www.elastic.co/docs/reference/query-languages/query-dsl/query-dsl-knn-query#knn-query-filtering).

![pre_filtering_with_knn_search](../images/pre_filtering_with_knn_search.png)

## Connect to ElasticSearch

In [2]:
from pprint import pprint
from elasticsearch import Elasticsearch

es = Elasticsearch("http://localhost:9200")
client_info = es.info()
print("Connected to Elasticsearch!")
pprint(client_info.body)

Connected to Elasticsearch!
{'cluster_name': 'docker-cluster',
 'cluster_uuid': 'DlYG5m9gR3upn7qgaYyAJA',
 'name': 'df0334cb3063',
 'tagline': 'You Know, for Search',
 'version': {'build_date': '2024-08-05T10:05:34.233336849Z',
             'build_flavor': 'default',
             'build_hash': '1a77947f34deddb41af25e6f0ddb8e830159c179',
             'build_snapshot': False,
             'build_type': 'docker',
             'lucene_version': '9.11.1',
             'minimum_index_compatibility_version': '7.0.0',
             'minimum_wire_compatibility_version': '7.17.0',
             'number': '8.15.0'}}


## Preparing the index

We are adding a new field with type `dense_vector` to store the embeddings.

In [9]:
es.indices.delete(index="apod", ignore_unavailable=True)
es.indices.create(
    index="apod",
    mappings={
        "properties": {
            "embedding": {
                "type": "dense_vector",
            }
        }
    },
)

ObjectApiResponse({'acknowledged': True, 'shards_acknowledged': True, 'index': 'apod'})

## Embedding model

![all-MiniLM-L6-v2_model](../images/all-MiniLM-L6-v2_model.png)

I chose the `all-MiniLM-L6-v2` model for its speed, compact size, and versatility as a general-purpose model. It features an embedding dimension of `384` and truncates text that exceeds `256` words. This model is very popular in the community with almost `50M` downloads in one month.

To download and utilize this model, Hugging Face offers a Python package called `sentence-transformers`. This framework simplifies the process of computing dense vector representations.

In [10]:
from sentence_transformers import SentenceTransformer

model = SentenceTransformer("all-MiniLM-L6-v2")
model

  from tqdm.autonotebook import tqdm, trange


README.md: 0.00B [00:00, ?B/s]

SentenceTransformer(
  (0): Transformer({'max_seq_length': 256, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

In [11]:
import torch

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
device

device(type='cuda')

In [12]:
model = model.to(device)
model

SentenceTransformer(
  (0): Transformer({'max_seq_length': 256, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

## Index documents

Let's use the `APOD` dataset in this notebook.

In [13]:
import json

with open("../data/apod.json") as f:
    documents = json.load(f)

Let's use the embedding model to embed the `explanation` field of the `APOD` dataset.

Use the `bulk` API to index the documents in the `apod` index.

In [None]:
from tqdm import tqdm


def get_embedding(text):
    return model.encode(text)


operations = []
for document in tqdm(documents, total=len(documents), desc="Indexing documents"):
    year = document["date"].split("-")[0]
    document["year"] = int(year)

    operations.append({"index": {"_index": "apod"}})
    operations.append(
        {
            **document,
            "embedding": get_embedding(document["explanation"]),
        }
    )

response = es.bulk(operations=operations)

Indexing documents: 100%|██████████| 3333/3333 [00:09<00:00, 364.03it/s]


If the indexing is successful, you should see `response["errors"]` as `False`.

In [22]:
response["errors"]

False

## Pre-filtering with kNN Search

### Regular kNN search

Regular kNN search means that we take the query, embed it, compute the similarity score between the query and every document in the index, and return the top k most similar documents.

In [39]:
query = "What is a black hole?"
embedded_query = get_embedding(query)

result = es.search(
    index="apod",
    knn={
        "field": "embedding",
        "query_vector": embedded_query,
        "num_candidates": 20,
        "k": 10,
    },
)

number_of_documents = result.body["hits"]["total"]["value"]
print(f"Found {number_of_documents} documents")

Found 10 documents


Here we got 10 documents that are most similar to the query "What is a black hole?". Let's print the first 3 documents.

In [40]:
for hit in result.body["hits"]["hits"][:3]:
    print(f"Score: {hit['_score']}")
    print(f"Title: {hit['_source']['title']}")
    print(f"Explanation: {hit['_source']['explanation']}")
    print("-" * 80)

Score: 0.80657506
Title: Black Hole Accreting with Jet
Explanation: Explanation: What happens when a black hole devours a star? Many details remain unknown, but observations are providing new clues. In 2014, a powerful explosion was recorded by the ground-based robotic telescopes of the All Sky Automated Survey for SuperNovae (Project ASAS-SN), with followed-up observations by instruments including NASA's Earth-orbiting Swift satellite. Computer modeling of these emissions fit a star being ripped apart by a distant supermassive black hole. The results of such a collision are portrayed in the featured artistic illustration. The black hole itself is a depicted as a tiny black dot in the center. As matter falls toward the hole, it collides with other matter and heats up. Surrounding the black hole is an accretion disk of hot matter that used to be the star, with a jet emanating from the black hole's spin axis.
-------------------------------------------------------------------------------

In [41]:
for hit in result.body["hits"]["hits"]:
    print(f"Explanation: {hit['_source']['year']}")

Explanation: 2024
Explanation: 2017
Explanation: 2019
Explanation: 2022
Explanation: 2018
Explanation: 2020
Explanation: 2024
Explanation: 2024
Explanation: 2022
Explanation: 2020


Let's look at the years of the documents returned by the regular kNN search. We can see that the years are different, let's see how we can use pre-filtering to filter the documents based on the year.

### 2. Pre-filtering

Let's run the same query but this time we will use pre-filtering to filter the documents based on the year. Let's say we want to filter the documents to only include those from the year 2024.

We do this by adding a `filter` clause to the kNN query. The `filter` clause is a regular query that filters the documents before the kNN search is performed.

In [42]:
query = "What is a black hole?"
embedded_query = get_embedding(query)

result = es.search(
    index="apod",
    knn={
        "field": "embedding",
        "query_vector": embedded_query,
        "num_candidates": 20,
        "k": 10,
        "filter": {"term": {"year": 2024}},
    },
)

number_of_documents = result.body["hits"]["total"]["value"]
print(f"Found {number_of_documents} documents")

Found 10 documents


As you can see, the documents returned are only from the year 2024.

In [43]:
for hit in result.body["hits"]["hits"]:
    print(f"Explanation: {hit['_source']['year']}")

Explanation: 2024
Explanation: 2024
Explanation: 2024
Explanation: 2024
Explanation: 2024
Explanation: 2024
Explanation: 2024
Explanation: 2024
Explanation: 2024
Explanation: 2024


Let's look at the first 3 documents returned by the kNN search to confirm that they are similar to the query.

In [45]:
for hit in result.body["hits"]["hits"][:3]:
    print(f"Score: {hit['_score']}")
    print(f"Title: {hit['_source']['title']}")
    print(f"Explanation: {hit['_source']['explanation']}")
    print("-" * 80)

Score: 0.80657506
Title: Black Hole Accreting with Jet
Explanation: Explanation: What happens when a black hole devours a star? Many details remain unknown, but observations are providing new clues. In 2014, a powerful explosion was recorded by the ground-based robotic telescopes of the All Sky Automated Survey for SuperNovae (Project ASAS-SN), with followed-up observations by instruments including NASA's Earth-orbiting Swift satellite. Computer modeling of these emissions fit a star being ripped apart by a distant supermassive black hole. The results of such a collision are portrayed in the featured artistic illustration. The black hole itself is a depicted as a tiny black dot in the center. As matter falls toward the hole, it collides with other matter and heats up. Surrounding the black hole is an accretion disk of hot matter that used to be the star, with a jet emanating from the black hole's spin axis.
-------------------------------------------------------------------------------