## Tutorial

To read more about embeddings, checkout this [tutorial](https://www.elastic.co/search-labs/tutorials/search-tutorial/vector-search/store-embeddings).

![embedding_tutorial](../images/embedding_tutorial.png)

## Connect to ElasticSearch

In [1]:
from pprint import pprint
from elasticsearch import Elasticsearch

HOST = "http://localhost:9200"

es = Elasticsearch(HOST)
client_info = es.info()
print("Connected tp Elasticsearch!")
pprint(client_info.body)

Connected tp Elasticsearch!
{'cluster_name': 'docker-cluster',
 'cluster_uuid': '-y81CJwxSBmTHbJfPsdEAQ',
 'name': '6df474028da5',
 'tagline': 'You Know, for Search',
 'version': {'build_date': '2025-01-09T14:09:01.578835424Z',
             'build_flavor': 'default',
             'build_hash': '0f88dde84795b30ca0d2c0c4796643ec5938aeb5',
             'build_snapshot': False,
             'build_type': 'docker',
             'lucene_version': '8.11.3',
             'minimum_index_compatibility_version': '6.0.0-beta1',
             'minimum_wire_compatibility_version': '6.8.0',
             'number': '7.17.27'}}


  client_info = es.info()


## Preparing the index

We are adding a new field with type `dense_vector` to store the embeddings.

In [4]:
INDEX = "my_index"

mappings = {
    "properties": {
        "embedding": {
            "type": "dense_vector",
            "dims": 384
        }
    }
}

settings = {
    "index": {
        "number_of_shards": 1,
        "number_of_replicas": 0
    }
}

es.indices.delete(index=INDEX, ignore_unavailable=True)
es.indices.create(index=INDEX, mappings=mappings, settings=settings)

  es.indices.delete(index=INDEX, ignore_unavailable=True)
  es.indices.create(index=INDEX, mappings=mappings, settings=settings)


ObjectApiResponse({'acknowledged': True, 'shards_acknowledged': True, 'index': 'my_index'})

## Embedding model

![all-MiniLM-L6-v2_model](../images/all-MiniLM-L6-v2_model.png)

I chose the `all-MiniLM-L6-v2` model for its speed, compact size, and versatility as a general-purpose model. It features an embedding dimension of `384` and truncates text that exceeds `256` words. This model is very popular in the community with almost `50M` downloads in one month.

To download and utilize this model, Hugging Face offers a Python package called `sentence-transformers`. This framework simplifies the process of computing dense vector representations.

In [5]:
from sentence_transformers import SentenceTransformer

model = SentenceTransformer("all-MiniLM-L6-v2")
model

  from .autonotebook import tqdm as notebook_tqdm


SentenceTransformer(
  (0): Transformer({'max_seq_length': 256, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

In [6]:
import torch

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
device

device(type='cpu')

In [7]:
model = model.to(device)
model

SentenceTransformer(
  (0): Transformer({'max_seq_length': 256, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

## Load documents

In [8]:
import json

documents = json.load(open("../data/dummy_data.json"))
documents

[{'title': 'Sample Title 1',
  'text': 'This is the first sample document text.',
  'created_on': '2024-09-22'},
 {'title': 'Sample Title 2',
  'text': 'Here is another example of a document.',
  'created_on': '2024-09-24'},
 {'title': 'Sample Title 3',
  'text': 'The content of the third document goes here.',
  'created_on': '2024-09-24'}]

## Embed documents

In [10]:
from tqdm import tqdm
from pprint import pprint


def get_embedding(text):
    return model.encode(text)


operations = []
for document in tqdm(documents, total=len(documents)):
    operations.append({"index": { "_index": INDEX }})
    operations.append({
        **document,
        "embedding": get_embedding(document["text"])
    })

response = es.bulk(operations=operations)
pprint(response.body)

100%|██████████| 3/3 [00:00<00:00, 49.06it/s]


{'errors': False,
 'items': [{'index': {'_id': 'acfGKJUBoQAAZU0yx0zX',
                      '_index': 'my_index',
                      '_primary_term': 1,
                      '_seq_no': 0,
                      '_shards': {'failed': 0, 'successful': 1, 'total': 1},
                      '_type': '_doc',
                      '_version': 1,
                      'result': 'created',
                      'status': 201}},
           {'index': {'_id': 'asfGKJUBoQAAZU0yx0zX',
                      '_index': 'my_index',
                      '_primary_term': 1,
                      '_seq_no': 1,
                      '_shards': {'failed': 0, 'successful': 1, 'total': 1},
                      '_type': '_doc',
                      '_version': 1,
                      'result': 'created',
                      'status': 201}},
           {'index': {'_id': 'a8fGKJUBoQAAZU0yx0zX',
                      '_index': 'my_index',
                      '_primary_term': 1,
                      '

  response = es.bulk(operations=operations)


We indexed all documents with an additional field `embedding`. Let's retrieve the documents to verify that the text was converted to a dense vector.

In [11]:
response = es.search(
    index=INDEX,
    body={
        "query": {
            "match_all": {}
        }
    }
)

pprint(response["hits"]["hits"])

[{'_id': 'acfGKJUBoQAAZU0yx0zX',
  '_index': 'my_index',
  '_score': 1.0,
  '_source': {'created_on': '2024-09-22',
              'embedding': [-0.04355228319764137,
                            0.06440839916467667,
                            -0.00508020119741559,
                            0.034451842308044434,
                            0.0406334288418293,
                            0.014603214338421822,
                            -0.019641714170575142,
                            0.049041081219911575,
                            0.035828832536935806,
                            0.011970660649240017,
                            0.041811347007751465,
                            0.08254101127386093,
                            -0.0003264865081291646,
                            -0.03726029768586159,
                            -0.009786701761186123,
                            0.039124712347984314,
                            0.030936717987060547,
                            -0.074

  response = es.search(


Awesome! We successfully inserted the documents with the additional `embedding` field. Now, let’s check the mapping to confirm that the dimension of the dense vector is 384.

In [13]:
response = es.indices.get_mapping(index=INDEX)
pprint(response.body)

{'my_index': {'mappings': {'properties': {'created_on': {'type': 'date'},
                                          'embedding': {'dims': 384,
                                                        'type': 'dense_vector'},
                                          'text': {'fields': {'keyword': {'ignore_above': 256,
                                                                          'type': 'keyword'}},
                                                   'type': 'text'},
                                          'title': {'fields': {'keyword': {'ignore_above': 256,
                                                                           'type': 'keyword'}},
                                                    'type': 'text'}}}}}


  response = es.indices.get_mapping(index=INDEX)
