# Semantic Search with Semantic Text

<a target="_blank" href="https://colab.research.google.com/github/elastic/elasticsearch-labs/blob/main/notebooks/search/09-semantic-text.ipynb"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Learn how to use the [semantic_text](https://www.elastic.co/guide/en/elasticsearch/reference/master/semantic-text.html) field type to quickly get started with semantic search.

## Requirements

For this example, you will need:

- An Elastic deployment:
  - We'll be using [Elastic Cloud](https://www.elastic.co/guide/en/cloud/current/ec-getting-started.html) for this example (available with a [free trial](https://cloud.elastic.co/registration?onboarding_token=vectorsearch&utm_source=github&utm_content=elasticsearch-labs-notebook))

- Elasticsearch 8.15 or above, or [Elasticsearch serverless](https://www.elastic.co/elasticsearch/serverless)

## Create Elastic Cloud deployment

If you don't have an Elastic Cloud deployment, sign up [here](https://cloud.elastic.co/registration?onboarding_token=vectorsearch&utm_source=github&utm_content=elasticsearch-labs-notebook) for a free trial.

## Install packages and connect with Elasticsearch Client

To get started, we'll need to connect to our Elastic deployment using the Python client (version 8.15.0 or above).
Because we're using an Elastic Cloud deployment, we'll use the **Cloud ID** to identify our deployment.

First we need to `pip` install the following packages:

- `elasticsearch`

In [18]:
!pip install elasticsearch
!pip install python-decouple
!pip install icecream


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.1.2[0m[39;49m -> [0m[32;49m24.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.1.2[0m[39;49m -> [0m[32;49m24.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.1.2[0m[39;49m -> [0m[32;49m24.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m


Next, we need to import the modules we need. 

🔐 NOTE: getpass enables us to securely prompt the user for credentials without echoing them to the terminal, or storing it in memory.

In [22]:
from elasticsearch import Elasticsearch, exceptions
from urllib.request import urlopen
from getpass import getpass
import json
import time
from icecream import ic

Now we can instantiate the Python Elasticsearch client.

First we prompt the user for their password and Cloud ID.
Then we create a `client` object that instantiates an instance of the `Elasticsearch` class.

In [6]:
ELASTIC_CLOUD_ID = "AI_Demo_815:dXMtd2VzdDIuZ2NwLmVsYXN0aWMtY2xvdWQuY29tOjQ0MyQ0YmUyZmQ1YjAxNzI0OTQ3ODE1YTg5OTJkOWVhNjU3YSQwMWU5NzFjMDdlYTk0YWMwOTg2NTgyYzVjYzk0M2VmYg=="

ELASTIC_API_KEY = "QWVyeUU1RUJEUGgxTGxoZEVnZWk6TXNYck1heDhTUHVfTE54bTVXT2FBQQ=="

# Create the client instance
elastic_client = Elasticsearch(
    # For local development
    # hosts=["http://localhost:9200"]
    cloud_id=ELASTIC_CLOUD_ID,
    api_key=ELASTIC_API_KEY,
)

### Test the Client
Before you continue, confirm that the client has connected with this test.

In [7]:
print(elastic_client.info())

{'name': 'instance-0000000000', 'cluster_name': '4be2fd5b01724947815a8992d9ea657a', 'cluster_uuid': 'Vt9__adQSzWxTkJuHjeMOA', 'version': {'number': '8.15.0', 'build_flavor': 'default', 'build_type': 'docker', 'build_hash': 'e84a0c8e4546f99e05cf8bfad923b0f2122afdf2', 'build_date': '2024-08-01T13:35:24.955648203Z', 'build_snapshot': False, 'lucene_version': '9.11.1', 'minimum_wire_compatibility_version': '7.17.0', 'minimum_index_compatibility_version': '7.0.0'}, 'tagline': 'You Know, for Search'}


Refer to [the documentation](https://www.elastic.co/guide/en/elasticsearch/client/python-api/current/connecting.html#connect-self-managed-new) to learn how to connect to a self-managed deployment.

Read [this page](https://www.elastic.co/guide/en/elasticsearch/client/python-api/current/connecting.html#connect-self-managed-new) to learn how to connect using API keys.

In [14]:
from decouple import config

elastic_cloud_id = config('ELASTIC_CLOUD_ID', default='none')
elastic_api_key = config('ELASTIC_API_KEY', default='none')
elastic_index_name = config('ELASTIC_INDEX_NAME', default='none')
raw_data = config('RAW_DATA', default='none')
elastic_synonym_fn = config('ELASTIC_SYNONYM_FILE', default='none')
elastic_synonym_id = config('ELASTIC_SYNONYM_ID', default='none')
elastic_pipeline_name = config('ELASTIC_PIPELINE_NAME', default='none')
elastic_sparse_model_name = config('ELASTIC_SPARSE_MODEL_NAME', default='none')
elastic_sparse_field_name = config('ELASTIC_SPARSE_FIELD_NAME', default='none')
elastic_sparse_field_name = config('ELASTIC_DENSE_FIELD_NAME', default='none')
elastic_dense_field_name = config('ELASTIC_DENSE_FIELD_NAME', default='none')
elastic_dense_field_dims = config('ELASTIC_DENSE_FIELD_DIMS', default=0, cast=int)
elastic_dense_field_model_name = config('ELASTIC_DENSE_FIELD_MODEL_NAME', default='none')
elastic_sparse_inference_endpoint_name = config('ELASTIC_INFERENCE_ENDPOINT_NAME', default='none')

'AI_Demo_815:dXMtd2VzdDIuZ2NwLmVsYXN0aWMtY2xvdWQuY29tOjQ0MyQ0YmUyZmQ1YjAxNzI0OTQ3ODE1YTg5OTJkOWVhNjU3YSQwMWU5NzFjMDdlYTk0YWMwOTg2NTgyYzVjYzk0M2VmYg=='

## Create the Inference Endpoint

Let's create the inference endpoint by using the [Create inference API](https://www.elastic.co/guide/en/elasticsearch/reference/current/put-inference-api.html).

For this example we'll use the [ELSER service](https://www.elastic.co/guide/en/machine-learning/current/ml-nlp-elser.html), but the inference API also supports [many other inference services](https://www.elastic.co/guide/en/elasticsearch/reference/current/put-inference-api.html#put-inference-api-desc).

In [23]:
def create_inference_endpoint(inference_endpoint_name=elastic_sparse_inference_endpoint_name, 
                              client=elastic_client):
    """
    Create an inference endpoint in Elasticsearch.

    This code is lifted almost directly from Elasticsearch Labs: 
        https://github.com/elastic/elasticsearch-labs/blob/main/notebooks/search/09-semantic-text.ipynb

    Args:
        inference_endpoint_name (str): The name of the inference endpoint to create.
        client (Elasticsearch): The Elasticsearch client.
    
    Raises:
        exceptions.BadRequestError: If the inference endpoint already exists.

    Returns:
        dict: The information about the created inference endpoint.

    """
    
    ic("Creating inference endpoints", inference_endpoint_name, client)

    try:
        client.inference.delete_model(inference_id=inference_endpoint_name)
        ic("Deleted inference endpoint {}".format(inference_endpoint_name))
    except exceptions.NotFoundError:
        # Inference endpoint does not exist
        pass

    try:
        client.options(
            request_timeout=60, max_retries=3, retry_on_timeout=True
        ).inference.put_model(
            task_type="sparse_embedding",
            inference_id=inference_endpoint_name,
            body={
                "service": "elser",
                "service_settings": {"num_allocations": 1, "num_threads": 1},
            },
        )
        
        ic("Created inference endpoint {}".format(inference_endpoint_name))

    except exceptions.BadRequestError as e:
        if e.error == "resource_already_exists_exception":
            ic("Inference endpoint already exists {}".format(inference_endpoint_name))
        else:
            raise e
        
    inference_endpoint_info = client.inference.get_model(
        inference_id=inference_endpoint_name,
    )

    ic(dict(inference_endpoint_info))
    
    model_id = inference_endpoint_info["endpoints"][0]["service_settings"]["model_id"]

    # deploy the ELSER model if it is not already deployed
    while True:
        status = client.ml.get_trained_models_stats(
            model_id=model_id,
        )

        deployment_stats = status["trained_model_stats"][0].get("deployment_stats")
        if deployment_stats is None:
            ic("ELSER Model is currently being deployed.")
            time.sleep(5)
            continue

        nodes = deployment_stats.get("nodes")
        if nodes is not None and len(nodes) > 0:
            ic("ELSER Model has been successfully deployed.")
            break
        else:
            ic("ELSER Model is currently being deployed.")
        time.sleep(5)

create_inference_endpoint(inference_endpoint_name=elastic_sparse_inference_endpoint_name,
                            client=elastic_client)

ic| "Creating inference endpoints": 'Creating inference endpoints'
    inference_endpoint_name: 'acme_inference'
    client: <Elasticsearch(['https://4be2fd5b01724947815a8992d9ea657a.us-west2.gcp.elastic-cloud.com:443'])>
ic| "Deleted inference endpoint {}".format(inference_endpoint_name): 'Deleted inference endpoint acme_inference'
ic| "Created inference endpoint {}".format(inference_endpoint_name): 'Created inference endpoint acme_inference'
ic| dict(inference_endpoint_info): {'endpoints': [{'inference_id': 'acme_inference',
                                                   'service': 'elser',
                                                   'service_settings': {'model_id': '.elser_model_2_linux-x86_64',
                                                                        'num_allocations': 1,
                                                                        'num_threads': 1},
                                                   'task_settings': {},
                        

## Create the Index

Now we need to create an index with a `semantic_text` field. Let's create one that enables us to perform semantic search on movie plots.

In [24]:
def create_index_with_fields(client=elastic_client, 
                             pipeline_name= elastic_pipeline_name,
                             inference_endpoint_name = elastic_sparse_inference_endpoint_name,
                             index_name=elastic_index_name,
                             model_name=elastic_sparse_model_name,
                             input_field_name=elastic_sparse_field_name,
                             sparse_field_name=elastic_sparse_field_name
                             dense_field_name=elastic_dense_field_name,
                             dense_field_dims=elastic_dense_field_dims):
    """
    Create an Elasticsearch index with custom analysis settings and mappings.

    Args:
        client (Elasticsearch): The Elasticsearch client.
        pipeline_name (str): The name of the pipeline to create.
        inference_endoint_name (str): The name of the inference endpoint to use.
        index_name (str): The name of the index to create.
        model_name (str): The name of the model to use for embeddings.
        input_field_name (str): The name of the input field.
        sparse_field_name (str): The name of the output field.
        dense_field_name (str): The name of the dense field.
        dense_field_dims (int): The number of dimensions for the dense field.
        
    """

    settings = {
            "analyzer": {
                "autocomplete": {
                    "type": "custom",
                    "tokenizer": "standard",
                    "filter": [
                        "lowercase",
                        "autocomplete_filter",
                    ]
                },
                "acme_synonym_analyzer": {
                    "tokenizer": "standard",
                    "filter": [
                        "lowercase",
                        "acme_synonym_filter",
                    ]
                }
            },
        }
    }

    mappings = {
        "properties": {
            "text": {
                "type": "text", 
            },
            "text_sparse_embedding": {
                "type": "semantic_text",
                "inference_id": "my-elser-endpoint"

        }
    }
 
    if client.indices.exists(index=index_name):
        client.indices.delete(index=index_name)
        ic("Deleted index {}".format(index_name))

    client.indices.create(index=index_name, mappings=mappings, settings=settings)
    ic("Created index {}".format(index_name))

create_index_with_fields(client=elastic_client, 
                            index_name=elastic_index_name,
                            pipeline_name=elastic_pipeline_name,
                            sparse_field_name=elastic_sparse_field_name)

ic| "Deleted index {}".format(index_name): 'Deleted index acme'
ic| "Created index {}".format(index_name): 'Created index acme'


Notice how we configured the mappings. We defined `plot_semantic` as a `semantic_text` field.
The `inference_id` parameter defines the inference endpoint that is used to generate the embeddings for the field.
Then we configured the `plot` field to [copy its value](https://www.elastic.co/guide/en/elasticsearch/reference/current/copy-to.html) to the `plot_semantic` field.

## Populate the Index

Let's populate the index with our example dataset of 12 movies.

In [8]:
url = "https://raw.githubusercontent.com/elastic/elasticsearch-labs/main/notebooks/search/movies.json"
response = urlopen(url)
movies = json.loads(response.read())

operations = []
for movie in movies:
    operations.append({"index": {"_index": "semantic-text-movies"}})
    operations.append(movie)
client.bulk(index="semantic-text-movies", operations=operations, refresh=True)

ObjectApiResponse({'errors': False, 'took': 954232972, 'items': [{'index': {'_index': 'semantic-text-movies', '_id': '_eonFJEBDPh1LlhdYRJe', '_version': 1, 'result': 'created', 'forced_refresh': True, '_shards': {'total': 2, 'successful': 2, 'failed': 0}, '_seq_no': 0, '_primary_term': 1, 'status': 201}}, {'index': {'_index': 'semantic-text-movies', '_id': '_uonFJEBDPh1LlhdYRJe', '_version': 1, 'result': 'created', 'forced_refresh': True, '_shards': {'total': 2, 'successful': 2, 'failed': 0}, '_seq_no': 1, '_primary_term': 1, 'status': 201}}, {'index': {'_index': 'semantic-text-movies', '_id': '_-onFJEBDPh1LlhdYRJe', '_version': 1, 'result': 'created', 'forced_refresh': True, '_shards': {'total': 2, 'successful': 2, 'failed': 0}, '_seq_no': 2, '_primary_term': 1, 'status': 201}}, {'index': {'_index': 'semantic-text-movies', '_id': 'AOonFJEBDPh1LlhdYRNe', '_version': 1, 'result': 'created', 'forced_refresh': True, '_shards': {'total': 2, 'successful': 2, 'failed': 0}, '_seq_no': 3, '_pr

## Semantic Search

Now that our index is populated, we can query it using semantic search.

### Aside: Pretty printing Elasticsearch search results

Your `search` API calls will return hard-to-read nested JSON.
We'll create a little function called `pretty_search_response` to return nice, human-readable outputs from our examples.

In [9]:
def pretty_search_response(response):
    if len(response["hits"]["hits"]) == 0:
        print("Your search returned no results.")
    else:
        for hit in response["hits"]["hits"]:
            id = hit["_id"]
            score = hit["_score"]
            title = hit["_source"]["title"]
            runtime = hit["_source"]["runtime"]
            plot = hit["_source"]["plot"]
            keyScene = hit["_source"]["keyScene"]
            genre = hit["_source"]["genre"]
            released = hit["_source"]["released"]

            pretty_output = f"\nID: {id}\nScore: {score}\nTitle: {title}\nRuntime: {runtime}\nPlot: {plot}\nKey Scene: {keyScene}\nGenre: {genre}\nReleased: {released}"

            print(pretty_output)

### Semantic Search with the `semantic` Query

We can use the [`semantic` query](https://www.elastic.co/guide/en/elasticsearch/reference/master/query-dsl-semantic-query.html) to quickly & easily query the `semantic_text` field in our index.
Under the hood, an embedding is automatically generated for our query text using the `semantic_text` field's inference endpoint.

In [10]:
response = client.search(
    index="semantic-text-movies",
    query={"semantic": {"field": "plot_semantic", "query": "organized crime movies"}},
)

pretty_search_response(response)


ID: BuonFJEBDPh1LlhdYRNe
Score: 16.526878
Title: The Godfather
Runtime: 175
Plot: An organized crime dynasty's aging patriarch transfers control of his clandestine empire to his reluctant son.
Key Scene: James Caan's character Sonny Corleone is shot to death at a toll booth by a number of machine gun toting enemies.
Genre: Crime, Drama
Released: 1972

ID: B-onFJEBDPh1LlhdYRNe
Score: 10.201027
Title: The Departed
Runtime: 151
Plot: An undercover cop and a mole in the police attempt to identify each other while infiltrating an Irish gang in South Boston.
Key Scene: Leonardo DiCaprio's character Billy Costigan is shot to death by Matt Damon's character Colin Sullivan.
Genre: Crime, Drama, Thriller
Released: 2006

ID: _eonFJEBDPh1LlhdYRJe
Score: 9.530054
Title: Pulp Fiction
Runtime: 154
Plot: The lives of two mob hitmen, a boxer, a gangster and his wife, and a pair of diner bandits intertwine in four tales of violence and redemption.
Key Scene: John Travolta is forced to inject adrenaline

These results demonstrate the power of semantic search.
Our top results are all movies involving organized crime, even if the exact term "organized crime" doesn't appear in the plot description.
This works because the ELSER model understands the semantic similarity between terms like "organized crime" and "mob".

However, these results also show the weaknesses of semantic search.
Because semantic search is based on vector similarity, there is a long tail of results that are weakly related to our query vector.
That's why movies like _The Matrix_ are returned towards the tail end of our search results.

### Hybrid Search with the `semantic` Query

We can address some of the issues with pure semantic search by combining it with lexical search techniques.
Here, we use a [boolean query](https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-bool-query.html) to require that all matches contain at least term from the query text, in either the `plot` or `genre` fields.

In [11]:
response = client.search(
    index="semantic-text-movies",
    query={
        "bool": {
            "must": {
                "multi_match": {
                    "fields": ["plot", "genre"],
                    "query": "organized crime movies",
                    "boost": 1.5,
                }
            },
            "should": {
                "semantic": {
                    "field": "plot_semantic",
                    "query": "organized crime movies",
                    "boost": 3.0,
                }
            },
        }
    },
)

pretty_search_response(response)


ID: BuonFJEBDPh1LlhdYRNe
Score: 56.279396
Title: The Godfather
Runtime: 175
Plot: An organized crime dynasty's aging patriarch transfers control of his clandestine empire to his reluctant son.
Key Scene: James Caan's character Sonny Corleone is shot to death at a toll booth by a number of machine gun toting enemies.
Genre: Crime, Drama
Released: 1972

ID: B-onFJEBDPh1LlhdYRNe
Score: 31.22543
Title: The Departed
Runtime: 151
Plot: An undercover cop and a mole in the police attempt to identify each other while infiltrating an Irish gang in South Boston.
Key Scene: Leonardo DiCaprio's character Billy Costigan is shot to death by Matt Damon's character Colin Sullivan.
Genre: Crime, Drama, Thriller
Released: 2006

ID: A-onFJEBDPh1LlhdYRNe
Score: 29.793152
Title: Goodfellas
Runtime: 146
Plot: The story of Henry Hill and his life in the mob, covering his relationship with his wife Karen Hill and his mob partners Jimmy Conway and Tommy DeVito in the Italian-American crime syndicate.
Key Scene

These results demonstrate that the application of lexical search techniques can help focus the results, while retaining many of the advantages of semantic search.
In this example, the top search results are all still movies involving organized crime, but the `multi_match` query keeps the long tail shorter and focused on movies in the crime genre.

Note the `boost` parameters applied to the `multi_match` and `semantic` queries.
Combining lexical and semantic search techniques in a boolean query like this is called "linear combination" and when doing this, it is important to normalize the scores of the component queries.
This involves consideration of a few factors, including:

- The range of scores generated by the query
- The relative importance and accuracy of the query in the context of the dataset

In this example, the `multi_match` query is mostly used as a filter to constrain the search results' long tail, so we assign it a lower boost than the `semantic` query.

## Conclusion

The [semantic_text](https://www.elastic.co/guide/en/elasticsearch/reference/master/semantic-text.html) field type is a powerful tool that can help you quickly and easily integrate semantic search.
It can greatly improve the relevancy of your search results, particularly when combined with lexical search techniques.