### Adding ELSER

Elastic also offers an out of the box semantic search model, optimized for context searches: Elastic Learned Sparse EncodeR, or ELSER.
Let's try this one as well.

See a full example of this here: https://github.com/elastic/elasticsearch-labs/blob/main/notebooks/search/03-ELSER.ipynb 


In [9]:
import eland as ed
import pandas as pd
from elasticsearch import Elasticsearch
from getpass import getpass  # For securely getting user input

# Prompt the user to enter their Elastic Cloud ID and API Key securely
ELASTIC_CLOUD_ID = getpass("Elastic Cloud ID: ")
ELASTIC_API_KEY = getpass("Elastic API Key: ")

# Create an Elasticsearch client using the provided credentials
es = Elasticsearch(
    cloud_id=ELASTIC_CLOUD_ID,  # cloud id can be found under deployment management
    api_key=ELASTIC_API_KEY # API keys can be generated under management / security
)


client.info()

ObjectApiResponse({'name': 'instance-0000000000', 'cluster_name': 'fdcc4e10e5a34385884a3eda9350099a', 'cluster_uuid': '1v8os-EZTPmrZoF6uXeWKA', 'version': {'number': '8.9.0', 'build_flavor': 'default', 'build_type': 'docker', 'build_hash': '8aa461beb06aa0417a231c345a1b8c38fb498a0d', 'build_date': '2023-07-19T14:43:58.555259655Z', 'build_snapshot': False, 'lucene_version': '9.7.0', 'minimum_wire_compatibility_version': '7.17.0', 'minimum_index_compatibility_version': '7.0.0'}, 'tagline': 'You Know, for Search'})

In [None]:
# Creates the ELSER model configuration. Automatically downloads the model if it doesn't exist.
client.ml.put_trained_model(
  model_id=".elser_model_1",
  input={
    "field_names": ["text_field"]
  }
)

In [None]:
client.ml.start_trained_model_deployment(
  model_id=".elser_model_1",
  number_of_allocations=1,
)

In [None]:
client.ingest.put_pipeline(
    id="elser-ingest-pipeline", 
    description="Ingest pipeline for ELSER",
    processors=[
    {
      "inference": {
        "model_id": ".elser_model_1",
        "target_field": "ml",
        "field_map": {
          "Sentence": "text_field"
        },
        "inference_config": {
          "text_expansion": {
            "results_field": "tokens"
          }
        }
      }
    }
  ]
)

Like before, we are building a new index with the additional ml field that will receive the ELSER generated tokens.

In [None]:
client.indices.create(
  index="hp_elser",
  mappings={
    "properties": {
      "text_embedding.predicted_value": {
        "type": "dense_vector",
        "dims": 384,
        "index": True,
        "similarity": "cosine"
      },
      "Character": {
          "type": "text"
      },
      "Line_number": {
        "type": "long"
      },
      "Sentence": {
        "type": "text"
       },
      "sentiment": {
          "properties": {
            "model_id": {
              "type": "text",
              "fields": {
                "keyword": {
                  "type": "keyword",
                  "ignore_above": 256
                }
              }
            },
            "predicted_value": {
              "type": "text",
              "fields": {
                "keyword": {
                  "type": "keyword",
                  "ignore_above": 256
                }
              }
            },
            "prediction_probability": {
              "type": "float"
            }
          }
        },
      "ml.tokens": { 
        "type": "rank_features" 
      }
      }
    })

In [None]:
client.reindex(body={
      "source": {
          "index": "hp_scripts_final"},
      "dest": {
    "index": "hp_elser",
    "pipeline": "elser-ingest-pipeline"
    }}, wait_for_completion=False)

In [10]:
result = client.search(
    index='hp_elser', 
    size=5,
    query={
        "text_expansion": {
            "ml.tokens": {
                "model_id":".elser_model_1",
                "model_text":"brave"
            }
        }
    }
)

In [11]:
for element in result["hits"]["hits"]:
        print("{}: {}, score {}".format(element["_source"]["Character"], element["_source"]["Sentence"], element["_score"]))

LUPIN: Be brave, score 12.215565
LUCIUS MALFOY: You must be very brave to mention his name, score 7.7868624
Sorting Hat: Plenty of courage I see, score 7.384922
Dumbledore: And finally it takes a great deal of bravery to stand up to your enemies but a great deal more to stand up to your friends, score 6.109682
Voldemort: Haha Bravery Your parents had it too, score 5.6515555


Now let's add the sentiment to the semantic search

In [12]:
result = client.search(
    index='hp_elser', 
    size=5,
    query={
        "bool": {
            "should": [{
                "text_expansion": {
                    "ml.tokens": {
                        "model_id":".elser_model_1",
                        "model_text":"brave"
                    }
                },
            }],
            "must":[
            {
                "match" : {
                    "sentiment.predicted_value": "NEGATIVE"
                }
            }]}})
for element in result["hits"]["hits"]:
        print("{}: {}".format(element["_source"]["Character"], element["_source"]["Sentence"]))

LUPIN: Dumbledore has already risked enough on my behalf
LUPIN: That suggests what you fear the most is fear itself
Hagrid: Fine Just so you know hes a bloody coward 
SNAPE: Do I detect a flicker of fear
MCGONAGALL: Our worst fear has been realized


In [13]:
result = client.search(
    index='hp_elser', 
    size=5,
    query={
        "bool": {
            "should": [{
                "text_expansion": {
                    "ml.tokens": {
                        "model_id":".elser_model_1",
                        "model_text":"brave"
                    }
                },
            }],
            "must":[
            {
                "match" : {
                    "sentiment.predicted_value": "NEGATIVE"
                }
            }],
            "must_not":[
                    {"term":{
                        "Sentence":"fear"
                 }}]
        }
    })

for element in result["hits"]["hits"]:
        print("{}: {}".format(element["_source"]["Character"], element["_source"]["Sentence"]))

LUPIN: Dumbledore has already risked enough on my behalf
Hagrid: Fine Just so you know hes a bloody coward 
GILDEROY LOCKHART: You may find yourselves facing your worst fears in this room
Hagrid: What if the other dragons are mean to him
TOM RIDDLE: Im afraid I cant do that
