<a href="https://colab.research.google.com/github/jeffvestal/elastic_jupyter_notebooks/blob/main/load_embedding_model_from_hf_to_elastic.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Loading an Sentense Transformer model from Hugging Face into Elastic

This code will show you how to set up an ingest pipeline to generate vectors for documents on ingest.

Overview of steps
1. Set up our python environment
2. Setup index mapping
3. Configure ingest pipeline
4. Index a couple test documents

### Requirements
This notebook assumes you already have loaded an embedding model into elasticsearch. If you haven't, please start with [this notebook example](https://github.com/jeffvestal/elastic_jupyter_notebooks/blob/main/load_embedding_model_from_hf_to_elastic.ipynb)


### Elastic version support
Requires Elastic version 8.0+ with a platinum or enterprise license (or trial license)

You can set up a [free trial elasticsearch Deployment in Elastic Cloud](https://cloud.elastic.co/registration).

# Setup
This section will set up the python environment with the required libraries

## Install and import required python libraries

Elastic uses the [eland python library](https://github.com/elastic/eland) to download modesl from Hugging Face hub and load them into elasticsearch

In [None]:
pip install eland

In [None]:
pip install elasticsearch

In [None]:
pip install transformers

In [None]:
pip install sentence_transformers

In [None]:
pip install torch==1.11

In [None]:
from pathlib import Path
from eland.ml.pytorch import PyTorchModel
from eland.ml.pytorch.transformers import TransformerModel
from elasticsearch import Elasticsearch
from elasticsearch.client import MlClient

## Configure elasticsearch authentication. 
The recommended authentication approach is using the [Elastic Cloud ID](https://www.elastic.co/guide/en/cloud/current/ec-cloud-id.html) and a [cluster level API key](https://www.elastic.co/guide/en/kibana/current/api-keys.html)

You can use any method you wish to set the required credentials. We are using getpass in this example to prompt for credentials to avoide storing them in github.

In [None]:
import getpass

In [None]:
es_cloud_id = getpass.getpass('Enter Elastic Cloud ID:  ')
es_api_id = getpass.getpass('Enter cluster API key ID:  ') 
es_api_key = getpass.getpass('Enter cluster API key:  ')

## Connect to Elastic Cloud

In [None]:
es = Elasticsearch(cloud_id=es_cloud_id, 
                   api_key=(es_api_id, es_api_key)
                   )
es.info() # should return cluster info

# Model Information and Status

## View information about the model
This is not required but will allow us to get the model_id as it is stored in elasticsearch as well as verify the model is running / deployed and ready to use in our ingest pipeline

In [None]:
m = MlClient.get_trained_models(es, model_id=es_model_id)
m.body

## Set the model_id for ease of reference later

In [None]:
es_model_id = <set from model_id value above>

### *If* the model is not started we will need to deploy the model

You will only need to run this if the model hasn't been deployed. 

This will load the model on the ML nodes and start the process(es) making it available for the NLP task

uncomment the code below

In [None]:
#s = MlClient.start_trained_model_deployment(es, model_id=es_model_id)
#s.body

#### Verify the model started without issue

In [None]:
#stats = MlClient.get_trained_models_stats(es, model_id=es_model_id)
#stats.body['trained_model_stats'][0]['deployment_stats']['nodes'][0]['routing_state']

# Elasticsearch index setup
Here we will configure an index template with settings and mappings to store our vectors and text data

The **important** part here will be setting our vector field to be a `dense_vector` type. This will tell elasticsearch to build the HNSW graph for the vectors so we can then use kNN search later. 

## Define the index template
We will have the following fields

- `vectors` of type `dense_vector`
- `title` of type `text`
- `summary` of type `text`

This will match new indices with the name matching the pattern of `jupyter-vector-demo*`

In [None]:
template = {
    "template": "jupyter-vector-demo*",
    "settings": {
        "number_of_shards": 1,
        "number_of_replicas": 1
    },
    "mappings": {
        "properties": {
            "vectors": {
                "type": "dense_vector",
                "dims": 512
            },
            "title": {
                "type": "text"
            },
            "summary": {
                "type": "text"
            }
        }
    }
}

## Apply the template
Here we apply the templat and give it a name of `jupyter-vector-demo`. This is just the name of the template if we need to modify it later on.

In [None]:
es.indices.put_template(name="jupyter-vector-demo-template", body=template)

---
---
# Working with Vectors
---



# Setting up the index in elasticsearch to store vectors



## Generate Vector for Query

Before we can run a kNN query, we need to convert our query string to a vector.

Creating a sample query sentence

In [None]:
docs =  [
    {
      "text_field": "Last week I upgraded my iOS version and ever since then my phone has been overheating whenever I use your app."
    }
  ]

We call the `_infer` endpoint supplying the model_id and the doc[s] we want to vectorize. 

In [None]:
z = MlClient.infer_trained_model(es, model_id=es_model_id, docs=docs, )

The vector for the first doc can be accessed in the response dict as shown below

In [None]:
doc_0_vector = z['inference_results'][0]['predicted_value']
doc_0_vector