<a href="https://colab.research.google.com/github/jeffvestal/elastic_jupyter_notebooks/blob/main/load_embedding_model_from_hf_to_elastic.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Loading an NLP Embedding model from Hugging Face into Elastic

This code will show you how to load a supported embedding model from Hugging Face into an elasticsearch cluster in [Elastic Cloud](https://cloud.elastic.co/)

### Elastic version support
Requires Elastic version 8.0+ with a platinum or enterprise license (or trial license)

You can set up a [free trial elasticsearch Deployment in Elastic Cloud](https://cloud.elastic.co/registration).

## Deleteme below

Example here is loading a [Zero Shot model](https://huggingface.co/typeform/distilbert-base-uncased-mnli)

[Elastic NLP Model Support Docs](https://www.elastic.co/guide/en/machine-learning/current/ml-nlp-model-ref.html) 


Disclaimer: presented as is with no guarantee.

# Install and import required python libraries

Elastic uses the [eland python library](https://github.com/elastic/eland) to download modesl from Hugging Face hub and load them into elasticsearch

In [1]:
pip install eland

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


In [2]:
pip install elasticsearch

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


In [3]:
pip install transformers

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


In [4]:
pip install sentence_transformers

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


In [5]:
pip install torch==1.11

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting torch==1.11
  Downloading torch-1.11.0-cp38-cp38-manylinux1_x86_64.whl (750.6 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m750.6/750.6 MB[0m [31m847.2 kB/s[0m eta [36m0:00:00[0m
Installing collected packages: torch
  Attempting uninstall: torch
    Found existing installation: torch 1.13.1+cu116
    Uninstalling torch-1.13.1+cu116:
      Successfully uninstalled torch-1.13.1+cu116
[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
torchvision 0.14.1+cu116 requires torch==1.13.1, but you have torch 1.11.0 which is incompatible.
torchtext 0.14.1 requires torch==1.13.1, but you have torch 1.11.0 which is incompatible.
torchaudio 0.13.1+cu116 requires torch==1.13.1, but you have torch 1.11.0 which is incompatible.[0m[31m
[0mSuc

In [6]:
from pathlib import Path
from eland.ml.pytorch import PyTorchModel
from eland.ml.pytorch.transformers import TransformerModel
from elasticsearch import Elasticsearch
from elasticsearch.client import MlClient

  from elasticsearch.client import MlClient


# Configure elasticsearch authentication. 
For this example we are using the [Elastic Cloud ID](https://www.elastic.co/guide/en/cloud/current/ec-cloud-id.html) and a [cluster API key](https://www.elastic.co/guide/en/kibana/current/api-keys.html)

You can use any method you wish to set the required credentials. We are using getpass in this example to prompt for credentials.

In [7]:
import getpass

In [8]:
es_cloud_id = getpass.getpass('Enter Elastic Cloud ID:  ')
es_api_id = getpass.getpass('Enter cluster API key ID:  ') 
es_api_key = getpass.getpass('Enter cluster API key:  ')

Enter Elastic Cloud ID:  ··········
Enter cluster API key ID:  ··········
Enter cluster API key:  ··········


# Connect to Elastic and Load a Hugging Face Model

In [9]:
es = Elasticsearch(cloud_id=es_cloud_id, 
                   api_key=(es_api_id, es_api_key)
                   )
es.info() # should return cluster info

ObjectApiResponse({'name': 'instance-0000000001', 'cluster_name': 'a7bf48bf42ad403ab45dd6b90b860f85', 'cluster_uuid': 'gEbjuhUOSyCVzG4Gz2SQ2w', 'version': {'number': '8.6.0', 'build_flavor': 'default', 'build_type': 'docker', 'build_hash': 'f67ef2df40237445caa70e2fef79471cc608d70d', 'build_date': '2023-01-04T09:35:21.782467981Z', 'build_snapshot': False, 'lucene_version': '9.4.2', 'minimum_wire_compatibility_version': '7.17.0', 'minimum_index_compatibility_version': '7.0.0'}, 'tagline': 'You Know, for Search'})

[Supported `task_type` values](https://github.com/elastic/eland/blob/15a300728876022b206161d71055c67b500a0192/eland/ml/pytorch/transformers.py#*L41*)

# Download an embedding model from Hugging Face using the HF copy link

[sentence-transformers/msmarco-MiniLM-L-12-v3](https://huggingface.co/sentence-transformers/msmarco-MiniLM-L-12-v3)


In [10]:
hf_model_id='sentence-transformers/msmarco-MiniLM-L-12-v3'
tm = TransformerModel(hf_model_id, "text_embedding")

In [11]:
es_model_id = tm.elasticsearch_model_id()
es_model_id

'sentence-transformers__msmarco-minilm-l-12-v3'

### Export the model in a TorchScrpt representation which Elasticsearch uses

In [12]:
tmp_path = "models"
Path(tmp_path).mkdir(parents=True, exist_ok=True)
model_path, config, vocab_path = tm.save(tmp_path)

### Import model into Elasticsearch
Model should not already exist in elasticsearch

In [14]:
ptm = PyTorchModel(es, es_model_id)
ptm.import_model(model_path=model_path, config_path=None, vocab_path=vocab_path, config=config) 

  0%|          | 0/32 [00:00<?, ? parts/s]

# Operationalizing the Model

## Deploy the model
This will load the model on the ML nodes for use

In [15]:
# List the in elasticsearch
m = MlClient.get_trained_models(es, model_id=es_model_id)
m.body

{'count': 1,
 'trained_model_configs': [{'model_id': 'sentence-transformers__msmarco-minilm-l-12-v3',
   'model_type': 'pytorch',
   'created_by': 'api_user',
   'version': '8.6.0',
   'create_time': 1675731184156,
   'model_size_bytes': 0,
   'estimated_operations': 0,
   'license_level': 'platinum',
   'description': "Model sentence-transformers/msmarco-MiniLM-L-12-v3 for task type 'text_embedding'",
   'tags': [],
   'input': {'field_names': ['text_field']},
   'inference_config': {'text_embedding': {'vocabulary': {'index': '.ml-inference-native-000001'},
     'tokenization': {'bert': {'do_lower_case': True,
       'with_special_tokens': True,
       'max_sequence_length': 512,
       'truncate': 'first',
       'span': -1}}}},
   'location': {'index': {'name': '.ml-inference-native-000001'}}}]}

In [16]:
# start trained model deployment
s = MlClient.start_trained_model_deployment(es, model_id=es_model_id)
s.body

# You can see model state in Kibana -> Machine Learning -> Model Management -> Trained Models

{'assignment': {'task_parameters': {'model_id': 'sentence-transformers__msmarco-minilm-l-12-v3',
   'model_bytes': 132923217,
   'threads_per_allocation': 1,
   'number_of_allocations': 1,
   'queue_capacity': 1024,
   'cache_size': '132923217b',
   'priority': 'normal'},
  'routing_table': {'dDqmmPySSKuH8d7HcF54uA': {'current_allocations': 1,
    'target_allocations': 1,
    'routing_state': 'started',
    'reason': ''}},
  'assignment_state': 'started',
  'start_time': '2023-02-07T00:53:53.570574077Z',
  'max_assigned_allocations': 1}}

In [17]:
stats = MlClient.get_trained_models_stats(es, model_id=es_model_id)
stats.body['trained_model_stats'][0]['deployment_stats']['nodes'][0]['routing_state']

{'routing_state': 'started'}

## Generate Vector for Query


In [18]:
#{
#  "docs": [{"text_field": "What was Jean Valjean prisoner number?"}]
#}

docs =  [
    {
      "text_field": "Last week I upgraded my iOS version and ever since then my phone has been overheating whenever I use your app."
    }
  ]

In [19]:
# future reference do not use yet
#z = MlClient.infer_trained_model_deployment(es, model_id =es_model_id, docs=docs, )
z = MlClient.infer_trained_model(es, model_id=es_model_id, docs=docs, )

In [23]:
z['inference_results'][0]['predicted_value']

[-0.003885856131091714,
 0.10948595404624939,
 0.4242307245731354,
 -0.1088743731379509,
 0.25633704662323,
 -0.29534631967544556,
 0.0018348372541368008,
 -0.04669986665248871,
 0.08677243441343307,
 -0.10970480740070343,
 -0.27509960532188416,
 0.33432483673095703,
 0.2643325924873352,
 -0.19775789976119995,
 0.26366937160491943,
 0.12308581918478012,
 -0.45062845945358276,
 0.05021745711565018,
 -0.20362886786460876,
 0.11058223992586136,
 -0.4123060405254364,
 -0.10743958503007889,
 -0.14408890902996063,
 0.12179391086101532,
 -0.2939355969429016,
 0.26750823855400085,
 0.21124641597270966,
 0.12771840393543243,
 -0.3403322398662567,
 0.2825608551502228,
 -0.15237951278686523,
 0.28149640560150146,
 0.030429411679506302,
 0.10954266041517258,
 -0.09875885397195816,
 -0.11424262076616287,
 -0.5897471904754639,
 -0.17613087594509125,
 -0.40651747584342957,
 -0.236677348613739,
 -0.03501296788454056,
 0.47361087799072266,
 0.0849834755063057,
 0.008568909019231796,
 0.6042559742927551