<a href="https://colab.research.google.com/github/jeffvestal/elastic_jupyter_notebooks/blob/main/load_embedding_model_from_hf_to_elastic.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Loading an NLP Embedding model from Hugging Face into Elastic

This code will show you how to load a supported embedding model from Hugging Face into an elasticsearch cluster in [Elastic Cloud](https://cloud.elastic.co/)

### Elastic version support
Requires Elastic version 8.0+ with a platinum or enterprise license (or trial license)

You can set up a [free trial elasticsearch Deployment in Elastic Cloud](https://cloud.elastic.co/registration).

## Deleteme below

Example here is loading a [Zero Shot model](https://huggingface.co/typeform/distilbert-base-uncased-mnli)

[Elastic NLP Model Support Docs](https://www.elastic.co/guide/en/machine-learning/current/ml-nlp-model-ref.html) 


Disclaimer: presented as is with no guarantee.

# Install and import required python libraries

Elastic uses the [eland python library](https://github.com/elastic/eland) to download modesl from Hugging Face hub and load them into elasticsearch

In [1]:
pip install eland

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting eland
  Downloading eland-8.3.0-py3-none-any.whl (143 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m143.7/143.7 KB[0m [31m1.6 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting elasticsearch<9,>=8.3
  Downloading elasticsearch-8.6.1-py3-none-any.whl (385 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m385.4/385.4 KB[0m [31m13.2 MB/s[0m eta [36m0:00:00[0m
Collecting elastic-transport<9,>=8
  Downloading elastic_transport-8.4.0-py3-none-any.whl (59 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m59.5/59.5 KB[0m [31m3.1 MB/s[0m eta [36m0:00:00[0m
Collecting urllib3<2,>=1.26.2
  Downloading urllib3-1.26.14-py2.py3-none-any.whl (140 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m140.6/140.6 KB[0m [31m10.5 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: urllib3, elastic-transport, ela

In [2]:
pip install elasticsearch

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


In [4]:
pip install transformers

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting transformers
  Downloading transformers-4.26.0-py3-none-any.whl (6.3 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m6.3/6.3 MB[0m [31m71.4 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting huggingface-hub<1.0,>=0.11.0
  Downloading huggingface_hub-0.12.0-py3-none-any.whl (190 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m190.3/190.3 KB[0m [31m19.5 MB/s[0m eta [36m0:00:00[0m
Collecting tokenizers!=0.11.3,<0.14,>=0.11.1
  Downloading tokenizers-0.13.2-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.6 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m7.6/7.6 MB[0m [31m105.3 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: tokenizers, huggingface-hub, transformers
Successfully installed huggingface-hub-0.12.0 tokenizers-0.13.2 transformers-4.26.0


In [6]:
pip install sentence_transformers

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting sentence_transformers
  Downloading sentence-transformers-2.2.2.tar.gz (85 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m86.0/86.0 KB[0m [31m5.7 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
Collecting sentencepiece
  Downloading sentencepiece-0.1.97-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.3 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.3/1.3 MB[0m [31m51.7 MB/s[0m eta [36m0:00:00[0m
Building wheels for collected packages: sentence_transformers
  Building wheel for sentence_transformers (setup.py) ... [?25l[?25hdone
  Created wheel for sentence_transformers: filename=sentence_transformers-2.2.2-py3-none-any.whl size=125938 sha256=c8ee7abe71bd203971a09c508a080912c39f61b2f2e80e47ff67f2d0de465229
  Stored in directory: /root/.cache/pip/wheels/5e/6f/8c/d88aec621f3f5

In [None]:
pip install torch==1.11

In [7]:
from pathlib import Path
from eland.ml.pytorch import PyTorchModel
from eland.ml.pytorch.transformers import TransformerModel
from elasticsearch import Elasticsearch

In [None]:
import getpass
from pathlib import Path
from eland.ml.pytorch import PyTorchModel
from eland.ml.pytorch.transformers import TransformerModel
from elasticsearch import Elasticsearch
from elasticsearch.client import MlClient


# Configure elasticsearch authentication. 
For this example we are using the [Elastic Cloud ID](https://www.elastic.co/guide/en/cloud/current/ec-cloud-id.html) and a [cluster API key](https://www.elastic.co/guide/en/kibana/current/api-keys.html)

In [8]:
import getpass

In [15]:
es_cloud_id = getpass.getpass('Enter Elastic Cloud ID:  ')
es_api_id = getpass.getpass('Enter cluster API key ID:  ') 
es_api_key = getpass.getpass('Enter cluster API key:  ')

Enter Elastic Cloud ID:  ··········
Enter cluster API key ID:  ··········
Enter cluster API key:  ··········


# Connect to Elastic and Load a Hugging Face Model

In [16]:
es = Elasticsearch(cloud_id=es_cloud_id, 
                   api_key=(es_api_id, es_api_key)
                   )
es.info() # should return cluster info

ObjectApiResponse({'name': 'instance-0000000001', 'cluster_name': 'a7bf48bf42ad403ab45dd6b90b860f85', 'cluster_uuid': 'gEbjuhUOSyCVzG4Gz2SQ2w', 'version': {'number': '8.6.0', 'build_flavor': 'default', 'build_type': 'docker', 'build_hash': 'f67ef2df40237445caa70e2fef79471cc608d70d', 'build_date': '2023-01-04T09:35:21.782467981Z', 'build_snapshot': False, 'lucene_version': '9.4.2', 'minimum_wire_compatibility_version': '7.17.0', 'minimum_index_compatibility_version': '7.0.0'}, 'tagline': 'You Know, for Search'})

[Supported `task_type` values](https://github.com/elastic/eland/blob/15a300728876022b206161d71055c67b500a0192/eland/ml/pytorch/transformers.py#*L41*)

In [None]:
# Download a Hugging Face Zero Shot model directly from the model hub

# https://huggingface.co/typeform/distilbert-base-uncased-mnli
#tm = TransformerModel("sentence-transformers/all-MiniLM-L12-v2", "text_embedding")
tm = TransformerModel("distilbert-base-cased-distilled-squad", "question_answering")

In [None]:
# Export the model in a TorchScrpt representation which Elasticsearch uses
tmp_path = "models"
Path(tmp_path).mkdir(parents=True, exist_ok=True)
# model_path, config_path, vocab_path = tm.save(tmp_path) #pre 8.2.0
model_path, config, vocab_path = tm.save(tmp_path)

In [None]:
# Import model into Elasticsearch
ptm = PyTorchModel(es, tm.elasticsearch_model_id())
# ptm.import_model(model_path, config_path, vocab_path) # pre 8.2.0
ptm.import_model(model_path=model_path, config_path=None, vocab_path=vocab_path, config=config) 

# Deploy the model

In [None]:
# List models in elasticsearch
m = MlClient.get_trained_models(es, )
m.body

In [None]:
# Deploy the model

#  Model is listed -> 'model_id': 'typeform__distilbert-base-uncased-mnli'
model_id='distilbert-base-cased-distilled-squad'

# start trained model deployment
s = MlClient.start_trained_model_deployment(es, model_id=model_id)
s.body

# You can see model state in Kibana -> Machine Learning -> Model Management -> Trained Models

# Zero Shot Time!

In [None]:
# future reference do not use yet
#z = MlClient.infer_trained_model_deployment(es, model_id =model_id, docs=docs, )

In [None]:
# Using requests until MlClient.infer_trained_model_deployment is updated to accept inference extra configs
import requests
from requests.auth import HTTPBasicAuth
import urllib.parse

endpoint = '_ml/trained_models/%s/deployment/_infer' % model_id
url = urllib.parse.urljoin(es_url, endpoint)

body = {
  "docs": [
    {
      "text_field": "Last week I upgraded my iOS version and ever since then my phone has been overheating whenever I use your app."
    }
  ],
  "inference_config": {
    "zero_shot_classification": {
      "labels": [
        "mobile",
        "website",
        "billing",
        "account access"
      ],
      "multi_label": True
    }
  }
}

resp = requests.post(url, auth=HTTPBasicAuth(es_user, es_pass), json=body)
r = resp.json()
print('Predicted value is: %s with a probability of %0.2f%%' % (r['predicted_value'], r['prediction_probability'] * 100))
print('=-=-=-=')
print('Full Probability output:')
for c in r['top_classes']:
    print ('%s probability of %0.5f%%' % (c['class_name'], c['class_probability'] * 100))

In [None]:
# Just to see the full doc
resp.json()