# Loading Models into Elasticsearch from Hugging Face

<a target="_blank" href="https://colab.research.google.com/github/elastic/elasticsearch-labs/blob/main/colab-notebooks-examples/integrations/hugging-face/loading-model-from-hugging-face.ipynb"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

This interactive notebook uses eland to load machine learning models from the Hugging Face hub into an Elasticsearch deployment.

## Install dependencies

To get started we'll need to connect to our Elasticsearch deployment using the Python client.

First we need to install our dependencies:

In [11]:
!pip install -qU elasticsearch eland torch transformers sentence_transformers

## Create Elastic Cloud deployment

If you don't have an Elastic Cloud deployment, sign up [here](https://cloud.elastic.co/registration?fromURI=%2Fhome) for a free trial.

- Go to the [Create deployment](https://cloud.elastic.co/deployments/create) page
- Select **Create deployment**

Now we can instantiate the [Elasticsearch python client](https://www.elastic.co/guide/en/elasticsearch/client/python-api/current/index.html), providing the cloud id and password in your deployment.

In [34]:
import json
from elasticsearch import Elasticsearch

es_client = Elasticsearch(
    # "http://deployment_url",
    cloud_id="CLOUD_ID",
    basic_auth=("elastic", "PASSWORD")
    # api_key="API_KEY"
)

# Test client to ensure you can connect
print(es_client.info())

{'name': 'e5f4d36cef4a', 'cluster_name': 'docker-cluster', 'cluster_uuid': 'pvSmoDFzQA26HNRZe__RTw', 'version': {'number': '8.10.0-SNAPSHOT', 'build_flavor': 'default', 'build_type': 'docker', 'build_hash': '92440687c5c05040596ad6c0383e0d91d42765a9', 'build_date': '2023-07-31T11:19:15.256884445Z', 'build_snapshot': True, 'lucene_version': '9.7.0', 'minimum_wire_compatibility_version': '7.17.0', 'minimum_index_compatibility_version': '7.0.0'}, 'tagline': 'You Know, for Search'}


## Importing a Model from Hugging Face

To import a model from Hugging face you will first need to find the model's id and task type so that it can be loaded into your Elasticsearch cluster.

In [39]:
from pathlib import Path
from eland.ml.pytorch import PyTorchModel
from eland.ml.pytorch.transformers import TransformerModel

# Load model from Hugging Face
model_name = "elastic/distilbert-base-cased-finetuned-conll03-english" 
model_type = "ner"
tm = TransformerModel(model_name, model_type)

## Export model

We will create a directory where we can export the mode from Hugging Face to a Torchscript that can be imported to Elasticsearch.

In [40]:
from pathlib import Path

# Export the model to a TorchScript representation which Elasticsearch uses
tmp_path = "models"
Path(tmp_path).mkdir(parents=True, exist_ok=True)
model_path, config, vocab_path = tm.save(tmp_path)

## Import model

We will now import the model into Elasticsearch from the saved Torchscript.

In [41]:
from eland.ml.pytorch import PyTorchModel

# Import model into Elasticsearch
model_id = tm.elasticsearch_model_id()
ptm = PyTorchModel(es_client, model_id)

# You can also give the model a custom model id like
# ptm = PyTorchModel(es_client, "my_model_id")

ptm.import_model(model_path=model_path, config_path=None, vocab_path=vocab_path, config=config)

  0%|          | 0/63 [00:00<?, ? parts/s]

## Get trained models

Next lets retrieve the list of trained models from Elasticsearch to ensure the model we imported is available.

In [50]:
models_resp = es_client.ml.get_trained_models()

for model_config in models_resp.body["trained_model_configs"]:
    if model_config["model_id"] == model_id:
        print(json.dumps(model_config, indent=2))

{
  "model_id": "elastic__distilbert-base-cased-finetuned-conll03-english",
  "model_type": "pytorch",
  "created_by": "api_user",
  "version": "8.10.0",
  "create_time": 1690992040049,
  "model_size_bytes": 0,
  "estimated_operations": 0,
  "license_level": "platinum",
  "description": "Model elastic/distilbert-base-cased-finetuned-conll03-english for task type 'ner'",
  "tags": [],
  "input": {
    "field_names": [
      "text_field"
    ]
  },
  "inference_config": {
    "ner": {
      "vocabulary": {
        "index": ".ml-inference-native-000001"
      },
      "tokenization": {
        "bert": {
          "do_lower_case": false,
          "with_special_tokens": true,
          "max_sequence_length": 512,
          "truncate": "first",
          "span": -1
        }
      },
      "classification_labels": [
        "O",
        "B_PER",
        "I_PER",
        "B_ORG",
        "I_ORG",
        "B_LOC",
        "I_LOC",
        "B_MISC",
        "I_MISC"
      ]
    }
  },
  "locat

## Start model deployment

Finally we will start the trained model. You can [learn more](https://www.elastic.co/guide/en/elasticsearch/reference/current/start-trained-model-deployment.html) about starting the model in the machine learning API documentation.

In [51]:
start_resp = es_client.ml.start_trained_model_deployment(
    model_id=model_id, 
    priority="normal",
    number_of_allocations=1,
    threads_per_allocation=1, # 1, 2, 4, 8, 16
    wait_for="started",
    timeout="1m"
)

print(json.dumps(start_resp.body,indent=2))

BadRequestError: BadRequestError(400, 'status_exception', 'Could not start model deployment because an existing deployment with the same id [elastic__distilbert-base-cased-finetuned-conll03-english] exist')