# Demo Notebook to trace Sentence Transformers model

This notebook provides a walkthrough guidance for users to trace models from Sentence Transformers in torchScript and onnx format. After tracing the model customer can upload the model to opensearch and generate embeddings.

Remember, tracing model in torchScript or Onnx format at just two different options. We don't need to trace model in both ways. Here in our notebook we just want to show both ways. 

Step 0: Import packages and set up client

Step 1: Save model in torchScript format

Step 2: Upload the saved torchScript model in Opensearch

[The following steps are optional, just showing uploading model in both ways and comparing the both embedding output]

Step 3: Save model in Onnx format 

Step 4: Upload the saved Onnx model in Opensearch

Step 5: Generate Sentence Embedding with uploaded models




## Step 0: Import packages and set up client
Install required packages for opensearch_py_ml.sentence_transformer_model
Install `opensearchpy` and `opensearch-py-ml` through pypi


In [1]:
#!pip install opensearch-py opensearch-py-ml

In [2]:
import warnings
warnings.filterwarnings('ignore', category=DeprecationWarning)
warnings.filterwarnings("ignore", message="Unverified HTTPS request")
import opensearch_py_ml as oml
from opensearchpy import OpenSearch
from opensearch_py_ml.ml_models import SentenceTransformerModel
# import mlcommon to later upload the model to OpenSearch Cluster
from opensearch_py_ml.ml_commons import MLCommonClient

In [2]:
CLUSTER_URL = 'https://localhost:9200'

In [3]:
def get_os_client(cluster_url = CLUSTER_URL,
                  username='admin',
                  password='admin'):
    '''
    Get OpenSearch client
    :param cluster_url: cluster URL like https://ml-te-netwo-1s12ba42br23v-ff1736fa7db98ff2.elb.us-west-2.amazonaws.com:443
    :return: OpenSearch client
    '''
    client = OpenSearch(
        hosts=[cluster_url],
        http_auth=(username, password),
        verify_certs=False
    )
    return client 

In [4]:
client = get_os_client()

#connect to ml_common client with OpenSearch client
import opensearch_py_ml as oml
from opensearch_py_ml.ml_commons import MLCommonClient
ml_client = MLCommonClient(client)

## Step 1: Save model in torchScript format

`Opensearch-py-ml` plugin provides method `save_as_pt` which will trace a model in torchScript format and save the model in a zip file in your filesystem. 

Detailed documentation: https://opensearch-project.github.io/opensearch-py-ml/reference/api/sentence_transformer.save_as_pt.html#opensearch_py_ml.ml_models.SentenceTransformerModel.save_as_pt


Users need to provide a model id from sentence transformers (an example: `sentence-transformers/all-MiniLM-L6-v2`). This model id is a huggingface model id. Exaample: https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2

`save_as_pt` will download the model in filesystem and then trace the model with the given input strings.

To get more direction about dummy input string please check this url: https://huggingface.co/docs/transformers/torchscript#dummy-inputs-and-standard-lengths

after tracing the model (a .pt file will be generated), `save_as_pt` method zips `tokenizers.json` and torchScript (`.pt`) file and saves in the file system. 

User can upload that model to opensearch to generate embedding.

In [5]:
pre_trained_model = SentenceTransformerModel(folder_path = '/Volumes/workplace/upload_content/', overwrite = True)
model_path = pre_trained_model.save_as_pt(model_id = "sentence-transformers/all-MiniLM-L6-v2", sentences=["for example providing a small sentence", "we can add multiple sentences"])

model file is saved to  /Volumes/workplace/upload_content/all-MiniLM-L6-v2.pt
zip file is saved to  /Volumes/workplace/upload_content/all-MiniLM-L6-v2.zip 



## Step 2: Upload the saved torchScript model in Opensearch

In the last step we saved a sentence transformer model in torchScript format. Now we will upload that model in opensearch cluster. To do that we can take help of `upload_model` method in `opensearch-py-ml` plugin.

To upload model, we need the zip file we just saved in the last step and a model config file. Example of Model config file content can be:

{
  "name": "all-MiniLM-L6-v2",
  "version": "1.0.0",
  "description": "test model",
  "model_format": "TORCH_SCRIPT",
  "model_config": {
    "model_type": "bert",
    "embedding_dimension": 384,
    "framework_type": "sentence_transformers"
  }
}

`model_format` needs to be `TORCH_SCRIPT` so that internal system will look for the corresponding `.pt` file from the zip folder. 

Please refer to this doc: https://github.com/opensearch-project/ml-commons/blob/2.x/docs/model_serving_framework/text_embedding_model_examples.md


Documentation for the method: https://opensearch-project.github.io/opensearch-py-ml/reference/api/ml_commons_upload_api.html#opensearch_py_ml.ml_commons.MLCommonClient.upload_model

Related demo notebook about ml-commons plugin integration: https://opensearch-project.github.io/opensearch-py-ml/examples/demo_ml_commons_integration.html



In [7]:

model_config_path = '/Volumes/workplace/upload_content/model_config_torchscript.json'
ml_client.upload_model( model_path, model_config_path, isVerbose=True)


Total number of chunks 10
Sha1 value of the model file:  0a8eabed8c09b09b588e9d4c3d42e61c90e524d014be5547b73db75e77185576
Model meta data was created successfully. Model Id:  SGHQ9IUBTo3f8n5RC3ZH
uploading chunk 1 of 10
Model id: {'status': 'Uploaded'}
uploading chunk 2 of 10
Model id: {'status': 'Uploaded'}
uploading chunk 3 of 10
Model id: {'status': 'Uploaded'}
uploading chunk 4 of 10
Model id: {'status': 'Uploaded'}
uploading chunk 5 of 10
Model id: {'status': 'Uploaded'}
uploading chunk 6 of 10
Model id: {'status': 'Uploaded'}
uploading chunk 7 of 10
Model id: {'status': 'Uploaded'}
uploading chunk 8 of 10
Model id: {'status': 'Uploaded'}
uploading chunk 9 of 10
Model id: {'status': 'Uploaded'}
uploading chunk 10 of 10
Model id: {'status': 'Uploaded'}
Model uploaded successfully


'SGHQ9IUBTo3f8n5RC3ZH'

## Step 3:Save model in Onnx format

`Opensearch-py-ml` plugin provides method `save_as_onnx` which will trace a model in ONNX format and save the model in a zip file in your filesystem. 

Detailed documentation: https://opensearch-project.github.io/opensearch-py-ml/reference/api/sentence_transformer.save_as_onnx.html#opensearch_py_ml.ml_models.SentenceTransformerModel.save_as_onnx


Users need to provide a model id from sentence transformers (an example: `sentence-transformers/all-MiniLM-L6-v2`). `save_as_onnx` will download the model in filesystem and then trace the model.

after tracing the model (a .onnx file will be generated), `save_as_onnx` method zips `tokenizers.json` and torchScript (`.onnx`) file and saves in the file system. 

User can upload that model to opensearch to generate embedding.


In [10]:
pre_trained_model = SentenceTransformerModel(folder_path = '/Volumes/workplace/upload_content/', overwrite = True)
model_path_onnx = pre_trained_model.save_as_onnx(model_id = "sentence-transformers/all-MiniLM-L6-v2")

ONNX opset version set to: 16
Loading pipeline (model: sentence-transformers/all-MiniLM-L6-v2, tokenizer: sentence-transformers/all-MiniLM-L6-v2)
Creating folder /Volumes/workplace/upload_content/onnx
Using framework PyTorch: 1.13.1
Found input input_ids with shape: {0: 'batch', 1: 'sequence'}
Found input token_type_ids with shape: {0: 'batch', 1: 'sequence'}
Found input attention_mask with shape: {0: 'batch', 1: 'sequence'}
Found output output_0 with shape: {0: 'batch', 1: 'sequence'}
Found output output_1 with shape: {0: 'batch'}
Ensuring inputs are in correct order
position_ids is not present in the generated input list.
Generated inputs order: ['input_ids', 'attention_mask', 'token_type_ids']
zip file is saved to  /Volumes/workplace/upload_content/all-MiniLM-L6-v2.zip 



## Step 4: Upload the saved Onnx model in Opensearch

In the last step we saved a sentence transformer model in ONNX format. Now we will upload that model in opensearch cluster. To do that we can take help of `upload_model` method in `opensearch-py-ml` plugin.

To upload model, we need the zip file we just saved in the last step and a model config file. Example of Model config file content can be:

{
  "name": "all-MiniLM-L6-v2",
  "version": "1.0.0",
  "description": "test model",
  "model_format": "ONNX",
  "model_config": {
    "model_type": "bert",
    "embedding_dimension": 384,
    "framework_type": "sentence_transformers",
    "pooling_mode":"mean",
    "normalize_result":"true"
  }
}

`model_format` needs to be `ONNX` so that internal system will look for the corresponding `.onnx` file from the zip folder.

Please refer to this doc: https://github.com/opensearch-project/ml-commons/blob/2.x/docs/model_serving_framework/text_embedding_model_examples.md


Documentation for the method: https://opensearch-project.github.io/opensearch-py-ml/reference/api/ml_commons_upload_api.html#opensearch_py_ml.ml_commons.MLCommonClient.upload_model

Related demo notebook about ml-commons plugin integration: https://opensearch-project.github.io/opensearch-py-ml/examples/demo_ml_commons_integration.html

In [11]:

model_config_path = '/Volumes/workplace/upload_content/model_config.json'
ml_client.upload_model( model_path_onnx, model_config_path, isVerbose=True)

Total number of chunks 10
Sha1 value of the model file:  8faa075a1693f3734f956a8c8e0de755ad4142b59df6d605298879b0dab31308
Model meta data was created successfully. Model Id:  YWHS9IUBTo3f8n5ReXas
uploading chunk 1 of 10
Model id: {'status': 'Uploaded'}
uploading chunk 2 of 10
Model id: {'status': 'Uploaded'}
uploading chunk 3 of 10
Model id: {'status': 'Uploaded'}
uploading chunk 4 of 10
Model id: {'status': 'Uploaded'}
uploading chunk 5 of 10
Model id: {'status': 'Uploaded'}
uploading chunk 6 of 10
Model id: {'status': 'Uploaded'}
uploading chunk 7 of 10
Model id: {'status': 'Uploaded'}
uploading chunk 8 of 10
Model id: {'status': 'Uploaded'}
uploading chunk 9 of 10
Model id: {'status': 'Uploaded'}
uploading chunk 10 of 10
Model id: {'status': 'Uploaded'}
Model uploaded successfully


'YWHS9IUBTo3f8n5ReXas'

## Step 5: Generate Sentence Embedding

Now after loading these models in memory, we can generate embedding for sentences. We can provide a list of sentences to get a list of embedding for the sentences. 

In [13]:
# Now using this model we can generate sentence embedding.

import numpy as np

input_sentences = ["first sentence", "second sentence"]

# generated embedding from torchScript

embedding_output_torch = ml_client.generate_embedding("SGHQ9IUBTo3f8n5RC3ZH", input_sentences)

#just taking embedding for the first sentence
data_torch = embedding_output_torch["inference_results"][0]["output"][0]["data"]

# generated embedding from onnx

embedding_output_onnx = ml_client.generate_embedding("YWHS9IUBTo3f8n5ReXas", input_sentences)

#just taking embedding for the first sentence
data_onnx = embedding_output_onnx["inference_results"][0]["output"][0]["data"]

## Now we can check if there's any significant difference between two outputs

print(np.testing.assert_allclose(data_torch, data_onnx, rtol=1e-03, atol=1e-05))


None
