![JohnSnowLabs](https://sparknlp.org/assets/images/logo.png)

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/JohnSnowLabs/spark-nlp/blob/master/examples/python/transformers/openvino/HuggingFace_OpenVINO_in_Spark_NLP_Nomic.ipynb)

# Import OpenVINO Nomic models from HuggingFace 🤗 into Spark NLP 🚀

This notebook provides a detailed walkthrough on optimizing and importing Nomic models from HuggingFace  for use in Spark NLP, with [Intel OpenVINO toolkit](https://www.intel.com/content/www/us/en/developer/tools/openvino-toolkit/overview.html). The focus is on converting the model to the OpenVINO format and applying precision optimizations (INT8 and INT4), to enhance the performance and efficiency on CPU platforms using [Optimum Intel](https://huggingface.co/docs/optimum/main/en/intel/inference).

Let's keep in mind a few things before we start 😊

- OpenVINO support was introduced in  `Spark NLP 5.4.0`, enabling high performance CPU inference for models. So please make sure you have upgraded to the latest Spark NLP release.
- Model quantization is a computationally expensive process, so it is recommended to use a runtime with more than 32GB memory for exporting the quantized model from HuggingFace.
- You can import Nomic models via `NomicModel`. These models are usually under `Text Generation` category and have `Nomic` in their labels.
- Some [example models](https://huggingface.co/models?search=Nomic)

## 1. Export and Save the HuggingFace model

- Let's install `transformers` and `openvino` packages with other dependencies. You don't need `openvino` to be installed for Spark NLP, however, we need it to load and save models from HuggingFace.
- We lock `transformers` on version `4.52.4`. This doesn't mean it won't work with the future release, but we wanted you to know which versions have been tested successfully.

In [None]:
!pip install -q --upgrade transformers[onnx]==4.52.4 optimum openvino

[Optimum Intel](https://github.com/huggingface/optimum-intel?tab=readme-ov-file#openvino) is the interface between the Transformers library and the various model optimization and acceleration tools provided by Intel. HuggingFace models loaded with optimum-intel are automatically optimized for OpenVINO, while remaining compatible with the Transformers API.

- We first use the `optimum-cli` tool to export the [openbmb/Nomic-2B-dpo-bf16](https://huggingface.co/openbmb/Nomic-2B-dpo-bf16) model to ONNX format for the `feature-extraction` task.
- Then, we use `convert_model()` to convert the exported ONNX model into OpenVINO Intermediate Representation (IR) format (`.xml` and `.bin`) directly in Python.
- The resulting OpenVINO model is saved in the specified directory (`export_openvino/hkunlp-instructor-base`)


Export ONNX model using Optimum CLI

In [None]:
!optimum-cli export onnx --trust-remote-code --task feature-extraction --model nomic-ai/nomic-embed-text-v1 ./onnx_models/nomic-ai/nomic-embed-text-v1

Convert ONNX to OpenVINO IR with FP16 compression


In [3]:
import openvino as ov

MODEL_NAME = "nomic-ai/nomic-embed-text-v1"
!mkdir -p models/$MODEL_NAME

ov_model = ov.convert_model(f"./onnx_models/{MODEL_NAME}/model.onnx")
ov.save_model(ov_model, f"models/{MODEL_NAME}/openvino_model.xml", compress_to_fp16=True)

Save tokenizer vocabulary to assets folder


In [None]:
from transformers import AutoTokenizer

!mkdir -p models/nomic-ai/nomic-embed-text-v1/assets
AutoTokenizer.from_pretrained("bert-base-uncased").save_vocabulary("models/nomic-ai/nomic-embed-text-v1/assets")

## 2. Import and Save Nomic in Spark NLP

- Install and set up Spark NLP in Google Colab
- This example uses specific versions of `pyspark` and `spark-nlp` that have been tested with the transformer model to ensure everything runs smoothly.

In [5]:
!pip install -q pyspark==3.5.4 spark-nlp==5.5.3

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m317.3/317.3 MB[0m [31m5.5 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m635.7/635.7 kB[0m [31m26.5 MB/s[0m eta [36m0:00:00[0m
[?25h  Building wheel for pyspark (setup.py) ... [?25l[?25hdone


Let's start Spark with Spark NLP included via our simple `start()` function

In [6]:
import sparknlp

spark = sparknlp.start()

print("Spark NLP version: ", sparknlp.version())
print("Apache Spark version: ", spark.version)

Spark NLP version:  5.5.3
Apache Spark version:  3.5.4


- Let's use `loadSavedModel` functon in `NomicEmbeddings` which allows us to load the OpenVINO model.
- Most params will be set automatically. They can also be set later after loading the model in `NomicEmbeddings` during runtime, so don't worry about setting them now.
- `loadSavedModel` accepts two params, first is the path to the exported model. The second is the SparkSession that is `spark` variable we previously started via `sparknlp.start()`
- NOTE: `loadSavedModel` accepts local paths in addition to distributed file systems such as `HDFS`, `S3`, `DBFS`, etc. This feature was introduced in Spark NLP 4.2.2 release. Keep in mind the best and recommended way to move/share/reuse Spark NLP models is to use `write.save` so you can use `.load()` from any file systems natively.st and recommended way to move/share/reuse Spark NLP models is to use `write.save` so you can use `.load()` from any file systems natively.

In [7]:
from sparknlp.annotator import NomicEmbeddings

Nomic = NomicEmbeddings \
    .loadSavedModel("models/nomic-ai/nomic-embed-text-v1", spark) \
    .setInputCols(["documents"]) \
    .setOutputCol("generation")

Let's save it on disk so it is easier to be moved around and also be used later via `.load` function

In [8]:
Nomic.write().overwrite().save(f"{MODEL_NAME}_spark_nlp")

Let's clean up stuff we don't need anymore

In [9]:
!rm -rf {EXPORT_PATH}

Awesome  😎 !

This is your OpenVINO Nomic model from HuggingFace 🤗  loaded and saved by Spark NLP 🚀

In [10]:
! ls -l {MODEL_NAME}_spark_nlp

total 267996
drwxr-xr-x 3 root root      4096 Jun 23 07:09 fields
drwxr-xr-x 2 root root      4096 Jun 23 07:09 metadata
-rw-r--r-- 1 root root 274413907 Jun 23 07:09 nomic_openvino


Now let's see how we can use it on other machines, clusters, or any place you wish to use your new and shiny Nomic model 😊

In [12]:
from sparknlp.base import *
from sparknlp.annotator import *
from pyspark.ml import Pipeline

test_data = spark.createDataFrame([
    [1, "query: how much protein should a female eat"],
    [2, "query: summit define"],
    [3, "passage: As a general guideline, the CDC's average requirement of protein for women ages 19 to 70 "
        "is 46 grams per day. But, as you can see from this chart, you'll need to increase that if you're "
        "expecting or training for a marathon. Check out the chart below to see how much protein you should "
        "be eating each day."],
    [4, "passage: Definition of summit for English Language Learners. : 1  the highest point of a mountain :"
        " the top of a mountain. : 2  the highest level. : 3  a meeting or series of meetings between the "
        "leaders of two or more governments."]
]).toDF("id", "text")

document_assembler = DocumentAssembler() \
    .setInputCol("text") \
    .setOutputCol("documents")

nomic = NomicEmbeddings \
    .load(f"{MODEL_NAME}_spark_nlp") \
    .setInputCols(["documents"]) \
    .setOutputCol("nomic")

pipeline = Pipeline().setStages([
    document_assembler,
    nomic
])

model = pipeline.fit(test_data)
results = model.transform(test_data)

results.select("nomic.embeddings").show()

+--------------------+
|          embeddings|
+--------------------+
|[[0.055686906, 0....|
|[[-0.0036336272, ...|
|[[0.004018774, 0....|
|[[-0.018702844, 0...|
+--------------------+



That's it! You can now go wild and use hundreds of Nomic models from HuggingFace 🤗 in Spark NLP 🚀
