![JohnSnowLabs](https://sparknlp.org/assets/images/logo.png)

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/JohnSnowLabs/spark-nlp/blob/master/examples/python/transformers/onnx/HuggingFace_ONNX_in_Spark_NLP_BartForZeroShot.ipynb)

# Import ONNX BartTransformer models from HuggingFace 🤗 into Spark NLP 🚀

Let's keep in mind a few things before we start 😊

- ONNX support was introduced in `Spark NLP 5.0.0`, enabling high performance inference for models. Please make sure you have upgraded to the latest Spark NLP release.
- The BartForZeroShot model was introduced in `Spark NLP 5.1.0 and requires Spark version 3.4.1 and up.`
- Official models are supported, but not all custom models may work.

## Export and Save HuggingFace model

- Let's install `transformers` package with the `onnx` extension and it's dependencies. You don't need `onnx` to be installed for Spark NLP, however, we need it to load and save models from HuggingFace.
- We lock `transformers` on version `4.31.0`. This doesn't mean it won't work with the future releases, but we wanted you to know which versions have been tested successfully.

In [None]:
!pip install -q --upgrade transformers optimum  onnx onnxruntime

- HuggingFace has an extension called Optimum which offers specialized model inference, including ONNX. We can use this to import and export ONNX models with `from_pretrained` and `save_pretrained`.
- We'll use the [facebook/bart-large-mnli](https://huggingface.co/facebook/bart-large-mnli) model from HuggingFace as an example and export it with the `optimum-cli`.

In [2]:
MODEL_NAME = "sshleifer/distilbart-xsum-12-6"
EXPORT_PATH = f"export_onnx/{MODEL_NAME}"

In [3]:
!optimum-cli export onnx --task text2text-generation-with-past --model {MODEL_NAME} {EXPORT_PATH}

2025-05-13 14:31:06.182461: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:477] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1747146666.544555    1430 cuda_dnn.cc:8310] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1747146666.638646    1430 cuda_blas.cc:1418] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2025-05-13 14:31:07.367006: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
config.json: 100% 1.59k/1.59k [00:00<00:00, 9.61MB/s]
pytorch_model.bin: 100% 611M/611M [00:02<00:00, 249MB/s]
model.safetens

We have to move additional model assets into a seperate folder, so that Spark NLP can load it properly.

Let's have a look inside these two directories and see what we are dealing with:

In [4]:
!mkdir {EXPORT_PATH}/assets

In [9]:
import json
output_json = json.load(open(f"{EXPORT_PATH}/vocab.json"))

with open(f"{EXPORT_PATH}/assets/vocab.txt", "w") as f:
    for key in output_json.keys():
        print(key, file=f)

In [10]:
!mv {EXPORT_PATH}/merges.txt {EXPORT_PATH}/assets

mv: cannot stat 'export_onnx/sshleifer/distilbart-xsum-12-6/merges.txt': No such file or directory


In [11]:
!ls -l {EXPORT_PATH}

total 3152352
drwxr-xr-x 2 root root      4096 May 13 14:34 assets
-rw-r--r-- 1 root root      1662 May 13 14:31 config.json
-rw-r--r-- 1 root root 819866018 May 13 14:33 decoder_model_merged.onnx
-rw-r--r-- 1 root root 819603498 May 13 14:33 decoder_model.onnx
-rw-r--r-- 1 root root 769174126 May 13 14:33 decoder_with_past_model.onnx
-rw-r--r-- 1 root root 814962649 May 13 14:31 encoder_model.onnx
-rw-r--r-- 1 root root       329 May 13 14:31 generation_config.json
-rw-r--r-- 1 root root       957 May 13 14:31 special_tokens_map.json
-rw-r--r-- 1 root root      1243 May 13 14:31 tokenizer_config.json
-rw-r--r-- 1 root root   3558642 May 13 14:31 tokenizer.json
-rw-r--r-- 1 root root    798293 May 13 14:31 vocab.json


## Import and Save BartTransformer  in Spark NLP


- Let's install and setup Spark NLP in Google Colab
- This part is pretty easy via our simple script

In [28]:
! wget -q http://setup.johnsnowlabs.com/colab.sh -O - | bash
!pip install pyspark==3.4.0

Installing PySpark 3.2.3 and Spark NLP 6.0.0
setup Colab for PySpark 3.2.3 and Spark NLP 6.0.0
Collecting pyspark==3.4.0
  Downloading pyspark-3.4.0.tar.gz (310.8 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m310.8/310.8 MB[0m [31m4.9 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
Collecting py4j==0.10.9.7 (from pyspark==3.4.0)
  Downloading py4j-0.10.9.7-py2.py3-none-any.whl.metadata (1.5 kB)
Downloading py4j-0.10.9.7-py2.py3-none-any.whl (200 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m200.5/200.5 kB[0m [31m13.9 MB/s[0m eta [36m0:00:00[0m
[?25hBuilding wheels for collected packages: pyspark
  Building wheel for pyspark (setup.py) ... [?25l[?25hdone
  Created wheel for pyspark: filename=pyspark-3.4.0-py2.py3-none-any.whl size=311317124 sha256=91bca7965dd64906d3a58c9b58f1f64b0b02c8104e1b12e87ab21bb943352206
  Stored in directory: /root/.cache/pip/wheels/fc/49/ad/5c21e362b2cc9fb6785cdf03f7864b

Let's start Spark with Spark NLP included via our simple `start()` function

In [1]:
import sparknlp
# let's start Spark with Spark NLP
spark = sparknlp.start()


- Let's use `loadSavedModel` functon in `BartTransformer` which allows us to load TensorFlow model in SavedModel format
- Most params can be set later when you are loading this model in `BartTransformer` in runtime like `setMaxSentenceLength`, so don't worry what you are setting them now
- `loadSavedModel` accepts two params, first is the path to the TF SavedModel. The second is the SparkSession that is `spark` variable we previously started via `sparknlp.start()`
- NOTE: `loadSavedModel` accepts local paths in addition to distributed file systems such as `HDFS`, `S3`, `DBFS`, etc. This feature was introduced in Spark NLP 4.2.2 release. Keep in mind the best and recommended way to move/share/reuse Spark NLP models is to use `write.save` so you can use `.load()` from any file systems natively.

In [18]:
from sparknlp.annotator import *
from sparknlp.base import *

EXPORT_PATH = f"export_onnx/{MODEL_NAME}"

zero_shot_classifier = BartTransformer.loadSavedModel(
    EXPORT_PATH,
    spark
    )\
    .setInputCols(["document"]) \
    .setOutputCol("generation")

- Let's save it on disk so it is easier to be moved around and also be used later via `.load` function

In [19]:
zero_shot_classifier.write().overwrite().save("./{}_spark_nlp".format(EXPORT_PATH))

Let's clean up stuff we don't need anymore

In [20]:
!rm -rf {MODEL_NAME}_tokenizer {MODEL_NAME}

Awesome 😎  !

This is your BartTransformer model from HuggingFace 🤗  loaded and saved by Spark NLP 🚀

In [21]:
! ls -l {MODEL_NAME}_spark_nlp

ls: cannot access 'sshleifer/distilbart-xsum-12-6_spark_nlp': No such file or directory


Now let's see how we can use it on other machines, clusters, or any place you wish to use your new and shiny BertForSequenceClassiBartTransformerfication model 😊

In [3]:
from sparknlp.annotator import *
from sparknlp.base import *
import sparknlp

zero_shot_classifier_loaded = BartTransformer.load("./{}_spark_nlp".format(EXPORT_PATH))\
    .setInputCols(["document"]) \
    .setOutputCol("generation")

This is how you can use your loaded classifier model in Spark NLP 🚀 pipeline:

In [4]:
from pyspark.ml import Pipeline, PipelineModel

document_assembler = DocumentAssembler() \
    .setInputCol("text") \
    .setOutputCol("document")

tokenizer = Tokenizer().setInputCols("document").setOutputCol("token")

pipeline = Pipeline(stages=[
    document_assembler,
    zero_shot_classifier_loaded
])

test_data = spark.createDataFrame([
    ["Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a " +
       "downstream task, has emerged as a powerful technique in natural language processing (NLP). The effectiveness" +
       " of transfer learning has given rise to a diversity of approaches, methodology, and practice. In this " +
       "paper, we explore the landscape of transfer learning techniques for NLP by introducing a unified framework " +
       "that converts all text-based language problems into a text-to-text format. Our systematic study compares " +
       "pre-training objectives, architectures, unlabeled data sets, transfer approaches, and other factors on dozens " +
       "of language understanding tasks. By combining the insights from our exploration with scale and our new " +
       "Colossal Clean Crawled Corpus, we achieve state-of-the-art results on many benchmarks covering " +
       "summarization, question answering, text classification, and more. To facilitate future work on transfer " +
       "learning for NLP, we release our data set, pre-trained models, and code."]
]).toDF("text")

model = pipeline.fit(test_data)
model.transform(test_data).select("generation.result").show()

+--------------------+
|              result|
+--------------------+
|[In this paper, w...|
+--------------------+



That's it! You can now go wild and use hundreds of
`BartForZeroShotClassification` models as zero-shot classifiers from HuggingFace 🤗 in Spark NLP 🚀