Skip to content

Spark NLP 5.1.2: Unveiling the First Image-to-Text VisionEncoderDecoder, Over 3,000 ONNX state-of-the-art Transformer Models, Overhaul update in documentation, and bug fixes!

Compare
Choose a tag to compare
@maziyarpanahi maziyarpanahi released this 26 Sep 07:46
· 174 commits to master since this release
6919f5e

πŸ“’ Overview

For the first time, Spark NLP 5.1.2 πŸš€ proudly presents a new image-to-text annotator designed for captioning images. Additionally, we've added over 3,000 state-of-the-art transformer models in ONNX format to ensure rapid inference in your RAG when you are using LLMs.

We're pleased to announce that our Models Hub now boasts 21,000+ free and truly open-source models & pipelines πŸŽ‰. Our deepest gratitude goes out to our community for their invaluable feedback, feature suggestions, and contributions.


πŸ”₯ New Features & Enhancements

  • NEW: We're excited to introduce the VisionEncoderDecoderForImageCaptioning annotator, designed specifically for image-to-text captioning. We used VisionEncoderDecoderModel to import models fine-tuned for auto image captioning

The VisionEncoderDecoder can be employed to set up an image-to-text model. The encoding part can utilize any pretrained Transformer-based vision model, such as ViT, BEiT, DeiT, or Swin. Meanwhile, for the decoding part, it can make use of any pretrained language model like RoBERTa, GPT2, BERT, or DistilBERT.

The efficacy of using pretrained checkpoints to initialize image-to-text-sequence models is evident in the study titled TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models by Minghao Li, Tengchao Lv, Lei Cui, Yijuan Lu, Dinei Florencio, Cha Zhang, Zhoujun Li, and Furu Wei.

Image Captioning Using Hugging Face Vision Encoder Decoder β€” Step2Step Guide (Part 2)

  • NEW: We've added cutting-edge transformer models in ONNX format for seamless integration. Our annotators will automatically recognize and utilize these models, streamlining your LLM pipelines without any additional setup.

  • We have added all the missing features from our documentation and added examples to Python and Scala APIs:

    • E5Embeddings
    • InstructorEmbeddings
    • MPNetEmbeddings
    • OpenAICompletion
    • VisionEncoderDecoderForImageCaptioning
    • DocumentSimilarityRanker
    • BartForZeroShotClassification
    • XlmRoBertaForZeroShotClassification
    • CamemBertForQuestionAnswering
    • DeBertaForSequenceClassification
    • DeBertaForTokenClassification
    • Date2Chunk

πŸ› Bug Fixes

  • We've made a minor adjustment to the beam search algorithm, enhancing the quality of the BART Transformer results.

πŸ““ New Notebooks

Notebooks Colab
Vision Encoder Decoder: Image Captioning at Scale in Spark NLP Open In Colab
Import Whisper models (ONNX) Open In Colab

πŸ“– Documentation


❀️ Community support

  • Slack For live discussion with the Spark NLP community and the team
  • GitHub Bug reports, feature requests, and contributions
  • Discussions Engage with other community members, share ideas, and show off how you use Spark NLP!
  • Medium Spark NLP articles
  • YouTube Spark NLP video tutorials

Installation

Python

#PyPI

pip install spark-nlp==5.1.2

Spark Packages

spark-nlp on Apache Spark 3.0.x, 3.1.x, 3.2.x, 3.3.x, and 3.4.x (Scala 2.12):

spark-shell --packages com.johnsnowlabs.nlp:spark-nlp_2.12:5.1.2

pyspark --packages com.johnsnowlabs.nlp:spark-nlp_2.12:5.1.2

GPU

spark-shell --packages com.johnsnowlabs.nlp:spark-nlp-gpu_2.12:5.1.2

pyspark --packages com.johnsnowlabs.nlp:spark-nlp-gpu_2.12:5.1.2

Apple Silicon (M1 & M2)

spark-shell --packages com.johnsnowlabs.nlp:spark-nlp-silicon_2.12:5.1.2

pyspark --packages com.johnsnowlabs.nlp:spark-nlp-silicon_2.12:5.1.2

AArch64

spark-shell --packages com.johnsnowlabs.nlp:spark-nlp-aarch64_2.12:5.1.2

pyspark --packages com.johnsnowlabs.nlp:spark-nlp-aarch64_2.12:5.1.2

Maven

spark-nlp on Apache Spark 3.0.x, 3.1.x, 3.2.x, 3.3.x, and 3.4.x:

<dependency>
    <groupId>com.johnsnowlabs.nlp</groupId>
    <artifactId>spark-nlp_2.12</artifactId>
    <version>5.1.2</version>
</dependency>

spark-nlp-gpu:

<dependency>
    <groupId>com.johnsnowlabs.nlp</groupId>
    <artifactId>spark-nlp-gpu_2.12</artifactId>
    <version>5.1.2</version>
</dependency>

spark-nlp-silicon:

<dependency>
    <groupId>com.johnsnowlabs.nlp</groupId>
    <artifactId>spark-nlp-silicon_2.12</artifactId>
    <version>5.1.2</version>
</dependency>

spark-nlp-aarch64:

<dependency>
    <groupId>com.johnsnowlabs.nlp</groupId>
    <artifactId>spark-nlp-aarch64_2.12</artifactId>
    <version>5.1.2</version>
</dependency>

FAT JARs

What's Changed

Full Changelog: 5.1.1...5.1.2