# Run inference in Keras 3 with the OpenVINO™ IR backend

Starting with release 3.8, [Keras](https://github.com/keras-team/keras) provides native integration with the OpenVINO backend for accelerated inference. This integration enables you to leverage OpenVINO performance optimizations directly within the Keras workflow, enabling faster inference on OpenVINO supported hardware.


In this tutorial, we will show how to run inference of an end-to-end [BERT model for classification tasks](https://www.kaggle.com/models/keras/bert/) using the OpenVINO backend.


>**Note**: The OpenVINO backend may currently lack support for some operations. This will be addressed in upcoming Keras releases as operation coverage is being expanded.

>**Note**: The `tensorflow-text` package [isn't provided for Windows after version 2.10](https://github.com/tensorflow/text#a-note-about-different-operating-system-packages). `tensorflow-text==2.16.1` - the last version that supports `macOS x86_64`, but it doesn't support `macOS arm` and `python3.12`. Since tensorflow-text==2.17.0 supports `macOS arm`, since `2.18.1` - `python12`. This package is required for `BertTokenizer`.


#### Table of contents:

- [Prerequisites](#Prerequisites)
- [Load the model with the OpenVINO backend and inference](#Load-the-model-with-the-OpenVINO-backend-and-inference)
- [Sentiment Classification Example](#Sentiment-Classification-Example)


### Installation Instructions

This is a self-contained example that relies solely on its own code.

We recommend  running the notebook in a virtual environment. You only need a Jupyter server to start.
For details, please refer to [Installation Guide](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/README.md#-installation-guide).

<img referrerpolicy="no-referrer-when-downgrade" src="https://static.scarf.sh/a.png?x-pxid=5b5a4db0-7875-4bfb-bdbd-01698b5b1a77&file=notebooks/keras-with-openvino-backend/keras-with-openvino-backend.ipynb" />

## Prerequisites
[back to top ⬆️](#Table-of-contents:)

In [1]:
%pip install -q "openvino>=2025.0.0"
%pip install -q "keras>=3.8" "keras-hub"

Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.


In [2]:
from pathlib import Path
import requests


if not Path("notebook_utils.py").exists():
    r = requests.get(
        url="https://raw.githubusercontent.com/openvinotoolkit/openvino_notebooks/latest/utils/notebook_utils.py",
    )

    open("notebook_utils.py", "w").write(r.text)


# Read more about telemetry collection at https://github.com/openvinotoolkit/openvino_notebooks?tab=readme-ov-file#-telemetry
from notebook_utils import collect_telemetry

collect_telemetry("keras-with-openvino-backend.ipynb")



## Load the model with the OpenVINO backend and inference
[back to top ⬆️](#Table-of-contents:)

Keras provides list of pretrained for general purposes models that can be used for fine-tuning on specific task.

We will use the BERT model using the [`BertTextClassifier`](https://keras.io/keras_hub/api/base_classes/text_classifier/#textclassifier-class) class. OpenVINO API provides only inference capabilities, which means that before moving to the OpenVINO backend, you need to train the model on your own data using one of the backends that supports training. Once your model training process is finished, you can move to OpenVINO for inference speedup. Here are the general steps you need for that:

    1. Specify the backend using an environment variable.
    2. Create a model instance.
    3. Run model prediction.

To switch to the OpenVINO backend in Keras 3, set the `KERAS_BACKEND` environment variable to `openvino` or specify the backend in the local configuration file at `~/.keras/keras.json`.

In [5]:
import os

os.environ["KERAS_BACKEND"] = "openvino"
import numpy as np
import keras_hub

Create a model instance. Take a model from [KerasHub](https://keras.io/keras_hub/presets/)

In [6]:
bert = keras_hub.models.BertTextClassifier.from_preset(
    "bert_base_en_uncased",
    num_classes=4,
)

TypeError: <class 'keras_hub.src.models.bert.bert_tokenizer.BertTokenizer'> could not be deserialized properly. Please ensure that components that are Python object instances (layers, models, etc.) returned by `get_config()` are explicitly deserialized in the model's `from_config()` method.

config={'module': 'keras_hub.src.models.bert.bert_tokenizer', 'class_name': 'BertTokenizer', 'config': {'name': 'bert_tokenizer', 'trainable': True, 'dtype': {'module': 'keras', 'class_name': 'DTypePolicy', 'config': {'name': 'int32'}, 'registered_name': None}, 'config_file': 'tokenizer.json', 'vocabulary': None, 'sequence_length': None, 'lowercase': True, 'strip_accents': False, 'split': True, 'suffix_indicator': '##', 'oov_token': '[UNK]', 'special_tokens': None, 'special_tokens_in_strings': False}, 'registered_name': 'keras_hub>BertTokenizer'}.

Exception encountered: Error when deserializing class 'BertTokenizer' using config={'name': 'bert_tokenizer', 'trainable': True, 'dtype': 'int32', 'config_file': 'tokenizer.json', 'vocabulary': None, 'sequence_length': None, 'lowercase': True, 'strip_accents': False, 'split': True, 'suffix_indicator': '##', 'oov_token': '[UNK]', 'special_tokens': None, 'special_tokens_in_strings': False}.

Exception encountered: BertTokenizer requires `tensorflow` and `tensorflow-text` for text processing. Run `pip install tensorflow-text` to install both packages or visit https://www.tensorflow.org/install

If `tensorflow-text` is already installed, try importing it in a clean python session. Your installation may have errors.

KerasHub uses `tf.data` and `tensorflow-text` to preprocess text on all Keras backends. If you are running on Jax or Torch, this installation does not need GPU support.

Run model prediction for raw string data.

In [None]:
features = ["The quick brown fox jumped.", "I forgot my homework."]

bert.predict(x=features, batch_size=2)

[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 398ms/step


array([[ 0.14641605,  0.33292952, -0.07132149,  0.2362039 ],
       [ 0.14057046,  0.2972596 , -0.02436665,  0.29821312]],
      dtype=float32)

Preprocessed integer data. You can obtain this data using any tokenizer. In the previous example, the default tokenizer was used to achieve this.

In [7]:
features = {
    "token_ids": np.ones(shape=(2, 12), dtype="int32"),
    "segment_ids": np.array([[0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0]] * 2),
    "padding_mask": np.array([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0]] * 2),
}

bert = keras_hub.models.BertTextClassifier.from_preset(
    "bert_base_en_uncased",
    num_classes=4,
    preprocessor=None,
)

predictions = bert.predict(x=features, batch_size=2)

predictions

[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 446ms/step


array([[-0.03224922,  0.09847151,  0.32198498,  0.09585449],
       [-0.03224922,  0.09847151,  0.32198498,  0.09585449]],
      dtype=float32)

## Sentiment Classification Example

This example demonstrates how to use a pre-trained BERT model [bert_tiny_en_uncased_sst](https://www.kaggle.com/models/keras/bert/keras/bert_tiny_en_uncased_sst2) from [KerasHub](https://keras.io/keras_hub/presets/) to perform sentiment classification on a set of sentences. The model predicts whether each sentence expresses a positive or negative sentiment.

[back to top ⬆️](#Table-of-contents:)

In [12]:
import tensorflow as tf


bert = keras_hub.models.BertTextClassifier.from_preset(
    "bert_tiny_en_uncased_sst2",
    num_classes=2,
)

sentences = [
    "the movie was a complete waste of time.",
    "the plot was predictable and boring.",
    "i absolutely loved this movie, it was fantastic!",
    "an excellent movie that i would highly recommend.",
]


def get_sentiment(text):
    predictions = bert.predict([text])
    probabilities = tf.nn.softmax(predictions, axis=1).numpy()
    sentiment = np.argmax(probabilities, axis=1)[0]
    sentiment_label = "positive" if sentiment == 1 else "negative"

    return sentiment_label


def display_results(results):
    max_sentence_length = max(len(result["Sentence"]) for result in results)
    max_sentiment_length = max(len(result["Sentiment"]) for result in results)

    print(f"{'Sentence':<{max_sentence_length}}  {'Sentiment':<{max_sentiment_length}}")
    print("-" * (max_sentence_length + max_sentiment_length + 4))

    for result in results:
        sentence = result["Sentence"]
        sentiment = result["Sentiment"]
        print(f"{sentence:<{max_sentence_length}}  {sentiment:<{max_sentiment_length}}")


results = []
for sentence in sentences:
    sentiment = get_sentiment(sentence)
    results.append({"Sentence": sentence, "Sentiment": sentiment})


display_results(results)

[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 234ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 170ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 165ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 165ms/step
Sentence                                           Sentiment
-------------------------------------------------------------
the movie was a complete waste of time.            negative
the plot was predictable and boring.               negative
i absolutely loved this movie, it was fantastic!   positive
an excellent movie that i would highly recommend.  positive


In [None]:
# Cleanup
# %pip uninstall -q -y "tensorflow-cpu" tensorflow keras