# Keras NLP - BERT Base Multi Sentiment Analysis example

An example usage of sentiment analysis with a BERT Base Multi model using Keras NLP / Keras Hub. This notebook is made for Google Colab, use at least the T4 GPU (tested with T4).

First install keras_nlp (or keras_hub which is the same thing right now)

In [None]:
!pip install keras_nlp



####Imports

In [None]:
import os
import keras
import tensorflow as tf
import numpy as np
from keras import layers
import keras_nlp
import keras_hub

ImportError: cannot import name 'ops' from 'keras' (/usr/local/lib/python3.10/dist-packages/keras/__init__.py)

#### Setup

KERAS_BACKEND specifies which backend is used for computation. Can choose from tensorflow, pytorch and jax. Second line specifies precision policy.

In [None]:
os.environ["KERAS_BACKEND"] = "tensorflow"
keras.mixed_precision.set_global_policy("mixed_float16")

The dtype policy mixed_float16 may run slowly because this machine does not have a GPU. Only Nvidia GPUs with compute capability of at least 7.0 run quickly with mixed_float16.


#### Loading data

First, unzip the data (or load them in a different way). File available at https://github.com/immm00/diplomka/blob/main/datasets/splits/extracted/train-validation-test.zip.

In [None]:
!unzip train-validation-test.zip

Data is loaded using text_dataset_from_directory, which expects a specific directory structure. Folders are separated into train, validation and test. Furthermore, each folder contains subfolders for classes - in this case positive, negative and neutral. Inside then are individual text files. Each text file is one instance (line of text).

Data will be processed in batches. Batch size is set to 32 here.

In [None]:

batch_size = 32
raw_train_ds = keras.utils.text_dataset_from_directory(
    "train-validation-test/train",
    batch_size=batch_size
)
raw_val_ds = keras.utils.text_dataset_from_directory(
    "train-validation-test/validation",
    batch_size=batch_size
)
raw_test_ds = keras.utils.text_dataset_from_directory(
    "train-validation-test/test", batch_size=batch_size
)

print(f"Number of batches in raw_train_ds: {raw_train_ds.cardinality()}")
print(f"Number of batches in raw_val_ds: {raw_val_ds.cardinality()}")
print(f"Number of batches in raw_test_ds: {raw_test_ds.cardinality()}")


Found 4059 files belonging to 3 classes.
Found 871 files belonging to 3 classes.
Found 871 files belonging to 3 classes.
Number of batches in raw_train_ds: 127
Number of batches in raw_val_ds: 28
Number of batches in raw_test_ds: 28


#### Initializing the model

Using keras_nlp, a specific pretrained model is loaded as a part of a classifier. Available pretrained models are listed here: https://keras.io/keras_hub/presets/.

Bert_base_multi, a multilingual model is used, as there are no pretrained models for Czech specifically available. It is pretrained on wikipedias of different languages.

The number of classes is set to 3 (positive, negative, neutral).

The summary shows the layers, parameters, etc. Part of the classifier is a preprocessor (BertTokenizer). Tokenization will happen automatically, there is no need to preprocess the data beforehand.

On the extracted economics dataset, the 3 epochs will take around 10 minutes with the T4 GPU.

In [None]:
classifier = keras_nlp.models.BertClassifier.from_preset(
    "bert_base_multi",
    num_classes=3,
)

classifier.summary()


NameError: name 'keras_nlp' is not defined

#### Fine-tuning

The pretrained model needs to be fine-tuned for the sentiment analysis task. The training and validation data is used for this. The number of epochs refers to how many times a machine learning model goes through the entire training dataset during training. It is set to 3 here.

The output will show step number, time, and evaluation metrics (loss function and sparse categorical accuracy for both the training and validation data).

In [None]:
classifier.fit(
    raw_train_ds,
    validation_data=raw_val_ds,
    epochs=3,
)

[1m127/127[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m244s[0m 1s/step - loss: 1.0402 - sparse_categorical_accuracy: 0.4593 - val_loss: 0.8545 - val_sparse_categorical_accuracy: 0.6498


<keras.src.callbacks.history.History at 0x79aff1358760>

#### Evaluation

Model performance on testing data can then be checked using the evaluate function. To get other metrics than the loss function and sparse categorical accuracy for the pretrained models, they either have to be manually defined and set using a compile function on the classifier or they have to be calculated separately (see notebook for cross-validation).

In [None]:
classifier.evaluate(raw_test_ds)

#### Prediction example

To predict a sentiment of a text, predict funtion is used. Since it takes batches of data, the example string is inputed as in a list. The output is of the predict function are 3 number, one for each class. Numpy argmax is used to get the most probable class.

The class labels are 0 for negative, 1 for neutral, 2 for positive. This depends on how the data is loaded and can be changed before the model is finetuned.

In [None]:
example_string = "Dnes je venku velmi hezky."

predictions = classifier.predict([example_string])

predicted_class = np.argmax(predictions, axis=-1)

print(predictions)
print(f"Predicted class: {predicted_class[0]}")

[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 344ms/step
[[-1.408   -0.01519  1.582  ]]
[2]
Predicted class: 2
