# Multi-Modal: Natural Language for MultiClass

Let's explore how to combine tabular data with other modalities, in this case, natural language text reviews. We'll try to predict a customer's 1-5 star review of a book:

We'll be using the data set from: https://www.kaggle.com/code/meetnagadia/amazon-kindle-book-sentiment-analysis

We'll treat each star as its own class category

In [None]:
!pip install autogluon

## Imports

In [1]:
from autogluon.tabular import TabularDataset, TabularPredictor

## Data

In [4]:
data = TabularDataset("/content/review.csv")

In [5]:
data.columns

Index(['asin', 'rating', 'reviewText', 'reviewerID', 'reviewerName'], dtype='object')

In [6]:
data.head()

Unnamed: 0,asin,rating,reviewText,reviewerID,reviewerName
0,B0033UV8HI,3,"Jace Rankin may be short, but he's nothing to ...",A3HHXRELK8BHQG,Ridley
1,B002HJV4DE,5,Great short read. I didn't want to put it dow...,A2RGNZ0TRF578I,Holly Butler
2,B002ZG96I4,3,I'll start by saying this is the first of four...,A3S0H2HV6U1I7F,Merissa
3,B002QHWOEU,3,Aggie is Angela Lansbury who carries pocketboo...,AC4OQW3GZ919J,Cleargrace
4,B001A06VJ8,4,I did not expect this type of book to be in li...,A3C9V987IQHOQD,Rjostler


Note how we won't clean up this data or need to get rid of unique identifiers, AutoGluon is smart enough to feature engineer based on detected natural language text and unique values.

## Train Test Split

In [7]:
train_size = int(len(data) * 0.8)
seed = 42
train_data = data.sample(train_size, random_state=seed)
test_data = data.drop(train_data.index)

In [8]:
save_path = 'book_rating'

In [9]:
predictor = TabularPredictor(label="rating", path=save_path)


Let's pass **multimodal** to the *hyperparameters* argument to tell autogluon that we want it to use its multimodal tuning procedure.
This additionally trains a Transformer Network.

Note that you need to have at least one nvidia GPU to be able to train a transformer in autogluon.
Otherwise just remove the multimodal parameter and we train autogluon as usual.

In [10]:
predictor.fit(train_data, hyperparameters="multimodal")

Beginning AutoGluon training ...
AutoGluon will save models to "book_rating/"
AutoGluon Version:  0.7.0
Python Version:     3.10.11
Operating System:   Linux
Platform Machine:   x86_64
Platform Version:   #1 SMP Sat Apr 29 09:15:28 UTC 2023
Train Data Rows:    9600
Train Data Columns: 4
Label Column: rating
Preprocessing data ...
AutoGluon infers your prediction problem is: 'multiclass' (because dtype of label-column == int, but few unique label-values observed).
	5 unique label values:  [3, 1, 2, 4, 5]
	If 'multiclass' is not the correct problem_type, please manually specify the problem_type parameter during predictor init (You may specify problem_type as one of: ['binary', 'multiclass', 'regression'])
Train Data Class Count: 5
Using Feature Generators to preprocess the data ...
Fitting AutoMLPipelineFeatureGenerator...
	Available Memory:                    12126.25 MB
	Train Data (Original)  Memory Usage: 8.62 MB (0.1% of available memory)
	Inferring data type of each feature based o

Downloading (…)lve/main/config.json:   0%|          | 0.00/666 [00:00<?, ?B/s]

Downloading pytorch_model.bin:   0%|          | 0.00/440M [00:00<?, ?B/s]

Downloading (…)okenizer_config.json:   0%|          | 0.00/27.0 [00:00<?, ?B/s]

Downloading (…)solve/main/vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

Downloading (…)/main/tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

INFO:pytorch_lightning.utilities.rank_zero:Using 16bit None Automatic Mixed Precision (AMP)
INFO:pytorch_lightning.utilities.rank_zero:GPU available: True (cuda), used: True
INFO:pytorch_lightning.utilities.rank_zero:TPU available: False, using: 0 TPU cores
INFO:pytorch_lightning.utilities.rank_zero:IPU available: False, using: 0 IPUs
INFO:pytorch_lightning.utilities.rank_zero:HPU available: False, using: 0 HPUs
INFO:pytorch_lightning.accelerators.cuda:LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
INFO:pytorch_lightning.callbacks.model_summary:
  | Name              | Type                         | Params
-------------------------------------------------------------------
0 | model             | HFAutoModelForTextPrediction | 108 M 
1 | validation_metric | Accuracy                     | 0     
2 | loss_func         | CrossEntropyLoss             | 0     
-------------------------------------------------------------------
108 M     Trainable params
0         Non-trainable params
108 M     T

<autogluon.tabular.predictor.predictor.TabularPredictor at 0x7f89b5636fe0>

In [11]:
predictor.get_model_best()

'WeightedEnsemble_L2'

In [12]:
predictor.fit_summary();

loading file vocab.txt
loading file tokenizer.json
loading file added_tokens.json
loading file special_tokens_map.json
loading file tokenizer_config.json
loading configuration file /content/book_rating/models/MultiModalPredictor/automm_model/hf_text/config.json
Model config ElectraConfig {
  "_name_or_path": "/content/book_rating/models/MultiModalPredictor/automm_model/hf_text",
  "architectures": [
    "ElectraForPreTraining"
  ],
  "attention_probs_dropout_prob": 0.1,
  "classifier_dropout": null,
  "embedding_size": 768,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 512,
  "model_type": "electra",
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "pad_token_id": 0,
  "position_embedding_type": "absolute",
  "summary_activation": "gelu",
  "summary_last_dropout": 0.1,
  "summary_type": "first",
  "summary_use_proj": true,
  "transform

*** Summary of fit() ***
Estimated performance of each model:
                 model  score_val  pred_time_val     fit_time  pred_time_val_marginal  fit_time_marginal  stack_level  can_infer  fit_order
0  WeightedEnsemble_L2   0.681250      13.010919  4362.976965                0.001352           0.759977            2       True          8
1  MultiModalPredictor   0.664583      11.874377  3081.556531               11.874377        3081.556531            1       True          7
2             CatBoost   0.607292       0.296715   710.614187                0.296715         710.614187            1       True          3
3        LightGBMLarge   0.552083       0.885358   426.364580                0.885358         426.364580            1       True          6
4           LightGBMXT   0.542708       0.454408    92.971035                0.454408          92.971035            1       True          2
5              XGBoost   0.538542       0.355685   461.968846                0.355685         461.

## Validation on Test Set

In [13]:
y_test = test_data["rating"]
test_features = test_data.drop(columns=["rating"])

In [14]:
y_pred = predictor.predict(test_features)

loading file vocab.txt
loading file tokenizer.json
loading file added_tokens.json
loading file special_tokens_map.json
loading file tokenizer_config.json
loading configuration file /content/book_rating/models/MultiModalPredictor/automm_model/hf_text/config.json
Model config ElectraConfig {
  "_name_or_path": "/content/book_rating/models/MultiModalPredictor/automm_model/hf_text",
  "architectures": [
    "ElectraForPreTraining"
  ],
  "attention_probs_dropout_prob": 0.1,
  "classifier_dropout": null,
  "embedding_size": 768,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 512,
  "model_type": "electra",
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "pad_token_id": 0,
  "position_embedding_type": "absolute",
  "summary_activation": "gelu",
  "summary_last_dropout": 0.1,
  "summary_type": "first",
  "summary_use_proj": true,
  "transform

In [15]:
metrics = predictor.evaluate_predictions(y_true=y_test, y_pred=y_pred, auxiliary_metrics=True)

Evaluation: accuracy on test data: 0.6808333333333333
Evaluations on test data:
{
    "accuracy": 0.6808333333333333,
    "balanced_accuracy": 0.6664195815842446,
    "mcc": 0.5966166438068452
}


In [16]:
metrics

{'accuracy': 0.6808333333333333,
 'balanced_accuracy': 0.6664195815842446,
 'mcc': 0.5966166438068452}

Notice how the accuracy is 60%, this is actually very good for 5 classes, a random guess would be only 20% accurate! Plus this is a difficult problem with 3-4 stars being hard to discern.

We can use sklearn to obtain more in-depth metrics like a confusion matrix

In [17]:
from sklearn.metrics import confusion_matrix

In [18]:
confusion_matrix(y_test, y_pred)

array([[294,  92,  14,   3,   5],
       [100, 219,  49,   9,   5],
       [ 12,  55, 231, 103,  23],
       [  6,   4,  46, 367, 149],
       [  0,   0,   4,  87, 523]])

We can see that the model mainly had difficulty in trying to differentiate between 3 and 4 as well as 4 and 5, which is very reasonable, as that can also be hard for human to discern and its very subjective!