*Named entity recognition (NER) refers to identifying and categorizing key information (entities) from unstructured text. An entity can be a word or a series of words which correspond to categories such as cities, time expressions, monetary values, facilities, person, organization, etc. An NER model usually takes as input an unannotated block of text and output an annotated block of text that highlights the named entities with predefined categories.*

*Like other tasks in AutoMM, all you need to do is to prepare your data as data tables (i.e., dataframes) which contain a text column and an annotation column. The text column stores the raw textual data which contains the entities you want to identify.*

In [1]:
!pip install autogluon.multimodal

Collecting autogluon.multimodal
  Downloading autogluon.multimodal-1.1.1-py3-none-any.whl.metadata (12 kB)
Collecting scipy<1.13,>=1.5.4 (from autogluon.multimodal)
  Downloading scipy-1.12.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (60 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m60.4/60.4 kB[0m [31m2.5 MB/s[0m eta [36m0:00:00[0m
Collecting scikit-learn<1.4.1,>=1.3.0 (from autogluon.multimodal)
  Downloading scikit_learn-1.4.0-1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (11 kB)
Collecting boto3<2,>=1.10 (from autogluon.multimodal)
  Downloading boto3-1.35.29-py3-none-any.whl.metadata (6.6 kB)
Collecting torch<2.4,>=2.2 (from autogluon.multimodal)
  Downloading torch-2.3.1-cp310-cp310-manylinux1_x86_64.whl.metadata (26 kB)
Collecting lightning<2.4,>=2.2 (from autogluon.multimodal)
  Downloading lightning-2.3.3-py3-none-any.whl.metadata (35 kB)
Collecting transformers<4.41.0,>=4.38.0 (from transformers[sent

In [1]:
import json
json.dumps([
    {"entity_group": "PERSON", "start": 0, "end": 15},
    {"entity_group": "LOCATION", "start": 28, "end": 35}
])

'[{"entity_group": "PERSON", "start": 0, "end": 15}, {"entity_group": "LOCATION", "start": 28, "end": 35}]'

*Following is an example of visualizing the annotations with the visualize_ner utility.*

In [3]:
!pip install torch==2.0.0+cu117 torchaudio==2.0.0 torchvision==0.15.0+cu117 --extra-index-url https://download.pytorch.org/whl/cu117


Looking in indexes: https://pypi.org/simple, https://download.pytorch.org/whl/cu117
Collecting torch==2.0.0+cu117
  Downloading https://download.pytorch.org/whl/cu117/torch-2.0.0%2Bcu117-cp310-cp310-linux_x86_64.whl (1843.9 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.8/1.8 GB[0m [31m954.1 kB/s[0m eta [36m0:00:00[0m
[?25hCollecting torchaudio==2.0.0
  Downloading https://download.pytorch.org/whl/cu117/torchaudio-2.0.0%2Bcu117-cp310-cp310-linux_x86_64.whl (4.4 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m4.4/4.4 MB[0m [31m40.1 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting torchvision==0.15.0+cu117
  Downloading https://download.pytorch.org/whl/cu117/torchvision-0.15.0%2Bcu117-cp310-cp310-linux_x86_64.whl (6.1 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m6.1/6.1 MB[0m [31m105.4 MB/s[0m eta [36m0:00:00[0m
Collecting triton==2.0.0 (from torch==2.0.0+cu117)
  Downloading https://download.pytorch.org/whl/tri

In [1]:
from autogluon.multimodal.utils import visualize_ner

sentence = "Albert Einstein was born in Germany and is widely acknowledged to be one of the greatest physicists."
annotation = [{"entity_group": "PERSON", "start": 0, "end": 15},
              {"entity_group": "LOCATION", "start": 28, "end": 35}]

visualize_ner(sentence, annotation)

*This dataset is converted from the MIT movies corpus which provides annotations on entity groups such as actor, character, director, genre, song, title, trailer, year, etc.*

In [2]:
from autogluon.core.utils.loaders import load_pd
train_data = load_pd.load('https://automl-mm-bench.s3.amazonaws.com/ner/mit-movies/train_v2.csv')
test_data = load_pd.load('https://automl-mm-bench.s3.amazonaws.com/ner/mit-movies/test_v2.csv')
train_data.head(5)

Unnamed: 0,text_snippet,entity_annotations
0,what movies star bruce willis,"[{""entity_group"": ""ACTOR"", ""start"": 17, ""end"":..."
1,show me films with drew barrymore from the 1980s,"[{""entity_group"": ""ACTOR"", ""start"": 19, ""end"":..."
2,what movies starred both al pacino and robert ...,"[{""entity_group"": ""ACTOR"", ""start"": 25, ""end"":..."
3,find me all of the movies that starred harold ...,"[{""entity_group"": ""ACTOR"", ""start"": 39, ""end"":..."
4,find me a movie with a quote about baseball in it,[]


In [3]:
print(f"text_snippet: {train_data['text_snippet'][1]}")
print(f"entity_annotations: {train_data['entity_annotations'][1]}")
visualize_ner(train_data['text_snippet'][1], train_data['entity_annotations'][1])

text_snippet: show me films with drew barrymore from the 1980s
entity_annotations: [{"entity_group": "ACTOR", "start": 19, "end": 33}, {"entity_group": "YEAR", "start": 43, "end": 48}]


*Training: Now, let's create a predictor for named entity recognition by setting the problem_type to ner and specifying the label column. Afterwards, we call predictor.fit() to train the model for five minutes. To achieve reasonable performance in your applications, you are recommended to set a longer enough time_limit (e.g., 30/60 minutes). You can also specify your backbone model and other hyperparameters using the hyperparameters argument. Here, we save the model to the directory "automm_ner".*

In [4]:
from autogluon.multimodal import MultiModalPredictor
import uuid

label_col = "entity_annotations"
model_path = f"./tmp/{uuid.uuid4().hex}-automm_ner"  # You can rename it to the model path you like
predictor = MultiModalPredictor(problem_type="ner", label=label_col, path=model_path)
predictor.fit(
    train_data=train_data,
    hyperparameters={'model.ner_text.checkpoint_name':'google/electra-small-discriminator'},
    time_limit=300, #second
)

AutoGluon Version:  1.1.1
Python Version:     3.10.12
Operating System:   Linux
Platform Machine:   x86_64
Platform Version:   #1 SMP PREEMPT_DYNAMIC Thu Jun 27 21:05:47 UTC 2024
CPU Count:          12
Pytorch Version:    2.0.0+cu117
CUDA Version:       11.7
Memory Avail:       50.24 GB / 52.96 GB (94.9%)
Disk Space Avail:   64.75 GB / 112.64 GB (57.5%)

AutoMM starts to create your model. ✨✨✨

To track the learning progress, you can open a terminal and launch Tensorboard:
    ```shell
    # Assume you have installed tensorboard
    tensorboard --logdir /content/tmp/56cdde896a6247d8a22d03d08132bea5-automm_ner
    ```

INFO: Seed set to 0
The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommende

config.json:   0%|          | 0.00/665 [00:00<?, ?B/s]

pytorch_model.bin:   0%|          | 0.00/54.2M [00:00<?, ?B/s]

  return self.fget.__get__(instance, owner)()


tokenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

GPU Count: 1
GPU Count to be Used: 1
GPU 0 Name: NVIDIA L4
GPU 0 Memory: 0.33GB/22.49GB (Used/Total)

INFO: Using 16bit Automatic Mixed Precision (AMP)
INFO: GPU available: True (cuda), used: True
INFO: TPU available: False, using: 0 TPU cores
INFO: HPU available: False, using: 0 HPUs
INFO: You are using a CUDA device ('NVIDIA L4') that has Tensor Cores. To properly utilize them, you should set `torch.set_float32_matmul_precision('medium' | 'high')` which will trade-off precision for performance. For more details, read https://pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision
INFO: LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
INFO: 
  | Name              | Type              | Params | Mode 
----------------------------------------------------------------
0 | model             | HFAutoModelForNER | 13.5 M | train
1 | validation_metric | MulticlassF1Score | 0      | train
2 | loss_func         | CrossEntropyLoss  | 0      | train
--

Sanity Checking: |          | 0/? [00:00<?, ?it/s]

Training: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

INFO: Epoch 0, global step 34: 'val_ner_token_f1' reached 0.00102 (best 0.00102), saving model to '/content/tmp/56cdde896a6247d8a22d03d08132bea5-automm_ner/epoch=0-step=34.ckpt' as top 3


Validation: |          | 0/? [00:00<?, ?it/s]

INFO: Epoch 0, global step 69: 'val_ner_token_f1' reached 0.56102 (best 0.56102), saving model to '/content/tmp/56cdde896a6247d8a22d03d08132bea5-automm_ner/epoch=0-step=69.ckpt' as top 3


Validation: |          | 0/? [00:00<?, ?it/s]

INFO: Epoch 1, global step 103: 'val_ner_token_f1' reached 0.80229 (best 0.80229), saving model to '/content/tmp/56cdde896a6247d8a22d03d08132bea5-automm_ner/epoch=1-step=103.ckpt' as top 3


Validation: |          | 0/? [00:00<?, ?it/s]

INFO: Epoch 1, global step 138: 'val_ner_token_f1' reached 0.83159 (best 0.83159), saving model to '/content/tmp/56cdde896a6247d8a22d03d08132bea5-automm_ner/epoch=1-step=138.ckpt' as top 3


Validation: |          | 0/? [00:00<?, ?it/s]

INFO: Epoch 2, global step 172: 'val_ner_token_f1' reached 0.86420 (best 0.86420), saving model to '/content/tmp/56cdde896a6247d8a22d03d08132bea5-automm_ner/epoch=2-step=172.ckpt' as top 3


Validation: |          | 0/? [00:00<?, ?it/s]

INFO: Epoch 2, global step 207: 'val_ner_token_f1' reached 0.86217 (best 0.86420), saving model to '/content/tmp/56cdde896a6247d8a22d03d08132bea5-automm_ner/epoch=2-step=207.ckpt' as top 3


Validation: |          | 0/? [00:00<?, ?it/s]

INFO: Epoch 3, global step 241: 'val_ner_token_f1' reached 0.87083 (best 0.87083), saving model to '/content/tmp/56cdde896a6247d8a22d03d08132bea5-automm_ner/epoch=3-step=241.ckpt' as top 3


Validation: |          | 0/? [00:00<?, ?it/s]

INFO: Epoch 3, global step 276: 'val_ner_token_f1' reached 0.87312 (best 0.87312), saving model to '/content/tmp/56cdde896a6247d8a22d03d08132bea5-automm_ner/epoch=3-step=276.ckpt' as top 3
INFO: Time limit reached. Elapsed time is 0:05:00. Signaling Trainer to stop.


Validation: |          | 0/? [00:00<?, ?it/s]

INFO: Epoch 4, global step 281: 'val_ner_token_f1' reached 0.87287 (best 0.87312), saving model to '/content/tmp/56cdde896a6247d8a22d03d08132bea5-automm_ner/epoch=4-step=281.ckpt' as top 3
Start to fuse 3 checkpoints via the greedy soup algorithm.


Predicting: |          | 0/? [00:00<?, ?it/s]

Downloading builder script:   0%|          | 0.00/6.34k [00:00<?, ?B/s]

Predicting: |          | 0/? [00:00<?, ?it/s]

Predicting: |          | 0/? [00:00<?, ?it/s]

AutoMM has created your model. 🎉🎉🎉

To load the model, use the code below:
    ```python
    from autogluon.multimodal import MultiModalPredictor
    predictor = MultiModalPredictor.load("/content/tmp/56cdde896a6247d8a22d03d08132bea5-automm_ner")
    ```

If you are not satisfied with the model, try to increase the training time, 
adjust the hyperparameters (https://auto.gluon.ai/stable/tutorials/multimodal/advanced_topics/customization.html),
or post issues on GitHub (https://github.com/autogluon/autogluon/issues).




<autogluon.multimodal.predictor.MultiModalPredictor at 0x791da649b130>

*Evaluation*

In [5]:
predictor.evaluate(test_data,  metrics=['overall_recall', "overall_precision", "overall_f1", "actor"])

Predicting: |          | 0/? [00:00<?, ?it/s]

{'overall_recall': 0.8569020415808204,
 'overall_precision': 0.8377586522614906,
 'overall_f1': 0.8472222222222222,
 'actor': {'precision': 0.8435754189944135,
  'recall': 0.9298029556650246,
  'f1': 0.8845928529584065,
  'number': 812}}

*Prediction and Visualization:*

In [6]:
from autogluon.multimodal.utils import visualize_ner

sentence = "Game of Thrones is an American fantasy drama television series created by David Benioff"
predictions = predictor.predict({'text_snippet': [sentence]})
print('Predicted entities:', predictions[0])

# Visualize
visualize_ner(sentence, predictions[0])

Predicting: |          | 0/? [00:00<?, ?it/s]

Predicted entities: [{'entity_group': 'TITLE', 'start': 0, 'end': 15}, {'entity_group': 'GENRE', 'start': 22, 'end': 55}, {'entity_group': 'DIRECTOR', 'start': 74, 'end': 87}]


*Prediction Probabilities*

In [7]:
predictions = predictor.predict_proba({'text_snippet': [sentence]})
print(predictions[0][0]['probability'])

Predicting: |          | 0/? [00:00<?, ?it/s]

{'O': 0.1783, 'I-TRAILER': 0.0004838, 'B-PLOT': 0.3281, 'B-YEAR': 0.001078, 'I-ACTOR': 0.00366, 'B-ACTOR': 0.006012, 'B-TITLE': 0.3896, 'I-REVIEW': 0.003305, 'I-SONG': 0.0003932, 'I-CHARACTER': 0.003107, 'I-DIRECTOR': 0.000993, 'B-GENRE': 0.02013, 'B-SONG': 0.00853, 'I-TITLE': 0.0047, 'B-DIRECTOR': 0.002625, 'B-CHARACTER': 0.007687, 'B-RATING': 0.00543, 'I-YEAR': 0.00642, 'I-RATING': 0.001756, 'I-GENRE': 0.003925, 'B-RATINGS_AVERAGE': 0.004536, 'B-TRAILER': 0.001556, 'B-REVIEW': 0.001716, 'I-RATINGS_AVERAGE': 0.01084, 'I-PLOT': 0.003775}


*Reloading and Continuous Training: The trained predictor is automatically saved and you can easily reload it using the path. If you are not saftisfied with the current model performance, you can continue training the loaded model with new data.*

In [8]:
new_predictor = MultiModalPredictor.load(model_path)
new_model_path = f"./tmp/{uuid.uuid4().hex}-automm_ner_continue_train"
new_predictor.fit(train_data, time_limit=60, save_path=new_model_path)
test_score = new_predictor.evaluate(test_data, metrics=['overall_f1', 'ACTOR'])
print(test_score)

Load pretrained checkpoint: /content/tmp/56cdde896a6247d8a22d03d08132bea5-automm_ner/model.ckpt
AutoGluon Version:  1.1.1
Python Version:     3.10.12
Operating System:   Linux
Platform Machine:   x86_64
Platform Version:   #1 SMP PREEMPT_DYNAMIC Thu Jun 27 21:05:47 UTC 2024
CPU Count:          12
Pytorch Version:    2.0.0+cu117
CUDA Version:       11.7
Memory Avail:       47.30 GB / 52.96 GB (89.3%)
Disk Space Avail:   64.65 GB / 112.64 GB (57.4%)

AutoMM starts to create your model. ✨✨✨

To track the learning progress, you can open a terminal and launch Tensorboard:
    ```shell
    # Assume you have installed tensorboard
    tensorboard --logdir /content/tmp/e32607fcc2b745bb92cf067df5a6f75c-automm_ner_continue_train
    ```

INFO: Seed set to 0
GPU Count: 1
GPU Count to be Used: 1
GPU 0 Name: NVIDIA L4
GPU 0 Memory: 0.94GB/22.49GB (Used/Total)

INFO: Using 16bit Automatic Mixed Precision (AMP)
INFO: GPU available: True (cuda), used: True
INFO: TPU available: False, using: 0 TPU cores

Sanity Checking: |          | 0/? [00:00<?, ?it/s]

Training: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

INFO: Epoch 0, global step 34: 'val_ner_token_f1' reached 0.87516 (best 0.87516), saving model to '/content/tmp/e32607fcc2b745bb92cf067df5a6f75c-automm_ner_continue_train/epoch=0-step=34.ckpt' as top 3
INFO: Time limit reached. Elapsed time is 0:01:00. Signaling Trainer to stop.


Validation: |          | 0/? [00:00<?, ?it/s]

INFO: Epoch 0, global step 54: 'val_ner_token_f1' reached 0.87516 (best 0.87516), saving model to '/content/tmp/e32607fcc2b745bb92cf067df5a6f75c-automm_ner_continue_train/epoch=0-step=54.ckpt' as top 3
Start to fuse 2 checkpoints via the greedy soup algorithm.


Predicting: |          | 0/? [00:00<?, ?it/s]

Predicting: |          | 0/? [00:00<?, ?it/s]

AutoMM has created your model. 🎉🎉🎉

To load the model, use the code below:
    ```python
    from autogluon.multimodal import MultiModalPredictor
    predictor = MultiModalPredictor.load("/content/tmp/e32607fcc2b745bb92cf067df5a6f75c-automm_ner_continue_train")
    ```

If you are not satisfied with the model, try to increase the training time, 
adjust the hyperparameters (https://auto.gluon.ai/stable/tutorials/multimodal/advanced_topics/customization.html),
or post issues on GitHub (https://github.com/autogluon/autogluon/issues).




Predicting: |          | 0/? [00:00<?, ?it/s]

{'overall_f1': 0.8445883441258094, 'ACTOR': {'precision': 0.8467650397275823, 'recall': 0.9187192118226601, 'f1': 0.8812758417011223, 'number': 812}}
