<a href="https://colab.research.google.com/github/jazu1412/LOW_CODE_AUTOML_AUTOGLUON/blob/master/TEXT%20CLASSIFICATION/ner_tutorial_notebook.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# AutoMM for Named Entity Recognition - Tutorial

## Introduction
Welcome to this tutorial on Named Entity Recognition (NER) using AutoGluon's AutoMM!

Named Entity Recognition is like teaching a computer to be a super-efficient highlighter. Imagine you're reading a long article and you want to quickly identify all the people, places, and organizations mentioned. NER does this automatically, saving you time and effort.

In this tutorial, we'll walk through the process of training a model to recognize named entities in text. We'll use a dataset about movies, where our model will learn to identify things like actors, directors, and movie titles.

Let's get started!

## Setup
First, we need to install the necessary libraries. We'll be using AutoGluon's multimodal package.

In [1]:
import sys
!{sys.executable} -m pip install autogluon.multimodal

print("AutoGluon installation complete!")

Collecting autogluon.multimodal
  Downloading autogluon.multimodal-1.1.1-py3-none-any.whl.metadata (12 kB)
Collecting scipy<1.13,>=1.5.4 (from autogluon.multimodal)
  Downloading scipy-1.12.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (60 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m60.4/60.4 kB[0m [31m807.3 kB/s[0m eta [36m0:00:00[0m
Collecting Pillow<11,>=10.0.1 (from autogluon.multimodal)
  Downloading pillow-10.4.0-cp310-cp310-manylinux_2_28_x86_64.whl.metadata (9.2 kB)
Collecting boto3<2,>=1.10 (from autogluon.multimodal)
  Downloading boto3-1.35.19-py3-none-any.whl.metadata (6.6 kB)
Collecting torch<2.4,>=2.2 (from autogluon.multimodal)
  Downloading torch-2.3.1-cp310-cp310-manylinux1_x86_64.whl.metadata (26 kB)
Collecting lightning<2.4,>=2.2 (from autogluon.multimodal)
  Downloading lightning-2.3.3-py3-none-any.whl.metadata (35 kB)
Collecting transformers<4.41.0,>=4.38.0 (from transformers[sentencepiece]<4.41.0,>=4.38.0->autog

In [None]:
!pip uninstall -y torchaudio
!pip install torchaudio

Found existing installation: torchaudio 2.4.0+cu121
Uninstalling torchaudio-2.4.0+cu121:
  Successfully uninstalled torchaudio-2.4.0+cu121
Collecting torchaudio
  Downloading torchaudio-2.4.1-cp310-cp310-manylinux1_x86_64.whl.metadata (6.4 kB)
Collecting torch==2.4.1 (from torchaudio)
  Downloading torch-2.4.1-cp310-cp310-manylinux1_x86_64.whl.metadata (26 kB)
Collecting nvidia-cudnn-cu12==9.1.0.70 (from torch==2.4.1->torchaudio)
  Downloading nvidia_cudnn_cu12-9.1.0.70-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)
Collecting triton==3.0.0 (from torch==2.4.1->torchaudio)
  Downloading triton-3.0.0-1-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.metadata (1.3 kB)
Downloading torchaudio-2.4.1-cp310-cp310-manylinux1_x86_64.whl (3.4 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m3.4/3.4 MB[0m [31m43.5 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading torch-2.4.1-cp310-cp310-manylinux1_x86_64.whl (797.1 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

In [2]:
!pip install --upgrade torchvision

Collecting torchvision
  Downloading torchvision-0.19.1-cp310-cp310-manylinux1_x86_64.whl.metadata (6.0 kB)
Collecting torch==2.4.1 (from torchvision)
  Downloading torch-2.4.1-cp310-cp310-manylinux1_x86_64.whl.metadata (26 kB)
Collecting nvidia-cudnn-cu12==9.1.0.70 (from torch==2.4.1->torchvision)
  Downloading nvidia_cudnn_cu12-9.1.0.70-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)
Collecting triton==3.0.0 (from torch==2.4.1->torchvision)
  Downloading triton-3.0.0-1-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.metadata (1.3 kB)
Downloading torchvision-0.19.1-cp310-cp310-manylinux1_x86_64.whl (7.0 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m7.0/7.0 MB[0m [31m55.9 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading torch-2.4.1-cp310-cp310-manylinux1_x86_64.whl (797.1 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m797.1/797.1 MB[0m [31m2.1 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading nvidia_cudnn_cu12-9.1.0.70-py3-none-man

## Importing Required Libraries
Now, let's import the libraries we'll need for this tutorial.

In [3]:
import json
import uuid
from autogluon.multimodal import MultiModalPredictor
from autogluon.core.utils.loaders import load_pd
from autogluon.multimodal.utils import visualize_ner

print("Libraries imported successfully!")

Libraries imported successfully!


## Understanding the Data Format
Before we dive into the actual data, let's understand the format AutoMM expects for NER tasks.

In [4]:
# Example of the required JSON format for annotations
example_annotation = json.dumps([
    {"entity_group": "PERSON", "start": 0, "end": 15},
    {"entity_group": "LOCATION", "start": 28, "end": 35}
])

print("Example annotation format:")
print(example_annotation)

# Explanation
print("\nExplanation:")
print("- 'entity_group' is the category of the entity (e.g., PERSON, LOCATION)")
print("- 'start' is the character position where the entity begins")
print("- 'end' is the character position where the entity ends")

Example annotation format:
[{"entity_group": "PERSON", "start": 0, "end": 15}, {"entity_group": "LOCATION", "start": 28, "end": 35}]

Explanation:
- 'entity_group' is the category of the entity (e.g., PERSON, LOCATION)
- 'start' is the character position where the entity begins
- 'end' is the character position where the entity ends


## Visualizing NER Annotations
Let's see how we can visualize these annotations. This is like using a digital highlighter on our text!

In [5]:
sentence = "Albert Einstein was born in Germany and is widely acknowledged to be one of the greatest physicists."
annotation = [
    {"entity_group": "PERSON", "start": 0, "end": 15},
    {"entity_group": "LOCATION", "start": 28, "end": 35}
]

print("Original sentence:")
print(sentence)
print("\nVisualized NER annotations:")
visualize_ner(sentence, annotation)

Original sentence:
Albert Einstein was born in Germany and is widely acknowledged to be one of the greatest physicists.

Visualized NER annotations:


## Loading the Dataset
Now, let's load our actual dataset. We're using a movie dataset that includes information about actors, directors, genres, and more.

In [6]:
train_data = load_pd.load('https://automl-mm-bench.s3.amazonaws.com/ner/mit-movies/train_v2.csv')
test_data = load_pd.load('https://automl-mm-bench.s3.amazonaws.com/ner/mit-movies/test_v2.csv')

print("Dataset loaded successfully!")
print("\nTraining data shape:", train_data.shape)
print("Test data shape:", test_data.shape)

print("\nFirst few rows of the training data:")
print(train_data.head())

Dataset loaded successfully!

Training data shape: (9775, 2)
Test data shape: (2443, 2)

First few rows of the training data:
                                        text_snippet  \
0                      what movies star bruce willis   
1   show me films with drew barrymore from the 1980s   
2  what movies starred both al pacino and robert ...   
3  find me all of the movies that starred harold ...   
4  find me a movie with a quote about baseball in it   

                                  entity_annotations  
0  [{"entity_group": "ACTOR", "start": 17, "end":...  
1  [{"entity_group": "ACTOR", "start": 19, "end":...  
2  [{"entity_group": "ACTOR", "start": 25, "end":...  
3  [{"entity_group": "ACTOR", "start": 39, "end":...  
4                                                 []  


## Examining a Single Data Point
Let's take a closer look at one of our data points to understand what we're working with.

In [7]:
example_index = 1  # You can change this to look at different examples
print(f"Text snippet: {train_data['text_snippet'][example_index]}")
print(f"\nEntity annotations: {train_data['entity_annotations'][example_index]}")

print("\nVisualized annotations:")
visualize_ner(train_data['text_snippet'][example_index], train_data['entity_annotations'][example_index])

Text snippet: show me films with drew barrymore from the 1980s

Entity annotations: [{"entity_group": "ACTOR", "start": 19, "end": 33}, {"entity_group": "YEAR", "start": 43, "end": 48}]

Visualized annotations:


## Training the NER Model
Now comes the exciting part - training our NER model! This is like teaching our computer to recognize and categorize important words and phrases in movie-related text.

In [8]:
label_col = "entity_annotations"
model_path = f"./tmp/{uuid.uuid4().hex}-automm_ner"

predictor = MultiModalPredictor(problem_type="ner", label=label_col, path=model_path)

print("Starting model training. This may take a few minutes...")
predictor.fit(
    train_data=train_data,
    hyperparameters={'model.ner_text.checkpoint_name':'google/electra-small-discriminator'},
    time_limit=300  # 5 minutes
)
print("Model training complete!")

AutoGluon Version:  1.1.1
Python Version:     3.10.12
Operating System:   Linux
Platform Machine:   x86_64
Platform Version:   #1 SMP PREEMPT_DYNAMIC Thu Jun 27 21:05:47 UTC 2024
CPU Count:          8
Pytorch Version:    2.4.1+cu121
CUDA Version:       12.1
Memory Avail:       48.27 GB / 50.99 GB (94.7%)
Disk Space Avail:   193.03 GB / 235.68 GB (81.9%)

AutoMM starts to create your model. ✨✨✨

To track the learning progress, you can open a terminal and launch Tensorboard:
    ```shell
    # Assume you have installed tensorboard
    tensorboard --logdir /content/tmp/d723c4c232af44f58116f4f0fbf459ec-automm_ner
    ```

INFO: Seed set to 0


Starting model training. This may take a few minutes...


The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


config.json:   0%|          | 0.00/665 [00:00<?, ?B/s]

pytorch_model.bin:   0%|          | 0.00/54.2M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

GPU Count: 1
GPU Count to be Used: 1
GPU 0 Name: Tesla T4
GPU 0 Memory: 0.25GB/15.0GB (Used/Total)

INFO: Using 16bit Automatic Mixed Precision (AMP)
/usr/local/lib/python3.10/dist-packages/lightning/pytorch/plugins/precision/amp.py:52: `torch.cuda.amp.GradScaler(args...)` is deprecated. Please use `torch.amp.GradScaler('cuda', args...)` instead.
INFO: GPU available: True (cuda), used: True
INFO: TPU available: False, using: 0 TPU cores
INFO: HPU available: False, using: 0 HPUs
INFO: LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
INFO: 
  | Name              | Type              | Params | Mode 
----------------------------------------------------------------
0 | model             | HFAutoModelForNER | 13.5 M | train
1 | validation_metric | MulticlassF1Score | 0      | train
2 | loss_func         | CrossEntropyLoss  | 0      | train
----------------------------------------------------------------
13.5 M    Trainable params
0         Non-trainable params
13.5 M    Total params
53.959    Total

Sanity Checking: |          | 0/? [00:00<?, ?it/s]

  self.pid = os.fork()


Training: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

INFO: Epoch 0, global step 34: 'val_ner_token_f1' reached 0.00076 (best 0.00076), saving model to '/content/tmp/d723c4c232af44f58116f4f0fbf459ec-automm_ner/epoch=0-step=34.ckpt' as top 3


Validation: |          | 0/? [00:00<?, ?it/s]

INFO: Epoch 0, global step 69: 'val_ner_token_f1' reached 0.60153 (best 0.60153), saving model to '/content/tmp/d723c4c232af44f58116f4f0fbf459ec-automm_ner/epoch=0-step=69.ckpt' as top 3


Validation: |          | 0/? [00:00<?, ?it/s]

INFO: Epoch 1, global step 103: 'val_ner_token_f1' reached 0.81019 (best 0.81019), saving model to '/content/tmp/d723c4c232af44f58116f4f0fbf459ec-automm_ner/epoch=1-step=103.ckpt' as top 3


Validation: |          | 0/? [00:00<?, ?it/s]

INFO: Epoch 1, global step 138: 'val_ner_token_f1' reached 0.83516 (best 0.83516), saving model to '/content/tmp/d723c4c232af44f58116f4f0fbf459ec-automm_ner/epoch=1-step=138.ckpt' as top 3


Validation: |          | 0/? [00:00<?, ?it/s]

INFO: Epoch 2, global step 172: 'val_ner_token_f1' reached 0.85656 (best 0.85656), saving model to '/content/tmp/d723c4c232af44f58116f4f0fbf459ec-automm_ner/epoch=2-step=172.ckpt' as top 3


Validation: |          | 0/? [00:00<?, ?it/s]

INFO: Epoch 2, global step 207: 'val_ner_token_f1' reached 0.86395 (best 0.86395), saving model to '/content/tmp/d723c4c232af44f58116f4f0fbf459ec-automm_ner/epoch=2-step=207.ckpt' as top 3
INFO: Time limit reached. Elapsed time is 0:05:00. Signaling Trainer to stop.


Validation: |          | 0/? [00:00<?, ?it/s]

INFO: Epoch 3, global step 240: 'val_ner_token_f1' reached 0.87236 (best 0.87236), saving model to '/content/tmp/d723c4c232af44f58116f4f0fbf459ec-automm_ner/epoch=3-step=240.ckpt' as top 3
Start to fuse 3 checkpoints via the greedy soup algorithm.
  state_dict = torch.load(path, map_location=torch.device("cpu"))["state_dict"]
/usr/local/lib/python3.10/dist-packages/lightning/pytorch/plugins/precision/amp.py:52: `torch.cuda.amp.GradScaler(args...)` is deprecated. Please use `torch.amp.GradScaler('cuda', args...)` instead.
  self.pid = os.fork()


Predicting: |          | 0/? [00:00<?, ?it/s]

Downloading builder script:   0%|          | 0.00/6.34k [00:00<?, ?B/s]

Predicting: |          | 0/? [00:00<?, ?it/s]

Predicting: |          | 0/? [00:00<?, ?it/s]

AutoMM has created your model. 🎉🎉🎉

To load the model, use the code below:
    ```python
    from autogluon.multimodal import MultiModalPredictor
    predictor = MultiModalPredictor.load("/content/tmp/d723c4c232af44f58116f4f0fbf459ec-automm_ner")
    ```

If you are not satisfied with the model, try to increase the training time, 
adjust the hyperparameters (https://auto.gluon.ai/stable/tutorials/multimodal/advanced_topics/customization.html),
or post issues on GitHub (https://github.com/autogluon/autogluon/issues).




Model training complete!


## Evaluating the Model
After training, we need to see how well our model performs. This is like giving our model a test after its training.

In [9]:
evaluation_metrics = ['overall_recall', "overall_precision", "overall_f1", "actor"]
evaluation_results = predictor.evaluate(test_data, metrics=evaluation_metrics)

print("Evaluation results:")
for metric, value in evaluation_results.items():
    print(f"{metric}: {value}")

Predicting: |          | 0/? [00:00<?, ?it/s]

Evaluation results:
overall_recall: 0.8512830117999626
overall_precision: 0.8340979996329602
overall_f1: 0.842602892102336
actor: {'precision': 0.825414364640884, 'recall': 0.9199507389162561, 'f1': 0.8701223063482818, 'number': 812}


## Making Predictions
Now that our model is trained and evaluated, let's use it to make some predictions!

In [10]:
sentence = "Game of Thrones is an American fantasy drama television series created by David Benioff"
predictions = predictor.predict({'text_snippet': [sentence]})

print("Input sentence:")
print(sentence)
print("\nPredicted entities:")
print(predictions[0])

print("\nVisualized predictions:")
visualize_ner(sentence, predictions[0])

Predicting: |          | 0/? [00:00<?, ?it/s]

Input sentence:
Game of Thrones is an American fantasy drama television series created by David Benioff

Predicted entities:
[{'entity_group': 'PLOT', 'start': 0, 'end': 4}, {'entity_group': 'TITLE', 'start': 5, 'end': 15}, {'entity_group': 'GENRE', 'start': 22, 'end': 44}, {'entity_group': 'DIRECTOR', 'start': 74, 'end': 87}]

Visualized predictions:


## Prediction Probabilities
Sometimes, we want to know how confident our model is in its predictions. Let's look at the probabilities for each prediction.

In [11]:
prob_predictions = predictor.predict_proba({'text_snippet': [sentence]})

print("Prediction probabilities:")
# Accessing the first element of the inner list which contains dictionaries
for entity in prob_predictions[0]:
    print("%s: %s" % (entity['entity_group'], entity['probability'])) # Using % for basic formatting.





Predicting: |          | 0/? [00:00<?, ?it/s]

Prediction probabilities:
B-PLOT: {'O': 0.06384, 'B-RATING': 0.002237, 'B-GENRE': 0.00815, 'B-ACTOR': 0.00211, 'B-DIRECTOR': 0.001024, 'B-CHARACTER': 0.004242, 'B-TRAILER': 0.0002968, 'I-DIRECTOR': 0.000213, 'B-RATINGS_AVERAGE': 0.003777, 'B-REVIEW': 0.00337, 'B-YEAR': 0.000515, 'I-TRAILER': 0.000715, 'B-SONG': 0.001265, 'B-PLOT': 0.79, 'I-RATINGS_AVERAGE': 0.000493, 'I-SONG': 0.001869, 'I-GENRE': 0.001959, 'I-TITLE': 0.006104, 'I-REVIEW': 0.0001924, 'I-PLOT': 0.007275, 'B-TITLE': 0.07837, 'I-RATING': 0.000343, 'I-CHARACTER': 9.674e-05, 'I-ACTOR': 0.01933, 'I-YEAR': 0.001682}
I-TITLE: {'O': 0.0923, 'B-RATING': 0.011, 'B-GENRE': 0.002287, 'B-ACTOR': 0.005886, 'B-DIRECTOR': 0.00183, 'B-CHARACTER': 0.00946, 'B-TRAILER': 0.002243, 'I-DIRECTOR': 0.000892, 'B-RATINGS_AVERAGE': 0.011, 'B-REVIEW': 0.00566, 'B-YEAR': 0.002972, 'I-TRAILER': 0.002003, 'B-SONG': 0.002686, 'B-PLOT': 0.00965, 'I-RATINGS_AVERAGE': 0.0845, 'I-SONG': 0.03061, 'I-GENRE': 0.003542, 'I-TITLE': 0.3381, 'I-REVIEW': 0.004173

## Reloading and Continuous Training
One of the great things about AutoGluon is that we can save our model and come back to it later. We can even continue training it with new data!

In [12]:
# Reloading the model
new_predictor = MultiModalPredictor.load(model_path)

# Continuing training
new_model_path = f"./tmp/{uuid.uuid4().hex}-automm_ner_continue_train"
print("Continuing training for 1 minute...")
new_predictor.fit(train_data, time_limit=60, save_path=new_model_path)

# Evaluating the updated model
test_score = new_predictor.evaluate(test_data, metrics=['overall_f1', 'ACTOR'])
print("\nUpdated model evaluation results:")
for metric, value in test_score.items():
    print(f"{metric}: {value}")

Load pretrained checkpoint: /content/tmp/d723c4c232af44f58116f4f0fbf459ec-automm_ner/model.ckpt
AutoGluon Version:  1.1.1
Python Version:     3.10.12
Operating System:   Linux
Platform Machine:   x86_64
Platform Version:   #1 SMP PREEMPT_DYNAMIC Thu Jun 27 21:05:47 UTC 2024
CPU Count:          8
Pytorch Version:    2.4.1+cu121
CUDA Version:       12.1
Memory Avail:       45.33 GB / 50.99 GB (88.9%)
Disk Space Avail:   192.93 GB / 235.68 GB (81.9%)

AutoMM starts to create your model. ✨✨✨

To track the learning progress, you can open a terminal and launch Tensorboard:
    ```shell
    # Assume you have installed tensorboard
    tensorboard --logdir /content/tmp/3c51ad0f487947adb51dc9fb6d07d3f1-automm_ner_continue_train
    ```

INFO: Seed set to 0
GPU Count: 1
GPU Count to be Used: 1
GPU 0 Name: Tesla T4
GPU 0 Memory: 0.39GB/15.0GB (Used/Total)

INFO: Using 16bit Automatic Mixed Precision (AMP)
INFO: GPU available: True (cuda), used: True
INFO: TPU available: False, using: 0 TPU cores
I

Continuing training for 1 minute...


INFO: LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
INFO: 
  | Name              | Type              | Params | Mode 
----------------------------------------------------------------
0 | model             | HFAutoModelForNER | 13.5 M | train
1 | validation_metric | MulticlassF1Score | 0      | train
2 | loss_func         | CrossEntropyLoss  | 0      | train
----------------------------------------------------------------
13.5 M    Trainable params
0         Non-trainable params
13.5 M    Total params
53.959    Total estimated model params size (MB)


Sanity Checking: |          | 0/? [00:00<?, ?it/s]

Training: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

INFO: Epoch 0, global step 34: 'val_ner_token_f1' reached 0.86369 (best 0.86369), saving model to '/content/tmp/3c51ad0f487947adb51dc9fb6d07d3f1-automm_ner_continue_train/epoch=0-step=34.ckpt' as top 3
INFO: Time limit reached. Elapsed time is 0:01:00. Signaling Trainer to stop.


Validation: |          | 0/? [00:00<?, ?it/s]

INFO: Epoch 0, global step 43: 'val_ner_token_f1' reached 0.87083 (best 0.87083), saving model to '/content/tmp/3c51ad0f487947adb51dc9fb6d07d3f1-automm_ner_continue_train/epoch=0-step=43.ckpt' as top 3
Start to fuse 2 checkpoints via the greedy soup algorithm.


Predicting: |          | 0/? [00:00<?, ?it/s]

Predicting: |          | 0/? [00:00<?, ?it/s]

AutoMM has created your model. 🎉🎉🎉

To load the model, use the code below:
    ```python
    from autogluon.multimodal import MultiModalPredictor
    predictor = MultiModalPredictor.load("/content/tmp/3c51ad0f487947adb51dc9fb6d07d3f1-automm_ner_continue_train")
    ```

If you are not satisfied with the model, try to increase the training time, 
adjust the hyperparameters (https://auto.gluon.ai/stable/tutorials/multimodal/advanced_topics/customization.html),
or post issues on GitHub (https://github.com/autogluon/autogluon/issues).




Predicting: |          | 0/? [00:00<?, ?it/s]


Updated model evaluation results:
overall_f1: 0.8426499122239675
ACTOR: {'precision': 0.8224195338512763, 'recall': 0.9125615763546798, 'f1': 0.8651488616462346, 'number': 812}


## Conclusion
Congratulations! You've just walked through the entire process of training, evaluating, and using a Named Entity Recognition model with AutoGluon.

Remember, NER is like teaching a computer to be a smart highlighter, automatically identifying and categorizing important information in text. This has numerous applications, from organizing large text datasets to powering intelligent search features in applications.

Feel free to experiment with different datasets, longer training times, or different model architectures to see how you can improve the performance even further!