# **Named Entity Recognition (NER) using BERT on CoNLL-2003 Dataset**  

##  **Introduction**
Named Entity Recognition (NER) is a fundamental task in Natural Language Processing (NLP), aimed at identifying key entities such as **persons (PER), organizations (ORG), locations (LOC), and miscellaneous entities (MISC)** within a given text. It serves as a cornerstone for various downstream NLP applications, including **information extraction, question answering, and knowledge graph construction**.  

This notebook presents an end-to-end pipeline for training a **BERT-based NER model** using the **CoNLL-2003 dataset**, a widely recognized benchmark for entity recognition tasks. Our objective is to fine-tune a **pretrained BERT model** from **Hugging Face Transformers** to achieve **high accuracy and robust generalization** on named entity recognition tasks.  

---

### **Dataset: CoNLL-2003**  
The **CoNLL-2003 dataset** is a widely used benchmark dataset for **NER tasks in the English language**, consisting of **news articles from Reuters**. It includes four entity types:  

- **PER** (Person) → e.g., *Elon Musk, Serena Williams*  
- **ORG** (Organization) → e.g., *Google, NASA, FIFA*  
- **LOC** (Location) → e.g., *Paris, Mount Everest, Amazon River*  
- **MISC** (Miscellaneous) → e.g., *Olympics, Grammy Awards*  

Each word in a sentence is assigned a **BIO (Beginning-Inside-Outside) tagging scheme**:
- **B-PER** (Beginning of a Person entity)  
- **I-PER** (Inside a Person entity)  
- **O** (Outside any entity)  



# **1. Setup & Imports**

In [1]:
#Install the necessary NLP libraries
!pip install transformers datasets tokenizers seqeval evaluate -q

[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/43.6 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m43.6/43.6 kB[0m [31m1.4 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m485.4/485.4 kB[0m [31m8.2 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m84.0/84.0 kB[0m [31m4.9 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m116.3/116.3 kB[0m [31m6.4 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m143.5/143.5 kB[0m [31m8.7 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m194.8/194.8 kB[0m [31m10.2 MB/s[0m eta [36m0:00:00[0m
[?25h  Building wheel for seqeval (setup.py) ... [?25l[?25hdone


In [2]:
# Import necessary libraries for dataset handling, tokenization, model training, and evaluation
import datasets
import numpy as np
from transformers import BertTokenizerFast
from transformers import DataCollatorForTokenClassification
from transformers import AutoModelForTokenClassification
from transformers import TrainingArguments, Trainer, pipeline

In [3]:
## Load the CoNLL-2003 dataset and display a sample training example
conll_dataset = datasets.load_dataset("conll2003")

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


README.md:   0%|          | 0.00/12.3k [00:00<?, ?B/s]

conll2003.py:   0%|          | 0.00/9.57k [00:00<?, ?B/s]

The repository for conll2003 contains custom code which must be executed to correctly load the dataset. You can inspect the repository content at https://hf.co/datasets/conll2003.
You can avoid this prompt in future by passing the argument `trust_remote_code=True`.

Do you wish to run the custom code? [y/N] y


Downloading data:   0%|          | 0.00/983k [00:00<?, ?B/s]

Generating train split:   0%|          | 0/14041 [00:00<?, ? examples/s]

Generating validation split:   0%|          | 0/3250 [00:00<?, ? examples/s]

Generating test split:   0%|          | 0/3453 [00:00<?, ? examples/s]

In [4]:
conll_dataset

DatasetDict({
    train: Dataset({
        features: ['id', 'tokens', 'pos_tags', 'chunk_tags', 'ner_tags'],
        num_rows: 14041
    })
    validation: Dataset({
        features: ['id', 'tokens', 'pos_tags', 'chunk_tags', 'ner_tags'],
        num_rows: 3250
    })
    test: Dataset({
        features: ['id', 'tokens', 'pos_tags', 'chunk_tags', 'ner_tags'],
        num_rows: 3453
    })
})

In [5]:
conll_dataset["train"][0]

{'id': '0',
 'tokens': ['EU',
  'rejects',
  'German',
  'call',
  'to',
  'boycott',
  'British',
  'lamb',
  '.'],
 'pos_tags': [22, 42, 16, 21, 35, 37, 16, 21, 7],
 'chunk_tags': [11, 21, 11, 12, 21, 22, 11, 12, 0],
 'ner_tags': [3, 0, 7, 0, 0, 0, 7, 0, 0]}

# **2. Data Preprocessing**
### **2.1. Cleaning up the dataset**

In [6]:
# Since we are working on NER, we can reomve the pos_tags and chunk_tags columns from the datset

ner_dataset = conll_dataset.remove_columns(["pos_tags", "chunk_tags"])
ner_dataset

DatasetDict({
    train: Dataset({
        features: ['id', 'tokens', 'ner_tags'],
        num_rows: 14041
    })
    validation: Dataset({
        features: ['id', 'tokens', 'ner_tags'],
        num_rows: 3250
    })
    test: Dataset({
        features: ['id', 'tokens', 'ner_tags'],
        num_rows: 3453
    })
})

In [7]:
ner_dataset.shape

{'train': (14041, 3), 'validation': (3250, 3), 'test': (3453, 3)}

In [8]:
ner_dataset["train"][0]

{'id': '0',
 'tokens': ['EU',
  'rejects',
  'German',
  'call',
  'to',
  'boycott',
  'British',
  'lamb',
  '.'],
 'ner_tags': [3, 0, 7, 0, 0, 0, 7, 0, 0]}

In [9]:
# get the label names from dateset features, so that the integer values( ner_tags) can be mapped to actual labels.
ner_features = ner_dataset["train"].features["ner_tags"]
ner_features

Sequence(feature=ClassLabel(names=['O', 'B-PER', 'I-PER', 'B-ORG', 'I-ORG', 'B-LOC', 'I-LOC', 'B-MISC', 'I-MISC'], id=None), length=-1, id=None)

In [10]:
labels = ner_features.feature.names
labels

['O', 'B-PER', 'I-PER', 'B-ORG', 'I-ORG', 'B-LOC', 'I-LOC', 'B-MISC', 'I-MISC']

### **2.2. Tokenization**

To ensure consistency with BERT's pretraining, we must tokenize the dataset using the same WordPiece tokenizer that was used during BERT’s original training. This maintains alignment in token representation, handles subword tokenization (e.g., breaking rare words into smaller units like "un##seen"), and ensures that our fine-tuned model benefits from BERT’s pretrained knowledge. Using Hugging Face’s BertTokenizerFast, we efficiently tokenize text while preserving entity-label alignment.

In [11]:
# Specify the pre-trained BERT model checkpoint to use
model_checkpoint = "bert-base-cased"

# Load the corresponding tokenizer to ensure consistency with the model's pretraining
tokenizer = BertTokenizerFast.from_pretrained(model_checkpoint)

tokenizer_config.json:   0%|          | 0.00/49.0 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/213k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/436k [00:00<?, ?B/s]

config.json:   0%|          | 0.00/570 [00:00<?, ?B/s]

#### **2.2.1. Preprocessing and Applying Tokenizer on a single datapoint to understand the output**

In [12]:
datapoint = ner_dataset["train"][100]
datapoint

{'id': '100',
 'tokens': ['Rabinovich',
  'is',
  'winding',
  'up',
  'his',
  'term',
  'as',
  'ambassador',
  '.'],
 'ner_tags': [1, 0, 0, 0, 0, 0, 0, 0, 0]}

In [13]:
datapoint["tokens"]

['Rabinovich', 'is', 'winding', 'up', 'his', 'term', 'as', 'ambassador', '.']

In [14]:
tokenized_datapoint = tokenizer(datapoint["tokens"], is_split_into_words=True)
tokenized_datapoint

{'input_ids': [101, 16890, 25473, 11690, 1110, 14042, 1146, 1117, 1858, 1112, 9088, 119, 102], 'token_type_ids': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], 'attention_mask': [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]}

In [15]:
tokens = tokenizer.convert_ids_to_tokens(tokenized_datapoint["input_ids"])
tokens

['[CLS]',
 'Ra',
 '##bino',
 '##vich',
 'is',
 'winding',
 'up',
 'his',
 'term',
 'as',
 'ambassador',
 '.',
 '[SEP]']

In [16]:
tokenized_datapoint.word_ids()

[None, 0, 0, 0, 1, 2, 3, 4, 5, 6, 7, 8, None]

In [17]:
len(datapoint["ner_tags"]), len(tokenized_datapoint["input_ids"])

(9, 13)

##### Aligning Labels with Tokens

We observe that the BERT tokenizer has added special tokens ([CLS], [SEP]) and has split certain word tokens into sub-tokens, and the labels don't match the new tokens anymore. Consider the example: "Fischler proposed EU-wide measures after reports". The word tokens with labels before tokenization and the new tokens after tokenization are shown below:

![image.png](https://i.postimg.cc/NjdtrrdF/1.png)

Here, the word IDs provided by the tokenizer helps to match each new token to the word it belongs to, which further allows to map each new token to its correct label.

![image.png](https://i.postimg.cc/kGyj2W7g/2.png)

Here, each token gets the same label as the token that started the word it’s inside, since they are part of the same entity. For tokens inside a word but not at the beginning, we replace the B- with I- (since the token does not begin the entity).

We will assign the special tokens a label of -100, as -100 is an index ignored in the loss function(cross entropy).

![image.png](https://i.postimg.cc/6QNYfHTp/3.png)

In [18]:
def align_labels_with_tokens(labels, word_ids):
    """
    Aligns NER labels with tokenized word pieces.

    Parameters:
    - labels (List[int]): Original NER labels for each word.
    - word_ids (List[Optional[int]]): Mapping of tokens to their original word index (None for special tokens).

    Returns:
    - List[int]: Aligned labels for each token, ensuring subwords inherit the correct entity label.
    """
    new_labels = []
    current_word = None  # Track the current word ID

    for word_id in word_ids:
        if word_id != current_word:
            # Start of a new word or special token
            current_word = word_id
            label = -100 if word_id is None else labels[word_id]  # Assign -100 to special tokens
            new_labels.append(label)
        elif word_id is None:
            # Special token (e.g., [CLS], [SEP]), ignored in loss computation
            new_labels.append(-100)
        else:
            # Continuation of the same word
            label = labels[word_id]

            # Convert "B-" (begin) entity labels to "I-" (inside) for subword tokens
            if label % 2 == 1:  # Assuming B-XXX labels have odd indices
                label += 1

            new_labels.append(label)

    return new_labels


In [19]:
labels = ner_dataset["train"][100]["ner_tags"]
word_ids = tokenized_datapoint.word_ids()
print(labels)
print(align_labels_with_tokens(labels, word_ids))

[1, 0, 0, 0, 0, 0, 0, 0, 0]
[-100, 1, 2, 2, 0, 0, 0, 0, 0, 0, 0, 0, -100]


#### **2.2.2. Tokenizing the entire dataset**

In [20]:
# We can pre-process the entire dataset using the map function with batch=True
# to leverage the efficiency of fast tokenizers.

def tokenize_and_align_labels(examples):
    """
    Tokenizes input text while aligning NER labels to subword tokens.

    Parameters:
    - examples (Dict[str, List]): A dictionary containing:
        - "tokens": List of tokenized words for each sentence.
        - "ner_tags": Corresponding NER labels for each word.

    Returns:
    - Dict[str, List]: Tokenized inputs with aligned NER labels.
    """

    # Tokenize input tokens while preserving word boundaries
    tokenized_inputs = tokenizer(
        examples["tokens"], truncation=True, is_split_into_words=True
    )

    all_labels = examples["ner_tags"]  # Extract corresponding NER labels
    new_labels = []

    # Align NER labels with tokenized subwords
    for i, labels in enumerate(all_labels):
        word_ids = tokenized_inputs.word_ids(i)  # Retrieve word IDs per token
        new_labels.append(align_labels_with_tokens(labels, word_ids))

    tokenized_inputs["labels"] = new_labels  # Attach aligned labels
    return tokenized_inputs

In [21]:
tokenized_datasets = ner_dataset.map(
    tokenize_and_align_labels,
    batched=True,
    remove_columns=ner_dataset["train"].column_names,
)

Map:   0%|          | 0/14041 [00:00<?, ? examples/s]

Map:   0%|          | 0/3250 [00:00<?, ? examples/s]

Map:   0%|          | 0/3453 [00:00<?, ? examples/s]

In [22]:
tokenized_datasets

DatasetDict({
    train: Dataset({
        features: ['input_ids', 'token_type_ids', 'attention_mask', 'labels'],
        num_rows: 14041
    })
    validation: Dataset({
        features: ['input_ids', 'token_type_ids', 'attention_mask', 'labels'],
        num_rows: 3250
    })
    test: Dataset({
        features: ['input_ids', 'token_type_ids', 'attention_mask', 'labels'],
        num_rows: 3453
    })
})

In [23]:
tokenized_datasets["train"][100]

{'input_ids': [101,
  16890,
  25473,
  11690,
  1110,
  14042,
  1146,
  1117,
  1858,
  1112,
  9088,
  119,
  102],
 'token_type_ids': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
 'attention_mask': [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
 'labels': [-100, 1, 2, 2, 0, 0, 0, 0, 0, 0, 0, 0, -100]}

### **2.3. Data Collation**

The last step of pre-processing is to batch inputs together.

However, we encounter a problem while we try to create batches, due to uneven lengths different sentences. So, we need to applying padding so that each sentenece is as long as the longest sentence in every batch. All these can be done by the **data collator** designed for token classification.

Data collators are objects that will form a batch by using a list of dataset elements as input. And In order to build batches, data collators apply data pre-processing techniques like padding or data augmentation like random masking.

Consider the example below:
- Before padding
![image.png](https://i.postimg.cc/Vvkrxsq5/4.png)

- After padding using data colator
![image.png](https://i.postimg.cc/rwcMhZMY/5.png)




In [24]:
# Initialize a data collator for token classification tasks
# This collator dynamically pads input sequences to the longest sequence in a batch
# Ensuring that all inputs in the batch have the same length is essential for efficient training
data_collator = DataCollatorForTokenClassification(tokenizer=tokenizer)

In [25]:
# Create a batch of two tokenized examples from the training dataset to check the results
batch = data_collator([tokenized_datasets["train"][i] for i in range(2)])

batch["labels"]


tensor([[-100,    3,    0,    7,    0,    0,    0,    7,    0,    0,    0, -100],
        [-100,    1,    2, -100, -100, -100, -100, -100, -100, -100, -100, -100]])

In [26]:
# compare this to the labels for the first and second elements in our dataset
for i in range(2):
    print(tokenized_datasets["train"][i]["labels"])

[-100, 3, 0, 7, 0, 0, 0, 7, 0, 0, 0, -100]
[-100, 1, 2, -100]


# **3. Model Training & Evaluation**

We will use the Hugging Face high-level Trainer API to fine-tune the BERT model.



## **3.1. Defining the Model**

As NER is a token classification problem, we will be using the AutoModelForTokenClassification class.


### 3.1.1. Create label dictionaries

In order to pass the label information to the model, set two dictionaries, id2label and label2id, which contain the mappings from ID to label and vice versa.

In [27]:
example = ner_dataset["train"][0]
label_list = ner_dataset["train"].features["ner_tags"].feature.names

label_list

['O', 'B-PER', 'I-PER', 'B-ORG', 'I-ORG', 'B-LOC', 'I-LOC', 'B-MISC', 'I-MISC']

In [28]:
# Create a mapping from label indices to label names
id2label = {i: label for i, label in enumerate(label_list)}

# Create a reverse mapping from label names to label indices
label2id = {v: k for k, v in id2label.items()}

### **3.1.2. Load the model**

In [29]:
#from transformers import AutoModelForTokenClassification

# Load a pre-trained BERT model for token classification
# This initializes BERT with a classification head for NER tasks
model = AutoModelForTokenClassification.from_pretrained(
    model_checkpoint,  # Pre-trained model checkpoint (e.g., "bert-base-cased")
    id2label=id2label,  # Mapping from numerical class IDs to entity labels (e.g., 0 → "B-PER")
    label2id=label2id,  # Reverse mapping from entity labels to numerical class IDs (e.g., "B-PER" → 0)
)


model.safetensors:   0%|          | 0.00/436M [00:00<?, ?B/s]

Some weights of BertForTokenClassification were not initialized from the model checkpoint at bert-base-cased and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


In [31]:
model.config.num_labels

9

## **3.2. Metrics**

In order to have the Hugging Face Trainer API compute every epoch, we need to define a compute_metrics() function that takes the arrays of predictions and labels, and returns a dictionary with the metric names and values.

The traditional framework used to evaluate token classification prediction is seqeval. To use this metric, we first need to install the seqeval library.

In [32]:
import evaluate

metric = evaluate.load("seqeval")

Downloading builder script:   0%|          | 0.00/6.34k [00:00<?, ?B/s]

### 3.2.1. Testing Metric on an example

In [33]:
example = ner_dataset["train"][0]
label_list = ner_dataset["train"].features["ner_tags"].feature.names
print(label_list)
labels = [label_list[i] for i in example["ner_tags"]]
print(labels)

['O', 'B-PER', 'I-PER', 'B-ORG', 'I-ORG', 'B-LOC', 'I-LOC', 'B-MISC', 'I-MISC']
['B-ORG', 'O', 'B-MISC', 'O', 'O', 'O', 'B-MISC', 'O', 'O']


In [34]:
metric.compute(predictions=[labels], references=[labels])

{'MISC': {'precision': 1.0, 'recall': 1.0, 'f1': 1.0, 'number': 2},
 'ORG': {'precision': 1.0, 'recall': 1.0, 'f1': 1.0, 'number': 1},
 'overall_precision': 1.0,
 'overall_recall': 1.0,
 'overall_f1': 1.0,
 'overall_accuracy': 1.0}

Define a compute_metrics() function, which first takes the argmax of the logits to convert them to predictions (as usual, the logits and the probabilities are in the same order, so we don’t need to apply the softmax). Then we have to convert both labels and predictions from integers to strings. We remove all the values where the label is -100, then pass the results to the metric.compute() method:

In [35]:
def compute_metrics(eval_preds):
    pred_logits, labels = eval_preds

    pred_logits = np.argmax(pred_logits, axis=2)
    # the logits and the probabilities are in the same order,
    # so we don’t need to apply the softmax

    # We remove all the values where the label is -100
    predictions = [
        [label_list[eval_preds] for (eval_preds, l) in zip(prediction, label) if l != -100]
        for prediction, label in zip(pred_logits, labels)
    ]

    true_labels = [
      [label_list[l] for (eval_preds, l) in zip(prediction, label) if l != -100]
       for prediction, label in zip(pred_logits, labels)
   ]
    results = metric.compute(predictions=predictions, references=true_labels)

    return {
          "precision": results["overall_precision"],
          "recall": results["overall_recall"],
          "f1": results["overall_f1"],
          "accuracy": results["overall_accuracy"],
  }

## **3.3. Fine-tuning the model**

### 3.3.1. Model Parameters

Set the hyperparameters like the learning rate, the number of epochs to train for, and the weight decay.

In [36]:
from transformers import TrainingArguments

args = TrainingArguments(
"test-ner",
evaluation_strategy = "epoch",
learning_rate=2e-5,
per_device_train_batch_size=16,
per_device_eval_batch_size=16,
num_train_epochs=3,
weight_decay=0.01,
report_to="none"
)




## **3.4. Model Training**

In [37]:
from transformers import Trainer

trainer = Trainer(
    model=model,
    args=args,
    train_dataset=tokenized_datasets["train"],
    eval_dataset=tokenized_datasets["validation"],
    data_collator=data_collator,
    compute_metrics=compute_metrics,
    # tokenizer=tokenizer,
)
trainer.train()

Epoch,Training Loss,Validation Loss,Precision,Recall,F1,Accuracy
1,0.223,0.062853,0.904723,0.93487,0.91955,0.981986
2,0.0461,0.059733,0.927591,0.944295,0.935869,0.985209
3,0.0257,0.055446,0.9321,0.949512,0.940725,0.986401


TrainOutput(global_step=2634, training_loss=0.07640768790118334, metrics={'train_runtime': 538.2768, 'train_samples_per_second': 78.255, 'train_steps_per_second': 4.893, 'total_flos': 1050534559887048.0, 'train_loss': 0.07640768790118334, 'epoch': 3.0})

# **4. Save the Model**

In [38]:
model.save_pretrained("ner_bert_model")

In [39]:
tokenizer.save_pretrained("tokenizer")

('tokenizer/tokenizer_config.json',
 'tokenizer/special_tokens_map.json',
 'tokenizer/vocab.txt',
 'tokenizer/added_tokens.json',
 'tokenizer/tokenizer.json')

### 4.1. Loding Model and Predictions

In [40]:
model_fine_tuned = AutoModelForTokenClassification.from_pretrained("ner_bert_model")

In [41]:
from transformers import pipeline

nlp = pipeline("ner", model=model_fine_tuned, tokenizer=tokenizer)

example = "Elon Musk founded SpaceX in California in 2002."

ner_results = nlp(example)

print(ner_results)

Device set to use cuda:0


[{'entity': 'B-PER', 'score': 0.99245775, 'index': 1, 'word': 'El', 'start': 0, 'end': 2}, {'entity': 'I-PER', 'score': 0.9940584, 'index': 2, 'word': '##on', 'start': 2, 'end': 4}, {'entity': 'I-PER', 'score': 0.9973279, 'index': 3, 'word': 'Mu', 'start': 5, 'end': 7}, {'entity': 'I-PER', 'score': 0.9964563, 'index': 4, 'word': '##sk', 'start': 7, 'end': 9}, {'entity': 'B-ORG', 'score': 0.98846966, 'index': 6, 'word': 'Space', 'start': 18, 'end': 23}, {'entity': 'I-ORG', 'score': 0.9865747, 'index': 7, 'word': '##X', 'start': 23, 'end': 24}, {'entity': 'B-LOC', 'score': 0.9983222, 'index': 9, 'word': 'California', 'start': 28, 'end': 38}]


### 4.2. Dowload Modle files for FastAPI integration

In [43]:
!zip -r ner_bert_model.zip ner_bert_model

  adding: ner_bert_model/ (stored 0%)
  adding: ner_bert_model/model.safetensors (deflated 7%)
  adding: ner_bert_model/config.json (deflated 53%)


In [44]:
import shutil

# Zip both model and tokenizer
shutil.make_archive("ner_bert_model", 'zip', "ner_bert_model")
shutil.make_archive("tokenizer", 'zip', "tokenizer")


'/content/tokenizer.zip'

In [45]:
from google.colab import files

files.download("ner_bert_model.zip")
files.download("tokenizer.zip")

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>