# Named Entity Recognition (NER) with spaCy

In this notebook, I have implemented a Named Entity Recognition (NER) model using spaCy.I have used the blank spaCy model for English and demonstrated how to make a custom NER model and train it.

## Installation and Setup

First, ensure you have spaCy installed. 
Run the following commands in your Jupyter Notebook:
%pip install spacy 



In [1]:
# Import necessary libraries
import spacy
from spacy.training import Example
import random

In [2]:
# Initializing a blank English model
nlp = spacy.blank("en")

# Creating the NER pipeline component
ner = nlp.add_pipe("ner")

# Defining the labels
labels = ["ORG", "GPE", "MONEY", "FAC", "PERSON", "LOC"]
for label in labels:
    ner.add_label(label)

print("NER Component Labels:", ner.labels)

NER Component Labels: ('FAC', 'GPE', 'LOC', 'MONEY', 'ORG', 'PERSON')


In [3]:
# Creating the training data (text, annotations)
TRAIN_DATA = [
    ("Apple is looking at buying U.K. startup for $1 billion", {
        "entities": [(0, 5, "ORG"), (25, 30, "GPE"), (43, 53, "MONEY")] 
    }),
    ("San Francisco considers banning sidewalk delivery robots", {
        "entities": [(0, 13, "GPE")]
    }),
    ("Uber is hiring a new data scientist", {
        "entities": [(0, 4, "ORG")]
    }),
    ("The Golden Gate Bridge ", {
        "entities": [(4, 23, "FAC")]
    }),
    ("Microsoft acquired LinkedIn for $26.2 billion", {
        "entities": [(0, 9, "ORG"), (18, 27, "ORG"), (31, 44, "MONEY")] 
    }),
    ("NASA plans new mission to Mars", {
        "entities": [(0, 4, "ORG"), (22, 26, "LOC")]
    }),
    ("Elon Musk is the CEO of SpaceX", {
        "entities": [(0, 9, "PERSON"), (20, 26, "ORG")]
    }),
    ("Amazon is expanding its operations in India", {
        "entities": [(0, 6, "ORG"), (37, 42, "GPE")]  
    }),
     ("$2 billion ", {
        "entities": [(0, 10, "MONEY")]  
    }),
    ("India ", {
        "entities": [(0, 5, "GPE")]  
    })
]

In [5]:
# Initializing the optimizer
optimizer = nlp.begin_training()

# Training loop
for epoch in range(25):
    random.shuffle(TRAIN_DATA)
    losses = {}
    for text, annotations in TRAIN_DATA:
        doc = nlp.make_doc(text)
        example = Example.from_dict(doc, annotations)
        nlp.update([example], drop=0.3, losses=losses)
    print(f"Epoch {epoch} - Losses: {losses}")

Epoch 0 - Losses: {'ner': 40.65151465497911}
Epoch 1 - Losses: {'ner': 23.448045816505328}
Epoch 2 - Losses: {'ner': 13.299232352014542}
Epoch 3 - Losses: {'ner': 12.874518994062331}
Epoch 4 - Losses: {'ner': 9.561684780608573}
Epoch 5 - Losses: {'ner': 6.453637711702084}
Epoch 6 - Losses: {'ner': 5.655721923847545}
Epoch 7 - Losses: {'ner': 5.098236141501728}
Epoch 8 - Losses: {'ner': 4.666470472041175}
Epoch 9 - Losses: {'ner': 5.2324986751278555}
Epoch 10 - Losses: {'ner': 25.301663676451966}
Epoch 11 - Losses: {'ner': 7.955636018965231}
Epoch 12 - Losses: {'ner': 3.367103139413463}
Epoch 13 - Losses: {'ner': 3.152870580628284}
Epoch 14 - Losses: {'ner': 3.033407881785544}
Epoch 15 - Losses: {'ner': 2.331697389661361}
Epoch 16 - Losses: {'ner': 1.416596895025783}
Epoch 17 - Losses: {'ner': 2.7620315660385444}
Epoch 18 - Losses: {'ner': 0.7548047652242961}
Epoch 19 - Losses: {'ner': 0.28635609202960477}
Epoch 20 - Losses: {'ner': 0.0051820196905337365}
Epoch 21 - Losses: {'ner': 0.00

In [6]:
test_texts = [
    "Twitter, Apple is looking at buying a startup in San Francisco for $1 billion",
    "Microsoft is launching a new product",
]

# Testing the model
for test_text in test_texts:
    doc = nlp(test_text)
    print(f"\nText: {test_text}")
    print("Recognized Entities:")
    for ent in doc.ents:
        print(f"{ent.text} ({ent.label_})")


Text: Twitter, Apple is looking at buying a startup in San Francisco for $1 billion
Recognized Entities:
Twitter (ORG)
Apple (ORG)
San Francisco (GPE)

Text: Microsoft is launching a new product
Recognized Entities:
Microsoft (ORG)
