#🧪 Practical: Perform Named Entity Recognition with spaCy

#✅ Requirements:
Python 3.x

spaCy library

en_core_web_md model

#✅ Step-by-step Code with Comments
🔹 1. Install spaCy and the medium model

In [5]:
# 🧩 Step 1: Install spaCy and the medium model
# Run this in a Jupyter cell (not in a .py script)

!pip install -q spacy
!python -m spacy download en_core_web_md


Collecting en-core-web-md==3.8.0
  Downloading https://github.com/explosion/spacy-models/releases/download/en_core_web_md-3.8.0/en_core_web_md-3.8.0-py3-none-any.whl (33.5 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m33.5/33.5 MB[0m [31m33.0 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: en-core-web-md
Successfully installed en-core-web-md-3.8.0
[38;5;2m✔ Download and installation successful[0m
You can now load the package via spacy.load('en_core_web_md')
[38;5;3m⚠ Restart to reload dependencies[0m
If you are in a Jupyter or Colab notebook, you may need to restart Python in
order to load all the package's dependencies. You can do this by selecting the
'Restart kernel' or 'Restart runtime' option.


In [6]:
# 🧩 Step 2: Import and load the spaCy medium English model

import spacy

# Load the medium model
nlp = spacy.load("en_core_web_md")


In [7]:
# 🧩 Step 3: Create medium-sized synthetic data for NER

text = """
In June 2022, Emma Johnson, a data scientist from IBM, presented a keynote speech at the AI Summit in New York.
The conference was attended by professionals from Google, Microsoft, and OpenAI.
Dr. Alan Smith from Stanford University discussed the future of AGI.
Meanwhile, Elon Musk tweeted about the progress at Tesla and SpaceX.
Back in Europe, Anna Müller from the University of Berlin led a study on multilingual NLP.
The report was later published in Nature and covered by BBC News and The Guardian.
"""

print("📄 Sample Text:\n")
print(text)


📄 Sample Text:


In June 2022, Emma Johnson, a data scientist from IBM, presented a keynote speech at the AI Summit in New York.
The conference was attended by professionals from Google, Microsoft, and OpenAI.
Dr. Alan Smith from Stanford University discussed the future of AGI.
Meanwhile, Elon Musk tweeted about the progress at Tesla and SpaceX.
Back in Europe, Anna Müller from the University of Berlin led a study on multilingual NLP.
The report was later published in Nature and covered by BBC News and The Guardian.



In [8]:
# 🧩 Step 4: Process the text and extract named entities

# Apply the NLP pipeline to the text
doc = nlp(text)

# Display named entities with labels
print("\n🔍 Named Entities Detected:\n")
for ent in doc.ents:
    print(f"{ent.text:<30} --> {ent.label_:<10} ({spacy.explain(ent.label_)})")



🔍 Named Entities Detected:

June 2022                      --> DATE       (Absolute or relative dates or periods)
Emma Johnson                   --> PERSON     (People, including fictional)
IBM                            --> ORG        (Companies, agencies, institutions, etc.)
the AI Summit                  --> ORG        (Companies, agencies, institutions, etc.)
New York                       --> GPE        (Countries, cities, states)
Google                         --> ORG        (Companies, agencies, institutions, etc.)
Microsoft                      --> ORG        (Companies, agencies, institutions, etc.)
OpenAI                         --> GPE        (Countries, cities, states)
Alan Smith                     --> PERSON     (People, including fictional)
Stanford University            --> ORG        (Companies, agencies, institutions, etc.)
AGI                            --> ORG        (Companies, agencies, institutions, etc.)
Elon Musk                      --> PERSON     (People, in

In [9]:
# 🧩 Step 5 (Optional): Visualize entities using displaCy

from spacy import displacy

# Visualize named entities in a Jupyter cell
displacy.render(doc, style="ent", jupyter=True)


#✅ Learning Outcomes
Understand how to set up and run spaCy’s NER pipeline.

Practice on medium-sized synthetic data.

Identify PERSON, ORG, GPE, DATE, and other entity types.

Visualize NER with displacy.