# Named Entity Recognition (NER) with spaCy

In this notebook, I have implemented a Named Entity Recognition (NER) model using spaCy.I have used the pre-trained spaCy model for English and demonstrated how to process text and extract named entities.

## 1. Installation and Setup

First, ensure you have spaCy installed and the English model downloaded. Run the following commands in your Jupyter Notebook:



In [1]:
%pip install spacy textblob

Note: you may need to restart the kernel to use updated packages.



[notice] A new release of pip available: 22.3.1 -> 24.1.2
[notice] To update, run: python.exe -m pip install --upgrade pip


In [11]:
!python -m spacy download en_core_web_sm

Collecting en-core-web-sm==3.7.1
  Downloading https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-3.7.1/en_core_web_sm-3.7.1-py3-none-any.whl (12.8 MB)
     --------------------------------------- 12.8/12.8 MB 13.9 MB/s eta 0:00:00
[38;5;2m✔ Download and installation successful[0m
You can now load the package via spacy.load('en_core_web_sm')



[notice] A new release of pip available: 22.3.1 -> 24.1.2
[notice] To update, run: python.exe -m pip install --upgrade pip


In [14]:
import spacy
from spacy import displacy
from textblob import TextBlob

In [2]:
# Loading pre-trained spaCy model
nlp = spacy.load("en_core_web_sm")


In [3]:
nlp.pipe_names

['tok2vec', 'tagger', 'parser', 'attribute_ruler', 'lemmatizer', 'ner']

In [4]:
# Sample text
text = "Elon Musk is looking to buy Twitter .Tim Cook is the CEO of Apple. Apple is looking at buying U.K. startup for $1 billion.  Amazon, founded by Jeff Bezos in 1994, announced a partnership with Microsoft to enhance cloud computing services. The collaboration aims to compete with Google's Cloud platform. Meanwhile, Tesla's CEO Elon Musk tweeted about launching a new electric truck in 2025. In Europe, the European Central Bank (ECB) is considering digital euro adoption, which could impact the cryptocurrency market significantly. Additionally, the new series of the popular Netflix show 'Stranger Things' is set to release next month, sparking excitement among fans worldwide. In sports, Serena Williams announced her retirement after the US Open, a tournament she has won six times. The COVID-19 pandemic continues to affect global markets, with companies like Pfizer and Moderna are working on updated vaccines. In Asia, Alibaba's shares surged after a record-breaking Singles' Day sales event, generating over $30 billion in revenue. At the same time, climate activists in Australia are protesting against the Adani coal mine project, citing environmental concerns."

# Processing the text with spaCy pipeline
doc = nlp(text)

In [5]:
# Extracting and displaying named entities
for ent in doc.ents:
    print(f"Entity: {ent.text}, Label: {ent.label_}")

Entity: Elon Musk, Label: PERSON
Entity: Cook, Label: PERSON
Entity: Apple, Label: ORG
Entity: Apple, Label: ORG
Entity: U.K., Label: GPE
Entity: $1 billion, Label: MONEY
Entity: Amazon, Label: ORG
Entity: Jeff Bezos, Label: PERSON
Entity: 1994, Label: DATE
Entity: Microsoft, Label: ORG
Entity: Google, Label: ORG
Entity: Tesla, Label: ORG
Entity: Elon Musk, Label: PERSON
Entity: 2025, Label: DATE
Entity: Europe, Label: LOC
Entity: the European Central Bank, Label: ORG
Entity: Netflix, Label: ORG
Entity: Stranger Things', Label: WORK_OF_ART
Entity: next month, Label: DATE
Entity: Serena Williams, Label: PERSON
Entity: the US Open, Label: EVENT
Entity: six, Label: CARDINAL
Entity: COVID-19, Label: ORG
Entity: Pfizer, Label: ORG
Entity: Moderna, Label: PERSON
Entity: Asia, Label: LOC
Entity: Alibaba, Label: GPE
Entity: over $30 billion, Label: MONEY
Entity: Australia, Label: GPE
Entity: Adani, Label: ORG


In [6]:
# Visualize
displacy.render(doc, style="ent", jupyter=True)

# Part-of-Speech (POS) Tagging

In [13]:
print("Part-of-Speech Tags:")
for token in doc:
    print(f"Token: {token.text}, POS: {token.pos_}, Tag: {token.tag_}")

Part-of-Speech Tags:
Token: Elon, POS: PROPN, Tag: NNP
Token: Musk, POS: PROPN, Tag: NNP
Token: is, POS: AUX, Tag: VBZ
Token: looking, POS: VERB, Tag: VBG
Token: to, POS: PART, Tag: TO
Token: buy, POS: VERB, Tag: VB
Token: Twitter, POS: PROPN, Tag: NNP
Token: .Tim, POS: PUNCT, Tag: .
Token: Cook, POS: PROPN, Tag: NNP
Token: is, POS: AUX, Tag: VBZ
Token: the, POS: DET, Tag: DT
Token: CEO, POS: NOUN, Tag: NN
Token: of, POS: ADP, Tag: IN
Token: Apple, POS: PROPN, Tag: NNP
Token: ., POS: PUNCT, Tag: .
Token: Apple, POS: PROPN, Tag: NNP
Token: is, POS: AUX, Tag: VBZ
Token: looking, POS: VERB, Tag: VBG
Token: at, POS: ADP, Tag: IN
Token: buying, POS: VERB, Tag: VBG
Token: U.K., POS: PROPN, Tag: NNP
Token: startup, POS: NOUN, Tag: NN
Token: for, POS: ADP, Tag: IN
Token: $, POS: SYM, Tag: $
Token: 1, POS: NUM, Tag: CD
Token: billion, POS: NUM, Tag: CD
Token: ., POS: PUNCT, Tag: .
Token:  , POS: SPACE, Tag: _SP
Token: Amazon, POS: PROPN, Tag: NNP
Token: ,, POS: PUNCT, Tag: ,
Token: founded, POS

# Sentiment Analysis

In [8]:
text1 = "I love programming in Python. It's such a versatile language!"

blob = TextBlob(text1)
sentiment = blob.sentiment

print("Sentiment Analysis:")
print(f"Polarity: {sentiment.polarity}")
print(f"Subjectivity: {sentiment.subjectivity}")

Sentiment Analysis:
Polarity: 0.25
Subjectivity: 0.55
