## NER (Named Entity Recognition)
Named Entity Recognition (NER) is a subtask of Natural Language Processing (NLP) that involves identifying and classifying entities in text into predefined categories such as names of persons, organizations, locations, dates, and more. It is widely used in applications like information extraction, question answering, and text summarization.


In [1]:
import spacy

nlp = spacy.load('en_core_web_sm')

In [2]:
nlp.pipe_names

['tok2vec', 'tagger', 'parser', 'attribute_ruler', 'lemmatizer', 'ner']

In [8]:
# doc = nlp("Tesla inc is going to acquire twitter inc for $45 billion")
# doc = nlp("Tesla inc is going to acquire twitter for $45 billion")

doc = nlp("Tesla inc is going to acquire Twitter inc for $45 billion")


for ent in doc.ents:
    print(ent.text, "|", ent.label_, "|", spacy.explain(ent.label_))

Tesla inc | ORG | Companies, agencies, institutions, etc.
Twitter inc | ORG | Companies, agencies, institutions, etc.
$45 billion | MONEY | Monetary values, including unit


In [9]:
nlp.pipe_labels['ner']

['CARDINAL',
 'DATE',
 'EVENT',
 'FAC',
 'GPE',
 'LANGUAGE',
 'LAW',
 'LOC',
 'MONEY',
 'NORP',
 'ORDINAL',
 'ORG',
 'PERCENT',
 'PERSON',
 'PRODUCT',
 'QUANTITY',
 'TIME',
 'WORK_OF_ART']

In [19]:
doc = nlp('Apple is a mobile company. Apple is good for health, we should eat it daily.')

for ent in doc.ents:
    print(ent, "|", ent.label_)

Apple | ORG
Apple | ORG
daily | DATE


### Span

In [12]:
doc

Apple is a mobile company. Apple is good for health, we should eat it daily.

In [14]:
print(doc[2:6])
print(type(doc[2:6]))

a mobile company.
<class 'spacy.tokens.span.Span'>


In [16]:
from spacy.tokens import Span

s1 = Span(doc, 0, 1, label="ORG")
s2 = Span(doc, 6, 7, label="ORG")

doc.set_ents([s1, s2], default="unmodified")

In [17]:
for ent in doc.ents:
    print(ent, "|", ent.label_)

Apple | ORG
Apple | ORG
daily | DATE
