#### Namend Entity Recognization(NER)

- means to find the name of the person, place, organization, etc. from the text.
- NER is a subtask of information extraction that locates and classifies named entities in a text.
- The named entities could be organizations, persons, locations, times, etc.

NER is used in many fields in Natural Language Processing (NLP), and it can help answering many real-world questions, such as:
1. Which companies were mentioned in the news article?
2. Were specified products mentioned in complaints or reviews?
3. Does the tweet contain the name of a person? Does the tweet contain this person’s location?
4. Which people are mentioned in the blog posts?
5. Which geographic locations are talked about in the tweets?
6. Does the text contain any dates or times?
7. What are the person’s name and location?
8. Which products and services are mentioned?
9. Which organizations, percentages,time references, numbers, dates, events, locations are mentioned?

**Important User Cases**
1. Search
2. Recommendations System
3. Customer Care

In [1]:
import spacy
nlp = spacy.load('en_core_web_sm')

In [2]:
nlp.pipe_names

['tok2vec', 'tagger', 'parser', 'attribute_ruler', 'lemmatizer', 'ner']

In [7]:
doc = nlp("Tesla Inc is going to acquire Twitter Inc for $45 billion")

for ent in doc.ents:
    print(ent.text, " | ", ent.label_, " | ", str(spacy.explain(ent.label_)))

Tesla Inc  |  ORG  |  Companies, agencies, institutions, etc.
Twitter Inc  |  ORG  |  Companies, agencies, institutions, etc.
$45 billion  |  MONEY  |  Monetary values, including unit


In [8]:
from spacy import displacy
display = displacy.render(doc, style='ent')

In [10]:
nlp.pipe_labels["ner"]

['CARDINAL',
 'DATE',
 'EVENT',
 'FAC',
 'GPE',
 'LANGUAGE',
 'LAW',
 'LOC',
 'MONEY',
 'NORP',
 'ORDINAL',
 'ORG',
 'PERCENT',
 'PERSON',
 'PRODUCT',
 'QUANTITY',
 'TIME',
 'WORK_OF_ART']

List of entities are also documented on this page: https://spacy.io/models/en

In [12]:
doc = nlp("Michael Bloomberg founded Bloomberg in 1982")
for ent in doc.ents:
    print(ent.text, "|", ent.label_, "|", spacy.explain(ent.label_))

Bloomberg | PERSON | People, including fictional
Bloomberg | PERSON | People, including fictional
1982 | DATE | Absolute or relative dates or periods


**Setting custom entities**

In [19]:
doc = nlp("Tesla is going to acquire Twitter for $45 billion")
for ent in doc.ents:
    print(ent.text, " | ", ent.label_)

type(doc.ents)
type(doc.ents[0])

tesla  |  ORG
$45 billion  |  MONEY


spacy.tokens.span.Span

In [21]:
from spacy.tokens import Span

s1 = Span(doc, 0, 1, label="ORG")
s2 = Span(doc, 5, 6, label="ORG")

doc.set_ents([s1, s2], default="unmodified")

for ent in doc.ents:
    print(ent.text, " | ", ent.label_)

tesla  |  ORG
twitter  |  ORG
$45 billion  |  MONEY


But for many ner, this is not a standard way of updating the entity.

**Approaches :**
1. Simple Lookup : Manually updating the database wheere it is found.
2. Rule Based NER
3. CRF(Conditional Random Fields), BERT
