__Named Entity Recognition __

Named Entity Recognition (NER) is a sub-task of Natural Language Processing (NLP) that aims to extract entities such as names of people, organizations, locations, dates, etc. from text. It involves identifying and classifying named entities in unstructured text into pre-defined categories.

NER is an essential component of many NLP applications, including chatbots, sentiment analysis, recommender systems, and machine translation, among others. It helps in better understanding the meaning of text and extracting useful insights.

For example, in a sentence like "John works at Google and lives in San Francisco," NER can extract the entities "John" as a person, "Google" as an organization, and "San Francisco" as a location.

NER is accomplished by using machine learning algorithms such as Conditional Random Fields (CRFs) and Recurrent Neural Networks (RNNs) to label the entities in text. These algorithms are trained on large datasets that are manually annotated with named entities.

Overall, NER is a critical tool for extracting meaningful information from unstructured text, and it has numerous applications across various industries. 

In [1]:
# Perform standard imports
import spacy
nlp = spacy.load('en_core_web_sm')

In [2]:
# Write a function to display basic entity info
def show_ents(doc):
    if doc.ents:
        for ent in doc.ents:
            print(ent.text+' - '+ent.label_+' - '+str(spacy.explain(ent.label_)))
        else:
            print('No named entities found.')

In [3]:
doc = nlp(u'Hi, everyone in Ashiqur Rahman')

In [4]:
show_ents(doc)

Ashiqur Rahman - GPE - Countries, cities, states
No named entities found.


In [5]:
doc2 = nlp(u'May i go to Bangladesh or Canada, next month to see the virous report?')
show_ents(doc2)

Bangladesh - GPE - Countries, cities, states
Canada - GPE - Countries, cities, states
next month - DATE - Absolute or relative dates or periods
No named entities found.


### Adding named entity to a span

In [7]:
doc = nlp(u'Ashiqur to build a u.k factory for $6 million')
show_ents(doc)

u.k - GPE - Countries, cities, states
$6 million - MONEY - Monetary values, including unit
No named entities found.


In [8]:
from spacy.tokens import Span

# Get the hash value of the org entity label
org = doc.vocab.strings[u'PERSON']

#Create a span for the new entity
new_ent = Span(doc, 0,1, label=org)

# Add the entity to the existing doc object
doc.ents = list(doc.ents)+[new_ent]

In [9]:
show_ents(doc)

Ashiqur - PERSON - People, including fictional
u.k - GPE - Countries, cities, states
$6 million - MONEY - Monetary values, including unit
No named entities found.


### Visualizing named entities


In [10]:
import spacy
nlp = spacy.load('en_core_web_sm')

In [11]:
from spacy import displacy

In [12]:
doc = nlp(u'Over the last quarter Apple sold nearly 20 thousand iPhone for a profit of $10 million.')
displacy.render(doc, style='ent', jupyter=True)