# 1) Basics of Named Entity Recognition

Named Entity Recognition is a subtask of information extraction that classify named entities into pre-defined categories such as names of persons, organizations, locations

spaCy features an extremely fast statistical entity recognition system, that assigns labels to contiguous spans of tokens

The default model identifies a variety of named and numeric entities, including companies, locations, organizations and products

In [None]:
# officaial documentation
# https://spacy.io/usage/linguistic-features/#named-entities


In [None]:
# Import spaCy
import spacy

In [None]:
# load the English language library
nlp = spacy.load(name="en_core_web_sm")

In [None]:
# create a document object
document = nlp("Apple is looking at buying U.K. startup for $1 billion")

In [None]:
for entity in document.ents:
  print(entity.text, entity.start_char, entity.end_char, entity.label_, str(spacy.explain(entity.label_)))

Apple 0 5 ORG Companies, agencies, institutions, etc.
U.K. 27 31 GPE Countries, cities, states
$1 billion 44 54 MONEY Monetary values, including unit


In [None]:
# Create another doc object
documentTwo = nlp("San Francisco considers banning sidewalk delivery robots")

In [None]:
for entity in documentTwo.ents:
  print(entity.text, entity.start_char, entity.end_char, entity.label_, str(spacy.explain(entity.label_)))

San Francisco 0 13 GPE Countries, cities, states


# 2) Adding Named Entity to Span

In [None]:
documentThree = nlp("facebook is hiring a new vice president in U.S.")

In [None]:
for entity in documentThree.ents:
  print(entity.text, entity.label_, str(spacy.explain(entity.label_)))

U.S. GPE Countries, cities, states


In [None]:
# We will add Facebook as Named Entity

# importing span from spacy.tokens
from spacy.tokens import Span

In [None]:
# Get the hash value of ORG entity label
ORG = documentThree.vocab.strings["ORG"]
print(ORG)

383


In [None]:
# Create a Span for new entity
newEntity = Span(documentThree, 0, 1, label=ORG)
# Index locations from 0 to 1 (excludes 1)

# Add the entity to the existing Doc object
documentThree.ents = list(documentThree.ents) + [newEntity]

In [None]:
for entity in documentThree.ents:
  print(entity.text, entity.label_, str(spacy.explain(entity.label_)))

facebook ORG Companies, agencies, institutions, etc.
U.S. GPE Countries, cities, states


# 3) Visualizing Named Entities

In [None]:
# Import the displaCy library
from spacy import displacy

In [None]:
displacy.render(docs=document, style="ent", jupyter=True)

In [None]:
# Viewing Specific Entities
options = {"ents": ["ORG", "MONEY"]}
displacy.render(docs=document, style="ent", jupyter=True, options=options)