## Name Entity Recognition

NER, or Named Entity Recognition, is a fundamental task in natural language processing (NLP) that involves identifying and categorizing named entities in text. Named entities refer to specific entities that have names, such as persons, organizations, locations, dates, and more. NER plays a crucial role in various NLP applications, including information extraction, question answering, and text summarization.

## Basics of NER:

<h3>Definition:</h3> 

NER is the process of identifying and classifying named entities in text into predefined categories. The commonly recognized entity types include person names, organizations, locations, date expressions, time expressions, monetary values, percentages, and more.

<h3>Entity Types:</h3> 
    
<h4> Named entities can be categorized into various types, depending on the domain and application. The common entity types include: </h4>

Person: Individual names or pronouns representing people.

Organization: Company names, institutions, or other organized groups.

Location: Names of cities, countries, regions, or specific places.

Date: Expressions representing dates or periods.

Time: Expressions representing specific times or durations.

Money: Monetary values, including currencies.

Percentage: Numeric expressions representing percentages.

Miscellaneous: Other named entities that do not fit into the above categories.

<h3>Annotation:</h3> 

NER typically involves annotating a dataset with labeled named entities. Human annotators manually label the named entities in the text and assign appropriate entity types. This annotated data is then used to train NER models.

<h3> Applications:</h3> 

NER is used in a wide range of applications, including:

<h4>*Information Extraction:</h4> 
Extracting structured information from unstructured text, such as extracting person names and organizations from news articles.

<h4>*Question Answering:</h4> Identifying and extracting relevant entities to answer specific questions, such as extracting the location mentioned in a question like "Where was the conference held?"
Text Summarization: Recognizing important named entities to generate informative summaries.
<h4>*Document Classification:</h4> Augmenting document classification models with named entity information for improved performance.

In [3]:
import spacy

# Load the pre-trained English model
nlp = spacy.load('en_core_web_sm')

In [7]:
# Text to be processed
text = "Apple Inc. is headquartered in Cupertino, California."

# Process the text with NER
doc = nlp(text)

# Iterate over the entities in the document
for entity in doc.ents:
    print(entity.text, entity.label_)


Apple Inc. ORG
Cupertino GPE
California GPE


In [8]:
# Output the tagged text with entity annotations
print([(entity.text, entity.label_) for entity in doc.ents])

[('Apple Inc.', 'ORG'), ('Cupertino', 'GPE'), ('California', 'GPE')]


As you can see, the entities "Apple Inc." (an organization), "Cupertino" (a geopolitical entity), and "California" (a geopolitical entity) are correctly recognized and labeled by the NER model.