**Named Entity Recognition (NER)** is a technique in NLP used to locate and classify key information in text into predefined categories such as **names of people, organizations, locations, dates,** and **quantities**. By identifying and categorizing these "named entities," NER helps extract useful information from text data, which is essential for tasks like information retrieval, question answering, and text summarization.

### Categories Commonly Detected in NER:
- **Person**: Names of people (e.g., "Elon Musk")
- **Organization**: Companies, agencies, institutions (e.g., "Google", "United Nations")
- **Location**: Geographical locations (e.g., "Paris", "Mount Everest")
- **Date/Time**: Specific dates or times (e.g., "January 1, 2022", "10:00 AM")
- **Money/Percent/Quantity**: Financial values, percentages, or measurements (e.g., "$1 million", "20%", "for more advanced entity recognition and custom entity training.

In [4]:
nltk.download('maxent_ne_chunker_tab')
nltk.download('words')

[nltk_data] Downloading package maxent_ne_chunker_tab to
[nltk_data]     C:\Users\sayan\AppData\Roaming\nltk_data...
[nltk_data]   Package maxent_ne_chunker_tab is already up-to-date!
[nltk_data] Downloading package words to
[nltk_data]     C:\Users\sayan\AppData\Roaming\nltk_data...
[nltk_data]   Package words is already up-to-date!


True

In [8]:
import nltk
from nltk import word_tokenize, pos_tag, ne_chunk

# Sample text
text = "Apple Inc. was founded by Steve Jobs in Cupertino, California, in 1976. In 2021, the company was valued at over $2 trillion."

# Tokenize and tag parts of speech
words = word_tokenize(text)
tagged_words = pos_tag(words)

# Perform Named Entity Recognition
named_entities = ne_chunk(tagged_words)

# Display named entities
print(named_entities)

(S
  (PERSON Apple/NNP)
  (ORGANIZATION Inc./NNP)
  was/VBD
  founded/VBN
  by/IN
  (PERSON Steve/NNP Jobs/NNP)
  in/IN
  (GPE Cupertino/NNP)
  ,/,
  (GPE California/NNP)
  ,/,
  in/IN
  1976/CD
  ./.
  In/IN
  2021/CD
  ,/,
  the/DT
  company/NN
  was/VBD
  valued/VBN
  at/IN
  over/IN
  $/$
  2/CD
  trillion/CD
  ./.)
