## Named Entity Recognition
- Named Entity Recognition **identifies and classifies key entities** in text into predefined categories such as:

    - Person (e.g., "Albert Einstein"),
    - Organization (e.g., "Google"),
    - Location (e.g., "New York"),
    - Date/Time (e.g., "2024-11-23"),
    - Monetary Values (e.g., "$100"),
    - Miscellaneous (custom categories like product names, scientific terms, etc.).

- Helps to extract **stuctured information from unstructured text**, enabling systems to understand the 'who', 'what' amd 'where' in data.
- Used in **question answering, chatbots, machine translation, and summarization**.

### How Does NER Work?

NER combines:

1. **Tokenization**: Splitting text into words or phrases.
2. **Part of Speech (POS) Tagging**: To understand the syntactic role of tokens.
3. **Entity Recognition Models**:
    - Rule-Based: Uses handcrafted rules and regular expressions.
    - Statistical: Uses machine learning techniques like CRF, HMM, or neural networks.
    - Pre-trained Models: Modern deep learning models such as BERT and spaCy.


In [2]:
sentence="The Eiffel Tower was built from 1887 to 1889 by French engineer Gustave Eiffel, whose company specialized in building metal frameworks and structures."
"""
Person Eg: Shubham Prajapati
Place Or Location Eg: India
Date Eg: September,24-09-1989
Time  Eg: 4:30pm
Money Eg: 1 million dollar
Organization Eg: Ideyalabs
Percent Eg: 20%, twenty percent
"""

'\nPerson Eg: Shubham Prajapati\nPlace Or Location Eg: India\nDate Eg: September,24-09-1989\nTime  Eg: 4:30pm\nMoney Eg: 1 million dollar\nOrganization Eg: Ideyalabs\nPercent Eg: 20%, twenty percent\n'

In [6]:
import nltk
words = nltk.word_tokenize(sentence)
words

['The',
 'Eiffel',
 'Tower',
 'was',
 'built',
 'from',
 '1887',
 'to',
 '1889',
 'by',
 'French',
 'engineer',
 'Gustave',
 'Eiffel',
 ',',
 'whose',
 'company',
 'specialized',
 'in',
 'building',
 'metal',
 'frameworks',
 'and',
 'structures',
 '.']

In [4]:
tag_elements = nltk.pos_tag(words)

In [7]:
## Named-Entity
import numpy
nltk.ne_chunk(tag_elements).draw()