# üåç Named Entity Recognition (NER) - Beyond Grammar

**Goal**: Find real-world entities in text (people, places, organizations, dates)

## Why NER is Powerful?

### POS vs NER:

| Aspect | POS Tagging | NER |
|--------|-------------|-----|
| **Focus** | Grammar structure | Real-world meaning |
| **Example** | "Elon" = NNP (Proper Noun) | "Elon Musk" = PERSON |
| **Scope** | Word-level | Phrase-level |
| **Output** | Grammatical tag | Entity type |

---

## Real-World Applications:

### üì∞ News Analysis
Extract who, where, when from articles:
- "**Elon Musk** announced **SpaceX** launch on **January 15**"
- PERSON: Elon Musk
- ORGANIZATION: SpaceX  
- DATE: January 15

### üîç Search Engines
Understand queries better:
- "Tesla cars in India" ‚Üí ORGANIZATION + GPE

### üíº Resume Parsing
Extract:
- Names (PERSON)
- Companies worked at (ORGANIZATION)
- Locations (GPE)

### ü§ñ Chatbots
- "Book flight from **Mumbai** to **New York** on **Dec 20**"
- Extract: From_location, To_location, Date

### üìä Knowledge Graphs
Build relationships: *Elon Musk ‚Üí founded ‚Üí SpaceX*

---

## ‚ö†Ô∏è Important Setup Note

NER requires special downloads:
```python
nltk.download('maxent_ne_chunker')  # The NER model
nltk.download('words')               # Dictionary
```

If you get errors, make sure you've downloaded these!

### Named Entity Recognition

# üè∑Ô∏è Named Entity Recognition (NER)

**Goal**: Identify real-world entities in text

## NER vs POS Tagging

| Aspect | POS Tagging | NER |
|--------|-------------|-----|
| Focus | Grammar | Meaning |
| Output | Noun, Verb, Adj | Person, Location, Org |
| Example | NNP | PERSON |

## Common Entity Types

- **PERSON** - Human names (Elon Musk)
- **GPE** - Geo-Political Entity (France, India)
- **ORGANIZATION** - Companies (Google, Microsoft)
- **LOCATION** - Places (Eiffel Tower)
- **DATE** - Dates (2024, January 1st)
- **MONEY** - Currency ($1 million)

## NER Workflow

```
Text ‚Üí Words ‚Üí POS Tags ‚Üí Named Entities
```

**Requirements**: Must download chunker first!
```python
nltk.download('maxent_ne_chunker')
nltk.download('words')
```

In [22]:
sentence="The Eiffel Tower was built from 1887 to 1889 by Gustave Eiffel, whose company specialized in building metal frameworks and structures."

In [19]:
import nltk
words=nltk.word_tokenize(sentence)

In [20]:
tag_elements=nltk.pos_tag(words)

In [8]:
nltk.download('maxent_ne_chunker')

[nltk_data] Downloading package maxent_ne_chunker to
[nltk_data]     C:\Users\win10\AppData\Roaming\nltk_data...
[nltk_data]   Unzipping chunkers\maxent_ne_chunker.zip.


True

In [10]:
nltk.download('words')

[nltk_data] Downloading package words to
[nltk_data]     C:\Users\win10\AppData\Roaming\nltk_data...
[nltk_data]   Unzipping corpora\words.zip.


True

In [21]:
nltk.ne_chunk(tag_elements).draw()

---

## ‚úÖ Checkpoint: What You Learned

- ‚úÖ What NER is (finding real-world entities)
- ‚úÖ Common entity types (PERSON, GPE, ORGANIZATION, DATE)
- ‚úÖ Difference between POS and NER
- ‚úÖ How to use `ne_chunk()` with POS tags
- ‚úÖ Real-world applications (news, chatbots, search)

---

## üéØ Next Step - THE BIG SHIFT!

Text is clean and understood. Now the crucial step: **Convert text to numbers!**

**Next Notebook**: `7-Bag+Of+Words+Practical's.ipynb`

**What you'll learn**: How to convert text into numerical vectors using word frequencies

---

üí° **This is where ML begins!** Everything before was preparation. Now we make text ML-ready! üöÄ