# Module 11: Named Entity Recognition

**Difficulty**: ⭐⭐⭐ Advanced  
**Estimated Time**: 110 minutes  
**Prerequisites**: Module 10

## Learning Objectives

1. Understand token classification tasks
2. Implement NER with transformers
3. Use BIO tagging scheme
4. Fine-tune for custom entity types
5. Evaluate with entity-level metrics

## Named Entity Recognition

**Task**: Identify and classify entities in text.

**Examples**:
- **Person**: Barack Obama, Marie Curie
- **Location**: Paris, Mount Everest
- **Organization**: Google, United Nations
- **Date**: July 4th, 1776

### BIO Tagging:

- **B**-PER: Beginning of person
- **I**-PER: Inside person
- **O**: Outside any entity

## Setup

In [None]:
from transformers import AutoTokenizer, AutoModelForTokenClassification, pipeline
from datasets import load_dataset

print('✓ Ready!')

## 1. Using Pre-trained NER

In [None]:
# Load NER pipeline
ner = pipeline('ner', aggregation_strategy='simple')

# Test
text = "Apple Inc. was founded by Steve Jobs in Cupertino, California."

entities = ner(text)

print(f"Text: {text}\n")
for ent in entities:
    print(f"{ent['entity_group']:12} {ent['word']:20} (score: {ent['score']:.2f})")

## 2. Fine-Tuning for Custom Entities

In [None]:
# Load CoNLL-2003 dataset
dataset = load_dataset('conll2003')

print(f'Label names: {dataset["train"].features["ner_tags"].feature.names}')

**Exercise**: Custom NER

1. Fine-tune BERT for NER
2. Add custom entity types
3. Evaluate with F1, precision, recall
4. Visualize entity predictions

In [None]:
# YOUR CODE HERE

## Summary

NER is crucial for information extraction. Transformers achieve excellent performance on entity recognition tasks.