# Advanced NER: Entity Types and Real-World Use Cases

In this notebook, we'll explore the different types of entities that Named Entity Recognition (NER) can identify and see how it's used in real-world scenarios.

## NER in the Real World

Companies across various industries use NER to extract useful information from text data. Here's an example image showing a global network with different entity types connected across industries:

![Global network with different entity types connected across industries, 800x600](images/global_network_entities.png)

*How Fortune 500 companies use NER to gain competitive advantages*

## Complete Entity Type Library

NER can identify many different types of entities in text. Here's a list of common entity types:

### People & Organizations
- **PERSON**: Individual names
- **ORG**: Companies, agencies
- **NORP**: Nationalities, groups

### Places & Locations
- **GPE**: Countries, cities, states
- **LOC**: Mountains, rivers, regions
- **FAC**: Buildings, airports, highways

### Numbers & Measurements
- **MONEY**: Monetary values
- **PERCENT**: Percentage values
- **QUANTITY**: Measurements, weights
- **ORDINAL**: First, second, third
- **CARDINAL**: Numerical values

### Time & Dates
- **DATE**: Dates
- **TIME**: Times

## Industry Success Stories

Many industries benefit from NER:

- **📰 News & Media:** Reuters uses NER to auto-tag articles, reducing manual work by 80%
- **🏦 Banking:** JPMorgan extracts entities from contracts for risk assessment
- **🏥 Healthcare:** Hospitals use NER to structure patient records and research
- **⚖️ Legal:** Law firms process thousands of documents for case research

## Enterprise-Grade NER

Building production-ready NER systems involves handling large volumes of documents efficiently.

*Let's explore how to build such systems.*

*Welcome to the big leagues! 💼*

## Production NER Pipeline

Below is an example of how to create a scalable NER processing pipeline using spaCy and pandas:

In [None]:
import spacy
from spacy.lang.en.stop_words import STOP_WORDS
import pandas as pd

class ProductionNER:
    def __init__(self, model_path="en_core_web_sm"):
        self.nlp = spacy.load(model_path)
        self.results = []
    
    def process_batch(self, texts):
        """Process multiple texts efficiently"""
        docs = list(self.nlp.pipe(texts, batch_size=50))
        
        for doc in docs:
            entities = {
                "text": doc.text,
                "entities": [(ent.text, ent.label_, ent.start_char, ent.end_char) for ent in doc.ents],
                "entity_count": len(doc.ents)
            }
            self.results.append(entities)
        
        return self.results
    
    def get_statistics(self):
        """Get processing statistics"""
        df = pd.DataFrame(self.results)
        return df.explode('entities').groupby('entities.1').size().sort_values(ascending=False)

# Usage example:
# ner_processor = ProductionNER()
# results = ner_processor.process_batch(news_articles)
# stats = ner_processor.get_statistics()

### Try it yourself

Click the link below to try this pipeline in Google Colab:

[🚀 Try in Colab](https://colab.research.google.com/github/Roopesht/codeexamples/blob/main/genai/python_easy/3/advanced.ipynb)

## Advanced NER Applications

NER is used in many advanced scenarios, including:

- 🔍 **Competitive Intelligence:** Track competitor mentions across media
- 📊 **Risk Management:** Identify potential risks in financial documents
- 🤖 **Chatbot Enhancement:** Better understanding of user intents
- 📰 **Content Curation:** Automatically categorize and route content
- 🔒 **Data Privacy:** Automatically detect and mask sensitive information

## Enterprise Thinking

Implementing production NER systems allows processing of millions of documents in real time, providing valuable insights for business decisions.

**Question:**
If you could implement NER across your entire organization's documents, what business problem would you solve first and what ROI would you expect?