# Entity Extractor Tool
Welcome to this beginner-friendly tutorial on building an Entity Extractor Tool using Python and spaCy.
In this notebook, we'll learn how to identify and visualize entities such as people, organizations, and places from text.

Let's get started!

## 1. Setup: Installing spaCy and Downloading the Model
First, we need to install spaCy, a powerful NLP library, and download its English model.

You can do this in your environment by running:

In [None]:
!pip install spacy
!python -m spacy download en_core_web_sm

## 2. Import Required Libraries
Let's import spaCy and other helpful modules.

In [None]:
import spacy
from spacy import displacy
import json
from collections import Counter

## 3. Building the Entity Extractor Class
We'll create a class to process text, extract entities, visualize them, summarize, and save results.

In [None]:
class EntityExtractor:
    def __init__(self):
        self.nlp = spacy.load("en_core_web_sm")

    def extract_entities(self, text):
        """Process text to extract named entities."""
        doc = self.nlp(text)
        entities = [(ent.text, ent.label_) for ent in doc.ents]
        return entities, doc

    def visualize_entities(self, doc):
        """Create an interactive visualization of entities."""
        displacy.render(doc, style='ent', jupyter=True)

    def get_summary(self, entities):
        """Count entities by type."""
        entity_types = [label for _, label in entities]
        counts = Counter(entity_types)
        return counts

    def export_results(self, entities, filename):
        """Save entities and counts to a JSON file."""
        results = {
            "entities": [{'text': text, 'label': label} for text, label in entities],
            "summary": self.get_summary(entities)
        }
        with open(filename, 'w') as f:
            json.dump(results, f, indent=4)

## 4. Using the EntityExtractor
Now, let's try processing some sample text and see the results.

In [None]:
# Create an instance of the extractor
extractor = EntityExtractor()

# Sample user input
user_input = "Microsoft CEO Satya Nadella announced the opening of a new office in London. The hiring will focus on AI researchers from Stanford and MIT."

# Extract entities
entities, doc = extractor.extract_entities(user_input)

# Visualize entities
extractor.visualize_entities(doc)

# Get summary counts
summary = extractor.get_summary(entities)
print("Entities found:")
for text, label in entities:
    print(f"{text} ({label})")
print("\nSummary:")
print(summary)

# Export results to JSON
extractor.export_results(entities, 'entity_results.json')

## 5. Summary
In this notebook, you've learned how to:
- Install and load spaCy
- Build an EntityExtractor class for processing text
- Visualize entities with displacy
- Generate summaries of entity counts
- Save extracted data into a JSON file

This is the foundation for building more advanced NLP tools! Keep experimenting with different texts and entity types.