# 📧 Zero Inbox Agent - Step-by-Step Testing

Test each phase of the Zero Inbox email categorization system.

**Goal**: Automatically categorize emails into 3 MVP categories:
- Other/Advertising
- Other/Rest  
- Review/Job search

---

## 🔧 Section 1: Setup

**What this does**: Initialize database and verify Gmail connection

**Test goal**: ✅ Database ready, Gmail credentials working

In [None]:
# Import required libraries
import sys
import os
import yaml
import logging
from datetime import datetime, timedelta
from dotenv import load_dotenv

# Load environment variables
load_dotenv()

# Setup logging for notebook
logging.basicConfig(level=logging.INFO, format="%(levelname)s: %(message)s")
logger = logging.getLogger(__name__)

print("📚 Libraries imported successfully")

📚 Libraries imported successfully


In [None]:
# Load configuration
with open("config/config.yaml", "r") as f:
    config = yaml.safe_load(f)

print("⚙️  Configuration loaded")
print(f"Gmail credentials file: {config['gmail']['credentials_file']}")
print(f"Claude model: {config['claude']['model']}")

⚙️  Configuration loaded
Gmail credentials file: config/gmail_credentials.json
Claude model: claude-3-5-sonnet-20241022


In [None]:
# Initialize Zero Inbox database
from models.zero_inbox_models import DatabaseManager

db_manager = DatabaseManager("sqlite:///data/zero_inbox.db")
success = db_manager.initialize_database()

if success:
    print("✅ Database initialized successfully")

    # Show current database status
    success, table_info = db_manager.verify_schema()
    if success:
        print("\n📊 Database status:")
        for info in table_info:
            print(f"  - {info}")
else:
    print("❌ Database initialization failed")

INFO: ✅ Zero Inbox database initialized: sqlite:///data/zero_inbox.db
INFO: ✅ Database schema verification successful:
INFO:   - emails: 0 records
INFO:   - email_categories: 3 records
INFO:   - agent_actions: 0 records
INFO:   - human_reviews: 0 records


✅ Database initialized successfully

📊 Database status:
  - emails: 0 records
  - email_categories: 3 records
  - agent_actions: 0 records
  - human_reviews: 0 records


In [None]:
# Test Gmail credentials (without fetching emails)
try:
    from gmail_server import GmailServer

    gmail_server = GmailServer(
        config["gmail"]["credentials_file"],
        config["gmail"]["token_file"],
        config["gmail"]["scopes"],
        config,
    )

    print("✅ Gmail authentication successful")
    print("Ready to fetch emails!")

except Exception as e:
    print(f"❌ Gmail authentication failed: {e}")
    print("Please check your Gmail credentials setup")

INFO: file_cache is only supported with oauth2client<4.0.0
INFO: Gmail authentication successful


✅ Gmail authentication successful
Ready to fetch emails!


---

## 📧 Section 2: Fetch Emails

**What this does**: Fetch emails from Gmail and store in database

**Test goal**: ✅ X emails fetched and stored (no duplicates)

In [5]:
# Configure email fetching parameters
DAYS_BACK = 1  # Change this to fetch more/fewer days
MAX_EMAILS = 20  # Limit for testing

print(f"📅 Will fetch emails from last {DAYS_BACK} day(s)")
print(f"📊 Maximum emails to fetch: {MAX_EMAILS}")

# Alternative: Use specific date range
# FROM_DATE = "2025-08-28"  # YYYY-MM-DD format
# TO_DATE = "2025-08-29"    # YYYY-MM-DD format
# print(f"📅 Date range: {FROM_DATE} to {TO_DATE}")

📅 Will fetch emails from last 1 day(s)
📊 Maximum emails to fetch: 20


In [None]:
# Initialize email fetcher and fetch emails
from zero_inbox_fetcher import ZeroInboxEmailFetcher

print("🔄 Starting email fetch...")

# Create fetcher
email_fetcher = ZeroInboxEmailFetcher(config, db_manager)

# Fetch emails
emails_fetched, emails_stored = email_fetcher.fetch_and_store_emails(
    days_back=DAYS_BACK,
    max_emails=MAX_EMAILS,
    # Alternatively, use date range:
    # from_date=FROM_DATE,
    # to_date=TO_DATE
)

print(f"\n📊 Results:")
print(f"  - Emails fetched from Gmail: {emails_fetched}")
print(f"  - Emails stored in database: {emails_stored}")
print(f"  - Duplicates skipped: {emails_fetched - emails_stored}")

INFO: file_cache is only supported with oauth2client<4.0.0
INFO: Gmail authentication successful
INFO: ✅ Zero Inbox Email Fetcher initialized
INFO: 🔄 Starting Zero Inbox email fetch and store process
INFO: 📅 Using days back: 1
INFO: Fetching emails with query: after:2025/08/28 ((subject:faktura OR faktura) OR (subject:räkning OR räkning) OR (subject:förfallodag OR förfallodag) OR (subject:förfallodatum OR förfallodatum) OR (subject:betalning OR betalning) OR (subject:att betala OR att betala) OR (subject:totalt belopp OR totalt belopp) OR (subject:slutsumma OR slutsumma) OR (subject:ocr OR ocr) OR (subject:bankgiro OR bankgiro) OR (subject:plusgiro OR plusgiro) OR (subject:invoice OR invoice) OR (subject:bill OR bill) OR (subject:statement OR statement) OR (subject:due OR due) OR (subject:payment due OR payment due) OR (subject:amount due OR amount due) OR (subject:total amount OR total amount) OR (subject:balance due OR balance due) OR (from:Vattenfall OR Vattenfall) OR (from:Telia OR

🔄 Starting email fetch...


INFO: Found 15 potential invoice emails
INFO: Processed email 1/15: Transform your images into unique content...
INFO: Processed email 2/15: “project manager”: Kambi - Delivery Manager and mo...
INFO: Processed email 3/15: Nytt ränteråd för brf:er...
INFO: Processed email 4/15: Publication editors, welcome to your new submissio...
INFO: Processed email 5/15: ~ 4 unique approaches to using diptychs in photogr...
INFO: Processed email 6/15: Just nu – ränterabatt för bostadsrätter...
INFO: Processed email 7/15: Påminnelse: Har du 1 minut över? Hjälp oss att bli...
INFO: Processed email 8/15: Du har fått en ny faktura från Bangerhead.se...
INFO: Processed email 9/15: Bangerhead, order 5703156...
INFO: Processed email 10/15: You may be a fit for HEMKÖP’s Produktägare kundpro...
INFO: Processed email 11/15: ByteDance passes Meta 📈, Google's manager purge 💼,...
INFO: Processed email 12/15: “project manager”: Sun IP - Patent Prosecution Pro...
INFO: Processed email 13/15: “consultant”: Billenn


📊 Results:
  - Emails fetched from Gmail: 15
  - Emails stored in database: 15
  - Duplicates skipped: 0


In [None]:
# Show sample stored emails
from models.zero_inbox_models import Email

session = db_manager.get_session()
recent_emails = (
    session.query(Email).order_by(Email.date_processed.desc()).limit(5).all()
)

print(f"📧 Sample of {len(recent_emails)} most recent emails:")
for i, email in enumerate(recent_emails, 1):
    print(f"\n{i}. From: {email.sender[:50]}...")
    print(f"   Subject: {email.subject[:50]}...")
    print(f"   Date: {email.date_received}")
    print(f"   Body preview: {email.body[:100]}...")
    if email.pdf_content:
        print(f"   📎 Has PDF content ({len(email.pdf_content)} chars)")

session.close()

📧 Sample of 5 most recent emails:

1. From: Medium Daily Digest <noreply@medium.com>...
   Subject: The Claude Code Workflow You Can Copy | Chris Dunl...
   Date: 2025-08-28 05:10:00
   Body preview: Stories for Christian Wahlström
@christian.wahlstrom (https://medium.com/@christian.wahlstrom?source...

2. From: Adobe Firefly <mail@mail.adobe.com>...
   Subject: How to make your social posts stand out, fast...
   Date: 2025-08-28 00:06:22
   Body preview: ------------------------------------------------------------------------

View web version:
https://...

3. From: LinkedIn Job Alerts <jobalerts-noreply@linkedin.co...
   Subject: “consultant”: Billennium - 🌍 SAP Consultant – Tale...
   Date: 2025-08-28 07:44:15
   Body preview: Your job alert for consultant in Greater Stockholm Metropolitan Area
26 new jobs match your preferen...

4. From: LinkedIn Job Alerts <jobalerts-noreply@linkedin.co...
   Subject: “project manager”: Sun IP - Patent Prosecution Pro...
   Date: 2025-08-28 09:44:

---

## 🏷️ Section 3: Categorize Emails

**What this does**: Analyze emails and assign categories using AI

**Test goal**: ✅ X emails categorized into Other/Advertising, Other/Rest, Review/Job search

⚠️ **Note**: This section is a placeholder - categorization agents will be implemented in Phase 3

In [None]:
# Show emails that need categorization
from models.zero_inbox_models import EmailCategory

session = db_manager.get_session()

# Find emails without categories
uncategorized_emails = (
    session.query(Email)
    .outerjoin(EmailCategory)
    .filter(EmailCategory.email_id == None)
    .all()
)

print(f"📊 Emails ready for categorization: {len(uncategorized_emails)}")

if uncategorized_emails:
    print("\n📧 Sample emails to categorize:")
    for i, email in enumerate(uncategorized_emails[:3], 1):
        print(f"\n{i}. From: {email.sender[:40]}...")
        print(f"   Subject: {email.subject[:60]}...")
        print(f"   Preview: {email.body[:80]}...")
else:
    print("ℹ️ All emails are already categorized")

session.close()

📊 Emails ready for categorization: 15

📧 Sample emails to categorize:

1. From: Adobe Creative Cloud for Photographers <...
   Subject: Transform your images into unique content...
   Preview: ------------------------------------------------------------------------

View w...

2. From: LinkedIn Job Alerts <jobalerts-noreply@l...
   Subject: “project manager”: Kambi - Delivery Manager and more...
   Preview: Your job alert for project manager in Greater Stockholm Metropolitan Area
23 new...

3. From: SEB <noreply@newsletter.seb.se>...
   Subject: Nytt ränteråd för brf:er...
   Preview: SEB

Få koll på den senaste utvecklingen. Dessutom: tips om brf-mässa i Stockhol...


In [9]:
# PLACEHOLDER: Categorization will be implemented in Phase 3
print("🚧 PLACEHOLDER: Email Categorization Agent")
print("")
print("This section will include:")
print("- EmailCategorizer agent using Claude API")
print("- Category rules from CSV configuration")
print("- Confidence scoring for each categorization")
print("- Storage of results in email_categories table")
print("")
print("Target categories:")
print("  - Other/Advertising: Promotional/sales emails")
print("  - Other/Rest: Uncategorized emails")
print("  - Review/Job search: Job opportunities")

🚧 PLACEHOLDER: Email Categorization Agent

This section will include:
- EmailCategorizer agent using Claude API
- Category rules from CSV configuration
- Confidence scoring for each categorization
- Storage of results in email_categories table

Target categories:
  - Other/Advertising: Promotional/sales emails
  - Other/Rest: Uncategorized emails
  - Review/Job search: Job opportunities


---

## ⚡ Section 4: Execute Actions

**What this does**: Run specific analysis on categorized emails

**Test goal**: ✅ X actions completed (summaries, job analysis, etc.)

⚠️ **Note**: This section is a placeholder - action agents will be implemented in Phase 4

In [None]:
# Show categorized emails ready for actions
session = db_manager.get_session()

categorized_emails = session.query(Email).join(EmailCategory).all()

print(f"📊 Categorized emails ready for actions: {len(categorized_emails)}")

if categorized_emails:
    # Group by category
    from collections import defaultdict

    by_category = defaultdict(list)

    for email in categorized_emails:
        for category in email.categories:
            key = f"{category.category}/{category.subcategory}"
            by_category[key].append(email)

    print("\n📈 Breakdown by category:")
    for category, emails in by_category.items():
        print(f"  - {category}: {len(emails)} emails")
else:
    print("ℹ️ No categorized emails found")

session.close()

📊 Categorized emails ready for actions: 0
ℹ️ No categorized emails found


In [11]:
# PLACEHOLDER: Action agents will be implemented in Phase 4
print("🚧 PLACEHOLDER: Action Agents")
print("")
print("This section will include:")
print("")
print("1. AdvertisingAnalyzer:")
print("   - Analyze advertising emails")
print("   - Summarize categorization reasoning")
print("   - Identify key indicators")
print("")
print("2. RestCategorizer:")
print("   - Process uncategorized emails")
print("   - Create sender/subject summaries")
print("   - Explain why emails didn't fit other categories")
print("")
print("3. JobSearchAnalyzer:")
print("   - Scan for target companies (MUST, Polisen, Ework)")
print("   - Identify relevant roles and domains")
print("   - Generate interest level and recommendations")
print("")
print("Results will be stored in agent_actions table")

🚧 PLACEHOLDER: Action Agents

This section will include:

1. AdvertisingAnalyzer:
   - Analyze advertising emails
   - Summarize categorization reasoning
   - Identify key indicators

2. RestCategorizer:
   - Process uncategorized emails
   - Create sender/subject summaries
   - Explain why emails didn't fit other categories

3. JobSearchAnalyzer:
   - Scan for target companies (MUST, Polisen, Ework)
   - Identify relevant roles and domains
   - Generate interest level and recommendations

Results will be stored in agent_actions table


---

## 📊 Section 5: View Results

**What this does**: Display processing summary and export for review

**Test goal**: ✅ Results summary displayed, JSON file exported

In [12]:
# Generate processing summary
from models.zero_inbox_models import AgentAction

session = db_manager.get_session()

# Get counts
total_emails = session.query(Email).count()
categorized_count = session.query(Email).join(EmailCategory).count()
actions_count = session.query(AgentAction).count()

print("📊 ZERO INBOX PROCESSING SUMMARY")
print("=" * 40)
print(f"📧 Total emails in database: {total_emails}")
print(f"🏷️ Emails categorized: {categorized_count}")
print(f"⚡ Actions completed: {actions_count}")
print(f"⏳ Pending categorization: {total_emails - categorized_count}")

session.close()

📊 ZERO INBOX PROCESSING SUMMARY
📧 Total emails in database: 15
🏷️ Emails categorized: 0
⚡ Actions completed: 0
⏳ Pending categorization: 15


In [None]:
# Show category breakdown (when categories exist)
session = db_manager.get_session()

category_stats = (
    session.query(
        EmailCategory.category,
        EmailCategory.subcategory,
        session.query(EmailCategory)
        .filter(
            EmailCategory.category == EmailCategory.category,
            EmailCategory.subcategory == EmailCategory.subcategory,
        )
        .count()
        .label("count"),
    )
    .group_by(EmailCategory.category, EmailCategory.subcategory)
    .all()
)

if category_stats:
    print("\n🏷️ CATEGORY BREAKDOWN:")
    for category, subcategory, count in category_stats:
        if category != "system_template":  # Skip template records
            print(f"  - {category}/{subcategory}: {count} emails")
else:
    print("\nℹ️ No categorized emails yet")

session.close()

AttributeError: 'int' object has no attribute 'label'

In [None]:
# Export results for human review (when data exists)
import json
import os
from datetime import datetime

# Create output directory
output_dir = "output/human_review"
os.makedirs(output_dir, exist_ok=True)

# Generate export filename
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
export_filename = f"zero_inbox_review_{timestamp}.json"
export_path = os.path.join(output_dir, export_filename)

# Create export data structure
export_data = {
    "export_metadata": {
        "export_date": datetime.now().isoformat(),
        "total_emails": total_emails,
        "categorized_emails": categorized_count,
        "pending_review": categorized_count,
    },
    "emails": [],
}

# Add placeholder for actual data (will be populated when categorization is implemented)
if categorized_count > 0:
    export_data["emails"].append(
        {
            "note": "Actual email data will be populated when categorization is implemented",
            "structure": {
                "email_id": "database_id",
                "sender": "email_sender",
                "subject": "email_subject",
                "date": "email_date",
                "original_category": "AI_category",
                "original_subcategory": "AI_subcategory",
                "action_result": "agent_analysis",
                "confidence": "0.0-1.0",
                "review_fields": {
                    "approved": "null (to be filled by human)",
                    "corrected_category": "null (if correction needed)",
                    "corrected_subcategory": "null (if correction needed)",
                    "human_reasoning": "null (explanation from human)",
                },
            },
        }
    )

# Write export file
with open(export_path, "w", encoding="utf-8") as f:
    json.dump(export_data, f, indent=2, ensure_ascii=False)

print(f"\n📄 Export file created: {export_path}")
print(f"📊 Ready for human review: {categorized_count} emails")

if categorized_count == 0:
    print(
        "ℹ️ Export contains structure template - will have real data after categorization"
    )

---

## 👤 Section 6: Import Feedback (Optional)

**What this does**: Load human corrections and update database

**Test goal**: ✅ Human feedback integrated, corrections stored

⚠️ **Note**: Run this only after manually reviewing and correcting the exported JSON file

In [None]:
# PLACEHOLDER: Human feedback import
print("🚧 PLACEHOLDER: Human Feedback Import")
print("")
print("This section will include:")
print("- Load reviewed JSON files from output/human_review/")
print("- Validate human corrections")
print("- Update human_reviews table with feedback")
print("- Generate feedback integration summary")
print("- Log corrections for future learning")
print("")
print("Instructions for human reviewers:")
print("1. Open the exported JSON file")
print("2. For each email, fill in review_fields:")
print("   - approved: true/false")
print("   - corrected_category: (if approved=false)")
print("   - corrected_subcategory: (if approved=false)")
print("   - human_reasoning: explanation")
print("3. Save the file and run this section")

---

## 🎯 Next Steps

**Current Status**: Phase 1 ✅ Database + Phase 2 ✅ Email Fetching

**To implement next**:
1. **Phase 3**: Email Categorization (Section 3 above)
2. **Phase 4**: Action Agents (Section 4 above)  
3. **Phase 5**: Results Summary (enhance Section 5)
4. **Phase 6**: Human Feedback Loop (Section 6 above)

**How to use this notebook**:
- Run sections 1-2 to test current functionality
- Sections 3-6 are placeholders for future phases
- Each section is independent and can be run separately
- Clear status messages show what's working vs. placeholder