# OutreachAI Accelerator: A Notebook Walkthrough
**Author:** Malini
**For:** Caprae Capital Partners - Intern Pre-Work

**Objective:** This notebook deconstructs the core logic of the OutreachAI application. It demonstrates how a raw lead list is systematically enriched with a quantitative priority score and qualitative, AI-powered outreach messaging to create a high-impact tool for sales and M&A teams.

---

## Part 1: The Problem - Unlocking Value from Flat Lead Lists

The journey often begins with a simple, yet challenging, flat file: a CSV brimming with potential leads. However, in its raw state, this data offers limited immediate value. Sales and M&A teams face the considerable task of:
1.  Identifying which leads are genuinely high-potential.
2.  Researching each promising lead.
3.  Crafting personalized outreach messages – a crucial step for cutting through the noise.

Manual execution of these steps is incredibly time-consuming and doesn't scale. The alternative, generic mass outreach, typically yields low engagement and conversion rates. My OutreachAI application tackles this by automating the intelligence-gathering and initial message-crafting process.

---

## Part 2: The OutreachAI Solution - A Step-by-Step Breakdown

OutreachAI transforms the raw CSV data into an actionable dashboard with enriched insights. Let's walk through the key processing stages:

### Step 2.1: Data Ingestion and Parsing

The process starts when the user uploads their SaaSquatch Enrichment CSV file. My application's backend then performs the following parsing and initial transformation:

*   **File Reading:** The CSV content is read.
*   **Header Identification:** Key headers (e.g., `Company`, `Website`, `Industry`, `Revenue`, `Employee Count`, `Owner's Title`, `Owner's Email`, `BBB Rating`, `City`, `State`) are identified.
*   **Data Cleaning & Formatting:**
    *   Revenue strings (e.g., "4.7M", "500k") are converted into numerical values representing millions.
    *   Employee count is cleaned to a numerical value.
    *   Location is derived by combining `City` and `State` if a direct `Location` column isn't present or is empty.
*   **Unique ID Assignment:** Each lead is assigned a unique internal ID for tracking.

In [ ]:
# This is a conceptual representation of the parsing logic implemented in TypeScript in the actual application.
# import pandas as pd # In actual app, custom CSV parsing logic is used.

# def parse_revenue_to_millions(revenue_str):
#     if not revenue_str or pd.isna(revenue_str): # Assuming pandas for isna for conceptual clarity
#         return None
#     num_str = str(revenue_str).lower().replace(',', '')
#     value = float(''.join(filter(lambda x: x.isdigit() or x == '.', num_str)))
#     if 'm' in num_str:
#         return value
#     elif 'k' in num_str:
#         return value / 1000
#     return value # Assuming it's already in millions if no suffix, or needs adjustment based on typical data scale

# # Example data structure after parsing a row
# raw_lead_data_example = {
#     'Company': 'Catalytic', 'Website': 'https://catalytic.com/', 'Industry': 'Productivity/tech',
#     'Employee Count': '14', 'Revenue': '450.8k', 'BBB Rating': 'A+',
#     'City': 'Chicago', 'State': 'IL', "Owner's Title": 'Senior Designer', "Owner's Email": 'example@example.com'
# }

# transformed_lead = {
#     'id': 'unique_lead_id_123',
#     'companyName': raw_lead_data_example['Company'],
#     'website': raw_lead_data_example['Website'],
#     'industry': raw_lead_data_example['Industry'],
#     'location': f"{raw_lead_data_example['City']}, {raw_lead_data_example['State']}",
#     'revenue': parse_revenue_to_millions(raw_lead_data_example['Revenue']), # Becomes 0.4508
#     'employeeCount': int(raw_lead_data_example['Employee Count']), # Becomes 14
#     'title': raw_lead_data_example["Owner's Title"],
#     'ownerEmail': raw_lead_data_example["Owner's Email"],
#     'bbbRating': raw_lead_data_example['BBB Rating']
# }

### Step 2.2: Priority Scoring

Once parsed, each lead is assigned a Priority Score (ranging from 0 to 100). This score is a quantitative measure of a lead's potential, calculated based on a weighted model:

*   **Revenue:** Up to 30 points (higher revenue = more points).
*   **Employee Count:** Up to 30 points (larger companies = more points).
*   **Owner's Title:** Up to 20 points (C-suite/VP/Director titles score higher).
*   **BBB Rating:** Up to 20 points (better ratings = more points).

This scoring system allows users to quickly sort leads and focus on those with the highest potential. For transparency, the UI includes a tooltip that appears when hovering over the score, detailing the specific factors that contributed to that lead's score.

**Conceptual Python-like Representation of Scoring Logic:**

In [ ]:
# This is a conceptual representation of the scoring logic.
# def calculate_priority_score(lead):
#     score = 0
#     explanations = []

#     # Revenue scoring
#     if lead.get('revenue'):
#         if lead['revenue'] > 50: score += 30; explanations.append(f"+30: High Revenue (${lead['revenue']}M)")
#         elif lead['revenue'] >= 10: score += 20; explanations.append(f"+20: Significant Revenue (${lead['revenue']}M)")
#         elif lead['revenue'] >= 1: score += 10; explanations.append(f"+10: Notable Revenue (${lead['revenue']}M)")
#     
#     # Employee Count scoring
#     if lead.get('employeeCount'):
#         if lead['employeeCount'] > 500: score += 30; explanations.append(f"+30: Large Employee Count ({lead['employeeCount']})")
#         # ... other employee tiers ...
#     
#     # Title scoring
#     if lead.get('title'):
#         title_lower = lead['title'].lower()
#         if any(t in title_lower for t in ['ceo', 'founder', 'president']): score += 20; explanations.append(f"+20: Key Title ({lead['title']})")
#         # ... other title tiers ...
#     
#     # BBB Rating scoring
#     if lead.get('bbbRating') and lead['bbbRating'] not in ['N/A', '']:
#         if lead['bbbRating'] == 'A+': score += 20; explanations.append(f"+20: Excellent BBB Rating ({lead['bbbRating']})")
#         # ... other BBB tiers ...

#     if not explanations: explanations.append("No specific scoring factors met.")
#     return {"total": min(score, 100), "breakdown": {"explanations": explanations}}

# # Example usage:
# # for lead in all_transformed_leads:
# #     lead['priorityScoreInfo'] = calculate_priority_score(lead)
# #
# # # Leads are then sorted by 'priorityScoreInfo.total' in descending order in the UI.

### Step 2.3: AI-Powered Qualitative Enrichment

Beyond quantitative scoring, OutreachAI leverages Google's Gemini 2.0 Flash model (via Genkit) to provide qualitative enrichment in the form of personalized outreach message components.

#### Customizable AI Persona

Before generating messages, the user can specify their "Sales Goal / Product Focus" (e.g., "Selling an AI-powered accounting automation tool"). This user-defined context is then dynamically included in the prompts sent to the AI, ensuring the generated content is tailored and relevant to the user's specific campaign.