<a href="https://colab.research.google.com/github/micah-shull/AI_Agents/blob/main/144_B2B_Sales_Agent_Claude_Langchain_02_Models.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
"""
Data Models for LangChain Sales Pipeline

This file demonstrates:
- Pydantic models for data validation
- Structured data for LangChain chains
- Type safety and validation
"""

from typing import List, Optional, Dict, Any
from pydantic import BaseModel, Field
from enum import Enum

class CompanySize(str, Enum):
    """Company size enumeration"""
    STARTUP = "startup"
    MID_MARKET = "mid-market"
    ENTERPRISE = "enterprise"

class CompanyInfo(BaseModel):
    """Company information model"""
    name: str = Field(description="Company name")
    industry: str = Field(description="Industry sector")
    size: CompanySize = Field(description="Company size")
    location: str = Field(description="Company location")
    website: str = Field(description="Company website")
    description: str = Field(description="Company description")
    recent_news: List[str] = Field(description="Recent company news")
    key_contacts: List[Dict[str, str]] = Field(description="Key company contacts")

class PainPointSeverity(str, Enum):
    """Pain point severity levels"""
    LOW = "low"
    MEDIUM = "medium"
    HIGH = "high"
    CRITICAL = "critical"

class PainPoint(BaseModel):
    """Pain point model"""
    category: str = Field(description="Pain point category")
    description: str = Field(description="Pain point description")
    severity: PainPointSeverity = Field(description="Pain point severity")
    evidence: List[str] = Field(description="Supporting evidence")
    potential_solution: str = Field(description="Potential solution")

class OpportunityPriority(str, Enum):
    """Opportunity priority levels"""
    LOW = "low"
    MEDIUM = "medium"
    HIGH = "high"
    URGENT = "urgent"

class Opportunity(BaseModel):
    """Opportunity model"""
    category: str = Field(description="Opportunity category")
    description: str = Field(description="Opportunity description")
    priority: OpportunityPriority = Field(description="Opportunity priority")
    evidence: List[str] = Field(description="Supporting evidence")
    potential_value: str = Field(description="Potential value")

class AnalysisResult(BaseModel):
    """Analysis result model"""
    company_name: str = Field(description="Company name")
    pain_points: List[PainPoint] = Field(description="Identified pain points")
    opportunities: List[Opportunity] = Field(description="Identified opportunities")
    industry_insights: List[str] = Field(description="Industry insights")
    recommended_approach: str = Field(description="Recommended sales approach")
    confidence_score: float = Field(description="Analysis confidence score", ge=0.0, le=1.0)

class MessageChannel(str, Enum):
    """Message channel types"""
    EMAIL = "email"
    LINKEDIN = "linkedin"
    PHONE = "phone"
    SOCIAL = "social"

class MessageTone(str, Enum):
    """Message tone types"""
    PROFESSIONAL = "professional"
    CASUAL = "casual"
    URGENT = "urgent"
    CONSULTATIVE = "consultative"

class OutreachMessage(BaseModel):
    """Outreach message model"""
    channel: MessageChannel = Field(description="Communication channel")
    subject: str = Field(description="Message subject")
    body: str = Field(description="Message body")
    tone: MessageTone = Field(description="Message tone")
    personalization_elements: List[str] = Field(description="Personalization elements")
    call_to_action: str = Field(description="Call to action")

class PersonalizationStrategy(str, Enum):
    """Personalization strategy types"""
    PAIN_POINT_FOCUSED = "pain_point_focused"
    OPPORTUNITY_FOCUSED = "opportunity_focused"
    RELATIONSHIP_BUILDING = "relationship_building"

class PersonalizationResult(BaseModel):
    """Personalization result model"""
    company_name: str = Field(description="Company name")
    primary_contact: Optional[Dict[str, str]] = Field(description="Primary contact")
    messages: List[OutreachMessage] = Field(description="Generated messages")
    personalization_strategy: PersonalizationStrategy = Field(description="Strategy used")
    recommended_sequence: List[str] = Field(description="Recommended outreach sequence")

class WorkflowStatus(str, Enum):
    """Workflow status enumeration"""
    PENDING = "pending"
    IN_PROGRESS = "in_progress"
    COMPLETED = "completed"
    FAILED = "failed"
    RETRYING = "retrying"

class AgentStatus(str, Enum):
    """Agent status enumeration"""
    PENDING = "pending"
    READY = "ready"
    RUNNING = "running"
    COMPLETED = "completed"
    FAILED = "failed"
    SKIPPED = "skipped"

class WorkflowStep(BaseModel):
    """Workflow step model"""
    step_id: str = Field(description="Step identifier")
    agent_name: str = Field(description="Agent name")
    status: AgentStatus = Field(default=AgentStatus.PENDING, description="Step status")
    start_time: Optional[str] = Field(default=None, description="Step start time")
    end_time: Optional[str] = Field(default=None, description="Step end time")
    error_message: Optional[str] = Field(default=None, description="Error message")
    retry_count: int = Field(default=0, description="Retry count")
    max_retries: int = Field(default=3, description="Maximum retries")
    input_data: Optional[Dict[str, Any]] = Field(default=None, description="Input data")
    output_data: Optional[Dict[str, Any]] = Field(default=None, description="Output data")

class WorkflowState(BaseModel):
    """Workflow state model"""
    workflow_id: str = Field(description="Workflow identifier")
    company_name: str = Field(description="Company name")
    status: WorkflowStatus = Field(default=WorkflowStatus.PENDING, description="Workflow status")
    start_time: Optional[str] = Field(default=None, description="Workflow start time")
    end_time: Optional[str] = Field(default=None, description="Workflow end time")
    steps: List[WorkflowStep] = Field(default_factory=list, description="Workflow steps")
    current_step_index: int = Field(default=0, description="Current step index")
    error_message: Optional[str] = Field(default=None, description="Error message")
    human_intervention_required: bool = Field(default=False, description="Human intervention required")
    human_intervention_reason: Optional[str] = Field(default=None, description="Human intervention reason")

# Example usage
if __name__ == "__main__":
    # Test the models
    company = CompanyInfo(
        name="Acme Corporation",
        industry="Manufacturing",
        size=CompanySize.MID_MARKET,
        location="Chicago, IL",
        website="https://acmecorp.com",
        description="Leading manufacturer of industrial equipment",
        recent_news=["Expansion into European markets", "New sustainability initiative"],
        key_contacts=[{"name": "Sarah Johnson", "title": "CEO", "email": "sarah@acmecorp.com"}]
    )

    print(f"Company: {company.name}")
    print(f"Industry: {company.industry}")
    print(f"Size: {company.size}")
    print(f"Contacts: {len(company.key_contacts)}")

    # Test validation
    try:
        invalid_company = CompanyInfo(
            name="",  # This should fail validation
            industry="Manufacturing",
            size=CompanySize.MID_MARKET,
            location="Chicago, IL",
            website="https://acmecorp.com",
            description="Leading manufacturer",
            recent_news=[],
            key_contacts=[]
        )
    except Exception as e:
        print(f"Validation error (expected): {e}")


This `langchain_models.py` file is the **data backbone** for your whole pipeline. It defines all the structured inputs, outputs, and workflow states that agents share. Let’s break it down into sections 👇.

---

## 🟦 Company Information

* **`CompanyInfo`**: Represents a researched company.

  * Fields: `name`, `industry`, `size`, `location`, `website`, `description`, `recent_news`, `key_contacts`.
  * Uses **`CompanySize` enum**: `"startup" | "mid-market" | "enterprise"`.
* **Why**: Keeps company data structured so downstream agents (analysis, personalization) get predictable inputs.

---

## 🟦 Analysis Models

* **`PainPoint`**: Defines a business problem.

  * Fields: `category`, `description`, `severity`, `evidence`, `potential_solution`.
  * Severity comes from **`PainPointSeverity` enum**: `"low" | "medium" | "high" | "critical"`.

* **`Opportunity`**: Defines a business opportunity.

  * Fields: `category`, `description`, `priority`, `evidence`, `potential_value`.
  * Priority uses **`OpportunityPriority` enum**: `"low" | "medium" | "high" | "urgent"`.

* **`AnalysisResult`**: Output of your **AnalysisAgent**.

  * Contains: `pain_points`, `opportunities`, `industry_insights`, `recommended_approach`, and a `confidence_score` (0.0–1.0).

👉 This guarantees structured analysis instead of raw free-text.

---

## 🟦 Personalization Models

* **`OutreachMessage`**: A single email, LinkedIn message, or phone script.

  * Fields: `channel`, `subject`, `body`, `tone`, `personalization_elements`, `call_to_action`.
  * Channels come from **`MessageChannel` enum**: `"email" | "linkedin" | "phone" | "social"`.
  * Tones come from **`MessageTone` enum**: `"professional" | "casual" | "urgent" | "consultative"`.

* **`PersonalizationResult`**: What your **PersonalizationAgent** produces.

  * Includes: `company_name`, `primary_contact`, list of `messages`, chosen `personalization_strategy`, and a `recommended_sequence`.
  * Strategy is from **`PersonalizationStrategy` enum**: `"pain_point_focused" | "opportunity_focused" | "relationship_building"`.

👉 This ensures outreach is consistent and multi-channel.

---

## 🟦 Workflow Orchestration

* **`WorkflowStep`**: One step in the pipeline (e.g., run ResearchAgent).

  * Tracks: `agent_name`, `status`, `start_time`, `end_time`, `retry_count`, `error_message`, etc.
  * Uses **`AgentStatus` enum**: `"pending" | "ready" | "running" | "completed" | "failed" | "skipped"`.

* **`WorkflowState`**: The overall pipeline state.

  * Fields: `workflow_id`, `company_name`, `status`, list of steps, current step index, error details, human intervention flags.
  * Uses **`WorkflowStatus` enum**: `"pending" | "in_progress" | "completed" | "failed" | "retrying"`.

👉 This gives your orchestrator **audit trails, retries, and error handling** in a structured way.

---

## 🟦 Why This Matters

* **Pydantic validation**: Prevents bad/missing data from breaking agents.
* **Schema-driven outputs**: Ensures every agent produces predictable, machine-readable results.
* **Auditability**: Workflow states and step logs give you observability and debugging power.
* **Interoperability**: All agents “speak the same language” (CompanyInfo in, AnalysisResult out, etc.).

---

✅ Big Picture: This file is like your **contract layer**. It defines the grammar of your sales pipeline so agents, orchestrators, and LangChain chains can interoperate cleanly.


