# Insight Generation Agent Description
Produces intelligent business insights and pipeline analysis through sophisticated prompts that evaluate conversion rates, deal performance, and strategic recommendations for sales optimization.

Available Insight Types:
- **pipeline_analysis** - Pipeline health, stages, bottlenecks
- **conversion_analysis** - Conversion rates, funnel analysis
- **performance_analysis** - Revenue, deals, efficiency
- **forecasting** - Revenue predictions, trends
- **recommendations** - Strategic action plans
- **agent_performance** - Individual agent analysis
- **product_analysis** - Product portfolio analysis

## Step 1: Import Packages

In [1]:
import os
import json
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
from typing import Dict, List, Optional, Tuple
from datetime import datetime, timedelta

from dotenv import load_dotenv

import google.generativeai as genai

An error occurred: module 'importlib.metadata' has no attribute 'packages_distributions'


  from .autonotebook import tqdm as notebook_tqdm


## Step 2: Setup for Gemini and API Key

In [2]:
# Load environment variables .env
load_dotenv()

# Get the Gemini API key from .env
api_key = os.getenv('GEMINI_API_KEY')

In [3]:
# Configure Gemini with API key
genai.configure(api_key=api_key)

# Debug: Print available models
print("Available models:", [m.name for m in genai.list_models()])

# Create the model
model = genai.GenerativeModel('models/gemini-pro-latest')


Available models: ['models/embedding-gecko-001', 'models/gemini-2.5-pro-preview-03-25', 'models/gemini-2.5-flash', 'models/gemini-2.5-pro-preview-05-06', 'models/gemini-2.5-pro-preview-06-05', 'models/gemini-2.5-pro', 'models/gemini-2.0-flash-exp', 'models/gemini-2.0-flash', 'models/gemini-2.0-flash-001', 'models/gemini-2.0-flash-exp-image-generation', 'models/gemini-2.0-flash-lite-001', 'models/gemini-2.0-flash-lite', 'models/gemini-2.0-flash-lite-preview-02-05', 'models/gemini-2.0-flash-lite-preview', 'models/gemini-2.0-pro-exp', 'models/gemini-2.0-pro-exp-02-05', 'models/gemini-exp-1206', 'models/gemini-2.0-flash-thinking-exp-01-21', 'models/gemini-2.0-flash-thinking-exp', 'models/gemini-2.0-flash-thinking-exp-1219', 'models/gemini-2.5-flash-preview-tts', 'models/gemini-2.5-pro-preview-tts', 'models/learnlm-2.0-flash-experimental', 'models/gemma-3-1b-it', 'models/gemma-3-4b-it', 'models/gemma-3-12b-it', 'models/gemma-3-27b-it', 'models/gemma-3n-e4b-it', 'models/gemma-3n-e2b-it', 'mo

## Step 3: Load Data

In [4]:
# Load clean data
base_path = "data_directory/clean_data"

# Read CSV files
accounts = pd.read_csv(os.path.join(base_path, "Accounts.csv"))
pipeline = pd.read_csv(os.path.join(base_path, "Pipeline.csv"))
teams = pd.read_csv(os.path.join(base_path, "Teams.csv"))
products = pd.read_csv(os.path.join(base_path, "Products.csv"))

In [5]:
won_deals = pipeline[pipeline['deal_stage'].str.lower() == 'won']
engaging_deals = pipeline[pipeline['deal_stage'].str.lower() == 'engaging']

print(f"Total Revenue (Won): ${won_deals['close_value'].sum():,.2f}")
print(f"Pipeline Value (Engaging): ${engaging_deals['close_value'].sum():,.2f}")
print(f"Deal Stages: {pipeline['deal_stage'].unique().tolist()}")
print(f"Products: {pipeline['product'].nunique()}")
print(f"Sales Agents: {pipeline['sales_agent'].nunique()}")


Total Revenue (Won): $10,005,534.00
Pipeline Value (Engaging): $750,008.00
Deal Stages: ['won', 'engaging', 'lost', 'prospecting']
Products: 7
Sales Agents: 30


## Step 4: InsightGenerationAgent Class

This the main class that analyzes CRM data and generates intelligent business insights.

**What it does:**
- Calculates metrics automatically from data
- Builds sophisticated prompts for each insight type
- Generates actionable recommendations with priorities
- Supports filtering and comparative analysis

**Main Methods:**
- `generate_insight()` - Generate specific insight types
- `_calculate_metrics()` - Compute business metrics from data
- `_build_insight_prompt()` - Construct AI prompts with context
- `generate_comparative_insight()` - Compare performance across dimensions

In [6]:
class InsightGenerationAgent:
    """
    Generates intelligent business insights from CRM data using advanced prompt engineering.
    Produces actionable recommendations for sales optimization.
    """
    
    def __init__(self, model_name="gemini-2.5-flash"):
        """
        Initialize the Insight Generation Agent.
        
        Args:
            model_name (str): The Gemini model to use for generation
        """
        self.model = genai.GenerativeModel(model_name)
        
        # Define supported insight types
        # Each type has specific analysis focus and prompt instructions
        self.insight_types = {
            "pipeline_analysis": "Pipeline Health & Stage Analysis",
            "conversion_analysis": "Conversion Rate & Funnel Analysis",
            "performance_analysis": "Sales Performance & Deal Metrics",
            "forecasting": "Revenue Forecasting & Predictions",
            "recommendations": "Strategic Recommendations",
            "agent_performance": "Sales Agent Performance Analysis",
            "product_analysis": "Product Performance Analysis"
        }
        


In [7]:
# Test initialization
agent = InsightGenerationAgent()

print("\nAvailable Insight Types:")
for key, value in agent.insight_types.items():
    print(f"  • {key}: {value}")


Available Insight Types:
  • pipeline_analysis: Pipeline Health & Stage Analysis
  • conversion_analysis: Conversion Rate & Funnel Analysis
  • performance_analysis: Sales Performance & Deal Metrics
  • forecasting: Revenue Forecasting & Predictions
  • recommendations: Strategic Recommendations
  • agent_performance: Sales Agent Performance Analysis
  • product_analysis: Product Performance Analysis


### InsightGenerationAgent Class - Main generate_insight method
Adding the main generate_insight method to the class

In [8]:
def generate_insight(
    self,
    insight_type: str,
    pipeline_data: pd.DataFrame,
    accounts_data: Optional[pd.DataFrame] = None,
    teams_data: Optional[pd.DataFrame] = None,
    products_data: Optional[pd.DataFrame] = None,
    time_period: Optional[str] = None,
    filters: Optional[Dict] = None,
    focus_area: Optional[str] = None
) -> str:
    """
    Generate a specific type of business insight from CRM data.
    
    This is the main method that orchestrates the insight generation process:
    1. Validates the insight type
    2. Calculates metrics from the data
    3. Builds a sophisticated prompt
    4. Generates the insight using AI
    
    Args:
        insight_type (str): Type of insight ('pipeline_analysis', 'conversion_analysis', etc.)
        pipeline_data (pd.DataFrame): Main opportunities/pipeline data
        accounts_data (pd.DataFrame, optional): Account information
        teams_data (pd.DataFrame, optional): Sales team information
        products_data (pd.DataFrame, optional): Product catalog
        time_period (str, optional): Time frame for analysis (e.g., 'Q4 2024')
        filters (dict, optional): Filters to apply (e.g., {'product': 'GTXPro'})
        focus_area (str, optional): Specific area to focus on
        
    Returns:
        str: Generated insight report with analysis and recommendations
        
    Raises:
        ValueError: If insight_type is not supported
    """
    
    # Validate that the requested insight type is supported
    if insight_type not in self.insight_types:
        raise ValueError(
            f"Invalid insight type: '{insight_type}'. "
            f"Choose from: {list(self.insight_types.keys())}"
        )
    
    print(f"Generating {self.insight_types[insight_type]}...")
    
    # Calculate all relevant metrics from the data
    metrics = self._calculate_metrics(
        pipeline_data, 
        accounts_data, 
        teams_data, 
        products_data,
        filters
    )
    
    print(f"Calculated {len(metrics)} metric categories")
    
    # Creates type-specific instructions and context for the AI prompt
    prompt = self._build_insight_prompt(
        insight_type,
        metrics,
        time_period,
        focus_area
    )
    
    print(f"Built prompt with {len(prompt)} characters")
    
    # Generate the insight using Gemini
    print("Generating insight with AI...")
    response = self.model.generate_content(prompt)
    
    print("Insight generated successfully!\n")
    
    return response.text

# Add method to class
InsightGenerationAgent.generate_insight = generate_insight

### InsightGenerationAgent Class - Metrics calculation method

In [9]:
# Add the metrics calculation method to the class

def _calculate_metrics(
    self,
    pipeline_data: pd.DataFrame,
    accounts_data: Optional[pd.DataFrame],
    teams_data: Optional[pd.DataFrame],
    products_data: Optional[pd.DataFrame],
    filters: Optional[Dict]
) -> Dict:
    """
    Calculate comprehensive business metrics from CRM data.
    
    This method extracts and computes all the numbers needed for insights:
    - Summary metrics (total opps, conversion rates, revenue)
    - Stage distribution (how many deals in each stage)
    - Product performance (revenue, win rate by product)
    - Agent performance (revenue, conversion by agent)
    - Time metrics (days to close, sales cycle)
    
    Args:
        pipeline_data: Main opportunities dataset
        accounts_data: Account information (optional)
        teams_data: Sales team data (optional)
        products_data: Product catalog (optional)
        filters: Dict of filters to apply (e.g., {'product': 'GTXPro'})
        
    Returns:
        Dict containing all calculated metrics organized by category
    """
    
    # Start with a copy of the data to avoid modifying original
    df = pipeline_data.copy()
    
    # Apply filters if provided
    # Example: filter to just one product or one sales agent
    if filters:
        for key, value in filters.items():
            if key in df.columns:
                df = df[df[key] == value]
                print(f"  Applied filter: {key} = {value}")
    
    # BASIC COUNTS
    total_opps = len(df)
    
    # STAGE DISTRIBUTION
    # Count how many deals are in each stage
    stage_dist = df['deal_stage'].value_counts().to_dict()
    
    # WIN/LOSS ANALYSIS
    # Filter to won, lost, and engaging deals (case-insensitive)
    won_deals = df[df['deal_stage'].str.lower() == 'won']
    lost_deals = df[df['deal_stage'].str.lower() == 'lost']
    engaging_deals = df[df['deal_stage'].str.lower() == 'engaging']
    
    won_count = len(won_deals)
    lost_count = len(lost_deals)
    engaging_count = len(engaging_deals)
    
    # Calculate conversion rate: wins / total closed deals
    closed_deals = won_count + lost_count
    conversion_rate = (won_count / closed_deals * 100) if closed_deals > 0 else 0

    # REVENUE METRICS
    total_revenue = won_deals['close_value'].sum()
    avg_deal_size = won_deals['close_value'].mean() if won_count > 0 else 0
    median_deal_size = won_deals['close_value'].median() if won_count > 0 else 0
    
    # Pipeline value = sum of all engaging deals
    pipeline_value = engaging_deals['close_value'].sum()
    
    # PRODUCT PERFORMANCE
    product_performance = {}
    if 'product' in df.columns:
        for product in df['product'].unique():
            # Get all deals for this product
            prod_df = df[df['product'] == product]
            prod_won = prod_df[prod_df['deal_stage'].str.lower() == 'won']
            
            product_performance[str(product)] = {
                'total_opps': int(len(prod_df)),
                'won': int(len(prod_won)),
                'revenue': float(prod_won['close_value'].sum()),
                'avg_deal_size': float(prod_won['close_value'].mean()) if len(prod_won) > 0 else 0
            }
    

    # SALES AGENT PERFORMANCE
    agent_performance = {}
    if 'sales_agent' in df.columns:
        for agent in df['sales_agent'].unique():
            # Get all deals for this agent
            agent_df = df[df['sales_agent'] == agent]
            agent_won = agent_df[agent_df['deal_stage'].str.lower() == 'won']
            agent_closed = len(agent_df[agent_df['deal_stage'].str.lower().isin(['won', 'lost'])])
            
            agent_performance[str(agent)] = {
                'total_opps': int(len(agent_df)),
                'won': int(len(agent_won)),
                'revenue': float(agent_won['close_value'].sum()),
                'conversion_rate': float((len(agent_won) / agent_closed * 100) if agent_closed > 0 else 0)
            }
    

    # TIME-BASED METRICS
    time_metrics = {}
    if 'engage_date' in df.columns and 'close_date' in df.columns:
        # Filter to rows with both dates
        df_with_dates = df.dropna(subset=['engage_date', 'close_date'])
        
        if not df_with_dates.empty:
            # Convert to datetime
            df_with_dates['engage_date'] = pd.to_datetime(df_with_dates['engage_date'])
            df_with_dates['close_date'] = pd.to_datetime(df_with_dates['close_date'])
            
            # Calculate days between engagement and close
            df_with_dates['days_to_close'] = (
                df_with_dates['close_date'] - df_with_dates['engage_date']
            ).dt.days
            
            time_metrics = {
                'avg_days_to_close': float(df_with_dates['days_to_close'].mean()),
                'median_days_to_close': float(df_with_dates['days_to_close'].median())
            }
    

    # COMPILE ALL METRICS
    metrics = {
        'summary': {
            'total_opportunities': int(total_opps),
            'won_deals': int(won_count),
            'lost_deals': int(lost_count),
            'engaging_deals': int(engaging_count),
            'conversion_rate': float(conversion_rate),
            'total_revenue': float(total_revenue),
            'avg_deal_size': float(avg_deal_size),
            'median_deal_size': float(median_deal_size),
            'pipeline_value': float(pipeline_value)
        },
        'stage_distribution': stage_dist,
        'product_performance': product_performance,
        'agent_performance': agent_performance,
        'time_metrics': time_metrics
    }
    
    return metrics

# Add method to class
InsightGenerationAgent._calculate_metrics = _calculate_metrics

### InsightGenerationAgent Class - Prompt Building Methods

In [10]:
# Add prompt building methods to the class

def _build_insight_prompt(
    self,
    insight_type: str,
    metrics: Dict,
    time_period: Optional[str],
    focus_area: Optional[str]
) -> str:
    """
    Build a sophisticated prompt for generating insights.
    
    This creates a comprehensive prompt that includes:
    - Expert role definition
    - Type-specific instructions
    - All calculated metrics
    - Time period context
    - Focus area guidance
    - Structured output format
    
    The prompt engineering here is critical - it determines the quality
    of the insights generated by the AI.
    """
    
    current_date = datetime.now().strftime("%B %d, %Y")
    
    # Get insight-type-specific instructions
    # These tell the AI how to analyze this particular type of insight
    type_instructions = self._get_insight_type_instructions(insight_type)
    
    # Convert metrics to JSON for the prompt
    # The AI will analyze these specific numbers
    metrics_json = json.dumps(metrics, indent=2, default=str)
    
    # Build time period context if provided
    time_context = ""
    if time_period:
        time_context = f"\n**Analysis Period:** {time_period}\n"
    
    # Build focus area context if provided
    focus_context = ""
    if focus_area:
        focus_context = f"\n**Specific Focus:** {focus_area}\n"
    
    # Construct the complete prompt
    prompt = f"""
You are an expert business intelligence analyst and sales strategist with 20+ years of experience in CRM analytics, revenue optimization, and sales performance management.

**YOUR TASK:**
Generate a comprehensive {self.insight_types[insight_type]} report based on the provided CRM metrics.

**INSIGHT TYPE INSTRUCTIONS:**
{type_instructions}

**DATA METRICS:**
```json
{metrics_json}
```
{time_context}{focus_context}

**CRITICAL REQUIREMENTS:**

1. **Data-Driven Analysis:**
   - Base ALL conclusions on the provided metrics
   - Use specific numbers, percentages, and dollar amounts
   - Compare metrics against industry benchmarks where relevant
   - Identify trends, patterns, and anomalies

2. **Structure Your Report:**
   - **Executive Summary** (3-4 sentences with key findings)
   - **Key Metrics** (bullet points with actual numbers)
   - **Detailed Analysis** (2-3 paragraphs interpreting the data)
   - **Critical Insights** (3-5 key findings from the data)
   - **Strategic Recommendations** (prioritized by impact: High/Medium/Long-term)
   - **Risk Factors** (potential concerns or red flags)
   - **Success Opportunities** (areas of high potential)

3. **Make it Actionable:**
   - Provide specific, concrete recommendations
   - Prioritize actions by impact and urgency
   - Include owner assignments (e.g., "Sales Manager", "Reps")
   - Suggest timelines for implementation
   - Identify quick wins vs. strategic initiatives

4. **Business Context:**
   - Frame insights in business terms (revenue, growth, efficiency)
   - Address both short-term tactics and long-term strategy
   - Highlight ROI and business impact
   - Consider competitive positioning

5. **Quality Standards:**
   - Be specific and quantitative, not vague
   - Use professional business language
   - Support claims with data
   - Acknowledge data limitations where relevant

**IMPORTANT:**
- All numbers must come from the provided metrics
- Use actual dollar amounts (e.g., "$125,000" not "significant revenue")
- Current date is {current_date}
- If data is insufficient for certain analysis, note it explicitly

Generate the insight report now:
"""
    
    return prompt


In [11]:
def _get_insight_type_instructions(self, insight_type: str) -> str:
    """
    Get specific analysis instructions for each insight type.
    
    Each insight type has unique focus areas and analysis requirements.
    These instructions guide the AI on what to emphasize.
    """
    
    instructions = {
        "pipeline_analysis": """
Analyze the health and composition of the sales pipeline:
- Evaluate the distribution of opportunities across deal stages
- Assess pipeline value and coverage (3x coverage is healthy benchmark)
- Identify bottlenecks where deals are getting stuck
- Analyze pipeline velocity and flow
- Highlight deals at risk of stalling
- Recommend actions to improve pipeline health and flow
""",
        
        "conversion_analysis": """
Analyze conversion rates throughout the sales funnel:
- Calculate and evaluate overall conversion rate (won / total closed)
- Identify which stages have highest drop-off rates
- Compare conversion rates across products, agents, or segments
- Benchmark against industry standards (B2B SaaS: 20-30% is typical)
- Identify factors contributing to wins vs. losses
- Recommend specific strategies to improve conversion
""",
        
        "performance_analysis": """
Analyze sales performance and deal metrics:
- Evaluate revenue achievement and deal sizes
- Assess sales efficiency and productivity
- Compare performance across products, agents, or accounts
- Identify top performers and what makes them successful
- Highlight underperforming areas needing attention
- Calculate key ratios (win rate, avg deal size, sales cycle)
- Recommend performance improvement strategies
""",
        
        "forecasting": """
Generate revenue forecasts and predictive insights:
- Project future revenue based on current pipeline
- Assess pipeline coverage for upcoming periods
- Identify trends in deal velocity and close rates
- Flag risks to forecast accuracy
- Consider seasonality and historical patterns
- Provide confidence intervals on forecasts
- Recommend actions to achieve or exceed forecast
""",
        
        "recommendations": """
Generate strategic recommendations for sales optimization:
- Identify the biggest opportunities for improvement
- Prioritize recommendations by potential impact
- Balance quick wins with long-term strategic initiatives
- Consider resource constraints and feasibility
- Address both process improvements and skill development
- Include specific KPIs to track success
- Provide implementation roadmap with timelines
""",
        
        "agent_performance": """
Analyze sales agent performance:
- Rank agents by key metrics (revenue, conversion, deal count)
- Identify top performers and their success patterns
- Highlight agents needing coaching or support
- Compare individual performance to team averages
- Assess workload distribution and capacity
- Identify skill gaps and training opportunities
- Recommend agent-specific development plans
""",
        
        "product_analysis": """
Analyze product performance:
- Compare revenue and deal counts across products
- Evaluate product win rates and average deal sizes
- Identify most/least profitable products
- Assess product-market fit signals
- Highlight cross-sell and upsell opportunities
- Recommend product portfolio optimization
- Suggest pricing or positioning adjustments
"""
    }
    
    return instructions.get(insight_type, "")


In [12]:
# Add methods to class
InsightGenerationAgent._build_insight_prompt = _build_insight_prompt
InsightGenerationAgent._get_insight_type_instructions = _get_insight_type_instructions

### InsightGenerationAgent Class - Comparative Analysis Method

In [13]:
# Add comparative analysis method to the class

def generate_comparative_insight(
    self,
    pipeline_data: pd.DataFrame,
    comparison_field: str,
    insight_focus: str = "performance"
) -> str:
    """
    Generate comparative insights across a dimension (products, agents, etc.).
    
    This method is useful for questions like:
    - "Which products perform best?"
    - "How do sales agents compare?"
    - "What sectors have highest conversion?"
    
    Args:
        pipeline_data: Pipeline dataset
        comparison_field: Field to compare across (e.g., 'product', 'sales_agent')
        insight_focus: What to focus on ('performance', 'conversion', 'revenue')
        
    Returns:
        str: Comparative analysis report with rankings and recommendations
    """
    
    # Validate that the comparison field exists
    if comparison_field not in pipeline_data.columns:
        raise ValueError(
            f"Field '{comparison_field}' not found in pipeline data. "
            f"Available fields: {pipeline_data.columns.tolist()}"
        )
    
    print(f"Generating comparative analysis for: {comparison_field}")
    print(f"Focus: {insight_focus}\n")
    
    # Calculate metrics for each value in the comparison field
    comparison_data = {}
    
    for value in pipeline_data[comparison_field].unique():
        # Filter to this specific value
        value_df = pipeline_data[pipeline_data[comparison_field] == value]
        
        # Calculate win/loss metrics
        won = value_df[value_df['deal_stage'].str.lower() == 'won']
        lost = value_df[value_df['deal_stage'].str.lower() == 'lost']
        closed = len(won) + len(lost)
        
        # Store metrics for this value
        comparison_data[str(value)] = {
            'total_opps': int(len(value_df)),
            'won': int(len(won)),
            'lost': int(len(lost)),
            'conversion_rate': float((len(won) / closed * 100) if closed > 0 else 0),
            'revenue': float(won['close_value'].sum()),
            'avg_deal_size': float(won['close_value'].mean()) if len(won) > 0 else 0
        }
    
    print(f"Calculated metrics for {len(comparison_data)} {comparison_field} values")
    
    # Create prompt for comparative analysis
    prompt = f"""
You are a business intelligence analyst performing comparative analysis.

**COMPARISON TASK:**
Compare performance across different {comparison_field} values.

**COMPARISON DATA:**
```json
{json.dumps(comparison_data, indent=2)}
```

**Analysis Focus:** {insight_focus}

**YOUR TASK:**
Generate a comprehensive comparative analysis that:

1. **Rankings:** Rank all {comparison_field} values by {insight_focus}
   - Show top 3 and bottom 3 performers
   - Include specific metrics for each

2. **Performance Gaps:** Identify and explain differences
   - What separates high performers from low performers?
   - Are there clear patterns or success factors?

3. **Success Patterns:** What do top performers have in common?
   - Similar strategies, approaches, or characteristics?
   - Lessons that can be replicated?

4. **Improvement Opportunities:** Specific recommendations for low performers
   - Actionable steps to improve
   - Resources or support needed

5. **Strategic Insights:** Overall patterns and trends
   - Portfolio balance
   - Resource allocation recommendations

**REQUIREMENTS:**
- Use specific numbers and percentages from the data
- Be objective and data-driven
- Provide actionable recommendations
- Highlight both successes and improvement areas

Generate the comparative analysis now:
"""
    
    print("Generating comparative insight with AI...\n")
    
    response = self.model.generate_content(prompt)
    return response.text

# Add method to class
InsightGenerationAgent.generate_comparative_insight = generate_comparative_insight

## Step 5: Testing

### Test 1 - Pipeline Analysis

In [14]:
print("TEST 1: PIPELINE ANALYSIS")

print("This test analyzes the overall health of your sales pipeline,")
print("including stage distribution, bottlenecks, and flow.\n")
print("-"*80 + "\n")

# Generate pipeline analysis
insight = agent.generate_insight(
    insight_type="pipeline_analysis",
    pipeline_data=pipeline,
    accounts_data=accounts,
    time_period="2016-2017 Data",
    focus_area="Identifying bottlenecks and improving pipeline flow"
)

print(insight)

TEST 1: PIPELINE ANALYSIS
This test analyzes the overall health of your sales pipeline,
including stage distribution, bottlenecks, and flow.

--------------------------------------------------------------------------------

Generating Pipeline Health & Stage Analysis...
Calculated 5 metric categories
Built prompt with 8419 characters
Generating insight with AI...
Insight generated successfully!

## Pipeline Health & Stage Analysis Report (2016-2017 Data & Current Snapshot)

**Analysis Period:** 2016-2017 Data (Historical)
**Current Snapshot Date:** November 18, 2025

### Executive Summary

The sales pipeline currently exhibits critical health issues, most notably a severe lack of coverage for future revenue targets and an alarming decline in the average deal size within active opportunities. While historical conversion rates (63.15%) are strong and certain products and agents demonstrate high performance, the current pipeline value of $750,008 provides only a fraction of the necessary 

### Test 2 - Conversion Analysis

In [15]:
print("TEST 2: CONVERSION RATE ANALYSIS")

print("This test analyzes conversion rates throughout your sales funnel,")
print("identifies drop-off points, and suggests improvements.\n")
print("-"*80 + "\n")

# Generate conversion analysis
insight = agent.generate_insight(
    insight_type="conversion_analysis",
    pipeline_data=pipeline,
    accounts_data=accounts,
    focus_area="Improving win rates and reducing losses"
)

print(insight)

TEST 2: CONVERSION RATE ANALYSIS
This test analyzes conversion rates throughout your sales funnel,
identifies drop-off points, and suggests improvements.

--------------------------------------------------------------------------------

Generating Conversion Rate & Funnel Analysis...
Calculated 5 metric categories
Built prompt with 8407 characters
Generating insight with AI...
Insight generated successfully!

## Conversion Rate & Funnel Analysis Report (November 18, 2025)

**Prepared for:** Sales Leadership
**From:** [Your Name/Title - Expert BI Analyst & Sales Strategist]

---

### Executive Summary

This report provides a comprehensive analysis of the current sales conversion rates and funnel performance. We observe an exceptionally high overall conversion rate of 63.15%, significantly exceeding typical B2B SaaS benchmarks, which merits further investigation into our sales model or definition of "closed-won." While pipeline value is healthy at $750,008, a substantial 28.10% of all op

### Test 3 - Performance Analysis

In [16]:
print("TEST 3: SALES PERFORMANCE ANALYSIS")

print("This test evaluates overall sales performance including")
print("revenue metrics, deal sizes, and efficiency indicators.\n")
print("-"*80 + "\n")

# Generate performance analysis
insight = agent.generate_insight(
    insight_type="performance_analysis",
    pipeline_data=pipeline,
    accounts_data=accounts,
    teams_data=teams,
    focus_area="Identifying top performers and growth opportunities"
)

print(insight)

TEST 3: SALES PERFORMANCE ANALYSIS
This test evaluates overall sales performance including
revenue metrics, deal sizes, and efficiency indicators.

--------------------------------------------------------------------------------

Generating Sales Performance & Deal Metrics...
Calculated 5 metric categories
Built prompt with 8414 characters
Generating insight with AI...
Insight generated successfully!

## Sales Performance & Deal Metrics Report - November 2025

**Prepared For:** Sales Leadership
**Date:** November 18, 2025
**Analyst:** Expert Business Intelligence & Sales Strategist

### Executive Summary

Overall sales performance demonstrates solid revenue achievement of over $10 million and a healthy win rate of 63.15% for closed deals, with a remarkably fast median sales cycle of 15 days. However, a significant disparity exists between average and median deal sizes, pointing to potential segmentation issues or an over-reliance on a few large deals. Critical areas for immediate atten

### Test 4 - Product Comparison

In [17]:
print("TEST 4: PRODUCT PERFORMANCE COMPARISON")

print("This test compares performance across all products in your portfolio")
print("to identify winners and areas for improvement.\n")
print("-"*80 + "\n")

# Get product list for context
print(f"Comparing {pipeline['product'].nunique()} products:")
print(f"  {', '.join(pipeline['product'].unique()[:5])}...")
print()

# Generate product comparison
comparison = agent.generate_comparative_insight(
    pipeline_data=pipeline,
    comparison_field="product",
    insight_focus="revenue and conversion rate"
)

print(comparison)

TEST 4: PRODUCT PERFORMANCE COMPARISON
This test compares performance across all products in your portfolio
to identify winners and areas for improvement.

--------------------------------------------------------------------------------

Comparing 7 products:
  gtx_plus_basic, gtxpro, mg_special, gtx_basic, mg_advanced...

Generating comparative analysis for: product
Focus: revenue and conversion rate

Calculated metrics for 7 product values
Generating comparative insight with AI...

## Comparative Product Performance Analysis

This analysis compares the performance of different product values, focusing on revenue generation and conversion rates to identify key trends, strengths, and areas for improvement.

### 1. Rankings by Revenue and Conversion Rate

#### Revenue Ranking (Highest to Lowest)

**Top 3 Performers:**
1.  **GTXPro:** $3,510,578
2.  **GTX Plus Pro:** $2,629,651
3.  **MG Advanced:** $2,216,387

**Bottom 3 Performers:**
5.  **GTX Basic:** $499,263
6.  **GTK 500:** $400,612

### Test 5 - Sales Agent Comparison

In [18]:
print("TEST 5: SALES AGENT PERFORMANCE COMPARISON")

print("This test compares individual sales agent performance")
print("to identify top performers and coaching opportunities.\n")
print("-"*80 + "\n")

# Get agent count for context
print(f"Comparing {pipeline['sales_agent'].nunique()} sales agents")
print(f"Total deals analyzed: {len(pipeline)}\n")

# Generate agent comparison
comparison = agent.generate_comparative_insight(
    pipeline_data=pipeline,
    comparison_field="sales_agent",
    insight_focus="overall performance and conversion"
)

print(comparison)

TEST 5: SALES AGENT PERFORMANCE COMPARISON
This test compares individual sales agent performance
to identify top performers and coaching opportunities.

--------------------------------------------------------------------------------

Comparing 30 sales agents
Total deals analyzed: 8800

Generating comparative analysis for: sales_agent
Focus: overall performance and conversion

Calculated metrics for 30 sales_agent values
Generating comparative insight with AI...

## Comparative Sales Agent Performance Analysis

This analysis compares the performance of sales agents based on total opportunities, won/lost deals, conversion rate, total revenue, and average deal size, focusing on overall performance and conversion efficiency.

### 1. Rankings

#### Overall Performance (Ranked by Revenue):

##### Top 3 Performers:

*   **Darcel Schlecht**:
    *   Total Opps: 747
    *   Won Deals: 349
    *   Lost Deals: 204
    *   Conversion Rate: 63.11%
    *   Revenue: $1,153,214
    *   Avg Deal Size

### Test 6 - Strategic Recommendations

In [19]:
print("TEST 6: STRATEGIC RECOMMENDATIONS")

print("This test generates prioritized strategic recommendations")
print("for sales optimization and revenue growth.\n")
print("-"*80 + "\n")

# Generate strategic recommendations
insight = agent.generate_insight(
    insight_type="recommendations",
    pipeline_data=pipeline,
    accounts_data=accounts,
    products_data=products,
    teams_data=teams,
    focus_area="Revenue growth and sales process optimization"
)

print(insight)
print("\n" + "="*80 + "\n")

print("This test generates prioritized strategic recommendations")
print("for sales optimization and revenue growth.\n")
print("-"*80 + "\n")

# Generate strategic recommendations
insight = agent.generate_insight(
    insight_type="recommendations",
    pipeline_data=pipeline,
    accounts_data=accounts,
    products_data=products,
    teams_data=teams,
    focus_area="Revenue growth and sales process optimization"
)

print(insight)

TEST 6: STRATEGIC RECOMMENDATIONS
This test generates prioritized strategic recommendations
for sales optimization and revenue growth.

--------------------------------------------------------------------------------

Generating Strategic Recommendations...
Calculated 5 metric categories
Built prompt with 8405 characters
Generating insight with AI...
Insight generated successfully!

## Strategic Recommendations Report: Sales Optimization - November 18, 2025

**Prepared by:** Expert Business Intelligence Analyst & Sales Strategist

---

### Executive Summary

This report analyzes recent CRM metrics to identify critical areas for sales optimization and revenue growth. While the overall conversion rate of 63.15% is strong, significant opportunities exist to boost total revenue by focusing on high-value products, improving the performance of specific sales agents, and streamlining the sales cycle for larger deals. Prioritized recommendations include targeted training, product portfolio rev

### Test 7 - Filtered Analysis (Specific Product)

In [20]:
print("TEST 7: FILTERED ANALYSIS (SPECIFIC PRODUCT)")

# Select a product to analyze
sample_product = pipeline['product'].value_counts().index[0]
print(f"Analyzing specific product: {sample_product}")
print(f"Total opportunities for this product: {len(pipeline[pipeline['product'] == sample_product])}\n")
print("-"*80 + "\n")

# Generate filtered analysis
insight = agent.generate_insight(
    insight_type="performance_analysis",
    pipeline_data=pipeline,
    filters={'product': sample_product},  # Filter to just this product
    focus_area=f"Deep dive performance analysis for {sample_product}"
)

print(insight)

TEST 7: FILTERED ANALYSIS (SPECIFIC PRODUCT)
Analyzing specific product: gtx_basic
Total opportunities for this product: 1866

--------------------------------------------------------------------------------

Generating Sales Performance & Deal Metrics...
  Applied filter: product = gtx_basic
Calculated 5 metric categories
Built prompt with 7298 characters
Generating insight with AI...
Insight generated successfully!

## Sales Performance & Deal Metrics Report: GTX_Basic

**Current Date:** November 18, 2025

### Executive Summary

This report provides a comprehensive analysis of sales performance and deal metrics for the `gtx_basic` product line. Overall, the company demonstrates a remarkably strong win rate of 63.72% for decided deals, coupled with an efficient median sales cycle of 20 days, contributing to a total revenue of $499,263.0. While agent performance shows significant variability in conversion rates, the average deal size remains consistent, indicating strong product-market

### Test 8 - Agent-Specific Analysis

In [21]:
print("TEST 8: AGENT-SPECIFIC PERFORMANCE ANALYSIS")

# Select a top agent to analyze
agent_opps = pipeline.groupby('sales_agent').size()
top_agent = agent_opps.idxmax()

print(f"Analyzing top agent: {top_agent}")
print(f"Total opportunities: {agent_opps[top_agent]}\n")
print("-"*80 + "\n")

# Generate agent-specific analysis
insight = agent.generate_insight(
    insight_type="agent_performance",
    pipeline_data=pipeline,
    teams_data=teams,
    filters={'sales_agent': top_agent},
    focus_area=f"Individual performance review for {top_agent}"
)

print(insight)

TEST 8: AGENT-SPECIFIC PERFORMANCE ANALYSIS
Analyzing top agent: darcel_schlecht
Total opportunities: 747

--------------------------------------------------------------------------------

Generating Sales Agent Performance Analysis...
  Applied filter: sales_agent = darcel_schlecht
Calculated 5 metric categories
Built prompt with 4160 characters
Generating insight with AI...
Insight generated successfully!

## Sales Agent Performance Analysis Report: Darcel Schlecht

**Reporting Period:** Data Snapshot as of November 18, 2025

**Prepared For:** Sales Leadership

**Expert Analyst:** [Your Name/Title - Expert Business Intelligence Analyst & Sales Strategist]

---

### Executive Summary

Darcel Schlecht exhibits a strong overall sales performance, contributing significantly to the company's revenue with $1,153,214 and an impressive conversion rate of 63.11%. As the sole agent represented in the provided data, her performance defines the current organizational sales baseline. While demons

## Save Agent
Save the agent configuration for later use

In [22]:
import pickle

In [23]:
# Save agent settings
agent_config = {
    'model_name': 'gemini-2.5-flash',
    'insight_types': agent.insight_types,
    'creation_date': datetime.now().isoformat(),
    'total_insights_generated': 8  # from our tests
}

# Save to use in other notebooks
config_filename = 'insight_agent_config.pkl'
with open(config_filename, 'wb') as f:
    pickle.dump(agent_config, f)

print("Configuration saved to:", config_filename)

Configuration saved to: insight_agent_config.pkl
