# Building a lead scoring agent with KumoRFM MCP and OpenAI Agents SDK

[[**KumoRFM**](https://kumorfm.ai) | [**KumoRFM MCP**](https://github.com/kumo-ai/kumo-rfm-mcp/) | [**OpenAI Agents SDK**](https://openai.github.io/openai-agents-python/)]

This notebook demonstrates how to build a **Lead scoring agent** powered by KumoRFM using our MCP server and OpenAI Agents SDK in just a few lines of code!


In [None]:
# Install required libraries
!pip install kumo-rfm-mcp
!pip install openai-agents==0.2.9
!pip install fsspec
!pip install s3fs
!pip install pandas

In [1]:
# Import required libraries
import asyncio
import os
from datetime import datetime, timedelta
from typing import List

from agents import Agent, Runner, gen_trace_id, trace
from agents.mcp import MCPServer, MCPServerStdio
import pandas as pd

print("✅ Libraries imported successfully")

✅ Libraries imported successfully


## Set API Keys
To run this example notebook you will need both a KumoRFM and an OpenAI API key. You can generate the keys by visiting [kumorfm.ai](https://kumorfm.ai/api-keys) and [platform.openai.com](https://platform.openai.com/api-keys) respectively. Make sure to update the values in the cell below! 

In [None]:
# API KEYS
os.environ['KUMO_API_KEY'] = "<YOUR_KUMO_API_KEY>"
os.environ['OPENAI_API_KEY'] = "<YOUR_OPENAI_API_KEY>"

## Setting up the agent

Our agent will use past data about leads to help our sales team reach out to prospects in an optimal way. We have a dataset publicly avaialble at `s3://kumo-sdk-public/rfm-datasets/lead_scoring/lead_scoring.csv`. It is composed of a single table, holding ~8000 past leads (both converted and unconverted). 

We will use our sales agent to help prioritize work for our team, as part of our daily sync it will take the leads submitted yesterday (one day before `MEETING_DAY`) and give us a ranked list for us to reach out to!

In [3]:
# Configuration
DATA_SOURCE = "s3://kumo-sdk-public/rfm-datasets/lead_scoring/lead_scoring.csv"
MEETING_DATE = "2025-05-31"  # Demo date - in practice this would be today's date

print(f"📊 Data Source: {DATA_SOURCE}")
print(f"📅 Meeting Date: {MEETING_DATE}")
print(f"🔍 Will analyze leads from: {(datetime.strptime(MEETING_DATE, '%Y-%m-%d') - timedelta(days=1)).strftime('%Y-%m-%d')}")


📊 Data Source: s3://kumo-sdk-public/rfm-datasets/lead_scoring/lead_scoring.csv
📅 Meeting Date: 2025-05-31
🔍 Will analyze leads from: 2025-05-30


Here's a simple helper to find the leads which came in yesterday!

In [None]:
def get_leads_from_previous_day(meeting_date: str, data_source: str) -> List[int]:
    """
    Get lead IDs from the day before the meeting date.
    
    Args:
        meeting_date: Format "YYYY-MM-DD" (e.g., "2025-06-02")
        data_source: Data source URL
    Returns:
        List of lead IDs from the previous day
    """
    # Get leads data from data source
    leads_data = pd.read_csv(data_source)
    
    # Parse meeting date
    meeting_dt = datetime.strptime(meeting_date, "%Y-%m-%d")
    previous_day = meeting_dt - timedelta(days=1)
    
    print(f"📅 Meeting date: {meeting_date}")
    print(f"🔍 Looking for leads from previous day: {previous_day.strftime('%Y-%m-%d')}")
    
    leads_data['contact_date'] = pd.to_datetime(leads_data['contact_date'])
    
    filtered_leads = leads_data[
        leads_data['contact_date'].dt.date == previous_day.date()
    ]
    
    previous_day_leads = filtered_leads['lead_id'].tolist()
    
    print(f"📊 Found {len(previous_day_leads)} leads from previous day: {previous_day_leads}")
    return previous_day_leads

# Test the function
test_leads = get_leads_from_previous_day(MEETING_DATE, DATA_SOURCE)
print(f"\n✅ Function test complete - found {len(test_leads)} leads for demo")


📅 Meeting date: 2025-05-31
🔍 Looking for leads from previous day: 2025-05-30
📊 Found 30 leads from previous day: [42, 322, 495, 834, 954, 1370, 1376, 1524, 2141, 2202, 2236, 2255, 2838, 2882, 2991, 3167, 3912, 3928, 4336, 4891, 5301, 5693, 5709, 5779, 5866, 6022, 6052, 6377, 6869, 7213]

✅ Function test complete - found 30 leads for demo


We're ready to create our agent function - the OpenAI agents SDK makes this very easy, we just need to:
1. Define a `lead_scoring_agent`: `Agent` object with the system prompt and available tools
2. Get the leads we want to inspect with our helper function
3. Write our specific `request` (prompt) for the agent
4. Run the agent with `result = await Runner.run(starting_agent=lead_scoring_agent, input=request)`

In [5]:
async def run_daily_lead_scoring_demo(mcp_server: MCPServer):
    """
    Run the daily lead scoring agent demo
    """
    # Daily Lead Scoring Agent
    lead_scoring_agent = Agent(
        name="Daily Lead Scoring Agent",
        model="gpt-5",
        instructions="""You are a lead scoring agent for the daily sales team meeting. Your task is to:

        1. Set up the KumoRFM model with the S3 lead scoring data
        2. Make predictions for the provided lead IDs from yesterday
        3. Create a prioritized outreach list for SDRs
        4. Provide actionable insights

        Workflow:
        - add_table: s3://kumo-sdk-public/rfm-datasets/lead_scoring/lead_scoring.csv (name: "leads")
        - inspect_table: Check the data structure
        - finalize_graph: Prepare for predictions
        - predict: Use query "PREDICT leads.converted=1 FOR leads.lead_id IN (<lead_ids>)"
        - lookup_table_rows: Get lead details for the high-priority leads

        Output format:
        🚀 HIGH PRIORITY (>15% conversion probability)
        🔥 MEDIUM PRIORITY (10-15% conversion probability)  
        ⏳ LOW PRIORITY (<10% conversion probability)

        For each lead, show: ID, probability, business segment, origin, lead type
        Focus on actionable SDR guidance.""",
        mcp_servers=[mcp_server],
    )

    # Get leads from previous day
    previous_day_leads = get_leads_from_previous_day(MEETING_DATE, DATA_SOURCE)
    
    if not previous_day_leads:
        print("❌ No leads found from previous day")
        return
    
    # Create the daily sales meeting request
    lead_ids_str = ",".join(map(str, previous_day_leads))
    request = f"""
    📋 DAILY SALES TEAM MEETING - {MEETING_DATE}

    We need to prioritize outreach for {len(previous_day_leads)} leads that came in yesterday.

    Lead IDs to score: {lead_ids_str}

    Please:
    1. Set up the lead scoring model 
    2. Score these leads for conversion probability
    3. Create a prioritized outreach list for our SDRs
    4. Provide insights on which leads to focus on first

    Data source: {DATA_SOURCE}
    """
    
    print("🎯 Daily Lead Scoring Meeting")
    print("=" * 50)
    print(f"Request: {request.strip()}")
    print("-" * 50)
    
    # Run the agent
    result = await Runner.run(starting_agent=lead_scoring_agent, input=request, max_turns=20)
    print("\n📊 SDR Prioritization Results:")
    print(result.final_output)
    
    return result

print("✅ Agent function defined and ready to run")


✅ Agent function defined and ready to run


The only remaining thing to do is initialize the KumoRFM MCP and provide it to the agent! We can do so with:
```
async with MCPServerStdio(
        name="KumoRFM Server",
        params={
            "command": "python",
            "args": ["-m", "kumo_rfm_mcp.server"],
            "env": {
                "KUMO_API_KEY": os.getenv("KUMO_API_KEY"),
                "KUMO_API_URL": "https://kumorfm.ai/api"
            }
        },
    ) as server:
     ...
```

OpenAI agents SDK also provides a tracing tool which is very useful for inspecting and debugging agentic runs!

In [None]:
async def main():
    """
    Main demo function - connects to MCP server and runs the agent
    """
    print("🔌 Connecting to KumoRFM MCP Server...")
    
    async with MCPServerStdio(
        name="KumoRFM Server",
        params={
            "command": "python",
            "args": ["-m", "kumo_rfm_mcp.server"],
            "env": {
                "KUMO_API_KEY": os.getenv("KUMO_API_KEY"),
            }
        },
    ) as server:
        trace_id = gen_trace_id()
        with trace(workflow_name="Daily Lead Scoring Meeting", trace_id=trace_id):
            print(f"📊 View trace: https://platform.openai.com/traces/trace?trace_id={trace_id}\n")
            print("✅ MCP Server connected successfully!")
            print("🤖 Running Daily Lead Scoring Agent...\n")
            
            # Run the demo
            result = await run_daily_lead_scoring_demo(server)
            
            print("\n" + "="*60)
            print("🎉 DEMO COMPLETE!")
            print("="*60)
            return result

# Run the demo
result = await main()


🔌 Connecting to KumoRFM MCP Server...
📊 View trace: https://platform.openai.com/traces/trace?trace_id=trace_9d28a61df30145ad8897be3369b579cd

✅ MCP Server connected successfully!
🤖 Running Daily Lead Scoring Agent...

📅 Meeting date: 2025-05-31
🔍 Looking for leads from previous day: 2025-05-30
📊 Found 30 leads from previous day: [42, 322, 495, 834, 954, 1370, 1376, 1524, 2141, 2202, 2236, 2255, 2838, 2882, 2991, 3167, 3912, 3928, 4336, 4891, 5301, 5693, 5709, 5779, 5866, 6022, 6052, 6377, 6869, 7213]
🎯 Daily Lead Scoring Meeting
Request: 📋 DAILY SALES TEAM MEETING - 2025-05-31

    We need to prioritize outreach for 30 leads that came in yesterday.

    Lead IDs to score: 42,322,495,834,954,1370,1376,1524,2141,2202,2236,2255,2838,2882,2991,3167,3912,3928,4336,4891,5301,5693,5709,5779,5866,6022,6052,6377,6869,7213

    Please:
    1. Set up the lead scoring model 
    2. Score these leads for conversion probability
    3. Create a prioritized outreach list for our SDRs
    4. Provide in

## We'd love to hear from you! ❤️

1. **Found a bug or have a feature request?**  
   Submit issues directly on [GitHub](https://github.com/kumo-ai/kumo-rfm). Your feedback helps us improve RFM for everyone.

2. **Built something cool with RFM? We'd love to see it!**  
   Share your project on LinkedIn and tag @kumo. We regularly spotlight on our official channels—yours could be next!

<div align="left">
  <img src="https://kumo-sdk-public.s3.us-west-2.amazonaws.com/rfm-colabs/kumo_ai_logo.jpeg" width="30" />
</div>