# LangGraph Agent for Call Transcription Processing

This notebook demonstrates how to use a LangGraph agent to process call transcription files and store them in Qdrant.

In [1]:
from dotenv import load_dotenv
_ = load_dotenv()

import os
from langgraph_agent import run_transcription_agent, create_agent

  from .autonotebook import tqdm as notebook_tqdm
Fetching 30 files: 100%|██████████| 30/30 [00:00<00:00, 73498.32it/s]
Fetching 30 files: 100%|██████████| 30/30 [00:00<00:00, 66017.38it/s]


In [2]:
# Check configuration
print("Configuration:")
print(f"OPENAI_API_KEY: {'✅ Set' if os.getenv('OPENAI_API_KEY') else '❌ Not set'}")
print(f"QDRANT_URL: {os.getenv('QDRANT_URL', 'localhost')}")
print(f"QDRANT_PORT: {os.getenv('QDRANT_PORT', '6333')}")

Configuration:
OPENAI_API_KEY: ✅ Set
QDRANT_URL: 127.0.0.1
QDRANT_PORT: 6333


## Basic Usage

The LangGraph agent can understand natural language instructions to process transcription files.

In [5]:
# Example 1: Basic transcription ingestion
result = run_transcription_agent(
    "show me the last call transcript and summarize it"
)

**File Information:**
- Filename: 4_negotiation_call.txt
- Processed: 2025-06-28T14:53:44.384361
- Total Chunks: 12

**Summary:**
The call focused on finalizing commercial terms, legal redlines, and sign-off details for a contract. Key participants included the Account Executive (AE), prospects Neha, Raghav, Sara, Priya, and Arjun, as well as Maya from Pricing and Elena from Security. Important decisions included agreeing on pricing, SLA terms, legal clauses, and onboarding details. Action items included updating the contract, sending security documentation, scheduling a DocuSign call, and preparing for onboarding. The call concluded with plans to mark the deal as CLOSED-WON.


In [None]:
# Example 2: Different phrasing
result = run_transcription_agent(
    "Load and process the transcription file at /home/bhayabhai.sisodiya/Documents/retriever-chatbot/4_negotiation_call.txt"
)

Fetching 30 files: 100%|██████████| 30/30 [00:00<00:00, 88988.06it/s]
You're using a XLMRobertaTokenizerFast tokenizer. Please note that with a fast tokenizer, using the `__call__` method is faster than using a method to encode the text followed by a call to the `pad` method to get a padded encoding.


Processing transcription file: /home/ad.rapidops.com/bhayabhai.sisodiya/Documents/retriever-chatbot/4_negotiation_call.txt
{'text': '[00:00] AE (Jordan):  Hi everyone—great to see a full house for what we hope is the final stretch.  On our side: I’m Jordan, Maya is back from Pricing, Elena from Security for any last clarifications, and our VP of Customer Success, Asha, in case onboarding timelines come up.  \n[00:15] AE:  From your side I see Neha from Procurement, Raghav your CFO, Sara from Legal, Priya from RevOps, and Arjun from Security.  Did I miss anyone?  \n[00:24] Prospect (Neha – Procurement):  That’s everyone, thanks Jordan.  Let’s get this wrapped.  \n[00:28] AE:  Agenda in three parts: (1) confirm commercial terms, (2) legal redlines and SLA language, (3) sign-off path and timing.  We’ve blocked 75 minutes.  We can finish sooner if decisions fall quickly.  Sound good?  \n[00:42] Prospect (Raghav – CFO):  Works.  I have another meeting at the top of the hour, so let’s keep p

100%|██████████| 1/1 [00:00<00:00, 36.86it/s]

Successfully upserted 12 points
Successfully processed /home/ad.rapidops.com/bhayabhai.sisodiya/Documents/retriever-chatbot/4_negotiation_call.txt
Collection: call_transcriptions
Total chunks: 12





✅ Successfully ingested transcription from /home/ad.rapidops.com/bhayabhai.sisodiya/Documents/retriever-chatbot/4_negotiation_call.txt into collection 'call_transcriptions'


In [4]:
# Example 3
result = run_transcription_agent(
    "list all the calls"
)

📋 **All Call Transcripts (4 files)**

**File List:**
1. **4_negotiation_call.txt**
   - Processed: 2025-06-28 14:53
   - Chunks: 12

2. **2_pricing_call.txt**
   - Processed: 2025-06-28 14:39
   - Chunks: 11

3. **3_objection_call.txt**
   - Processed: 2025-06-28 14:01
   - Chunks: 8

4. **1_demo_call.txt**
   - Processed: 2025-06-28 11:20
   - Chunks: 10

---
✅ Total: 4 transcript files in collection 'call_transcriptions'


In [4]:
# Example 3
result = run_transcription_agent(
    "tell me what CRO said about our pricing?"
)

pre tokenize: 100%|██████████| 1/1 [00:00<00:00, 87.92it/s]
You're using a XLMRobertaTokenizerFast tokenizer. Please note that with a fast tokenizer, using the `__call__` method is faster than using a method to encode the text followed by a call to the `pad` method to get a padded encoding.
Inference Embeddings: 100%|██████████| 1/1 [00:00<00:00, 17.02it/s]


**Question:** tell me what CRO said about our pricing?

**Answer:**
During the pricing call, the CRO (Chief Revenue Officer) discussed various discounts and pricing strategies to align with the prospect's needs. The CRO proposed a 20% discount for a 25-seat pilot, a 15% discount for a year-one commit of 120 seats, and a 4% renewal uplift cap. The total contract value (TCV) was calculated to be ₹67,73,760 for the prospect. The CRO also mentioned the possibility of seeking exceptions for slightly higher discounts. Additionally, the CRO addressed technical requests such as guaranteeing Slack early-access by pilot and Hindi diarization GA by Q4, which required a side-letter "Service Level Release" with penalties.

Quotes from the transcript:
- "Let’s take a short sidebar. Maya and I can propose a logo-plus-case-study discount of 15% and stretch volume to 12% for year-one commit of 120 seats. That reaches 20% net, slightly above policy but we can seek exception." (Chunk 2)
- "That’d put TCV

## Advanced Usage

You can also create the agent directly and interact with it step by step.

In [None]:
# Create the agent
agent = create_agent()

# Create initial state
from langchain_core.messages import HumanMessage

initial_state = {
    "messages": [
        HumanMessage(content="Process the transcription file at ./sample_call.txt")
    ]
}

# Run the agent
result = agent.invoke(initial_state)

# Print results
for message in result["messages"]:
    print(f"Type: {type(message).__name__}")
    print(f"Content: {message.content}")
    print("-" * 40)

## Batch Processing

Process multiple transcription files in sequence.

In [None]:
# List of transcription files to process
transcription_files = [
    "/path/to/call_001.txt",
    "/path/to/call_002.json",
    "/path/to/call_003.csv"
]

# Process each file
results = []
for file_path in transcription_files:
    print(f"\n📁 Processing: {file_path}")
    print("=" * 50)
    
    result = run_transcription_agent(
        f"Ingest the call transcription from {file_path}",
        verbose=True
    )
    results.append(result)

print(f"\n✅ Processed {len(results)} transcription files")