# **Meeting Summarizer Agent**

This notebook implements an automated meeting summarization system that combines audio transcription with AI-powered summary generation. The system uses OpenAI's Whisper for accurate speech-to-text conversion and Google's Gemini AI for intelligent meeting summary creation.

## Dependencies
The following packages are required:
- `openai-whisper`: For audio transcription
- `crewai`: For agent-based interactions
- `crewai-tools`: Additional tools for CrewAI
- `google-generativeai`: For Gemini AI integration
- `ffmpeg-python`: For audio processing

Run the installation cell below to set up these dependencies.

In [3]:
# 1. INSTALLATION GUIDE (run only once, comment after install)
# !pip install openai-whisper crewai crewai-tools google-generativeai ffmpeg-python

## Library Imports
This cell imports all necessary Python libraries:
- `whisper`: For audio transcription capabilities
- `crewai`: For agent-based interactions (Agent, Task, Crew, Process, LLM)
- `dotenv`: For environment variable management
- `os`: For file and environment operations
- `datetime`: For timestamp generation

In [4]:
# 2. IMPORTS
import whisper
from crewai import Agent, Task, Crew, Process, LLM
from dotenv import load_dotenv
import os
from datetime import datetime

## API Configuration
This cell sets up the necessary API configuration:
1. Loads environment variables from a `.env` file
2. Retrieves the Gemini API key for AI-powered summarization
3. Validates the API key presence to ensure proper setup

Make sure to have a `.env` file with `GEMINI_API_KEY` before running this notebook.

In [5]:
# 3. CONFIGURATION: Set your Gemini API KEY below!
load_dotenv()
GEMINI_API_KEY = os.getenv("GEMINI_API_KEY")

if not GEMINI_API_KEY:
    raise ValueError("GEMINI_API_KEY not found in .env file. Please add it before running.")

## Audio Transcription Component
The `AudioTranscriber` class handles the speech-to-text conversion using OpenAI's Whisper model:

### Features
1. **Model Loading**
   - Supports different model sizes (base, small, medium, large)
   - Automatically uses CUDA if available
   
2. **Transcription**
   - Processes audio files in multiple formats
   - Supports multiple languages
   - Returns structured output with:
     - Full text transcription
     - Language detection
     - Time-stamped segments

In [6]:
# 4. AUDIO TRANSCRIBER COMPONENT (Whisper)
class AudioTranscriber:
    """Handles audio transcription using Whisper"""
    def __init__(self, model_size="base"):
        print(f"Loading Whisper {model_size} model...")
        self.model = whisper.load_model(model_size, device="cuda")
        print(f"Using device: {self.model.device}")
        
        
        print("✓ Whisper model loaded")

    def transcribe(self, audio_path, language="en"):
        print(f"Transcribing: {audio_path}...")
        result = self.model.transcribe(
            audio_path,
            language=language,
            verbose=False
        )
        print("✓ Transcription complete")
        return {
            'text': result['text'],
            'language': result['language'],
            'segments': result['segments']
        }

## Meeting Summarizer Component
The `MeetingSummarizerAgent` class implements the AI-powered summarization using CrewAI and Gemini:

### Components
1. **LLM Setup**
   - Uses Gemini's flash-exp model
   - Temperature set to 0.3 for consistent output
   
2. **Agent Configuration**
   - Role: Meeting Minutes Specialist
   - Specialized in transcript analysis
   - Focused on structured information extraction
   
3. **Summary Structure**
   - Meeting Overview
   - Key Discussion Points
   - Decisions Made
   - Action Items
   - Open Questions

In [7]:
# 5. MEETING SUMMARIZER AGENT (CrewAI + Gemini API)
class MeetingSummarizerAgent:
    """CrewAI agent for meeting summarization"""
    def __init__(self, gemini_api_key=None):
        self.llm = LLM(
            model="gemini/gemini-2.0-flash-exp",
            api_key=gemini_api_key or os.getenv("GEMINI_API_KEY"),
            temperature=0.3
        )
        self.agent = Agent(
            role='Meeting Minutes Specialist',
            goal='Create comprehensive, well-structured meeting summaries with key points, decisions, and action items',
            backstory="""You are an expert at analyzing meeting transcripts and extracting
the most important information. You organize information clearly with proper
headings and bullet points. You identify key decisions, action items, and
important discussions.""",
            llm=self.llm,
            verbose=True,
            allow_delegation=False
        )

    def summarize(self, transcript):
        # Customize this prompt as needed!
        task = Task(
            description=f"""Analyze the following meeting transcript and create a comprehensive summary.

TRANSCRIPT:
{transcript}

Your summary must include:
1. **Meeting Overview**: Brief description of the meeting purpose
2. **Key Discussion Points**: Main topics discussed (bullet points)
3. **Decisions Made**: Important decisions and conclusions
4. **Action Items**: Tasks assigned with responsible persons (if mentioned)
5. **Open Questions**: Unresolved issues or questions raised

Format the output with clear headings and bullet points for easy reading.
""",
            expected_output="A well-structured meeting summary with all key information organized under clear headings",
            agent=self.agent
        )
        crew = Crew(
            agents=[self.agent],
            tasks=[task],
            process=Process.sequential,
            verbose=True
        )
        result = crew.kickoff()
        return result

## Complete Pipeline Implementation
The `MeetingSummarizationPipeline` class combines all components into a seamless workflow:

### Pipeline Steps
1. **Initialization**
   - Sets up Whisper transcriber with specified model size
   - Configures Gemini-powered summarizer
   
2. **Meeting Processing**
   - Audio transcription with progress tracking
   - Transcript summarization
   - Optional output saving with timestamps
   
3. **Output Generation**
   - Full transcript
   - Structured summary
   - Metadata (language, timestamp)

In [8]:

# 6. COMPLETE PIPELINE CLASS
class MeetingSummarizationPipeline:
    """End-to-end pipeline for meeting summarization"""
    def __init__(self, whisper_model="base", gemini_api_key=None):
        self.transcriber = AudioTranscriber(model_size=whisper_model)
        self.summarizer = MeetingSummarizerAgent(gemini_api_key=gemini_api_key)
    
    def process_meeting(self, audio_path, language="en", save_output=True):
        print("\n" + "="*60)
        print("MEETING SUMMARIZATION PIPELINE")
        print("="*60 + "\n")

        # Step 1: Transcribe audio
        print("[1/2] Transcribing audio...")
        transcription_result = self.transcriber.transcribe(audio_path, language)
        transcript = transcription_result['text']
        print(f"\nTranscript length: {len(transcript)} characters\n")

        # Step 2: Summarize transcript
        print("[2/2] Generating summary...")
        summary = self.summarizer.summarize(transcript)

        # Save transcripts and summaries if desired
        if save_output:
            timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
            transcript_file = f"meeting_transcript_{timestamp}.txt"
            with open(transcript_file, 'w', encoding='utf-8') as f:
                f.write(transcript)
            print(f"\n✓ Transcript saved: {transcript_file}")

            summary_file = f"meeting_summary_{timestamp}.txt"
            with open(summary_file, 'w', encoding='utf-8') as f:
                f.write(str(summary))
            print(f"✓ Summary saved: {summary_file}")
        
        print("\n" + "="*60)
        print("PROCESSING COMPLETE")
        print("="*60 + "\n")

        return {
            'transcript': transcript,
            'summary': summary,
            'metadata': {
                'language': transcription_result['language'],
                'timestamp': datetime.now().isoformat()
            }
        }

## Usage Example
This cell demonstrates how to use the complete pipeline:

1. **Setup**
   - Specify the audio file path
   - Initialize pipeline with desired model size
   - Provide Gemini API key
   
2. **Execution**
   - Process meeting audio
   - Set language (default: English)
   - Enable/disable output saving
   
3. **Output Display**
   - View full transcript
   - Review generated summary
   
Note: Replace `sample_audio.mp3` with your actual audio file path before running.

In [None]:
# 7. USAGE
if __name__ == "__main__":
    AUDIO_FILE_PATH = "sample_audio.mp3"  # <-- Replace with your file!
    pipeline = MeetingSummarizationPipeline(
        whisper_model="base",
        gemini_api_key=GEMINI_API_KEY
    )
    result = pipeline.process_meeting(
        audio_path=AUDIO_FILE_PATH,
        language="en",
        save_output=True
    )
    print("="*60)
    print("TRANSCRIPT")
    print("="*60)
    print(result['transcript'])
    print("\n" + "="*60)
    print("SUMMARY")
    print("="*60)
    print(result['summary'])

Loading Whisper base model...
Using device: cuda:0
✓ Whisper model loaded

MEETING SUMMARIZATION PIPELINE

[1/2] Transcribing audio...
Transcribing: sample_audio.mp3...


100%|██████████| 256062/256062 [01:12<00:00, 3515.51frames/s]

✓ Transcription complete

Transcript length: 34245 characters

[2/2] Generating summary...






✓ Transcript saved: meeting_transcript_20251021_012912.txt
✓ Summary saved: meeting_summary_20251021_012912.txt

PROCESSING COMPLETE

TRANSCRIPT
 can record. And we don't have a ton of items to get to. And I might be able to do one that might be fun if we have a little bit of time. So corporate events, they, I think I saw a little, I put this in Slack and I saw a little bit of kind of noise around it, which was good. The nutshell here is as we've kind of restructured and tried different things. The event support that we need isn't as nailed down as it needs to be. So the current tactic that we're going with is go-to-market team, signs up and kind of sponsors that event. So you support as a PMM, your campaign manager does the campaigns for that event, etc, etc. I don't see anyone in the, maybe there are comments in the issue. I don't see the header updated yet. I thought we had in Slack sort of farm to each one of them out. So I guess the next, so it looks like tie put in some folks. I