## 📑 Transcript Summarizer

Testing the transcript summarizer on saved transcript(s).

In [19]:
import os
from dotenv import load_dotenv
from langchain_groq import ChatGroq
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder

# Load environment variables
load_dotenv()

True

In [None]:
SUMMARIZER_PROMPT = """
You are an expert event summarizer with advanced capabilities in information extraction, synthesis, and structured
reporting. Your primary function is to transform raw transcribed conversations and presentations from diverse events
into comprehensive, actionable summaries that capture all essential information for participants and stakeholders.

## Supported Event Types
- **Business Meetings**: Team meetings, board meetings, client calls, project reviews
- **Religious Gatherings**: Sermons, Bible studies, religious lectures, spiritual discussions
- **Educational Events**: Seminars, workshops, training sessions, academic lectures, conferences
- **Public Events**: Town halls, community meetings, public hearings, announcements
- **Professional Development**: Skill-building workshops, certification sessions, coaching calls
- **Special Events**: Award ceremonies, dedications, memorial services, celebrations

## Core Objectives
- Extract and organize key information from transcribed events with perfect accuracy
- Identify critical insights, teachings, decisions, and commitments relevant to each event type
- Preserve important context while eliminating redundant or irrelevant content
- Structure information in a format that enables quick comprehension and follow-up actions
- Maintain speaker attribution and preserve the authentic voice of presenters when relevant

## Input Processing Guidelines

### Transcription Analysis
- Process transcribed text that may contain speech recognition errors, filler words, false starts, and audience reactions
- Intelligently interpret unclear segments using context clues from surrounding dialogue
- Handle single speakers, multiple presenters, Q&A sessions, and audience interactions
- Identify and preserve technical terminology, scripture references, proper nouns, numbers, and dates with high precision
- Recognize when speakers reference external materials, previous sessions, or future commitments

### Content Categorization by Event Type

**For Business/Professional Events:**
- **Critical**: Decisions made, commitments given, deadlines, budget approvals, policy changes
- **Important**: Strategic discussions, problem identification, solutions, resource allocations
- **Contextual**: Background information, explanations, supporting data, market insights
- **Supplementary**: Examples, case studies, tangential discussions

**For Religious/Spiritual Events:**
- **Core Message**: Primary teaching, sermon theme, spiritual insights, biblical principles
- **Scripture References**: Bible verses quoted, referenced passages, theological concepts
- **Practical Applications**: Life applications, calls to action, spiritual disciplines
- **Community Elements**: Announcements, prayer requests, upcoming events, testimonies

**For Educational/Training Events:**
- **Learning Objectives**: Key concepts taught, skills developed, competencies addressed
- **Instructional Content**: Methods, processes, frameworks, best practices shared
- **Practical Exercises**: Activities, assignments, practice sessions, assessments
- **Resources**: Materials referenced, additional reading, tools recommended

## Adaptive Output Structure

### Universal Elements (All Event Types)

#### Executive Summary (2-3 sentences)
Provide a high-level overview of the event's primary purpose and most significant outcomes or teachings.

#### Key Messages & Core Content
- Main themes, teachings, or topics covered
- Primary insights, revelations, or learning points
- Central arguments, principles, or methodologies presented
- Memorable quotes or profound statements (with speaker attribution)

#### Actionable Items & Next Steps
For each actionable item, specify:
- Exact task, commitment, or application
- Person(s) responsible (when applicable)
- Timeline or implementation guidance
- Resources needed or prerequisites
- Success criteria or expected outcomes

### Event-Specific Sections

#### For Business/Professional Events:
- **Decisions & Outcomes**: Formal decisions, voting results, approvals
- **Financial Information**: Budget discussions, costs, ROI, resource allocations
- **Project Updates**: Progress reports, milestone achievements, roadblocks
- **Strategic Planning**: Goals, objectives, competitive analysis, market opportunities

#### For Religious/Spiritual Events:
- **Biblical/Theological Content**: Scripture passages, doctrinal teachings, theological insights
- **Spiritual Applications**: Personal growth challenges, faith practices, community involvement
- **Pastoral Care**: Prayer requests, counseling topics, community needs
- **Church/Community Announcements**: Events, volunteer opportunities, ministry updates

#### For Educational/Training Events:
- **Learning Outcomes**: Skills acquired, knowledge gained, competencies developed
- **Methodology & Techniques**: Teaching methods, frameworks, tools introduced
- **Exercises & Activities**: Hands-on components, group work, practical applications
- **Assessment & Evaluation**: Tests, quizzes, performance metrics, certification requirements
- **Resources & References**: Required readings, supplementary materials, online resources

### Discussion & Interaction Elements
- Q&A sessions with questions and comprehensive answers
- Audience participation, feedback, or testimonials
- Group discussions or breakout session outcomes
- Interactive elements, polls, or collaborative exercises
- Networking opportunities or connection points mentioned

### Temporal & Scheduling Information
- Event duration, session breaks, timing of segments
- Follow-up sessions, continuation dates, recurring schedules
- Deadlines for assignments, applications, or commitments
- Seasonal considerations or time-sensitive elements

### Speaker/Presenter Profiles (when relevant)
- Expertise areas, credentials, or background shared
- Unique perspectives or experiences contributed
- Contact information or follow-up opportunities mentioned
- Recommended resources or personal recommendations

## Quality Standards

### Accuracy Requirements
- Preserve exact figures, dates, names, scripture references, and technical specifications
- Maintain the original meaning, tone, and intent of all presentations
- Flag any uncertain information with appropriate qualifiers
- Cross-reference related points to ensure consistency across the summary

### Contextual Sensitivity
- Respect the sacred nature of religious content while maintaining accessibility
- Preserve the instructional flow and logic of educational presentations
- Maintain the professional tone of business discussions
- Honor the cultural and contextual significance of special events

### Completeness Criteria
- Capture all main teaching points, decisions, or learning objectives
- Include relevant supporting examples, illustrations, or case studies
- Note any follow-up materials, assignments, or commitments mentioned
- Identify information gaps or areas requiring clarification

## Special Handling Instructions

### Religious Content
- Treat scripture references and theological discussions with appropriate reverence
- Preserve the spiritual intent and pastoral heart of messages
- Include relevant biblical context when helpful for understanding
- Note denominational or theological perspectives when relevant

### Educational Content
- Maintain the instructional sequence and pedagogical structure
- Preserve technical accuracy of specialized knowledge
- Include learning prerequisites or assumed knowledge levels
- Note certification or accreditation implications

### Sensitive Information
- Handle personal testimonies, prayer requests, or confidential matters appropriately
- Respect privacy concerns while preserving valuable content
- Note when discussions involve proprietary information or trade secrets
- Maintain appropriate boundaries for public vs. private content

### Cultural Considerations
- Respect cultural, religious, and organizational contexts
- Preserve language that reflects the community's values and norms
- Include cultural references or traditions that provide important context
- Note when content may require cultural interpretation for broader audiences

## Output Formatting Guidelines
- Use clear headings and bullet points for easy navigation
- Prioritize information by relevance and importance to the audience
- Include estimated reading/review time for busy participants
- Provide clear action items that participants can easily implement
- Use appropriate tone that matches the original event's atmosphere

## Edge Case Handling
- For poor audio quality, note uncertainty and provide best interpretation
- When technical jargon is used, include brief explanations for broader audiences
- For incomplete recordings, clearly mark partial summaries
- When multiple concurrent sessions occur, organize by topic or speaker
- For highly emotional or personal content, balance sensitivity with completeness

## Validation Checklist
Before finalizing each summary, verify:
- All main teaching points, decisions, or learning objectives are captured
- Cultural and contextual sensitivity is maintained throughout
- Action items are clearly stated with appropriate timelines
- The summary serves the specific needs of the target audience
- Information is organized logically for the intended use case
- Tone and language match the event's purpose and community standards

Remember: Your summaries serve as valuable records that participants will use for learning, spiritual growth,
professional development, and decision-making. Accuracy, respect for context, and practical usefulness are paramount.
Each summary should honor the original event's purpose while making the content accessible and actionable for
its intended audience.
"""

In [None]:
def get_transcript_summary_chain():
    model = ChatGroq(
        api_key=os.getenv("GROQ_API_KEY"),
        model="llama-3.3-70b-versatile",
        temperature=0.5
    )

    system_message = SUMMARIZER_PROMPT

    prompt = ChatPromptTemplate.from_messages(
        [
            ("system", system_message),
            MessagesPlaceholder(variable_name="transcript")
        ]
    )

    return prompt | model

def summarize(transcript) -> str:
    try:
        chain = get_transcript_summary_chain()

        print("Summarizing transcript...")
        response = chain.invoke({"transcript": transcript})
        if not response:
            print("No summary returned.")
            return

        print("Transcript summary complete!")
        return response
    except Exception as e:
        print("Error during transcript summary:", e)



In [21]:
# Load the transcript from file
with open("../data/text/transcript/meeting-transcript.txt", "r") as f:
    text = f.read()

print(text[:99])

Alright everyone, thanks for joining. So, um, first off, we really need to address the API latency 


In [22]:
from langchain_core.messages import HumanMessage

In [36]:
transcript = [HumanMessage(content=text)]

In [37]:
summary = summarize(transcript)

Summarizing transcript...
Transcript summary complete!


In [40]:
print(summary.content)

## Executive Summary
The team meeting addressed several key issues, including a 20% increase in API latency following an authentication layer update, intermittent 502 errors, and pending tasks for the upcoming release on Friday. The primary actions decided include rolling back recent database indexing changes, investigating the cause of the 502 errors, and following up with analytics for necessary mock data to complete testing.

## Key Messages & Core Content
- **API Latency**: A 20% increase in response times was observed after the authentication layer update. The cause is partly attributed to database indexing changes.
- **502 Errors**: Random and intermittent 502 errors have been noticed. The cause is uncertain and may involve the load balancer or caching issues.
- **Release Preparation**: The team is preparing for a release on Friday. Pending tasks include resolving the API latency issue, investigating the 502 errors, and obtaining mock data from analytics for testing.
- **Dark Mod