# Meeting Minutes Generator

## Problem Statement

### Background
Traditional meeting documentation is time-consuming and often misses critical details. Manual transcription of meetings can take up to 3-4 times the length of the meeting itself, leading to:
- Reduced productivity
- Inconsistent documentation
- Missing important action items
- Difficulty in knowledge sharing

### Objective
Create an automated meeting minutes generator that will:
1. Transcribe audio recordings to text using state-of-the-art speech recognition models
2. Extract key information including:
   - Main discussion points
   - Action items and assignments
   - Decisions made
   - Follow-up tasks
3. Generate structured, easily readable meeting minutes
4. Support multiple audio formats and meeting types

### Technical Requirements
- Speech-to-text conversion using OpenAI Whisper or AssemblyAI
- Natural Language Processing for information extraction
- Secure handling of sensitive meeting data
- Support for multiple speakers (diarization)
- Export functionality to common formats (PDF, DOCX, etc.)

### Success Metrics
- Transcription accuracy > 95%
- Processing time < 0.5x meeting duration
- Correct identification of key points and action items
- User satisfaction with generated minutes format

### Scope
Phase 1:
- Audio file upload and transcription
- Basic text processing and formatting
- Simple export functionality

Future Phases:
- Real-time transcription
- Speaker identification
- Integration with calendar and task management systems
- Custom templates for different meeting types

In [2]:
import os
from dotenv import load_dotenv
from openai import OpenAI
from pathlib import Path
import soundfile as sf
import datetime

In [3]:
# Load environment variables
load_dotenv()

# Initialize OpenAI client
client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))

In [4]:
class MeetingTranscriber:
    def __init__(self):
        self.client = client
        
    def transcribe_audio(self, audio_path: str) -> dict:
        """
        Transcribe audio file using OpenAI's Whisper model
        
        Args:
            audio_path (str): Path to the audio file
            
        Returns:
            dict: Transcription result containing text and metadata
        """
        try:
            with open(audio_path, "rb") as audio_file:
                transcript = self.client.audio.transcriptions.create(
                    model="whisper-1",
                    file=audio_file,
                    response_format="verbose_json"
                )
            
            return {
                'text': transcript.text,
                'timestamp': datetime.datetime.now().isoformat(),
                'duration': self._get_audio_duration(audio_path),
                'status': 'success'
            }
            
        except Exception as e:
            return {
                'text': None,
                'timestamp': datetime.datetime.now().isoformat(),
                'status': 'error',
                'error': str(e)
            }
    
    def _get_audio_duration(self, audio_path: str) -> float:
        """Get duration of audio file in seconds"""
        audio_info = sf.info(audio_path)
        return audio_info.duration

In [None]:
if __name__ == "__main__":
    transcriber = MeetingTranscriber()

    audio_data, sample_rate = sf.read("denver.mp3")
    audio_data = audio_data[:15 * 60 * sample_rate]
    sf.write("denver_audio_trimmed.mp3", audio_data, sample_rate)

        # transcribe the audio file
    result = transcriber.transcribe_audio("denver_audio_trimmed.mp3")
    
    if result['status'] == 'success':
        print(f"Transcription completed! Duration: {result['duration']:.2f} seconds")
    else:
        print(f"Error during transcription: {result['error']}")

    # system prompt
    system_prompt = """
    You are a highly skilled AI assistant tasked with generating concise and informative meeting minutes from a provided meeting transcript. 
    Extract key elements like the meeting title, date and time, attendees, absentees, agenda items, decisions made, action items, and next steps.
    Structure the minutes with a header, attendee list, agenda items and discussion summaries, decisions, action items with assigned individuals and 
    deadlines, and next steps. Write in a professional tone, using clear and concise language, and maintain a neutral and objective perspective. 
    Focus on brevity, summarizing key information without unnecessary detail, and proofread carefully to ensure the minutes are free of errors.
    """
    
    # user prompt
    user_prompt = f"""
    Here is the transcript of the meeting: `{result['text']}`
    """

    messages_chat = [{"role": "system", "content": system_prompt}, {"role": "user", "content": user_prompt}]
    
    # generate the meeting minutes
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=messages_chat
    )
    
    # print the response
    print(response.choices[0].message.content)

In [None]:
# using bites and bytes package to do quantization
from bitsandbytes import BitsAndBytesConfig
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_id = "meta-llama/Llama-3.3-70B-Instruct"

# quantization config
quant_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_quant_storage=torch.bfloat16,
    bnb_4bit_use_double_quant=True,
    bnb_4bit_quant_storage_quant_type=torch.bfloat16,
)

# tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_id, return_tensors="pt").to("cuda")

# model
quantized_model = AutoModelForCausalLM.from_pretrained(model_id, quantization=quant_config, auto_device=True, torch_dtype=torch.bfloat16).to("cuda")

# generate the meeting minutes
input_ids = tokenizer.apply_chat_template(messages_chat, tokenize=False, add_bos=True)
output = quantized_model.generate(input_ids, max_length=1000)
print(tokenizer.decode(output[0], skip_special_tokens=True))



