![Meeting Minutes Cover](https://media.licdn.com/dms/image/D4D12AQHDwTFupp2TTA/article-cover_image-shrink_720_1280/0/1701168770707?e=2147483647&v=beta&t=iPZMm8gUXWO3NvMRaoxNTFkvEYVXP3SyHLtNpw41nw8)


# Audio to Meeting Minutes Tool

## Functionality

This tool automates the process of generating professional meeting minutes from audio recordings. It takes an audio file stored on Google Drive as input, transcribes it into text using OpenAI's Whisper model, and summarizes the transcription into structured meeting minutes. The meeting minutes include:

- **Summary**: Key details such as attendees, location, and date.
- **Discussion Points**: Highlights of the conversations.
- **Takeaways**: Key insights and conclusions.
- **Action Items**: Specific tasks with assigned owners.

The output is provided in a markdown format, making it easy to view, edit, or save in various formats.

## How It Functions

1. **Input**: The user specifies the Google Drive path to the audio file and provides API credentials for OpenAI and Hugging Face.
2. **Transcription**: The tool processes the audio file and converts it to text using OpenAI's Whisper model.
3. **Summarization**: A pre-trained language model (LLAMA) generates meeting minutes based on the transcription.
4. **Output**: The summarized minutes are displayed and can be downloaded in markdown format.

## Why It Can Be Useful

This tool is highly beneficial for:
- **Efficiency**: Automating the labor-intensive task of manually transcribing and summarizing meeting audio.
- **Consistency**: Producing well-structured and professional meeting minutes.
- **Accessibility**: Saving minutes in markdown ensures compatibility with various text-processing tools.
- **Collaboration**: Simplifying sharing and editing of meeting records among team members.

By leveraging state-of-the-art models, this tool ensures accurate transcription and coherent summarization, making it an invaluable asset for professionals handling frequent meetings.

In [2]:
!pip install -q gradio fpdf python-docx requests torch bitsandbytes transformers sentencepiece accelerate openai httpx==0.27.2

import gradio as gr
import os
from google.colab import drive
from huggingface_hub import login
from transformers import AutoTokenizer, AutoModelForCausalLM, TextStreamer, BitsAndBytesConfig
import torch
from openai import OpenAI
from IPython.display import Markdown
from fpdf import FPDF
from docx import Document
from time import sleep

# Constants
AUDIO_MODEL = "whisper-1"
LLAMA = "meta-llama/Meta-Llama-3.1-8B-Instruct"

# Mount Google Drive
drive.mount("/content/drive")

def save_to_file(content, format):
    file_path = f"meeting_minutes.{format}"
    if format == "txt":
        with open(file_path, "w") as file:
            file.write(content)
    elif format == "pdf":
        pdf = FPDF()
        pdf.add_page()
        pdf.set_font("Arial", size=12)
        for line in content.splitlines():
            pdf.cell(0, 10, line, ln=True)
        pdf.output(file_path)
    elif format == "docx":
        doc = Document()
        doc.add_paragraph(content)
        doc.save(file_path)
    return file_path

def transcribe_and_summarize(audio_path, hf_token, openai_api_key, output_format, progress=gr.Progress()):
    try:
        # Initialize APIs
        login(hf_token, add_to_git_credential=True)
        openai = OpenAI(api_key=openai_api_key)

        # Step 1: Transcribe audio using OpenAI Whisper
        progress(10, "Starting transcription...")
        audio_file = open(audio_path, "rb")
        transcription = openai.audio.transcriptions.create(
            model=AUDIO_MODEL, file=audio_file, response_format="text"
        )
        progress(50, "Transcription complete. Generating meeting minutes...")

        # Step 2: Prepare LLM prompt
        system_message = "You are an assistant that produces minutes of meetings from transcripts, with summary, key discussion points, takeaways and action items with owners, in markdown."
        user_prompt = (
            f"Below is an extract transcript of a Denver council meeting. Please write minutes in markdown, including a summary with attendees, location and date; discussion points; takeaways; and action items with owners.\n{transcription}"
        )

        messages = [
            {"role": "system", "content": system_message},
            {"role": "user", "content": user_prompt},
        ]

        # Step 3: Load model and tokenizer
        quant_config = BitsAndBytesConfig(
            load_in_4bit=True,
            bnb_4bit_use_double_quant=True,
            bnb_4bit_compute_dtype=torch.bfloat16,
            bnb_4bit_quant_type="nf4",
        )

        tokenizer = AutoTokenizer.from_pretrained(LLAMA)
        tokenizer.pad_token = tokenizer.eos_token
        inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to("cuda")
        streamer = TextStreamer(tokenizer)

        model = AutoModelForCausalLM.from_pretrained(
            LLAMA, device_map="auto", quantization_config=quant_config
        )

        # Step 4: Generate outputs
        outputs = model.generate(inputs, max_new_tokens=2000, streamer=streamer)
        progress(90, "Meeting minutes generation nearly complete...")

        # Decode response
        response = tokenizer.decode(outputs[0])

        # Save to file
        file_path = save_to_file(response, output_format)
        progress(100, "Meeting minutes generation complete.")

        return response, file_path

    except Exception as e:
        return f"An error occurred: {str(e)}", None

# Gradio Interface
def interface():
    with gr.Blocks() as demo:
        gr.Markdown("### Audio to Meeting Minutes")

        audio_path = gr.Textbox(label="Google Drive Path for Audio")
        hf_token = gr.Textbox(label="Hugging Face Token", type="password")
        openai_api_key = gr.Textbox(label="OpenAI API Key", type="password")
        output_format = gr.Radio(["txt", "pdf", "docx"], label="Output Format", value="txt")

        output_text = gr.Markdown()
        output_file = gr.File(label="Download Meeting Minutes")

        transcribe_button = gr.Button("Generate Minutes")

        transcribe_button.click(
            transcribe_and_summarize,
            inputs=[audio_path, hf_token, openai_api_key, output_format],
            outputs=[output_text, output_file],
            show_progress=True
        )

    return demo

# Launch the interface
demo = interface()
demo.launch()


Mounted at /content/drive
Running Gradio in a Colab notebook requires sharing enabled. Automatically setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly).

Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
* Running on public URL: https://d390306ff0795fcd41.gradio.live

This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)


