<a href="https://colab.research.google.com/github/axiom19/AI-Meeting-Summarizer/blob/main/Meeting_Summarizer.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Meeting Summarizer / Speech Analyzer
<ul>
<li>
In this notebook, we'll set up a language model (LLM) instance, which is HuggingFace Model, model.
<li>Then, we'll establish a prompt template. These templates are structured guides to generate prompts for language models, aiding in output organization (more info in langchain prompt template).
<li>
Next, we'll develop a transcription function that employs the OpenAI Whisper model to convert speech-to-text. This function takes an audio file uploaded through a Gradio app interface (preferably in .mp3 format).
<li>The transcribed text is then fed into an LLMChain, which integrates the text with the prompt template and forwards it to the chosen LLM.

<li>The final output from the LLM is then displayed in the Gradio app's output textbox.

In [9]:
# !pip install gradio
# !pip install transformers
# !pip install langchain
# !pip install torch

In [10]:
# !pip install langchain-huggingface

## Import Statements

In [1]:
import torch
import os
import gradio as gr
from transformers import pipeline, AutoTokenizer, AutoModelForSeq2SeqLM, AutoModelForCausalLM
from langchain.prompts import PromptTemplate
from langchain.chains import LLMChain
from langchain_huggingface import HuggingFacePipeline

In [11]:
#os.environ["HF_TOKEN"] = "KEY"

## Step 1: Setting up LLM

In [5]:
#######------------- LLM-------------####
# Set the device
device = "cuda" if torch.cuda.is_available() else "cpu"

# Load the tokenizer and model
tokenizer = AutoTokenizer.from_pretrained("facebook/bart-large-cnn")
model = AutoModelForSeq2SeqLM.from_pretrained("facebook/bart-large-cnn")

# Create a text-generation pipeline
text_generation_pipeline = pipeline(
    "summarization",
    model=model,
    tokenizer=tokenizer,
    max_length=800,
    temperature=0.1,
    device=device
)

llm = HuggingFacePipeline(pipeline=text_generation_pipeline)

## Step 2: Setting Prompt Template and LLMChain

In [6]:
#######------------- Prompt Template-------------####

temp = """
List the key points with details from the context:
The context : {context}
"""
pt = PromptTemplate(
    input_variables=["context"],
    template= temp)

prompt_to_LLAMA2 = LLMChain(llm=llm, prompt=pt)

  warn_deprecated(


## Step 3: Speech to text transcription function

In [7]:
#######------------- Speech2text-------------####

def transcript_audio(audio_file):
    # Initialize the speech recognition pipeline

    pipe = pipeline(
        "automatic-speech-recognition",
        model="openai/whisper-tiny.en",
        chunk_length_s=30
    )

    # Transcribe the audio file and return the result
    transcript_txt = pipe(audio_file, batch_size=8)["text"]
    # run the chain to merge transcript text with the template and send it to the LLM
    result = prompt_to_LLAMA2.run(transcript_txt)
    return result


## Step 4: Set up Gradio Interface

In [8]:
#######------------- Gradio-------------####
audio_input = gr.Audio(sources="upload", type="filepath")
output_text = gr.Textbox()
# Create the Gradio interface with the function, inputs, and outputs
iface = gr.Interface(
    fn=transcript_audio,
    inputs=audio_input,
    outputs=output_text,
)
iface.launch()

Setting queue=True in a Colab notebook requires sharing enabled. Setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly).

Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
Running on public URL: https://cadeccaf6829e9f610.gradio.live

This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from Terminal to deploy to Spaces (https://huggingface.co/spaces)


