## LangChain Agent + Speech Recognition
* **Model Used:** Whisper to transcribe Audio files
* **LLM Model:** gpt-3.5-turbo
* **Tool used for Deployment:** Gradio
* **Chatbot:** ServiceNow QA Agent - Text and Audio support

## Step1: Import Libraries

In [1]:
import whisper
from langchain.tools import Tool
from langchain.chat_models import ChatOpenAI
from langchain.agents import initialize_agent, AgentType
import gradio as gr

## Step2: Whisper Model for Transcriptions

In [2]:
whisper_model = whisper.load_model("base")
def transcribe_audio(file_path):
    print("DEBUG: file_path =", file_path, type(file_path))
    result = whisper_model.transcribe(file_path)
    return result["text"]

## Step3: RAG QA Tool Definition

In [None]:
def rag_qa(query):
    return rag_pipeline(query) # type: ignore
qa_tool = Tool(
    name="YouTubeQA",
    func=rag_qa,
    description="Answer questions about YouTube videos using RAG."
)

## Step4: LLM Model with Agent

In [4]:
llm = ChatOpenAI(model_name="gpt-3.5-turbo")
tools = [qa_tool]
agent = initialize_agent(
    tools=tools,
    llm=llm,
    agent=AgentType.OPENAI_FUNCTIONS,
    verbose=True
)

  llm = ChatOpenAI(model_name="gpt-3.5-turbo")
  agent = initialize_agent(


## Step5: Deploy Interface for Audio

In [5]:
def process_audio_with_agent(audio_file):
    text_query = transcribe_audio(audio_file)
    answer = agent.run(input=text_query)
    return f"Q: {text_query}\n\nA: {answer}"

demo = gr.Interface(
    fn=process_audio_with_agent,
    inputs=gr.Audio(type="filepath", label="Record or Upload Audio"),
    outputs=gr.Textbox(label="Q & A", lines=6),
    title="ServiceNow QA Assistant: Ask by Voice!",
    description="The answer will appear below your transcribed question."
)
demo.launch()

* Running on local URL:  http://127.0.0.1:7860
* To create a public link, set `share=True` in `launch()`.




## Step6: Deploy Interface for Audio and Text

In [8]:
import gradio as gr

def process_query(text_query, audio_file, history):
    if audio_file is not None:
        text_query = transcribe_audio(audio_file)
        question = text_query
    elif text_query and text_query.strip():
        question = text_query.strip()
    else:
        return history + [("User", "Please enter a question or upload audio.")]
    answer = agent.run(input=question)
    history = history + [(f"User: {question}", f"Agent: {answer}")]
    return history

with gr.Blocks() as demo:
    gr.HTML("<h1 style='text-align: center;'>ServiceNow QA Agent</h1>")
    gr.Markdown("<center>Type or record your question below. The bot will provide you answer</center>")
    chatbot = gr.Chatbot(label="Conversation")
    with gr.Row():
        text_input = gr.Textbox(label="Type Your Question", lines=2)
        audio_input = gr.Audio(type="filepath", label="Or Record/Upload Audio")
    submit = gr.Button("Submit")
    #clear = gr.Button("Clear Chat")

    state = gr.State([])  # to hold chat history

    submit.click(
        process_query,
        inputs=[text_input, audio_input, state],
        outputs=chatbot
    ).then(
        lambda history: history,  # update state with latest history
        inputs=chatbot,
        outputs=state
    )
    #clear.click(
        #lambda: [],
        #None,
        #[chatbot, state]
    #)

demo.launch(share=True)


  chatbot = gr.Chatbot(label="Conversation")


* Running on local URL:  http://127.0.0.1:7863
* Running on public URL: https://9bce2684e90b149d41.gradio.live

This share link expires in 1 week. For free permanent hosting and GPU upgrades, run `gradio deploy` from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)




DEBUG: file_path = C:\Users\Mercy\AppData\Local\Temp\gradio\78be28e7dc6119d1c446aefcfb9d43aba51d7126525f37211bfd2c3f5407d710\Question 1.m4a <class 'str'>




 What are the AI functionalities available in ServiceNow?


[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m
Invoking: `YouTubeQA` with `AI functionalities in ServiceNow`


[0mDEBUG: file_path = C:\Users\Mercy\AppData\Local\Temp\gradio\78be28e7dc6119d1c446aefcfb9d43aba51d7126525f37211bfd2c3f5407d710\Question 1.m4a <class 'str'>




 What are the AI functionalities available in ServiceNow?


[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m
Invoking: `YouTubeQA` with `AI functionalities in ServiceNow`


[0mDEBUG: file_path = C:\Users\Mercy\AppData\Local\Temp\gradio\320ad5ef4328994efffd8723bc1e9ee9b45a6ee431a9e831cdd6f0eab076bc97\Question 2.mp3 <class 'str'>




 What is CMDB in service now?


[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mCMDB stands for Configuration Management Database in ServiceNow. It is a centralized repository that stores information about all the configuration items (CIs) in an organization's IT infrastructure. This includes hardware, software, applications, and other components that are essential for delivering IT services. The CMDB in ServiceNow helps organizations track and manage their IT assets, relationships between assets, and the impact of changes on the IT environment. It plays a crucial role in IT service management and helps organizations improve service delivery, reduce risks, and enhance decision-making.[0m

[1m> Finished chain.[0m


## Note: to upload Audio file - format should be mp3