<a href="https://colab.research.google.com/github/ankush-003/alerts-simulation-and-remediation/blob/main/queryBot/asmr_query_bot_v2.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **ASMR Query Bot** 🔔

## Setup

In [1]:
%pip install --quiet langchain-groq langchain pymongo[srv] sentence_transformers streamlit

In [2]:
import os
from google.colab import userdata

os.environ["GROQ_API_KEY"]=userdata.get('GROQ_API_KEY')
os.environ["MONGO_URI"]=userdata.get('MONGO_URI')

## Groq Test

In [3]:
from langchain_core.prompts import ChatPromptTemplate
from langchain_groq import ChatGroq

In [4]:
chat = ChatGroq(temperature=0, model_name="mixtral-8x7b-32768")

In [5]:
system = "You are a helpful assistant."
human = "{text}"
prompt = ChatPromptTemplate.from_messages([("system", system), ("human", human)])

chain = prompt | chat
chain.invoke({"text": "Explain the importance of low latency LLMs."})

AIMessage(content='Sure, I\'d be happy to explain!\n\nLLM stands for "Low Latency Logging," which is a method of recording and transmitting data with minimal delay. Low latency LLMs are particularly important in time-sensitive applications where real-time data processing is critical. Here are some reasons why low latency LLMs are important:\n\n1. Improved user experience: In applications such as online gaming or video conferencing, low latency LLMs ensure that data is transmitted and processed quickly, resulting in a smoother and more responsive user experience.\n2. Increased efficiency: Low latency LLMs enable faster data processing, which can lead to increased efficiency in data-intensive applications such as financial trading or scientific simulations.\n3. Enhanced security: In security-critical applications, low latency LLMs can help detect and respond to threats more quickly, reducing the risk of data breaches or other security incidents.\n4. Better decision-making: In real-time d

In [6]:
chat = ChatGroq(temperature=0, model_name="llama3-70b-8192")
prompt = ChatPromptTemplate.from_messages([("human", "Write a essay on about {topic} in 1000 words")])
chain = prompt | chat
for chunk in chain.stream({"topic": "The Moon"}):
    print(chunk.content, end="", flush=True)

The Moon: Earth's Celestial Companion

The Moon, Earth's sole natural satellite, has been a source of fascination and wonder for humans since the dawn of time. This celestial body has been an integral part of our planet's history, influencing the tides, stabilizing Earth's axis, and inspiring countless scientific, artistic, and cultural achievements. In this essay, we will delve into the Moon's formation, composition, and significance, as well as its impact on human society and our understanding of the universe.

Formation and Composition

The Moon is believed to have formed approximately 4.5 billion years ago, not long after the formation of the Earth. One of the most widely accepted theories is the giant impact hypothesis, which suggests that a massive object collided with Earth, causing debris to be ejected into orbit and eventually coalesce into the Moon. This collision is thought to have occurred when the solar system was still in its early stages of formation, and the resulting d

# Vectorsearch Test

In [12]:
import os
from collections.abc import Collection
from langchain.memory import ChatMessageHistory
from langchain_groq import ChatGroq
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_community.vectorstores import MongoDBAtlasVectorSearch
from langchain_community.embeddings import HuggingFaceEmbeddings
from langchain.chains import create_history_aware_retriever, create_retrieval_chain
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain.output_parsers import ResponseSchema, StructuredOutputParser
from langchain_core.runnables.history import RunnableWithMessageHistory
from langchain_core.runnables import RunnablePassthrough
from langchain_core.chat_history import BaseChatMessageHistory
from langchain.chains import RetrievalQA
import nest_asyncio
from langchain_core.output_parsers import StrOutputParser
nest_asyncio.apply()

# config
database = "AlertSimAndRemediation"
collection = "alert_embed"
index_name = "alert_index"

# llm
chat = ChatGroq(temperature=0, model_name="mixtral-8x7b-32768")

# embedding model
embedding_args = {
    "model_name" : "BAAI/bge-large-en-v1.5",
    "model_kwargs" : {"device": "cpu"},
    "encode_kwargs" : {"normalize_embeddings": True}
}
embedding_model = HuggingFaceEmbeddings(**embedding_args)

# chat history
# chat_history = ChatMessageHistory()

# vector search
vector_search = MongoDBAtlasVectorSearch.from_connection_string(
    os.environ["MONGO_URI"],
    f"{database}.{collection}",
    embedding_model,
    index_name=index_name,
)

qa_retriever = vector_search.as_retriever(
    search_type="similarity",
    search_kwargs={"k": 2},
)

# contextualising prev chats
contextualize_q_system_prompt = """Given a chat history and the latest user question \
which might reference context in the chat history, formulate a standalone question \
which can be understood without the chat history. Do NOT answer the question, \
just reformulate it if needed and otherwise return it as is."""
contextualize_q_prompt = ChatPromptTemplate.from_messages(
    [
        ("system", contextualize_q_system_prompt),
        MessagesPlaceholder("chat_history"),
        ("human", "{input}"),
    ]
)
history_aware_retriever = create_history_aware_retriever(
    chat, qa_retriever, contextualize_q_prompt
)

# prompt
system_prompt = """
You are a helpful query assistant for Alertmanager, an open-source system for monitoring and alerting on system metrics. Your goal is to accurately answer questions related to alerts triggered within the Alertmanager system based on the alert information provided to you. \
You will be given details about specific alerts, including the alert source, severity, category, and any other relevant metadata. Using this information, you should be able to respond to queries about the nature of the alert, what it signifies, potential causes, and recommended actions or troubleshooting steps. \
Your responses should be clear, concise, and tailored to the specific alert details provided, while also drawing from your broader knowledge about Alertmanager and monitoring best practices when relevant. If you cannot provide a satisfactory answer due to insufficient information, politely indicate that and ask for any additional context needed. \

<context>
{context}
</context>
"""

chat_history = []

qa_prompt = ChatPromptTemplate.from_messages(
    [
        ("system", system_prompt),
        MessagesPlaceholder("chat_history"),
        ("human", "{input}"),
    ]
)

# preprocessing context
def format_docs_with_metadata(docs):
    formatted_docs = []
    for i, doc in enumerate(docs, start=1):
        metadata_str = "\n".join([f"**{key}**: {value}" for key, value in doc.metadata.items() if key != "embedding"])
        formatted_doc = f"**{i}.**\n{doc.page_content}\n\n**Metadata:**\n{metadata_str}"
        formatted_docs.append(formatted_doc)
    return "\n\n".join(formatted_docs)

question_answer_chain = create_stuff_documents_chain(chat, qa_prompt) | StrOutputParser()

# output parser
response_schemas = [
    ResponseSchema(name="answer", description="answer to the user's question"),
    ResponseSchema(
        name="source",
        description="source used to answer the user's question, should be a website.",
    )
]
output_parser = StructuredOutputParser.from_response_schemas(response_schemas)


rag_chain = create_retrieval_chain(history_aware_retriever, question_answer_chain)

# managing message history
store = {}

def get_session_history(session_id: str) -> BaseChatMessageHistory:
    if session_id not in store:
        store[session_id] = ChatMessageHistory()
    return store[session_id]

conversational_rag_chain = RunnableWithMessageHistory(
    rag_chain,
    get_session_history,
    input_messages_key="input",
    history_messages_key="chat_history",
    output_messages_key="answer",
)

# schema
# print(conversational_rag_chain.input_schema.schema())
# print(conversational_rag_chain.output_schema.schema())


# Retrieves documents
# retriever_chain = create_history_aware_retriever(chat, qa_retriever, prompt)

# retriever_chain.invoke({
#     "chat_history": chat_history,
#     "input": "Tell me about the latest alert"
# })

res = conversational_rag_chain.invoke(
    {"input": "What is the remedy to the latest alert"},
    config={
        "configurable": {"session_id": "abc123"}
    },  # constructs a key "abc123" in `store`.
)

print(res["answer"])
print(format_docs_with_metadata(res["context"]))

for chunk in conversational_rag_chain.stream(
    {"input": "What is the remedy to the latest alert"},
    config={
        "configurable": {"session_id": "abc123"}
    },  # constructs a key "abc123" in `store`.
):

    if 'context' in chunk:
      print(format_docs_with_metadata(chunk["context"]))
    if 'answer' in chunk:
      print(chunk['answer'], flush=True, end="")


# for s in conversational_rag_chain.stream(
#     {"input": "What is the latest alert and which node is it from ?"},
#     config={
#         "configurable": {"session_id": "abc123"}
#     },
# ):
#     print(s, end="", flush=True)



**1.**
A Severe alert of category Network was raised from the Hardware source for node a4a18bf2-44fd-4a87-a7f7-89fbce94dcfb at 2024-05-08 03:04:04. The alert is Not Acknowledged. The recommended remedy is: Monitor network traffic closely.

**Metadata:**
**_id**: 663a9e4f87000b6df1a4cba2
**CreatedAt**: 2024-05-08 03:04:04
**Acknowledged**: 0
**Remedy**: Monitor network traffic closely
**Severity**: Severe
**Source**: Hardware
**node**: a4a18bf2-44fd-4a87-a7f7-89fbce94dcfb
**Category**: Network

**2.**
A Severe alert of category Network was raised from the Hardware source for node a4a18bf2-44fd-4a87-a7f7-89fbce94dcfb at 2024-05-08 03:01:48. The alert is Not Acknowledged. The recommended remedy is: Monitor network traffic closely.

**Metadata:**
**_id**: 663a9dc687000b6df1a4cb99
**Severity**: Severe
**Source**: Hardware
**node**: a4a18bf2-44fd-4a87-a7f7-89fbce94dcfb
**Category**: Network
**CreatedAt**: 2024-05-08 03:01:48
**Acknowledged**: 0
**Remedy**: Monitor network traffic closely
The

In [8]:
conversational_rag_chain.invoke(
    {"input": "can you explain the remedy that you provided previously inb detail"},
    config={
        "configurable": {"session_id": "abc123"}
    },  # constructs a key "abc123" in `store`.
)



{'input': 'can you explain the remedy that you provided previously inb detail',
 'chat_history': [HumanMessage(content='What is the remedy to the latest alert'),
  AIMessage(content='The remedy to the latest alert is to monitor network traffic closely.')],
 'context': [Document(page_content='A Severe alert of category Network was raised from the Hardware source for node a4a18bf2-44fd-4a87-a7f7-89fbce94dcfb at 2024-05-08 03:04:04. The alert is Not Acknowledged. The recommended remedy is: Monitor network traffic closely.', metadata={'_id': ObjectId('663a9e4f87000b6df1a4cba2'), 'embedding': [-0.0026083788834512234, 0.015334388241171837, -0.03778952360153198, 0.06690531969070435, 0.022192401811480522, -0.015791084617376328, 0.029491988942027092, -0.02906719036400318, 0.013931314460933208, 0.06609197705984116, 0.02190079726278782, -0.039952948689460754, 0.027515923604369164, -0.017939701676368713, -0.04218700900673866, -0.021478867158293724, -0.008489315398037434, -0.02031848207116127, 0.02

In [9]:
!wget -q -O - ipv4.icanhazip.com

35.245.110.99


# **Query Bot**

In [39]:
%%writefile app.py
import streamlit as st
import os
import time
from collections.abc import Collection
from langchain.memory import ChatMessageHistory
from langchain_community.chat_message_histories import (
    StreamlitChatMessageHistory,
)
from langchain_groq import ChatGroq
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_community.vectorstores import MongoDBAtlasVectorSearch
from langchain_community.embeddings import HuggingFaceEmbeddings
from langchain.chains import create_history_aware_retriever, create_retrieval_chain
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain.output_parsers import ResponseSchema, StructuredOutputParser
from langchain_core.runnables.history import RunnableWithMessageHistory
from langchain_core.chat_history import BaseChatMessageHistory
from langchain.chains import RetrievalQA
import nest_asyncio
nest_asyncio.apply()

st.title('ASMR Query Bot 🔔')

# config
database = "AlertSimAndRemediation"
collection = "alert_embed"
index_name = "alert_index"

# llm
# temp = 0.0
# model_name = "mixtral-8x7b-32768"
# chat = ChatGroq(temperature=temp, model_name=model_name)

# embedding model
embedding_args = {
    "model_name" : "BAAI/bge-large-en-v1.5",
    "model_kwargs" : {"device": "cpu"},
    "encode_kwargs" : {"normalize_embeddings": True}
}
embedding_model = HuggingFaceEmbeddings(**embedding_args)

# chat history
# chat_history = ChatMessageHistory()

# vector search
vector_search = MongoDBAtlasVectorSearch.from_connection_string(
    os.environ["MONGO_URI"],
    f"{database}.{collection}",
    embedding_model,
    index_name=index_name,
)

# retriever
# k = 2
# qa_retriever = vector_search.as_retriever(
#     search_type="similarity",
#     search_kwargs={"k": k},
# )

# contextualising prev chats
contextualize_q_system_prompt = """Given a chat history and the latest user question \
which might reference context in the chat history, formulate a standalone question \
which can be understood without the chat history. Do NOT answer the question, \
just reformulate it if needed and otherwise return it as is."""
contextualize_q_prompt = ChatPromptTemplate.from_messages(
    [
        ("system", contextualize_q_system_prompt),
        MessagesPlaceholder("chat_history"),
        ("human", "{input}"),
    ]
)
# history_aware_retriever = create_history_aware_retriever(
#     chat, qa_retriever, contextualize_q_prompt
# )

# prompt
system_prompt = """
You are a helpful query assistant for Alertmanager, an open-source system for monitoring and alerting on system metrics. Your goal is to accurately answer questions related to alerts triggered within the Alertmanager system based on the alert information provided to you. \
You will be given details about specific alerts, including the alert source, severity, category, and any other relevant metadata. Using this information, you should be able to respond to queries about the nature of the alert, what it signifies, potential causes, and recommended actions or troubleshooting steps. \
Your responses should be clear, concise, and tailored to the specific alert details provided, while also drawing from your broader knowledge about Alertmanager and monitoring best practices when relevant. If you cannot provide a satisfactory answer due to insufficient information, politely indicate that and ask for any additional context needed. \

<context>
{context}
</context>
"""

qa_prompt = ChatPromptTemplate.from_messages(
    [
        ("system", system_prompt),
        MessagesPlaceholder("chat_history"),
        ("human", "{input}"),
    ]
)
# question_answer_chain = create_stuff_documents_chain(chat, qa_prompt)

# rag_chain = create_retrieval_chain(history_aware_retriever, question_answer_chain)

# conversational_rag_chain = RunnableWithMessageHistory(
#     rag_chain,
#     lambda session_id: history,
#     input_messages_key="input",
#     history_messages_key="chat_history",
#     output_messages_key="answer",
# )

if "chat_messages" not in st.session_state:
    st.session_state.chat_messages = []

# streamlit history
history = StreamlitChatMessageHistory(key="chat_messages")

# Initialize chat history
if len(history.messages) == 0:
    history.add_ai_message("Hey I am ASMR Query Bot, how can i help you ?")

with st.sidebar:
    st.title('Settings ⚙️')

    st.subheader('Models and parameters')
    selected_model = st.sidebar.selectbox('Choose a model', ['Llama3-8B', 'Llama3-70B', 'Mixtral-8x7B'], key='selected_model')
    if selected_model == 'Mixtral-8x7B':
        model_name="mixtral-8x7b-32768"
    elif selected_model == 'Llama3-70B':
        model_name='Llama3-70b-8192'
    elif selected_model == 'Llama3-8B':
        model_name='Llama3-8b-8192'

    temp = st.sidebar.slider('temperature', min_value=0.01, max_value=1.0, value=0.0, step=0.01)
    k = st.sidebar.slider('number of docs retrieved', min_value=1, max_value=20, value=2, step=1)

def get_response(query, config):
  chat = ChatGroq(temperature=temp, model_name=model_name)
  qa_retriever = vector_search.as_retriever(
      search_type="similarity",
      search_kwargs={"k": k},
  )
  history_aware_retriever = create_history_aware_retriever(
      chat, qa_retriever, contextualize_q_prompt
  )
  question_answer_chain = create_stuff_documents_chain(chat, qa_prompt)
  rag_chain = create_retrieval_chain(history_aware_retriever, question_answer_chain)
  conversational_rag_chain = RunnableWithMessageHistory(
      rag_chain,
      lambda session_id: history,
      input_messages_key="input",
      history_messages_key="chat_history",
      output_messages_key="answer",
  )
  return conversational_rag_chain.invoke({"input": prompt}, config=config)


def clear_chat_history():
    st.session_state.chat_messages = []
    history.add_ai_message("Hey I am ASMR Query Bot, how can i help you ?")

st.sidebar.button('Clear Chat History', on_click=clear_chat_history)

for msg in history.messages:
    st.chat_message(msg.type).write(msg.content)


# preprocessing context
def format_docs_with_metadata(docs):
    formatted_docs = []
    for i, doc in enumerate(docs, start=1):
        metadata_str = "\n".join([f"**{key}**: `{value}`\n" for key, value in doc.metadata.items() if key != "embedding"])
        formatted_doc = f"- {doc.page_content}\n\n**Metadata:**\n{metadata_str}"
        formatted_docs.append(formatted_doc)
    return "\n\n".join(formatted_docs)

def stream_data(response):
  for word in response.split(" "):
        yield word + " "
        time.sleep(0.05)

if prompt := st.chat_input():
    with st.chat_message("Human"):
        st.markdown(prompt)

    # As usual, new messages are added to StreamlitChatMessageHistory when the Chain is called.
    config = {"configurable": {"session_id": "any"}}
    res = get_response(prompt, config)

    with st.chat_message("AI"):
      st.write_stream(stream_data(res['answer']))
      with st.popover("View Source"):
        st.markdown("### Source Alerts 📢")
        st.markdown(format_docs_with_metadata(res['context']))

Overwriting app.py


In [40]:
!streamlit run app.py & npx localtunnel --port 8501


Collecting usage statistics. To deactivate, set browser.gatherUsageStats to false.
[0m
[0m
[34m[1m  You can now view your Streamlit app in your browser.[0m
[0m
[34m  Network URL: [0m[1mhttp://172.28.0.12:8501[0m
[34m  External URL: [0m[1mhttp://35.245.110.99:8501[0m
[0m
[K[?25hnpx: installed 22 in 2.181s
your url is: https://chatty-items-wave.loca.lt
from langchain_community.output_parsers.rail_parser import GuardrailsOutputParser
Parent run 8a72943d-01de-484b-903b-20b97401bb62 not found for run 14a94868-a0d0-4c03-ad04-2d584c81268e. Treating as a root run.
Parent run b234dc8b-e80c-40a3-905e-90e841fcc523 not found for run 0ca79993-616c-4f15-b327-241f1de2c998. Treating as a root run.
Parent run cf664a1b-4fdc-4992-ae47-7ca240a248d8 not found for run 947aa2ad-547d-4c26-b0df-6268ebc90ab0. Treating as a root run.
Parent run a2c2c0dd-66cd-4408-a00c-8032e97e5ab1 not found for run c1add834-29b9-42b8-abec-f207ec56c563. Treating as a root run.
^C
