### Build the Incident Navigator pipeline

In this notebook we build and evalate out our pipeline

![Image](pipeline.png)

Note that certain cell outputs were erased as unfortunately due to rate limiters associate with GroqCloud we have had to run this notebook multiple times over the course of the project and note all cells were run every time.

#### Library Imports

In [None]:
import pandas as pd
from langchain_huggingface.embeddings import HuggingFaceEmbeddings
from langchain_core.documents import Document
import weaviate
from langchain_weaviate.vectorstores import WeaviateVectorStore
from weaviate.classes.query import Filter
from pymongo import MongoClient
from langchain.retrievers.document_compressors.base import BaseDocumentCompressor
from langchain.retrievers import ContextualCompressionRetriever
from flashrank import Ranker, RerankRequest
from typing import Optional
import itemgetter
from langchain_core.runnables import RunnablePassthrough, RunnableLambda
from langchain.memory import ConversationBufferWindowMemory

import warnings
warnings.filterwarnings("ignore")

weaviate_client = weaviate.connect_to_local(port=8081)
embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/gtr-t5-large", cache_folder="./embedding_model")
MONGO_URI = "mongodb://root:root@localhost:27017/"
DATABASE_NAME = "incident_db"
COLLECTION_NAME = "incident_collection"

In [2]:
from pydantic import root_validator

class CustomReranker(BaseDocumentCompressor):
    """Document compressor using Flashrank interface."""

    client: Ranker
    """Flashrank client to use for compressing documents"""
    top_n: int = 3
    """Number of documents to return."""
    model: Optional[str] = None
    """Model to use for reranking."""

    class Config:
        extra = 'forbid'
        arbitrary_types_allowed = True

    @root_validator(pre=True)
    def validate_environment(cls, values):
        """Validate that api key and python package exists in environment."""
        try:
            from flashrank import Ranker
        except ImportError:
            raise ImportError(
                "Could not import flashrank python package. "
                "Please install it with `pip install flashrank`."
            )

        values["model"] = values.get("model", "ms-marco-MiniLM-L-12-v2")
        values["client"] = Ranker(model_name=values["model"], cache_dir="reranker")
        return values

    def compress_documents(
        self,
        documents,
        query,
        callbacks = None):
        passages = [
            {"id": i, "text": doc.page_content, "metadata": doc.metadata} for i, doc in enumerate(documents)
        ]
        rerank_request = RerankRequest(query=query, passages=passages)
        rerank_response = self.client.rerank(rerank_request)[:self.top_n]
        final_results = []
        for r in rerank_response:
            doc = Document(
                page_content=r["text"],
                metadata={
                    **r['metadata'],
                    "id": r["id"],
                    "relevance_score": r["score"]
                },
            )
            final_results.append(doc)
        return final_results

#### Build Chatbot

In [3]:
compressor = CustomReranker()

def create_retriever(industries):
    filters = None
    if not industries == 'all':
        filters = Filter.any_of([Filter.by_property("industry").equal(industry) for industry in industries])
    db = WeaviateVectorStore(client=weaviate_client, index_name="incident", text_key="text", embedding=embeddings)
    compression_retriever = ContextualCompressionRetriever(
        base_compressor = compressor,
        base_retriever = db.as_retriever(search_type="mmr", search_kwargs={"fetch_k": 20, 'filters': filters})
    )
    return compression_retriever

def get_documents_ids(retrieved_docs):
    if retrieved_docs:
        return [int(doc.metadata['incident_id']) for doc in retrieved_docs]
    else:
        return None

def get_documents_by_ids(ids):
    try:
        client = MongoClient(MONGO_URI)
        db = client[DATABASE_NAME]
        collection = db[COLLECTION_NAME]        
        documents = list(collection.find({"accident_id": {"$in": ids}}))
        return documents
    except Exception as e:
        return []
    finally:
        client.close()

In [4]:
from langchain_openai.chat_models import ChatOpenAI
from langchain_core.output_parsers import StrOutputParser
from langchain.prompts import ChatPromptTemplate
import json
from datetime import datetime

llm = ChatOpenAI(
  openai_api_base="https://api.groq.com/openai/v1/",
  model = "llama-3.3-70b-versitile",
  temperature=0.7,
  api_key=""
)

CONTEXT_TEMPLATE = """
<|start_header_id|>system<|end_header_id|>
Given a discussion history and a follow-up question, rewrite the follow-up question to be fully self-contained and understandable without the context of the previous conversation. Keep it as close as possible to the original meaning but include any relevant details from the history if they add clarity or context. If no additional context is needed, leave the question unchanged.
Discussion history:{chat_history}
<|eot_id|>
<|start_header_id|>user|end_header_id|>
Question: {question}
<|eot_id|>
<|start_header_id|>assistant<|end_header_id|>
Standalone question:"""

CONTEXT_PROMPT = ChatPromptTemplate.from_template(CONTEXT_TEMPLATE)

SYSTEM_TEMPLATE = """
<|start_header_id|>system<|end_header_id|>
You are IncidentNavigator, an AI designed to assist in managing and understanding incidents using a dataset of incident records. Your role is to provide precise, concise, and clear responses based on the context of the documents you receive. If a question falls outside of the information available in the provided context, you should clearly state that you cannot provide an answer but will offer the best response based on what is available.
The documents you process include the following fields:
- accident_id: Unique identifier for each incident.
- event_type: Category of the incident (e.g., fire, collision).
- industry_type: The sector or industry where the incident occurred (e.g., construction, transportation).
- accident_title: A brief, descriptive title for the accident.
- start_date: The date and time the incident began.
- finish_date: The date and time the incident ended or was resolved.
- accident_description: A detailed account of how the accident occurred.
- causes_of_accident: Factors or conditions leading to the incident.
- consequences: Outcomes or impacts of the incident (e.g., injuries, damage).
- emergency_response: Immediate actions taken to manage the incident.
- lesson_learned: Insights or recommendations for future prevention.
- url: Reference link to the document webpage.
When answering questions, follow these guidelines:
- Context Provided: If the context includes information related to these fields, provide a direct and detailed response based on the relevant data.
- Context Missing or Insufficient: If no context or relevant information is provided:
  - State that you cannot provide a definitive answer because the requester does not have sufficient privileges or the information is unavailable.
  - Do not speculate but offer a general response or guidance based on the type of question, when possible.
Context: {context}
IMPORTANT: KEEP YOUR ANSWERS AS CONCISE AND RELEVANT AS POSSIBLE, DON'T GIVE OUT UNNECESSARALY LONG ANSWERS.
<|eot_id|>
<|start_header_id|>user|end_header_id|>
Question: {question}
<|eot_id|>
<|start_header_id|>assistant<|end_header_id|>
Answer:
"""

SYSTEM_PROMPT = ChatPromptTemplate.from_template(SYSTEM_TEMPLATE)

class CustomJSONEncoder(json.JSONEncoder):
  def default(self, obj):
      if isinstance(obj, datetime):
          return obj.isoformat()
      return super().default(obj)
  
def retrieve(data):
  query = data['question']
  retriever = create_retriever('all')
  docs = retriever.invoke(query)
  ids = get_documents_ids(docs)
  retrieved_docs = get_documents_by_ids(ids)
  for document in retrieved_docs:
      document.pop("_id", None)
  data['context'] = retrieved_docs
  return data

def get_industry(placeholder = None):
  return ['processing of metals', 'power generation']

#### Evaluation

**Create the test set**

In [5]:
test_set = pd.read_csv("../../data/test.csv")
test_set

Unnamed: 0,Input,Reference
0,How should I respond to a fire in an industria...,"Immediately activate fire alarm, evacuate non-..."
1,What are signs of an imminent explosion risk i...,"Abnormal pressure readings, unusual temperatur..."
2,I'm seeing sludge overflow in my refinery's bi...,Likely filamentous bacteria growth caused by: ...
3,What immediate actions should be taken if toxi...,"Activate emergency alarms, evacuate personnel ..."
4,How do I safely handle a chemical spill in a p...,Identify spilled substance from safety data sh...
5,What safety measures are essential for oxygen ...,"Regular inspection of trapping sieves, tempera..."
6,How should I respond to a pressure vessel show...,"Evacuate area immediately, activate emergency ..."
7,What are warning signs of a runaway chemical r...,"Unexpected temperature increase, unusual color..."
8,What precautions are needed when handling sodi...,"Keep away from moisture and heat above 40°C, a..."
9,How do I respond to a leak in a gas storage fa...,"Activate emergency shutdown systems, evacuate ..."


In [6]:
import time

memory = ConversationBufferWindowMemory(window_size=3)

context = RunnablePassthrough.assign(chat_history= memory.load_variables | itemgetter("history")) | CONTEXT_PROMPT | llm | StrOutputParser()
runnable = (
            RunnablePassthrough.assign(
                memory = memory.load_memory_variables | itemgetter("history"),
                industries="all"
            )
            | retrieve
            | SYSTEM_PROMPT 
            | llm
            | StrOutputParser()
        )

def get_answer_and_context(question):
    input = context.invoke({"question": question})
    answer = runnable.invoke(input)
    return input['context'], answer

retrieved_contexts = []
responses = []

for index, row in test_set.iterrows():
    time.sleep(60)
    input = row["Input"]
    context, response = get_answer_and_context(input)
    retrieved_contexts.append(context)
    responses.append(response)

len(retrieved_contexts), len(responses)

INFO:httpx:HTTP Request: GET http://localhost:8081/v1/schema/Incident "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: GET http://localhost:8081/v1/schema/Incident "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.groq.com/openai/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: GET http://localhost:8081/v1/schema/Incident "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: GET http://localhost:8081/v1/schema/Incident "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.groq.com/openai/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: GET http://localhost:8081/v1/schema/Incident "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: GET http://localhost:8081/v1/schema/Incident "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.groq.com/openai/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: GET http://localhost:8081/v1/schema/Incident "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: GET http://localhost:8081/v1/schema/Incident "HTTP/1.1 200 OK"
INFO:htt

(24, 24)

In [7]:
test_set["Retrieved Context"] = retrieved_contexts
test_set["Response"] = responses
test_set.to_csv("../../data/test_results_ablation2.csv", index=False)

**Get metrics**

In [8]:
import pandas as pd
import json

class CustomJSONEncoder(json.JSONEncoder):
  def default(self, obj):
      if isinstance(obj, datetime):
          return obj.isoformat()
      return super().default(obj)

def convert_context_to_string(context):
    context = eval(context)
    return list(json.dumps(document, cls=CustomJSONEncoder) for document in context)

In [9]:
test_set = pd.read_csv("../../data/test_results_ablation2.csv")
eval_dataset = test_set[["Input", "Retrieved Context", "Response", "Reference"]]
eval_dataset = eval_dataset.rename(columns={"Input": "user_input", "Retrieved Context": "retrieved_contexts", "Response": "response", "Reference": "reference"})
eval_dataset["retrieved_contexts"] = eval_dataset["retrieved_contexts"].apply(convert_context_to_string)

In [None]:
from ragas.llms import LangchainLLMWrapper
from ragas.embeddings import LangchainEmbeddingsWrapper
from langchain_huggingface.embeddings import HuggingFaceEmbeddings
from ragas.metrics import LLMContextPrecisionWithoutReference, LLMContextRecall, ResponseRelevancy, Faithfulness
from ragas import SingleTurnSample
from langchain_openai.chat_models import ChatOpenAI
import time

evaluator_llm = LangchainLLMWrapper(ChatOpenAI(
  openai_api_base="https://api.groq.com/openai/v1/",
  model = "llama3-70b-8192",
  temperature=1.0,
  api_key=""
))

evaluator_embeddings = LangchainEmbeddingsWrapper(HuggingFaceEmbeddings(model_name="sentence-transformers/gtr-t5-large", cache_folder="./embedding_model"))

metrics = [
  LLMContextPrecisionWithoutReference(llm=evaluator_llm),
  LLMContextRecall(llm=evaluator_llm),
  ResponseRelevancy(llm=evaluator_llm, embeddings=evaluator_embeddings),
  Faithfulness(llm=evaluator_llm)
]


context_precision = context_precision[:23]
context_recall = context_recall[:23]
response_relevancy = response_relevancy[:23]
faithfulness = faithfulness[:23]

for index, row in eval_dataset[23:].iterrows():
  sample = SingleTurnSample(
    user_input=row['user_input'],
    response = row["response"],
    retrieved_contexts=row["retrieved_contexts"],
    reference=row["reference"]
  )
  print(f"Processing sample {index + 1}")
  for metric in metrics:
    score = metric.single_turn_score(sample)
    if metric.name == "llm_context_precision_without_reference":
      context_precision.append(score)
    elif metric.name == "context_recall":
      context_recall.append(score)
    elif metric.name == "answer_relevancy":
      response_relevancy.append(score)
    else:
      faithfulness.append(score)
    print(f"{metric.name}: {score}")

INFO:sentence_transformers.SentenceTransformer:Load pretrained SentenceTransformer: sentence-transformers/gtr-t5-large
INFO:sentence_transformers.SentenceTransformer:Use pytorch device_name: mps


Processing sample 24


INFO:httpx:HTTP Request: POST https://api.groq.com/openai/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.groq.com/openai/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.groq.com/openai/v1/chat/completions "HTTP/1.1 200 OK"


llm_context_precision_without_reference: 0.9999999999666667


INFO:httpx:HTTP Request: POST https://api.groq.com/openai/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
INFO:openai._base_client:Retrying request to /chat/completions in 31.000000 seconds
INFO:httpx:HTTP Request: POST https://api.groq.com/openai/v1/chat/completions "HTTP/1.1 200 OK"


context_recall: 0.7142857142857143


INFO:httpx:HTTP Request: POST https://api.groq.com/openai/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
INFO:openai._base_client:Retrying request to /chat/completions in 4.000000 seconds
INFO:httpx:HTTP Request: POST https://api.groq.com/openai/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.groq.com/openai/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
INFO:openai._base_client:Retrying request to /chat/completions in 6.000000 seconds
INFO:httpx:HTTP Request: POST https://api.groq.com/openai/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.groq.com/openai/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
INFO:openai._base_client:Retrying request to /chat/completions in 6.000000 seconds
INFO:httpx:HTTP Request: POST https://api.groq.com/openai/v1/chat/completions "HTTP/1.1 200 OK"


answer_relevancy: 0.9588536590399638


INFO:httpx:HTTP Request: POST https://api.groq.com/openai/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
INFO:openai._base_client:Retrying request to /chat/completions in 13.000000 seconds
INFO:httpx:HTTP Request: POST https://api.groq.com/openai/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.groq.com/openai/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
INFO:openai._base_client:Retrying request to /chat/completions in 37.000000 seconds
INFO:httpx:HTTP Request: POST https://api.groq.com/openai/v1/chat/completions "HTTP/1.1 200 OK"


faithfulness: 0.19047619047619047


In [29]:
eval_dataset["Context Precision"] = context_precision
eval_dataset["Context Recall"] = context_recall
eval_dataset["Response Relevancy"] = response_relevancy
eval_dataset["Faithfulness"] = faithfulness
eval_dataset.to_csv("../../data/eval_results_ablation2_part1.csv", index=False)

In [30]:
df1 = pd.read_csv("../../data/eval_results_ablation2_part1.csv")
df1.to_csv("../../data/eval_results.csv", index=False)
df1

Unnamed: 0,user_input,retrieved_contexts,response,reference,Context Precision,Context Recall,Response Relevancy,Faithfulness
0,How should I respond to a fire in an industria...,"['{""accident_id"": 109, ""event_type"": ""Major Ac...","Based on the provided context, I can offer a r...","Immediately activate fire alarm, evacuate non-...",1.0,0.428571,0.956607,0.916667
1,What are signs of an imminent explosion risk i...,"['{""accident_id"": 935, ""event_type"": ""Near Mis...","Based on the provided context, I can provide a...","Abnormal pressure readings, unusual temperatur...",1.0,0.0,0.968016,0.916667
2,I'm seeing sludge overflow in my refinery's bi...,"['{""accident_id"": 1016, ""event_type"": ""Major A...","Based on the provided context, I can provide a...",Likely filamentous bacteria growth caused by: ...,1.0,1.0,0.819802,0.6
3,What immediate actions should be taken if toxi...,"['{""accident_id"": 538, ""event_type"": ""Major Ac...","Based on the provided context, it appears that...","Activate emergency alarms, evacuate personnel ...",0.333333,0.0,0.984302,0.777778
4,How do I safely handle a chemical spill in a p...,"['{""accident_id"": 1146, ""event_type"": ""Near Mi...","Based on the provided context, I can offer a g...",Identify spilled substance from safety data sh...,0.0,0.0,0.933813,0.041667
5,What safety measures are essential for oxygen ...,"['{""accident_id"": 1129, ""event_type"": ""Major A...","Based on the provided context, it can be infer...","Regular inspection of trapping sieves, tempera...",1.0,0.0,0.995625,1.0
6,How should I respond to a pressure vessel show...,"['{""accident_id"": 991, ""event_type"": ""Major Ac...","Based on the provided context, I can offer a g...","Evacuate area immediately, activate emergency ...",0.0,0.0,0.0,0.0
7,What are warning signs of a runaway chemical r...,"['{""accident_id"": 26, ""event_type"": ""Major Acc...","Based on the provided context, I found that th...","Unexpected temperature increase, unusual color...",0.0,0.375,0.999543,0.5
8,What precautions are needed when handling sodi...,"['{""accident_id"": 742, ""event_type"": ""Major Ac...","Based on the provided context, it is mentioned...","Keep away from moisture and heat above 40°C, a...",1.0,0.0,1.0,0.0
9,How do I respond to a leak in a gas storage fa...,"['{""accident_id"": 398, ""event_type"": ""Major Ac...","Based on the provided context, I found two inc...","Activate emergency shutdown systems, evacuate ...",0.0,0.0,0.918294,0.909091


In [31]:
print("Mean Context Precision:", df1["Context Precision"].mean())
print("Mean Context Recall:", df1["Context Recall"].mean())
print("Mean Response Relevancy:", df1["Response Relevancy"].mean())
print("Mean Faithfulness:", df1["Faithfulness"].mean())

Mean Context Precision: 0.4999999999690972
Mean Context Recall: 0.13467261904761904
Mean Response Relevancy: 0.8276305236228989
Mean Faithfulness: 0.6546235491088432
