## Loading Environment Variables

In [1]:
import os

from dotenv import load_dotenv
load_dotenv()

os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_API_KEY"] = os.getenv("LANGCHAIN_API_KEY")
os.environ["OPENAI_API_KEY"] = os.getenv("OPENAI_API_KEY")
os.environ["PINECONE_API_KEY"] = os.getenv("PINECONE_API_KEY")
os.environ['LANGCHAIN_PROJECT'] = os.getenv("LANGCHAIN_PROJECT")

## Importing Libraries

In [2]:
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough, RunnableParallel
from langchain_core.prompts import ChatPromptTemplate
from langchain_openai import OpenAIEmbeddings # To create embeddings
from langchain_openai import ChatOpenAI
from langchain_pinecone import PineconeVectorStore # To connect with the Vectorstore

import helper_hallucination as hc # includes helper functions to compute hallucination score

## RAG Pipeline

In [3]:
# Defing Constants
INDEX_NAME = 'earning-calls'
TOP_K = 2
QUARTER = "Q1"
FILENAME = "Adani Enterprises Ltd.pdf"
YEAR = "FY24"

# initializing embedding model and generation LLM
embeddings = OpenAIEmbeddings(model="text-embedding-ada-002")
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0.5)

# loading vectorstore index and initializing the retriver
index = PineconeVectorStore(index_name=INDEX_NAME, embedding=embeddings) # loading the index
retriver = index.as_retriever(search_kwargs={"filter": {"quarter": QUARTER, "filename": FILENAME, "year": YEAR}, "k": TOP_K})

# Defining the ChatPromptTemplate
chat_template = ChatPromptTemplate.from_messages(
    [
        ("system", "You are an expert Q&A system that is trusted around the world.\nAlways answer the query using the provided context information, and not prior knowledge.\nSome rules to follow:\n1. Never directly reference the given context in your answer.\n2. Avoid statements like 'Based on the context, ...' or 'The context information ...' or anything along those lines."),
        ("human", "Context information is below.\n---------------------\n{context}\n---------------------\nGiven the context information and not prior knowledge, answer the query.\nQuery: {query}\nAnswer: "),
    ]
)

# helper function to format context in the prompt
def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

# defing the RAG chain
rag_chain_from_docs = (
    RunnablePassthrough.assign(context=(lambda x: format_docs(x["context"])))
    | chat_template
    | llm
    | StrOutputParser()
)

# Adding chain to extract the retrived documents as the pipeline output
rag_chain_with_source = RunnableParallel(
    {"context": retriver, "query": RunnablePassthrough()}
).assign(answer=rag_chain_from_docs)

In [4]:
# Defining the query
# query = "can you list the breakdown of the capex?"
query = "How does Adani Enterprises Limited contribute to the energy sector, specifically through Adani New Industries Limited?"

# invoking chain to get the response
response = rag_chain_with_source.invoke(query)
answer = response['answer']
print(answer)

Adani Enterprises Limited contributes to the energy sector through its significant investment in the green hydrogen business under Adani New Industries Limited. This initiative is part of a broader ESG philosophy embedded in the company's fundamental plans, focusing on sustainable and innovative energy solutions. The commitment to green hydrogen is highlighted by recognition received, such as the Aegis Graham Bell award for Innovation in Manufacturing, which underscores the company's efforts in advancing energy conservation and sustainable practices within the sector.


## Hallucination

`Hallucination` refers to instances where the model generates information that sounds plausible but is actually incorrect or made up. This happens because the model fills in gaps or patterns in its training data without verifying the facts. It's like confidently telling a story with details that aren't real.

To check if a response from a language model (LLM) is reliable, we can use a method that involves generating multiple sample responses and comparing them to the original response. Here's how it works:

1. **Generate Samples**: Use the same LLM that produced the original response to generate several additional sample responses to the same prompt.
   
2. **Consistency Check**: Compare the original response with each of these samples to see how consistent they are. This involves two main checks:
   - **LLM-based Check**: Ask the LLM itself to assess the consistency between the original response and each sample. This is done in halves and the scores are averaged.
   - **Semantic Similarity Check**: Measure how similar the meaning of each sentence in the original response is to the corresponding sentences in the samples. These scores are averaged as well.

3. **Calculate Hallucination Score**: Combine the results of the LLM-based check and the semantic similarity check to get a final score. This score helps determine how much you can trust the original response.

Credits: `Langkit` - The implemention defiend is taken from it and as per the library the approch is inspired from [SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models](https://arxiv.org/abs/2303.08896)

### Initializing the hallucination pipeline

In [5]:
hc.init(rag_pipeline=rag_chain_with_source, num_samples=3)

[nltk_data] Downloading package punkt to
[nltk_data]     C:\Users\HP\AppData\Roaming\nltk_data...
[nltk_data]   Package punkt is already up-to-date!


### Computing the hallucination Score

In [6]:
hallucination_result = hc.consistency_check(prompt=query,response=answer)

print(str(round(hallucination_result['final_score'] * 100, 2)) + '%')

5.75%


In [7]:
hallucination_result

{'llm_score': 0.0,
 'semantic_score': 0.11503152052561438,
 'final_score': 0.05751576026280719,
 'samples': ["Adani Enterprises Limited contributes to the energy sector through its significant investments in green hydrogen and related businesses under Adani New Industries Limited. This focus on sustainable energy solutions is part of the company's broader ESG philosophy, which emphasizes environmental responsibility and innovation in manufacturing, as evidenced by the recognition received, such as the Aegis Graham Bell award for Innovation in Manufacturing.",
  'Adani Enterprises Limited contributes to the energy sector through its significant investments in green hydrogen and similar businesses under Adani New Industries Limited. This focus on sustainable energy solutions is part of their broader ESG philosophy, which emphasizes environmentally responsible practices and innovation in manufacturing, as evidenced by their recognition with the Aegis Graham Bell award for their efforts in

### Prompt Sample

In [16]:
print(hallucination_result['prompt_response_pair'][0]["prompt"])

Context: Adani Enterprises Limited contributes to the energy sector through its significant investments in green hydrogen and related businesses under Adani New Industries Limited. This focus on sustainable energy solutions is part of the company's broader ESG philosophy, which emphasizes environmental responsibility and innovation in manufacturing, as evidenced by the recognition received, such as the Aegis Graham Bell award for Innovation in Manufacturing.

Passage: Adani Enterprises Limited contributes to the energy sector through its significant investment in the green hydrogen business under Adani New Industries Limited.This initiative is part of a broader ESG philosophy embedded in the company's fundamental plans, focusing on sustainable and innovative energy solutions.

Is the passage supported by the context above?
Answer between: Accurate, Minor Inaccurate, Major Inaccurate

Don't include additional information/explanation. Please answer only with the options above.

Answer:



In [9]:
print(hallucination_result['prompt_response_pair'][0]["response"])


Accurate


### Generated Samples

In [10]:
for sample in hallucination_result['samples']:
    print(sample)
    print("-------------")

Adani Enterprises Limited contributes to the energy sector through its significant investments in green hydrogen and related businesses under Adani New Industries Limited. This focus on sustainable energy solutions is part of the company's broader ESG philosophy, which emphasizes environmental responsibility and innovation in manufacturing, as evidenced by the recognition received, such as the Aegis Graham Bell award for Innovation in Manufacturing.
-------------
Adani Enterprises Limited contributes to the energy sector through its significant investments in green hydrogen and similar businesses under Adani New Industries Limited. This focus on sustainable energy solutions is part of their broader ESG philosophy, which emphasizes environmentally responsible practices and innovation in manufacturing, as evidenced by their recognition with the Aegis Graham Bell award for their efforts in the green hydrogen ecosystem.
-------------
Adani Enterprises Limited contributes to the energy sect