# [参照リポジトリ](https://github.com/athina-ai/rag-cookbooks) を元に8種類のRAGと5種類のAgenticRAGについて学ぶ 3/13 

memo: google colabからlocal macbookのjupyter notebookに変更

# Hypothetical Document Embeddings (HyDE) RAG
入力されたクエリ(Input Query)をいったんLLMに入れて、仮の回答(Hypothetical Documents)を生成します。その仮の回答を用いて、ナレッジ(Vector DB)から関連する文書を検索し、それをコンテキストとしてLLMに与えて最終的な回答を生成します。

1.	通常のRAG
	- ユーザーのクエリをベクトル化
	- （先に作っておいたベクトルデータベース内の）既存の文書の埋め込みと比較して類似度が高いものを検索
	- 類似した文書を取得し、それを元に回答を生成する

2.	HyDEの動作
	- ユーザーのクエリに対して、「理想的な文書（仮想的な文書）」をまず生成 する
	- その仮想文書を埋め込み化し、それと類似した既存の文書を検索する
	- 結果として、より関連性の高い情報にたどり着きやすくなる

3. なぜこの方法が有効なのか？
	- 従来のRAGでは、ユーザーのクエリが単純すぎると、適切な文書を見つけにくいことがある
	- HyDEは、まず「こういう情報がありそうだ」という仮想文書を作り、それを元に検索するため、より関連性の高い文書にアクセスしやすい
	- 特に、検索対象の文書が大量にある場合や、直接的なキーワード検索が難しい場合に有効

論文: [HyDE](https://arxiv.org/pdf/2212.10496)

In [1]:
!python -m pip install -qU pip
!pip install -qU langchain-openai langchain-community chromadb

In [2]:
import os
from dotenv import load_dotenv
load_dotenv()
openai_api_key = os.environ.get("OPENAI_API_KEY")

from langchain.document_loaders import CSVLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
loader = CSVLoader("data/context.csv")
documents = loader.load()
text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=0)
documents = text_splitter.split_documents(documents)

In [3]:
from langchain_openai import OpenAIEmbeddings
embeddings = OpenAIEmbeddings()
# create vectorstore
from langchain.vectorstores import Chroma
vectorstore = Chroma.from_documents(documents, embeddings)
# create retriever
retriever = vectorstore.as_retriever()

In [4]:
from langchain_openai import ChatOpenAI
from langchain.prompts import ChatPromptTemplate
from langchain.schema.runnable import RunnablePassthrough
from langchain.schema.output_parser import StrOutputParser

llm = ChatOpenAI()

native_rag_template = """
You are a helpful assistant that answers questions based on the following context.
If you don't find the answer in the context, just say that you don't know.
Context: {context}

Question: {input}

Answer:
"""

prompt = ChatPromptTemplate.from_template(native_rag_template)

# "context": retriever→ ここにはリトリーバル（文書検索）ロジックが含まれるオブジェクトを入れて、後続のプロンプトで文脈として利用します。
# "input": RunnablePassthrough()→ ユーザからのクエリ（質問など）をそのまま次の処理に渡す役割を持つコンポーネントです。
# チェーンの最初の入力として "context" と "input" という2つの項目を一つの辞書オブジェクトとして渡すために {} が用いられています。
rag_chain = (
    {"context": retriever, "input": RunnablePassthrough()} 
    | prompt 
    | llm
    | StrOutputParser()
)
# test
questions = [
    "what bacteria grow on macconkey agar",
    "who wrote a rose is a rose is a rose"
]

for q in questions:
    result = rag_chain.invoke(q)
    print(f"Q: {q}\nA: {result}\n")

Q: what bacteria grow on macconkey agar
A: Gram-negative and enteric bacteria grow on MacConkey agar.

Q: who wrote a rose is a rose is a rose
A: Gertrude Stein wrote "A rose is a rose is a rose".



In [21]:
questions = [
    "what bacteria grow on macconkey agar",
    "who wrote a rose is a rose is a rose"
]

native_rag_data = []  # 各質問毎のレコードとして格納

print("=== Native RAG ===")
for q in questions:
    # print(f"\nQ: {q}")
    # チェーンを実行し回答を取得
    answer_text = rag_chain.invoke(q)
    
    # 検索されたドキュメント
    docs = retriever.get_relevant_documents(q)
    
    # 各質問ごとに辞書レコードとしてまとめる
    native_rag_data.append({
        "query": q,
        "response": answer_text,
        "context": docs
    })

native_rag_data  # // ...以降、評価処理で利用...


=== Native RAG ===


[{'query': 'what bacteria grow on macconkey agar',
  'response': 'Gram-negative and enteric (normally found in the intestinal tract) bacteria grow on MacConkey agar.',
  'context': [Document(metadata={'row': 6, 'source': 'data/context.csv'}, page_content="context: ['MacConkey agar is a selective and differential culture medium for bacteria. It is designed to selectively isolate Gram-negative and enteric (normally found in the intestinal tract) bacteria and differentiate them based on lactose fermentation. Lactose fermenters turn red or pink on MacConkey agar, and nonfermenters do not change color. The media inhibits growth of Gram-positive organisms with crystal violet and bile salts, allowing for the selection and isolation of gram-negative"),
   Document(metadata={'row': 6, 'source': 'data/context.csv'}, page_content='0.001 g\\nAgar – 13.5 g\\nWater – add to make 1 litre; adjust pH to 7.1 +/− 0.2\\nSodium taurocholateThere are many variations of MacConkey agar depending on the need. 

In [None]:
# chain without the retriever
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_openai import ChatOpenAI

no_context_template ="""
from langchain_core.prompts import ChatPromptTemplate
from langchain_openai import ChatOpenAI

"""
no_context_prompt = ChatPromptTemplate.from_messages(
    [
        ("system", no_context_template),
        ("human", "{input}"),
    ]
)
qa_no_context = no_context_prompt | llm | StrOutputParser()

# test no_context response
question = 'what bacteria grow on macconkey agar'
answer = qa_no_context.invoke({"input": question})
answer



'MacConkey agar is a selective and differential agar used for the isolation and differentiation of Gram-negative bacteria, particularly Enterobacteriaceae. Some common bacteria that grow on MacConkey agar include Escherichia coli, Klebsiella pneumoniae, Proteus mirabilis, and Salmonella spp. It is also used to differentiate lactose fermenting bacteria (such as E. coli) from non-lactose fermenting bacteria.'

In [None]:
hyde_retrieval_chain = qa_no_context | retriever
hyde_retrieved_docs = hyde_retrieval_chain.invoke({"input": question})

hyde_template = """
You are a helpful assistant that answers questions based on the provided context.
hyde_retrieved_docs = hyde_retrieval_chain.invoke({"input": question})

Context: {context}
"""

hyde_prompt = ChatPromptTemplate.from_template(hyde_template)

final_hyde_chain = (
    hyde_prompt
    | llm
    | StrOutputParser()
)

# test final hyde response
final_hyde_chain.invoke({"context": hyde_retrieved_docs, "input": question})

'Bacteria that grow on MacConkey agar are primarily Gram-negative and enteric bacteria. This medium is designed to selectively isolate Gram-negative and enteric bacteria based on their ability to ferment lactose. Lactose fermenters will turn red or pink on MacConkey agar, while nonfermenters will not change color. The medium inhibits the growth of Gram-positive organisms, allowing for the selection and isolation of Gram-negative bacteria.'

In [27]:
questions = ["what bacteria grow on macconkey agar",
    "who wrote a rose is a rose is a rose"
]

hyde_rag_data = []

for q in questions:
    answer_text = final_hyde_chain.invoke({"context": hyde_retrieved_docs, "input": q})
    docs = retriever.get_relevant_documents(q)

    # 最後に辞書データでまとめる
    hyde_rag_data.append({
        "query": q,
        "response": answer_text,
        "context": docs,
    })

hyde_rag_data

[{'query': 'what bacteria grow on macconkey agar',
  'response': 'Bacteria that grow on MacConkey agar are primarily Gram-negative and enteric bacteria that can ferment lactose. This includes lactose fermenters that turn red or pink on the agar and nonfermenters that do not change color. The agar inhibits the growth of Gram-positive organisms.',
  'context': [Document(metadata={'row': 6, 'source': 'data/context.csv'}, page_content="context: ['MacConkey agar is a selective and differential culture medium for bacteria. It is designed to selectively isolate Gram-negative and enteric (normally found in the intestinal tract) bacteria and differentiate them based on lactose fermentation. Lactose fermenters turn red or pink on MacConkey agar, and nonfermenters do not change color. The media inhibits growth of Gram-positive organisms with crystal violet and bile salts, allowing for the selection and isolation of gram-negative"),
   Document(metadata={'row': 6, 'source': 'data/context.csv'}, pa

In [None]:
import pandas as pd
from langchain.evaluation import load_evaluator
from langchain.evaluation import EvaluatorType

relevance_evaluator = load_evaluator(
    EvaluatorType.QA,
    criteria="relevance",
    model="gpt-4",
    model_kwargs={"temperature": 0}
)

accuracy_evaluator = load_evaluator(
    EvaluatorType.QA,
    criteria="accuracy",
    model="gpt-4",
    model_kwargs={"temperature": 0}
)

def evaluate_response(data: list) -> pd.DataFrame:
    """
    与えられたデータリストに対する Relevance と Accuracy の評価を行い、結果を DataFrame で返します。
    """
    df = pd.DataFrame(data)

    def flatten_context(context_val):
        if isinstance(context_val, list):
            return "\n".join([item.page_content if hasattr(item, "page_content") else str(item) for item in context_val])
        return str(context_val)

    df["ContextText"] = df["context"].apply(flatten_context)

    def evaluate_row(row):
        relevance_score = relevance_evaluator.evaluate_strings(
            prediction=row["response"],
            input=row["query"],
            reference=row["ContextText"]
        )
        accuracy_score = accuracy_evaluator.evaluate_strings(
            prediction=row["response"],
            input=row["query"],
            reference=row["ContextText"]
        )
        return relevance_score, accuracy_score

    df[["Relevance", "Accuracy"]] = df.apply(lambda row: pd.Series(evaluate_row(row)), axis=1)
    
    df["RelevanceScore"] = df["Relevance"].apply(lambda x: 1 if x["value"].upper() == "CORRECT" else 0)
    df["AccuracyScore"] = df["Accuracy"].apply(lambda x: 1 if x["value"].upper() == "CORRECT" else 0)
    
    # 平均スコアを計算
    avg_relevance = df["RelevanceScore"].mean()
    avg_accuracy = df["AccuracyScore"].mean()
    
    print(f"Average Relevance: {avg_relevance:.2f}")
    print(f"Average Accuracy: {avg_accuracy:.2f}")

    return df[["query", "response", "ContextText", "Relevance", "Accuracy"]]




評価の平均スコア

Average Relevance: 回答の関連性の平均スコアです。1.00に近いほど、関連性が高いと評価されていることを示します。
Average Accuracy: 回答の正確性の平均スコアです。1.00に近いほど、正確性が高いと評価されていることを示します。
評価結果のDataFrame

query: 質問内容です。
response: RAGモデルによる回答です。
ContextText: RAGモデルが回答を生成する際に参照したコンテキスト（文書）の内容です。
Relevance: 回答の関連性に関する評価結果の詳細です。valueがCORRECTであれば、関連性が高いと評価されています。
Accuracy: 回答の正確性に関する評価結果の詳細です。valueがCORRECTであれば、正確性が高いと評価されています。

In [44]:
# Native RAG と HyDE RAG の評価
print("\n=== Native RAG Evaluation ===")
native_eval_df = evaluate_response(native_rag_data)
native_eval_df


=== Native RAG Evaluation ===
Average Relevance: 1.00
Average Accuracy: 1.00


Unnamed: 0,query,response,ContextText,Relevance,Accuracy
0,what bacteria grow on macconkey agar,Gram-negative and enteric (normally found in t...,context: ['MacConkey agar is a selective and d...,"{'reasoning': 'CORRECT', 'value': 'CORRECT', '...","{'reasoning': 'CORRECT', 'value': 'CORRECT', '..."
1,who wrote a rose is a rose is a rose,"Gertrude Stein wrote ""A rose is a rose is a ro...","context: ['The sentence ""Rose is a rose is a r...","{'reasoning': 'CORRECT', 'value': 'CORRECT', '...","{'reasoning': 'CORRECT', 'value': 'CORRECT', '..."


In [45]:
print("\n=== HyDE RAG Evaluation ===")
hyde_eval_df = evaluate_response(hyde_rag_data)
hyde_eval_df


=== HyDE RAG Evaluation ===
Average Relevance: 1.00
Average Accuracy: 1.00


Unnamed: 0,query,response,ContextText,Relevance,Accuracy
0,what bacteria grow on macconkey agar,Bacteria that grow on MacConkey agar are prima...,context: ['MacConkey agar is a selective and d...,"{'reasoning': 'CORRECT', 'value': 'CORRECT', '...","{'reasoning': 'CORRECT', 'value': 'CORRECT', '..."
1,who wrote a rose is a rose is a rose,"The phrase ""a rose is a rose is a rose"" was wr...","context: ['The sentence ""Rose is a rose is a r...","{'reasoning': 'CORRECT', 'value': 'CORRECT', '...","{'reasoning': 'CORRECT', 'value': 'CORRECT', '..."
