<a href="https://colab.research.google.com/github/enya-yx/LangChain-Courses/blob/main/doc_and_evaluate_py.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
!pip install "langchain-google-genai" "langchain" "langchain-core" "langgraph-prebuilt" "google-generativeai" "langchain_community" "docarray"

In [2]:
import google.generativeai as genai
import os
from google.colab import userdata

os.environ["GOOGLE_API_KEY"] = userdata.get('google_api_key')
# Configure the generative AI library with your API key
genai.configure(api_key=os.environ["GOOGLE_API_KEY"])


In [3]:
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain.prompts import PromptTemplate, ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser

# Define llm
llm = ChatGoogleGenerativeAI(
    model="gemini-2.5-flash",
    temperature=0.1,
    verbose=True
)


In [None]:
# Load the doc and save in memory as db with embedding for future query
from langchain_community.document_loaders import WebBaseLoader
from langchain.indexes import VectorstoreIndexCreator
from langchain.vectorstores import DocArrayInMemorySearch
from langchain_google_genai import GoogleGenerativeAIEmbeddings

loader = WebBaseLoader("https://medium.com/@natazwa/best-and-worst-of-keigo-higashino-a811bcd89b04")
docs = loader.load()
embeddings = GoogleGenerativeAIEmbeddings(model="models/text-embedding-004")
db = DocArrayInMemorySearch.from_documents(docs, embeddings)



In [6]:
# Query by 'db' to create related docs used to answer questions
from langchain.chains import RetrievalQA
from IPython.display import display, Markdown

query = "Please list famous novels written by Keigo Higashino and provide short introduction."
qres = db.similarity_search(query)
#print(qres[0].page_content)
qdocs = ".".join(r.page_content for r in qres)
question = "Please list the best 3 novels written by Keigo Higashino in a table in markdown\
 with simple short description for each of them"
res_from_docs = llm.call_as_llm(f'{qdocs} Question: {question}')
display(Markdown(res_from_docs))


Here are the top 3 Keigo Higashino novels according to the provided ranking:

| Rank | Title                  | Description                                                                                             |
|------|------------------------|---------------------------------------------------------------------------------------------------------|
| 1    | Salvation of A Saint   | Features an ingenious murder method where police struggle to find evidence despite a vague idea of the culprit. |
| 2    | Devotion of Suspect X  | The murderer and motive are revealed early, but the mystery lies in the ingenious disposal method and a shocking plot twist. |
| 3    | Masquerade Hotel       | A police officer goes undercover as hotel staff to prevent a murder, knowing where it will happen but not who the suspect or target is. |

In [7]:
# Define retriever by the 'db' to answer the question
retriever = db.as_retriever()
qa_chain = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=retriever,
    verbose=True
)
res_from_retriver = qa_chain.run(question)
display(Markdown(res_from_retriver))

  res_from_retriver = qa_chain.run(question)




[1m> Entering new RetrievalQA chain...[0m

[1m> Finished chain.[0m


Here are the top 3 novels by Keigo Higashino, according to the provided ranking:

| Rank | Title | Description |
|---|---|---|
| 1 | Salvation of A Saint | Features an incredibly clever and shocking murder method, making evidence hard to find. |
| 2 | Devotion of Suspect X | Reveals the murderer early, but the mystery is how the body was disposed of undetected, with a mind-blowing twist. |
| 3 | Masquerade Hotel | A police officer goes undercover in a hotel to investigate a murder where the location is known, but not the suspect or target. |

In [None]:
'''
# Create index by Index Creator
index = VectorstoreIndexCreator(
    vectorstore_cls=DocArrayInMemorySearch,
    embedding=GoogleGenerativeAIEmbeddings(model="models/text-embedding-004")
).from_loaders([loader])
'''


In [36]:
# Create examples to test llm
from langchain.evaluation.qa import QAGenerateChain
example_gen_chain = QAGenerateChain.from_llm(llm)
auto_examples = example_gen_chain.apply_and_parse(
    [{"doc": doc} for doc in docs]
)

examples = [
    {
        "query": "What's Keigo Higashino's occupation?",
        "answer": "Author"
    },
    {
        "query": "Which type of his novel is famous for?",
        "answer": "Mystery"
    },
    {
        "query": "Which one of his novels is ranked at fifth?",
        "answer": "The Newcomer"
    }
]
examples.append(auto_examples[0]['qa_pairs'])
qa_chain.run(examples[-1])
#print(examples[-1])


  auto_examples = example_gen_chain.apply_and_parse(




[1m> Entering new RetrievalQA chain...[0m

[1m> Finished chain.[0m
{'query': "According to natazwa's ranking, which Keigo Higashino novel received the lowest rating, and what specific criticisms did the reviewer have regarding its genre and execution?", 'answer': 'The lowest-ranked novel on natazwa\'s list is "Keajaiban Toko Kelontong Namiya," which received a rating of 2.5/5. The reviewer criticized it for being in the fantasy genre, a departure from Keigo Higashino\'s typical murder mysteries, and for its poor execution where the author "tells instead of shows," making the narrative feel rushed. Despite an intriguing premise and the author\'s ability to connect characters, natazwa found the novel boring and disappointing.'}


In [42]:
# Evaluate all predictions for examples
from langchain.evaluation.qa import QAEvalChain
predictions = qa_chain.apply(examples)

eval_chain = QAEvalChain.from_llm(llm)
graded_outputs = eval_chain.evaluate(examples, predictions)

for i in range(len(examples)):
    print(f'Example {i}:')
    print('Question: '+ examples[i]['query'])
    print('Real Answer: '+ examples[i]['answer'])
    print('Predicted Answer: '+ predictions[i]['result'])
    print('Predicted Grade: ' + graded_outputs[i]['results'])
    print()




[1m> Entering new RetrievalQA chain...[0m

[1m> Finished chain.[0m


[1m> Entering new RetrievalQA chain...[0m

[1m> Finished chain.[0m


[1m> Entering new RetrievalQA chain...[0m

[1m> Finished chain.[0m


[1m> Entering new RetrievalQA chain...[0m

[1m> Finished chain.[0m
Example 0:
Question: What's Keigo Higashino's occupation?
Real Answer: Author
Predicted Answer: Keigo Higashino is an acclaimed Japanese author, renowned for his mystery novels.
Predicted Grade: CORRECT

Example 1:
Question: Which type of his novel is famous for?
Real Answer: Mystery
Predicted Answer: Keigo Higashino is famous for his **mystery novels**, particularly **murder mysteries**.
Predicted Grade: GRADE: CORRECT

Example 2:
Question: Which one of his novels is ranked at fifth?
Real Answer: The Newcomer
Predicted Answer: The novel ranked at fifth is **The Newcomer**.
Predicted Grade: GRADE: CORRECT

Example 3:
Question: According to natazwa's ranking, which Keigo Higashino novel received the 