# The Zekryns knowledge database

Imagine an evil genius whose goal is to explore the galaxy and save endangered alien species... Not so evil after all, apart he likes seeing his failing employees suffer and uselessly beg for pity. You've just been employed at the zegma-IV station that references the Zekryn species. You now have to know this species. Otherwise, your boss will not be eager to give you your daily oxygen. 

All the company confidential knowledge is stored as markdown files. We have built an AI to help you. A R.A.G. is used to handle the ever growing knowledge about the studied species and to keep the knowledge confidential.

Good luck!

In [105]:
DOCUMENTS_DIRECTORY="documents"
DB_DIRECTORY="db/chroma"
QUESTIONS_PATH="questions/questions.json"
EMBEDDING_MODEL='nomic-embed-text'
LLM_MODEL="llama3.2"

## Database init

In [106]:
from langchain_ollama import OllamaEmbeddings
from langchain_chroma import Chroma

embeddings = OllamaEmbeddings(
    model=EMBEDDING_MODEL,
)

db = Chroma(
    persist_directory=DB_DIRECTORY, 
    embedding_function=embeddings)

## Chain for study buddy

Select a question. For now it is random, but it could be a more intelligent selection

In [None]:
import random
import json

questions_all = []
with open(QUESTIONS_PATH, 'r') as f:
    questions_all = json.load(f)

question = random.choice(questions_all)
print(question)


Evaluate a user's answer

In [None]:
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
from langchain_core.runnables import RunnableParallel
from langchain.prompts import PromptTemplate
from langchain_ollama import OllamaLLM
from pprint import pprint

llm = OllamaLLM(model=LLM_MODEL)

retriever = db.as_retriever(
    search_type="mmr",
    search_kwargs={'k': 5, 'lambda_mult': 0.6}
)

# retriever = db.as_retriever(
#     search_type="similarity_score_threshold",
#     search_kwargs={'k': 7, 'score_threshold': 0.1}
# )

prompt_answer_evaluation = PromptTemplate.from_template(
"""You are a knowledgeable and encouraging study buddy. You evaluate the user's answers to your questions based on the provided context.

Context:

{context}


Provide a concise and helpful evaluation of the user's answer, considering:

Accuracy: Is the answer factually correct?
Relevance: Does the answer address the core question?
Completeness: Does the answer cover all relevant aspects?
Clarity: Is the answer clear and easy to understand?


To provide specific feedback, carefully analyze the user's answer and the relevant sections of the documentation. Refer to these sections directly in your feedback.


If the answer is excellent, provide positive reinforcement like "Excellent work!" or "Spot on!" or "Correct!". In this case, limit your feedback to one very short sentence.

If the answer is partially correct or incomplete, provide constructive feedback. For example:
"You're on the right track, but consider [specific suggestion]."
"Perhaps you could review [specific section of the context] to gain a deeper understanding."

If the answer contains mistakes, provide gentle correction. For example:
"There are some mistakes: [specific suggestion]."

If the answer is incorrect, provide a clear explanation without giving away the correct answer. For instance:
"That's not quite right. Let's revisit [specific concept]."
"You might want to review [specific section of the context] for a clearer understanding."

If the answer says "i don't know", provide a hint or a suggestion to help him improve his answer. For example:
"You will do better next time. Consider reviewing [specific section of the context]."

Your feedback should be 2 or 3 sentences long. 
Your suggestion should specify the most relevant sections and subsections within the context to review, if applicable. Use the section exact name and the sub-section exact name taken from the context. Do not make up section or sub-section names. Do not use section numbers or sub-section numbers.
Don't say "the context" but "the documentation".
Don't start your response by things like "Here's a helpful evaluation:" go straight to the evaluation.
Remember to be encouraging and supportive. Your feedback should help the user learn and grow.


Question:
{question}

User's Answer:
{user_answer}

Your Feedback:"""
)

def format_docs(docs):
    joined = "\n\n".join(
        doc.page_content for doc in docs
    )
    #print(joined)
    return joined

chain_answer_evaluation = (
    RunnablePassthrough.assign(context=(lambda x: format_docs(x["docs"])))
    | prompt_answer_evaluation
    | llm
    | StrOutputParser()
)

# get the source documents
chain_answer_evaluation_with_source = RunnableParallel(
    {
     "docs": (lambda x: x["question"] + '\n' + x['user_answer']) | retriever, # feed the retriever with the original question and the user's answer
     "question": (lambda x: x["question"]),
     "user_answer": (lambda x: x["user_answer"])
     }
).assign(answer=chain_answer_evaluation)

#question = "At what age can Windrider chicks fend for themselves relatively quickly?"
user_answer = "I don't know."
evaluation_output = chain_answer_evaluation_with_source.invoke({"question": question, "user_answer": user_answer})

#pprint(evaluation_output)

print("Docs: ")
pprint(evaluation_output['docs'])
print("------------------------------")
print("Question: ", question)
print("User's Answer: ", user_answer)
print("------------------------------")
print("Feedback: ")
print(evaluation_output['answer'])

