## Advanced RAG Pipeline
* RAGAS: https://www.youtube.com/watch?v=Anr1br0lLz8
* Conversation Retrieval Chain: https://js.langchain.com/docs/modules/chains/popular/chat_vector_db
* Application: https://abvijaykumar.medium.com/prompt-engineering-retrieval-augmented-generation-rag-cd63cdc6b00
* https://medium.com/@jerome.o.diaz/langchain-conversational-retrieval-chain-how-does-it-work-bb2d71cbb665

### Steps
* Query rephrasing: take in context the memory of the conversation
* Relevant document retrieval
* Answer rephrasing query with document retrieval

In [None]:
import os
import openai
from dotenv import load_dotenv

load_dotenv('.env')
openai.api_base = os.getenv('OPENAI_ENDPOINT')
openai.api_key = os.getenv('OPENAI_API_KEY')
openai.api_version = "2023-09-15-preview"
llm_model = 'gpt-35-turbo-jdrios'
emb_model = 'text-embedding-ada-002-jdrios'

In [None]:
from langchain_community.document_loaders import WebBaseLoader
loader = WebBaseLoader(
    ["https://www.leal.co/usuarios", 
     "https://landing.leal.co/plataforma",
     "https://www.leal.co/nosotros"]
)
documents = loader.load()

In [None]:
# Metadata
documents[2].metadata

In [None]:
from langchain.text_splitter import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(
    chunk_size = 700,
    chunk_overlap = 50
)

documents = text_splitter.split_documents(documents)

In [None]:
len(documents)

In [None]:
from langchain_openai import AzureOpenAIEmbeddings

embeddings = AzureOpenAIEmbeddings(
    azure_deployment=emb_model,
    azure_endpoint=os.getenv('OPENAI_ENDPOINT'),
)

In [None]:
# Create Vector Store
from langchain_community.vectorstores import FAISS
vector_store = FAISS.from_documents(documents, embeddings)

In [None]:
# Creating Retriever
retriever = vector_store.as_retriever()

In [None]:
# How to retrieve content
retrieved_documents = retriever.invoke("Quienes son los fundadores de Leal?")

In [None]:
for doc in retrieved_documents:
  print(doc)

In [None]:
# Creatin Prompt
from langchain.prompts import ChatPromptTemplate

template = """Answer the question based only on the following context. If you cannot answer the question with the context, please respond with 'Redirigir...':
Context:
{context}
Question:
{question}
"""

prompt = ChatPromptTemplate.from_template(template)

In [39]:
from operator import itemgetter

from langchain_openai import AzureChatOpenAI
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough

primary_qa_llm = AzureChatOpenAI(model_name=llm_model,
                                 temperature=0,
                                 api_version="2023-09-15-preview",
                                 azure_endpoint=os.getenv('OPENAI_ENDPOINT')) # chain_type is "stuff" by default

retrieval_augmented_qa_chain = (
    # INVOKE CHAIN WITH: {"question" : "<<SOME USER QUESTION>>"}
    # "question" : populated by getting the value of the "question" key
    # "context"  : populated by getting the value of the "question" key and chaining it into the base_retriever
    {"context": itemgetter("question") | retriever, "question": itemgetter("question")}
    # "context"  : is assigned to a RunnablePassthrough object (will not be called or considered in the next step)
    #              by getting the value of the "context" key from the previous step
    | RunnablePassthrough.assign(context=itemgetter("context"))
    # "response" : the "context" and "question" values are used to format our prompt object and then piped
    #              into the LLM and stored in a key called "response"
    # "context"  : populated by getting the value of the "context" key from the previous step
    | {"response": prompt | primary_qa_llm, "context": itemgetter("context")}
)

In [40]:
question = "En que paises está Leal?"

result = retrieval_augmented_qa_chain.invoke({"question" : question})

print(result["response"].content)

Leal está presente en Colombia, El Salvador, Panamá, Costa Rica, Honduras, Guatemala, Nicaragua y México.


## Chain Implementation

In [49]:
from langchain.memory import ConversationBufferMemory
from langchain.chains import ConversationalRetrievalChain

# Create Memory
# memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)

# Create Chain
chatQA = ConversationalRetrievalChain.from_llm(
            llm=primary_qa_llm, 
            retriever=retriever, 
            # memory=memory,
            verbose=True)
chat_history = []

In [51]:
query = "Como me llamo?"
response = chatQA({"question": query, "chat_history": chat_history})
print("******************************** Done ********************************")
print(response["answer"])
# print(f'Memory buffer: {memory.buffer}')



[1m> Entering new StuffDocumentsChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mSystem: Use the following pieces of context to answer the user's question. 
If you don't know the answer, just say that you don't know, don't try to make up an answer.
----------------
PUNTOS LEAL

Acerca de Leal

aliadas‍8Países dondetenemos presenciaConoce nuestros fundadoresEstas son las mentes brillantes que están detrás deléxito de nuestro día a día.Camilo MartínezCo-Founder & CEOLinkedIn

Florence FrechCo-Founder & COOLinkedIn

Conoce nuestras oficinasBogotá, ColombiaEl Salvador, San SalvadorMedellín, ColombiaCiudad de México, México¿Quieres unirte al equipo?HAZLO AQUÍPáginas principalesUsuariosComerciosNosotrosBlog LealPuntosLeal CoinsSeguridadBlog LealSoporteTérminos y condicionesAyuda y contactoTratamiento de datosVencimiento de CoinsActualizar datosCopyright © Leal Colombia | Designed by Leal
Human: Como me llamo?[0m

[1m> Finished chai

In [52]:
print(chat_history)

[]


In [None]:
# Conversation flow
chat_history = []
qry = ""
while qry != 'done':
    qry = input('Question: ')
    if qry != exit:
        response = chatQA({"question": qry, "chat_history": chat_history})
        print(response["answer"])

In [None]:
chat_history

## Evaluation with RAGAS

In [None]:
# Create Test Data
from langchain.document_loaders import DirectoryLoader
from ragas.testset.generator import TestsetGenerator
from ragas.testset.evolutions import simple, reasoning, multi_context

generator = TestsetGenerator.from_langchain(generator_llm=primary_qa_llm,critic_llm=primary_qa_llm,embeddings=embeddings)
testset = generator.generate_with_langchain_docs(documents, test_size=10, 
                                                 raise_exceptions=False, with_debugging_logs=False,
                                                 distributions={simple: 0.5, reasoning: 0.25, multi_context: 0.25})   

In [None]:
testset.test_data[1]

In [None]:
test_df = testset.to_pandas()
test_df

In [None]:
test_questions = test_df["question"].values.tolist()
test_groundtruths = test_df["ground_truth"].values.tolist()

In [None]:
answers = []
contexts = []

for question in test_questions:
  response = retrieval_augmented_qa_chain.invoke({"question" : question})
  answers.append(response["response"].content)
  contexts.append([context.page_content for context in response["context"]])

In [None]:
from datasets import Dataset

response_dataset = Dataset.from_dict({
    "question" : test_questions,
    "answer" : answers,
    "contexts" : contexts,
    "ground_truth" : test_groundtruths
})

In [None]:
response_dataset[0]

## Evaluating process

In [None]:
from ragas import evaluate
from ragas.metrics import (
    faithfulness,
    answer_relevancy,
    answer_correctness,
    context_recall,
    context_precision,
)

metrics = [
    faithfulness,
    answer_relevancy,
    context_recall,
    context_precision,
    answer_correctness,
]

In [None]:
results = evaluate(response_dataset, metrics, llm=primary_qa_llm,embeddings=embeddings)

In [None]:
results_df = results.to_pandas()
results_df

## Better implementation

In [None]:
from langchain import hub
retrieval_qa_prompt = hub.pull("rlm/rag-prompt", api_url="https://api.hub.langchain.com")

In [None]:
print(retrieval_qa_prompt.messages[0].prompt.template)

In [None]:
# Multi Query Retriever
from langchain.retrievers import MultiQueryRetriever
advanced_retriever = MultiQueryRetriever.from_llm(retriever=retriever, llm=primary_qa_llm)

# Document 
from langchain.chains.combine_documents import create_stuff_documents_chain
document_chain = create_stuff_documents_chain(primary_qa_llm, prompt)

# Retrieval Chain
from langchain.chains import create_retrieval_chain
retrieval_chain = create_retrieval_chain(advanced_retriever, document_chain)

In [None]:
response = retrieval_chain.invoke({"input": "Que son los Leal Coins"})

In [None]:
response

In [None]:
import random
def chatbot_response(user_query, chatbot_name="Le A.I", company_name="Leal"):
  """
  This function simulates a friendly customer service chatbot conversation.

  Args:
      user_query: The user's question or input.
      chatbot_name: The name of the chatbot
      company_name: The name of your company

  Returns:
      A string representing the chatbot's response.
  """
  # Define conversation flow
  prompts = [
      {
          "START_SEQ": True,
          "USER_QUERY": None,
          "RESPONSE A": f"Hi there! I'm {chatbot_name}, your friendly customer service assistant for {company_name}. How can I help you today?",
          "RESPONSE B": f"Great to see you! Is there anything I can assist you with on this day?"
      },
      {
          "USER_QUERY": None,
          "RESPONSE A": f"I understand you're having trouble with {user_query}. Let's see what we can do to fix that.",
          "RESPONSE B": f"It sounds like you're looking for information about {user_query}. I'm happy to help you find what you need."
      },
      {
          "USER_QUERY": None,
          "RESPONSE A": "Here are a few things you can try: [List your solutions here]. Let me know if any of these work!",
          "RESPONSE B": f"I can definitely walk you through the steps for {user_query}. Would you like me to do that?",
          "RESPONSE C": f"No problem! I've found some helpful resources on {user_query} that you might find useful: [List your resources here]."
      },
      {
          "USER_QUERY": None,
          "RESPONSE A": "No worries at all! We'll get this figured out together.",
          "RESPONSE B": "That's a great question! Let me see if I can find an answer for you.",
          "RESPONSE C": f"I apologize for any inconvenience this may have caused. Is there anything else I can do to assist you today?"
      },
      {
          "USER_QUERY": None,
          "RESPONSE A": "If the issue seems complex, you can offer to connect the user to a human agent.",
          "RESPONSE B": f"It seems like your situation might require a bit more personalized attention. Would you like me to connect you with one of our customer service representatives?"
      },
      {
          "USER_QUERY": None,
          "RESPONSE A": "I hope this information was helpful! Is there anything else I can help you with today?",
          "RESPONSE B": "Glad I could be of assistance! Have a wonderful {day.name}!"
      },
      {
          "END_SEQ": True,
          "USER_QUERY": None,
          "RESPONSE A": None,
          "RESPONSE B": None
      }
  ]

  # Loop through conversation prompts
  current_prompt = 0
  while current_prompt < len(prompts):
    prompt = prompts[current_prompt]
    if prompt.get("USER_QUERY") is None or prompt.get("USER_QUERY") == user_query:
      response = random.choice([prompt.get(f"RESPONSE {char}") for char in "ABC" if prompt.get(f"RESPONSE {char}")==""])
      if response:
        print(response)
      if prompt.get("END_SEQ"):
        break
    current_prompt += 1

# Example usage
chatbot_response("I'm having trouble logging in.")