### Conversation Q/A RAG 
1. simple RAG app doestn't focus on previous questions(no chat memory )
2. convo RAG has some sort of "memory" of past questions and answers, and some logic for incorporating those into its current thinking.

- focusing on adding logic for incorporating historical messages using 2 approaches-->1. chains, 2. Agents

In [80]:
# to load envi variables
import os
from dotenv import load_dotenv
load_dotenv()

True

In [81]:
# using groq for llm 

groq_api_key = os.getenv("GROQ_API_KEY")
from langchain_groq import ChatGroq
llm = ChatGroq(model="llama3-8b-8192",groq_api_key=groq_api_key,max_retries=5)

In [82]:
# importing modules
import bs4 # web scraping
from langchain_community.document_loaders import WebBaseLoader # to load webpage content
from langchain.text_splitter import RecursiveCharacterTextSplitter # to split text 
from langchain_chroma import Chroma # to store vectors in db
from langchain_core.prompts import ChatPromptTemplate # to define prompt for llm
from langchain.chains import create_retrieval_chain #to create retrieval chain 
from langchain.chains.combine_documents import create_stuff_documents_chain # Create a chain for passing a list of Documents to a model.

# for free embeddings using hugging face models
os.environ['HF_TOKEN']=os.getenv("HF_TOKEN")
from langchain_huggingface import HuggingFaceEmbeddings # to embed text to vectors
HF_embedding = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")




In [105]:
# 1. Load, chunk and index the contents of the blog to create a retriever.
loader = WebBaseLoader(
    web_paths=("https://medium.com/@prateekgaurav/nlp-zero-to-hero-part-1-introduction-bow-tf-idf-word2vec-c1b11ed77a2#:~:text=Word2Vec%20vs.,in%20a%20dense%20vector%20space.",),
        bs_kwargs=dict(
        parse_only=bs4.SoupStrainer(
            class_=("post-content", "post-title","post-header")
        )
    ),
)
docs = loader.load()

In [106]:
# split embed and vectorspace
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000,chunk_overlap=200)

splitted_docs = text_splitter.split_documents(docs)

vectorstore = Chroma.from_documents(documents=docs,embedding=HF_embedding)

retriever = vectorstore.as_retriever()


In [107]:
# prompts
# system prompt is telling llm how should it act/work
system_prompt = (
    "You are an assistant for question-answering tasks. "
    "Use the following pieces of retrieved context to answer "
    "the question. If you don't know the answer, say that you "
    "don't know. Use three sentences maximum and keep the "
    "answer concise."
    "\n\n"
    "{context}"
)

prompt = ChatPromptTemplate.from_messages(
    [
        ("system",system_prompt),
        ("human","{input}")
    ])


In [108]:
# chains
question_answer_chain = create_stuff_documents_chain(llm, prompt)
# here prompt requires context in system-prompt create_stuff_doc_chain combines all the docs and sends it to llm

rag_chain = create_retrieval_chain(retriever, question_answer_chain)

# this retrieval chain retrieves docs and then passes question_ans_chain-- here docs are context and prompt sent to llm

In [125]:
# output from llm 
response = rag_chain.invoke({"input": "What is LLM?"})
response ["answer"]
# here given webpage doesn't contain anything about llm

"I don't know."

In [124]:
# given web page only contains info about embeddings
response_1 = rag_chain.invoke({"input":"what is tf-idf"})
response_1

{'input': 'what is tf-idf',
 'context': [Document(metadata={'description': 'Natural Language Processing (NLP) has become an integral part of various industries, including healthcare, finance, and e-commerce, to name a few. NLP is a subfield of artificial intelligence (AI)…', 'language': 'en', 'source': 'https://medium.com/@prateekgaurav/nlp-zero-to-hero-part-1-introduction-bow-tf-idf-word2vec-c1b11ed77a2#:~:text=Word2Vec%20vs.,in%20a%20dense%20vector%20space.', 'title': 'NLP: Zero To Hero [Part 1: Introduction, BOW, TF-IDF & Word2Vec] | by Prateek Gaurav | Medium'}, page_content="NLP: Zero To Hero [Part 1: Introduction, BOW, TF-IDF & Word2Vec] | by Prateek Gaurav | MediumOpen in appSign upSign inWriteSign upSign inNLP: Zero To Hero [Part 1: Introduction, BOW, TF-IDF & Word2Vec]Prateek Gaurav·Follow10 min read·Mar 23, 2023--1ListenShareLink to Part 2 of this article:NLP: Zero To Hero [Part 2: Vanilla RNN, LSTM, GRU & Bi-Directional LSTM]Link to Part 3 of this article:NLP: Zero To Hero [

### ADDING CHAT HISTORY
- we need to create a model such that it remembers previously asked questions
- example ::Human: "What is Tf-idf?"---> AI: gives some answer-->human: tell me more about IT--> here it represents tf-idf and model need to capture this
- initially we have [query -> retriever] now [(query, conversation history) -> LLM -> rephrased query -> retriever]


In [126]:
# we need to update existing app with 1. prompt and 2. contextualizing questions

from langchain.chains import create_history_aware_retriever # Create a chain that takes conversation history and returns documents.
from langchain_core.prompts import MessagesPlaceholder # to pass a list of messages to prompt using chat history input

contextualize_q_system_prompt = (
    "Given a chat history and the latest user question "
    "which might reference context in the chat history, "
    "formulate a standalone question which can be understood "
    "without the chat history. Do NOT answer the question, "
    "just reformulate it if needed and otherwise return it as is."
)

contextualize_q_prompt = ChatPromptTemplate.from_messages(
    [
        ("system", contextualize_q_system_prompt),
        MessagesPlaceholder("chat_history"), # here variable chat_history is used to store all the messages history
        ("human", "{input}"),
    ]
)
# here we are upgrading our retriever to his_aware_Ret where we get op from original retriever considering the input and chat_history as inputs (chat history given in prompt)
history_aware_retriever = create_history_aware_retriever(
    llm, retriever, contextualize_q_prompt
)

# using previous system prompt which takes context from webpage
qa_prompt = ChatPromptTemplate.from_messages(
    [
        ("system", system_prompt),
        MessagesPlaceholder("chat_history"),
        ("human", "{input}"),
    ]
)

# here to generate q/a(question_ans_chain) with input keys context,message_history,input_query
question_answer_chain = create_stuff_documents_chain(llm, qa_prompt)

# this chain applies history_aware_retriever and question_answer_chain in sequence
rag_chain = create_retrieval_chain(history_aware_retriever, question_answer_chain)


In [127]:
# as we have defined variable chat_history we need to store all the messages and give it to chat_history
from langchain_core.messages import AIMessage,HumanMessage

# list to store chat history from model
chat_history = []

first_question = "what is tf-idf?"
ai_msg_1 = rag_chain.invoke({"input": first_question, "chat_history": chat_history})

# to save the outputs in list of chat_history
chat_history.extend(
    [
        HumanMessage(content=first_question),
        AIMessage(content=ai_msg_1["answer"]),
    ]
)

second_question = "What are differences between it and Bow?"
ai_msg_2 = rag_chain.invoke({"input": second_question, "chat_history": chat_history})

print(ai_msg_2["answer"])
# model is able to capture it stands for tf-idf in second_question

Bag-of-Words (BoW) and Term Frequency-Inverse Document Frequency (TF-IDF) are two popular techniques used in natural language processing (NLP) for feature extraction from text data. While both techniques are used to convert text data into numerical vectors, they differ in their approach and assumptions.

**Bag-of-Words (BoW)**

BoW is a simple and widely used technique that represents a document as a bag, or a set, of its word frequencies. BoW assumes that the order of words in a document is irrelevant and that the frequency of each word is sufficient to capture the document's meaning.

Here are the key characteristics of BoW:

* **Simple**: BoW is a straightforward and easy-to-implement technique.
* **Frequency-based**: BoW focuses solely on the frequency of each word in a document.
* **Lack of context**: BoW ignores the context in which words appear in a document.
* **No consideration of rarity**: BoW does not consider the rarity of words across the corpus.

**Term Frequency-Inverse 