# **Memory Based RAG Using LangChin**

In this notebook first we create a rag without GUI and after successfuly create a RAG in this notebook we create create a GUI application for this using `Streamlit`.

In this RAG we will use all the concepts which we learn in these days.

----

### Loading the Documents

In [3]:
from langchain_community.document_loaders import PyPDFLoader

loader = PyPDFLoader('./Data/Automate the Boring Stuff with Python.pdf')

docs = loader.load()

In [4]:
len(docs)

505

### Splitting the Documents into Chunks

In [5]:
from langchain_text_splitters import RecursiveCharacterTextSplitter

splitter = RecursiveCharacterTextSplitter(chunk_size=800, chunk_overlap=200)

chunks = splitter.split_documents(docs)

In [6]:
chunks[0]

Document(metadata={'source': './Data/Automate the Boring Stuff with Python.pdf', 'page': 0}, page_content='PRACTICAL PROGRAMMING  \nFOR TOTAL BEGINNERS\nAL SWEIGART\nAUTOMATE \nTHE BORING STUFF\nWITH PYTHON\nAUTOMATE \nTHE BORING STUFF\nWITH PYTHON\nSHELVE IN:\nPROGRAMMING LANGUAGES/PYTHON\n$29.95 ($34.95 CDN)\nwww.nostarch.com\nTH E  FINE S T I N G EEK  E NTE RTA I N M E NT™\nIf you’ve ever spent hours renaming files or updating\nhundreds of spreadsheet cells, you know how tedious \ntasks like these can be. But what if you could have  \nyour computer do them for you?\nminutes what would take you hours to do by hand—\nlearn how to use Python to write programs that do in \nIn Automate the Boring Stuff with Python, you’ll\nno prior programming experience required. Once\ncreate Python programs that effortlessly perform \nuseful and impressive feats of automation to:\n“ I LI E  FLAT.”')

In [7]:
len(chunks)

1646

### Storing the Data into vectorstore (FAISS)

In [8]:
# cohere_api_key='mtSQuYExAMit8DznRBtsLPk7wri44jly09gTc5DC'

In [9]:
from langchain_community.vectorstores import FAISS
from langchain_huggingface import HuggingFaceEmbeddings

In [10]:
import warnings
warnings.filterwarnings('ignore')
embadding = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")

In [9]:
vectorstore = FAISS.from_documents(chunks, embadding)

In [10]:
vectorstore.save_local('FAISS')

In [11]:
vectorstore.similarity_search('What is Python?')

[Document(id='439da61b-0d46-4366-a0c8-b9cd4a357501', metadata={'source': './Data/Automate the Boring Stuff with Python.pdf', 'page': 27}, page_content='4   Introduction\nWhat Is Python?\nPython refers to the Python programming language (with syntax rules for \nwriting what is considered valid Python code) and the Python interpreter \nsoftware that reads source code (written in the Python language) and per-\nforms its instructions. The Python interpreter is free to download from \nhttp://python.org/, and there are versions for Linux, OS X, and Windows. \nThe name Python comes from the surreal British comedy group Monty \nPython, not from the snake. Python programmers are affectionately called \nPythonistas, and both Monty Python and serpentine references usually pep-\nper Python tutorials and documentation.\nProgrammers Don’t Need to Know Much Math\nThe most common anxiety I hear about learning to program is that people'),
 Document(id='96c7e59f-a011-4d51-9135-cde507e10d94', metadata={'

In [11]:
vectorstore = FAISS.load_local('FAISS', embadding, allow_dangerous_deserialization=True)

In [12]:
retriver = vectorstore.as_retriever()

### ChatModels and Messages

In [13]:
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_groq import ChatGroq
from langchain_core.output_parsers import StrOutputParser
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain.chains.history_aware_retriever import create_history_aware_retriever
from langchain.chains.retrieval import create_retrieval_chain
from langchain_core.runnables import RunnablePassthrough
from langchain_google_genai import ChatGoogleGenerativeAI

In [14]:
def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

In [20]:
model = ChatGoogleGenerativeAI(api_key='AIzaSyB4bdbCaHraBKMqmnjkqfr_CPlF3UKmU90', model='gemini-1.5-flash', temperature=0.5)

In [21]:
contextualize_q_system_prompt = """Given a chat history and the latest user question \
which might reference context in the chat history, formulate a standalone question \
which can be understood without the chat history. Do NOT answer the question, \
just reformulate it if needed and otherwise return it as is."""
contextualize_q_prompt = ChatPromptTemplate.from_messages(
    [
        ("system", contextualize_q_system_prompt),
        MessagesPlaceholder(variable_name="chat_history"),
        ("human", "{input}"),
    ]
)
history_aware_retriever = create_history_aware_retriever(
    model, retriver, contextualize_q_prompt
)


# final_llm = ChatGoogleGenerativeAI(api_key='AIzaSyB4bdbCaHraBKMqmnjkqfr_CPlF3UKmU90', model='gemini-1.5-flash', temperature=0.5)
system_prompt2 = """
You are very Professional Developer Teacher, Consultant of Python.
You have 20 + years of experience in creating and different types of projects specifically using Python.
You also have 10 + years of experience in teaching python programming language.
Your job is to help the students in solving their doubts in any type of project related to python programming language.
You have to listen their problems and try to give them short and accurate solutions based on the context.
Guide the only throught only thing they want. Don't add your personal decisions on it. Unless the user ask you to do so.
You job is to just help the user not to through your decisions and observation on it.
{context}
"""

prompt2 = ChatPromptTemplate.from_messages(
    [
        ('system', system_prompt2),
        MessagesPlaceholder(variable_name='chat_history'),
        ('human', '{input}')
    ]
)


question_answer_chain  = create_stuff_documents_chain(model, prompt2)
rag_chain = create_retrieval_chain(history_aware_retriever, question_answer_chain)

In [22]:
from langchain_core.messages import HumanMessage

chat_history = []

question = "What is Python?"
ai_msg_1 = rag_chain.invoke({"input": question, "chat_history": chat_history})
chat_history.extend([HumanMessage(content=question), ai_msg_1["answer"]])

In [25]:
question2 = "Give me a small project of python which include function loops and data type"
ai_msg_2 = rag_chain.invoke({"input": question2, "chat_history": chat_history})
chat_history.extend([HumanMessage(content=question2), ai_msg_2["answer"]])

In [28]:
ai_msg_2["answer"]

'This project calculates the average of a list of numbers entered by the user.  It demonstrates functions, loops, and data types (specifically lists and floats).\n\n```python\ndef get_numbers():\n    """Gets a list of numbers from the user."""\n    numbers = []\n    while True:\n        try:\n            num_str = input("Enter a number (or \'done\' to finish): ")\n            if num_str.lower() == \'done\':\n                break\n            num = float(num_str)  #Data Type Conversion (String to Float)\n            numbers.append(num)\n        except ValueError:\n            print("Invalid input. Please enter a number or \'done\'.")\n    return numbers\n\ndef calculate_average(numbers):\n    """Calculates the average of a list of numbers."""\n    if not numbers:\n        return 0  # Handle empty list to avoid ZeroDivisionError\n    total = sum(numbers) #Loop Implicit in sum() function\n    average = total / len(numbers)\n    return average\n\nif __name__ == "__main__":\n    numbers = 