Implement a complete RAG pipeline with a Stuff Documents chain.
You must implement the chain manually.
Give a ConversationBufferMemory to the chain.
Use this document to perform RAG: https://gist.github.com/serranoarevalo/5acf755c2b8d83f1707ef266b82ea223
Ask the following questions to the chain:
Is Aaronson guilty?
What message did he write in the table?
Who is Julia?


In [1]:
from langchain.chat_models import ChatOpenAI
from langchain.callbacks import StreamingStdOutCallbackHandler

model = ChatOpenAI(
    temperature=0.1, streaming=True, callbacks=[StreamingStdOutCallbackHandler()]
)

In [2]:
from langchain.memory import ConversationBufferMemory

memory = ConversationBufferMemory(
    return_messages=True,
    memory_key="chat_history",
)


def load_memory(_):
    return memory.load_memory_variables({})["chat_history"]


def invoke_chain_and_save_memory(chain, question):
    result = chain.invoke(question)
    return memory.save_context(
        {"input": question},
        {"output": result.content},
    )

In [3]:
from langchain.document_loaders import UnstructuredFileLoader

loader = UnstructuredFileLoader("./asset/4th/document.txt")
load_docs = loader.load()

from langchain.text_splitter import CharacterTextSplitter

splitter = CharacterTextSplitter.from_tiktoken_encoder(
    separator="\n",
    chunk_size=300,
    chunk_overlap=50,
)

split_docs = splitter.split_documents(documents=load_docs)

from langchain.embeddings import OpenAIEmbeddings, CacheBackedEmbeddings
from langchain.storage import LocalFileStore

embeddings = OpenAIEmbeddings()
cache_dir = LocalFileStore("./asset/4th/.cache/")

cached_embeddings = CacheBackedEmbeddings.from_bytes_store(embeddings, cache_dir)

Created a chunk of size 326, which is longer than the specified 300
Created a chunk of size 319, which is longer than the specified 300
Created a chunk of size 317, which is longer than the specified 300
Created a chunk of size 303, which is longer than the specified 300
Created a chunk of size 304, which is longer than the specified 300
Created a chunk of size 312, which is longer than the specified 300
Created a chunk of size 326, which is longer than the specified 300


In [4]:
from langchain.vectorstores import FAISS

vectorstore = FAISS.from_documents(split_docs, cached_embeddings)
retriever = vectorstore.as_retriever()

In [21]:
from langchain.prompts import (
    ChatPromptTemplate,
    HumanMessagePromptTemplate,
    SystemMessagePromptTemplate,
)

prompt = ChatPromptTemplate(
    messages=[
        SystemMessagePromptTemplate.from_template(
            """
            Use the following reference document to answer the question of human. \n
            If you don't know the answer, just say that you don't know. Don't try to make up an answer.\n
            \n
            [reference document]\n
            {context}\n\n
            """
        ),
        SystemMessagePromptTemplate.from_template(
            """
            Below is the chat history\n
            {memory}\n\n
            """
        ),
        HumanMessagePromptTemplate.from_template("{question}"),
    ]
)

In [22]:
from langchain.schema.runnable import RunnableLambda, RunnablePassthrough


def retrieve_docs(question):
    docs = retriever.invoke(question)
    return "\n".join(doc.page_content for doc in docs)


final_chain = (
    {
        "context": RunnableLambda(retrieve_docs),
        "question": RunnablePassthrough(),
        "memory": RunnableLambda(load_memory),
    }
    | prompt
    | model
)

In [23]:
invoke_chain_and_save_memory(chain=final_chain, question="Is Aranson guilty?")

Yes, according to the reference document, Jones, Aaronson, and Rutherford were guilty of the crimes they were charged with.

In [24]:
invoke_chain_and_save_memory(
    chain=final_chain,
    question="What message did he write in the table? (Say the name who I refered as 'he'?)",
)

The message that Winston wrote on the table was "FREEDOM IS SLAVERY."

In [25]:
invoke_chain_and_save_memory(chain=final_chain, question="Who is Julia?")

Julia is a character in the novel referenced in the document. She is a significant figure in the protagonist's life and plays a crucial role in the story.

In [26]:
invoke_chain_and_save_memory(
    chain=final_chain, question="What was my previous question"
)

Your previous question was "Who is Julia?"