# Assignment5 RAG (GPT MINI)

In [1]:
from langchain.chat_models.ollama import ChatOllama
from langchain.chat_models.openai import ChatOpenAI
from langchain.document_loaders.unstructured import UnstructuredFileLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.embeddings.ollama import OllamaEmbeddings
from langchain.embeddings.cache import CacheBackedEmbeddings
from langchain.vectorstores.faiss import FAISS
from langchain.storage import LocalFileStore
from langchain.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain.schema.runnable import RunnablePassthrough, RunnableLambda
from langchain.memory import ConversationBufferMemory


# 시험 결과 embedding, llm 성능 모두 openai가 더 좋았음
# 또한 ollama는 openai보다 더 느림
# ollama는 embedding이 매우 잘못되어 있음... 이유를 모르겠음... (제대로 확인해 볼 문제... 내가 설정하는 과정이 문제일 수 있음)

# LLM_model, models = ["openai", "gpt-4o-mini-2024-07-18"]
LLM_model, models = ["ollama", "gemma2:latest"]

file_name = "document.txt"

llm = (
    ChatOllama(temperature=0.1, model=models)
    if LLM_model == "ollama"
    else ChatOpenAI(temperature=0.1, model=models)
)

loader = UnstructuredFileLoader(f"../files/{file_name}")
cache_dir = LocalFileStore(f"../.cache/embeddings/{LLM_model}/{models}/{file_name}")

splitter = RecursiveCharacterTextSplitter.from_tiktoken_encoder(
    separators=["\n\n", ".", "?", "!"],
    chunk_size=1000,
    chunk_overlap=100,
)

docs = loader.load_and_split(text_splitter=splitter)
embeddings = (
    OllamaEmbeddings(model=models) if LLM_model == "ollama" else OpenAIEmbeddings()
)

cached_embeddings = CacheBackedEmbeddings.from_bytes_store(embeddings, cache_dir)

vectorstore = FAISS.from_documents(docs, cached_embeddings)

retriever = vectorstore.as_retriever()

memory = ConversationBufferMemory(
    llm=llm,
    return_messages=True,
    memory_key="history",
)


def load_memory(_):
    return memory.load_memory_variables({})["history"]


prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            """
            You are an AI that reads documents for you. Please answer based on the document given below. 
            If the information is not in the document, answer the question with “The required information is not in the document.” Never make up answers. \n\n{context}
            """,
        ),
        MessagesPlaceholder(variable_name="history"),
        ("human", "{question}"),
    ]
)

chain = (
    {
        "context": retriever,
        "question": RunnablePassthrough(),
        "history": RunnableLambda(load_memory),
    }
    | prompt
    | llm
)


def invoke_chain(question):
    result = chain.invoke(question).content
    memory.save_context(
        {"input": question},
        {"output": result},
    )
    print(result)

In [3]:
invoke_chain("Is Aaronson guilty?")

You're absolutely right to be skeptical!  The text makes it very clear that in this world, truth is malleable and power dictates reality. 

**We cannot know if Aaronson is guilty based on the information provided.** The Party's methods rely on manipulation, torture, and the rewriting of history.  Any "evidence" against Aaronson is likely fabricated or coerced. 

It's important to remember that in *Nineteen Eighty-Four*, the Party's goal is to control its citizens through fear and manipulation.  They use accusations and "evidence" to create a world where people are constantly suspicious of each other and afraid to speak out. 



In [4]:
invoke_chain("What message did he write in the table?")

The text you provided doesn't mention what message Winston wrote in the table.  


Let me know if you have any other questions about *Nineteen Eighty-Four* or need help understanding other parts of the story! 



In [5]:
invoke_chain("Who is Julia?")

Julia is a young woman who Winston falls in love with during his rebellion against the Party in *Nineteen Eighty-Four*. She is described as being rebellious, independent, and passionate. Unlike Winston, Julia is more focused on enjoying the present moment and finding pleasure in small acts of defiance against the Party. 

Here are some key things to know about Julia:

* **She is a member of the Junior Anti-Sex League:** This is ironic because she is actually very sexually liberated and enjoys physical intimacy with Winston.
* **She is resourceful and practical:** She helps Winston hide their relationship and find ways to express their rebellion.
* **She is more cynical than Winston:** While Winston believes in a larger revolution, Julia is more focused on personal freedom and happiness.
* **She is captured and tortured by the Party:** This ultimately leads to her breaking and betraying Winston.


Julia represents a different kind of resistance to the Party. While Winston seeks to overt

In [6]:
invoke_chain("What was the first question I asked?")

invoke_chain("What was the second question I asked?")

invoke_chain("What was the third question I asked?")

The first question you asked was:  "Is Aaronson guilty?" 


Let me know if you have any other questions about *Nineteen Eighty-Four* or anything else!
The second question you asked was: "What message did he write in the table?" 

The third question you asked was: "Who is Julia?" 



In [7]:
memory

ConversationBufferMemory(chat_memory=ChatMessageHistory(messages=[HumanMessage(content='Is Aaronson guilty?'), AIMessage(content='The provided text excerpts from George Orwell\'s *Nineteen Eighty-Four* heavily imply that the Party, led by O\'Brien, uses torture and manipulation to extract confessions and create a reality where truth is irrelevant. \n\n**Therefore, based on the text, we cannot determine if Aaronson is actually guilty.** The Party\'s methods suggest that any accusation, no matter how baseless, can be used to control and punish individuals. \n\nHere\'s why we can\'t trust the information presented:\n\n* **O\'Brien\'s words:** He states that the Party seeks power for its own sake and that the object of power is power. This suggests that justice and truth are secondary to maintaining control.\n* **Torture:** The text describes Winston being tortured by O\'Brien. This implies that any confession extracted under duress is unreliable.\n* **Control of reality:** O\'Brien claims