In [6]:
from langchain.chat_models import ChatOpenAI
from langchain.document_loaders import UnstructuredFileLoader
from langchain.text_splitter import CharacterTextSplitter
from langchain.embeddings import OpenAIEmbeddings, CacheBackedEmbeddings
from langchain.vectorstores import FAISS
from langchain.storage import LocalFileStore
from langchain.schema.runnable import RunnablePassthrough
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain.memory import ConversationBufferMemory
from langchain.chains import LLMChain

llm = ChatOpenAI(
    temperature=0.1,
)
# 캐시 저장소 설정
cache_dir = LocalFileStore("./.cache/")

# 텍스트 분할기 설정
#splitter = CharacterTextSplitter.from_tiktoken_encoder(
#    separator="\n",
#    chunk_size=600,
#    chunk_overlap=100,
#)
splitter=RecursiveCharacterTextSplitter(
    chunk_size=1000,  # 더 큰 문서 조각 사용
    chunk_overlap=200,  # 더 많은 문맥 유지
)
# 문서 로드
loader = UnstructuredFileLoader("./files/document.txt")

docs = loader.load_and_split(text_splitter=splitter)

# 임베딩 설정
embeddings = OpenAIEmbeddings()
#cached_embeddings = CacheBackedEmbeddings.from_bytes_store(embeddings, cache_dir)

# 벡터 저장소 생성
#vectorstore = FAISS.from_documents(docs, cached_embeddings)
vectorstore = FAISS.from_documents(docs, embeddings)
# retriever 업데이트
retriever  = vectorstore.as_retriever()

prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You are a helpful assistant. Answer questions using only the following context.\n"
            "You MUST summarize the relevant information and analyze it logically before answering.\n"
            "DO NOT copy the context verbatim.\n"
            "Instead, explain the key points from the context and provide a well-reasoned answer.\n"
            "If the context is unclear or contradictory, explain why and state possible interpretations.\n"
            "Your response should be thoughtful, critical, and structured in a clear and concise manner:\n\n{context}"
        ),
        MessagesPlaceholder(variable_name="history"),
        ("human", "{question}"),
    ]
)

# 대화 메모리 설정
memory= ConversationBufferMemory(memory_key="history",return_messages=True, input_key="question")
def load_memory(_):
    x=memory.load_memory_variables({})
    return {"history":x["history"]}
chain = LLMChain(
    llm=llm,
    prompt=prompt,
    memory=memory  # ⬅️ memory를 직접 적용
)
# 🔹 문서 검색 후 context 변환 함수
def get_context(query):
    retrieved_docs = retriever.invoke(query)  # 🔍 문서 검색
    if not retrieved_docs:
        print("⚠️ 검색된 문서가 없습니다.")
        return "I don't know."
    
    context_text = "\n\n".join([doc.page_content for doc in retrieved_docs])  # 📄 문서를 문자열로 변환
    #print(f"🔍 검색된 문서 내용:\n{context_text[:500]}...")  # 500자까지만 미리보기
    return context_text
# 🔹 `chain.invoke()` 실행 시 memory 포함
def ask_question(question):
    context = get_context(question)
    #history = memory.load_memory_variables({})["history"]  # ⬅️ 수동으로 이전 대화 불러오기

    response = chain.invoke({
        "context": context,
    #    "history": history,  # 🔥 대화 내역 추가
        "question": question
    })
    #memory.save_context({"question": question}, {"response": response})  # 🔥 수동 저장
    
    return response
response1 = ask_question("Describe Victory Mansions")
print("📢 AI 응답:", response1["text"])

📢 AI 응답: Victory Mansions is a dilapidated apartment building where Winston resides. The building is described as run-down and shabby, with a sense of neglect evident in its appearance. The living conditions are poor, reflecting the oppressive and austere environment of the society depicted in the novel. The residents, like Winston, live in a state of constant surveillance and fear under the watchful eye of Big Brother. The atmosphere is bleak and oppressive, mirroring the oppressive regime under which the characters live.


In [7]:
response1 = ask_question("Aaronson 은 유죄인가요?")
print("📢 AI 응답:", response1["text"])

📢 AI 응답: Aaronson, along with Jones and Rutherford, were declared guilty of the crimes they were charged with. However, the protagonist, Winston, later questions the validity of this information, suggesting that the photograph disproving their guilt may have been fabricated by the authorities. This ambiguity raises doubts about the true guilt or innocence of Aaronson and the others.


In [8]:
response1 = ask_question("그런데 실제로 유죄인가요?")
print("📢 AI 응답:", response1["text"])

📢 AI 응답: The text does not provide a definitive answer regarding whether Aaronson and his associates were actually guilty of the crimes they were accused of. The protagonist, Winston, questions the authenticity of the evidence against them, suggesting that it may have been manipulated by the authorities. This ambiguity leaves their true guilt or innocence open to interpretation, as the narrative does not conclusively confirm their actual culpability.


In [9]:
response1=ask_question("Winston 은 Aaronson의 유죄를 어떻게 생각했나요?"  )
print("📢 AI 응답:", response1["text"])

📢 AI 응답: Winston initially believed in the guilt of Aaronson and his associates based on the evidence presented by the Party. However, he later begins to question the authenticity of this evidence, suspecting that it may have been fabricated by the authorities. This shift in Winston's perspective indicates his growing skepticism towards the Party's version of events and raises doubts about the true guilt or innocence of Aaronson and the others.


In [10]:
response1=ask_question("Winston은 처음 부터 그들의 유죄를 의심했나요?")
print("📢 AI 응답:", response1["text"])

📢 AI 응답: Initially, Winston did not doubt the guilt of Aaronson and his associates. He believed in the evidence presented by the Party and accepted their guilt without question. It was only later that Winston began to question the authenticity of the evidence and consider the possibility that it may have been manipulated by the authorities. This indicates a shift in Winston's perspective over time, from unquestioning acceptance to growing skepticism towards the Party's narrative.


In [11]:
import logging

# 로깅 설정
logging.basicConfig(level=logging.INFO, format="%(message)s")

def log_message(text):
    logging.info(text)  # 개행 포함된 문자열 출력
log_message("첫 번째 줄\n두 번째 줄\n세 번째 줄")

첫 번째 줄
두 번째 줄
세 번째 줄
