## 6 RAG

Retrieval Augumented Generation(검색 증강 생성)

질문을 개인의 문서와 함께 prompt에서 합치는 방법

<img src="https://python.langchain.com/assets/images/data_connection-95ff2033a8faa5f3ba41376c0f6dd32a.jpg">

Load : 어떤 소스에서 데이터를 받아옴

Transform : 받아온 데이터를 분할

Embed : 데이터를 컴퓨터가 알아듣기 쉽게 벡터화

Store : 벡터 데이터를 저장 (Vector Store)

### 6.1 Data Loaders and Splitters

RAG의 첫번째 단계 : 데이터 검색

<https://python.langchain.com/docs/modules/data_connection/document_loaders/>

In [None]:
from langchain.chat_models import ChatOpenAI
from langchain.document_loaders import UnstructuredFileLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter, CharacterTextSplitter

# splitter = RecursiveCharacterTextSplitter(
#     chunk_size=200,
#     chunk_overlap=50
# )

splitter = CharacterTextSplitter(
    separator="\n",
    chunk_size=600,
    chunk_overlap=100,
    length_function=len,
)

loader = UnstructuredFileLoader("./files/chapter_one.pdf")

loader.load_and_split(text_splitter=splitter)

### 6.2 Tiktoken
<https://platform.openai.com/tokenizer>

In [None]:
from langchain.chat_models import ChatOpenAI
from langchain.document_loaders import UnstructuredFileLoader
from langchain.text_splitter import CharacterTextSplitter

splitter = CharacterTextSplitter.from_tiktoken_encoder(
    separator="\n",
    chunk_size=600,
    chunk_overlap=100,
)


loader = UnstructuredFileLoader("./files/chapter_one.docx")


### 6.3 Vectors

Embedding : 사람이 읽는 텍스트를 컴퓨터가 쉽게 이해할 수 있도록 변환하는 과정

각각의 문서를 Vectorization화함

연습에서는 3차원 백터로 단어를 표현

openAI에서는 1536차원의 벡터로 단어를 

<https://turbomaze.github.io/word2vecjson/>

<https://www.youtube.com/watch?v=2eWuYf-aZE4>

Masculinity | Femininity | Royalty

king    | 0.9   | 0.1   | 1.0
queen   | 0.1   | 0.9   | 1.0
man     | 0.9   | 0.1   | 0.0
woman   | 0.1   | 0.9   | 0.0

royal = king - man | 0.0    | 0.0   | 1.0
queen = royal + woman   | 0.1   | 0.9   | 1.0

### 6.4 Vector Store

In [None]:
from langchain.embeddings import OpenAIEmbeddings

embedder = OpenAIEmbeddings()

vector = embedder.embed_documents([
    "hi",
    "how",
    "are",
    "you",
    "longer sentences because"
])

len(vector[4])

In [None]:
# Chroma를 이용하여 임배딩
from langchain.chat_models import ChatOpenAI
from langchain.document_loaders import UnstructuredFileLoader
from langchain.text_splitter import CharacterTextSplitter
from langchain.embeddings import OpenAIEmbeddings, CacheBackedEmbeddings
from langchain.vectorstores import Chroma
from langchain.storage import LocalFileStore

cache_dir = LocalFileStore("./.cache/")

splitter = CharacterTextSplitter.from_tiktoken_encoder(
    separator="\n",
    chunk_size=600,
    chunk_overlap=100,
)

loader = UnstructuredFileLoader("./files/chapter_one.docx")
# loader = UnstructuredFileLoader("./files/lucky_day.txt")

docs = loader.load_and_split(text_splitter=splitter)

embeddings = OpenAIEmbeddings()

cached_embedding = CacheBackedEmbeddings.from_bytes_store(
    embeddings, cache_dir,
)

vectorstore = Chroma.from_documents(docs, cached_embedding)

In [None]:
results = vectorstore.similarity_search("주인공은 왜 기분이 좋은가?")

results
# len(results)

### 6.5 Langsmith

<https://www.langchain.com/langsmith>

### 6.6 RetrievalQA

생성한 문서를 이용하여 LLM에게 질문함

off-the-shelf chain : 미리 만들어 놓은 chain을 사용함

#### Stuff document chain Method
<https://js.langchain.com/docs/modules/chains/document/stuff>

<img src="https://miro.medium.com/v2/resize:fit:1400/format:webp/1*GykpE5AdrPkT2jPM3XH-QQ.png">

docs와 question을 prompt에 묶어서 한번에 질문

---

#### Refine document chain
<https://js.langchain.com/docs/modules/chains/document/refine>

<img src="https://js.langchain.com/assets/images/refine-a70f30dd7ada6fe5e3fcc40dd70de037.jpg">

여러개의 document를 순차적으로 질문하고 이를 반복하면서 정제된 답변을 도출

Stuff document chain 보다 더 비쌈

---

#### Map reduce document chain
<https://js.langchain.com/docs/modules/chains/document/map_reduce>

<img src="https://js.langchain.com/assets/images/map_reduce-c65525a871b62f5cacef431625c4d133.jpg">

document를 입력받아서, 개별적으로 요약작업 수행

각각의 요약본을 LLM에 전달

매우 많은 연산작업이 필요

---

### Map re-rank document chain (현재 안됨)
<https://python.langchain.com.cn/docs/modules/chains/document/map_rerank>

<img src="https://python.langchain.com.cn/assets/images/map_rerank-0302b59b690c680ad6099b7bfe6d9fe5.jpg">

각 document를 입력받아서, 개별적으로 요약작업을 수행하고 점수(score)를 매김

가장 높은 점수인 답변을 점수와 함께 도출

In [None]:
# vectorstore : Chroma 방식
from langchain.chat_models import ChatOpenAI
from langchain.document_loaders import UnstructuredFileLoader
from langchain.text_splitter import CharacterTextSplitter
from langchain.embeddings import OpenAIEmbeddings, CacheBackedEmbeddings
from langchain.vectorstores import Chroma
from langchain.storage import LocalFileStore
from langchain.chains import RetrievalQA

llm = ChatOpenAI()

cache_dir = LocalFileStore("./.cache/")

splitter = CharacterTextSplitter.from_tiktoken_encoder(
    separator="\n",
    chunk_size=600,
    chunk_overlap=100,
)

loader = UnstructuredFileLoader("./files/chapter_one.docx")
# loader = UnstructuredFileLoader("./files/lucky_day.txt")

docs = loader.load_and_split(text_splitter=splitter)

embeddings = OpenAIEmbeddings()

cached_embedding = CacheBackedEmbeddings.from_bytes_store(
    embeddings, cache_dir,
)

vectorstore = Chroma.from_documents(docs, cached_embedding)

chain = RetrievalQA.from_chain_type(
    llm = llm,
    chain_type="stuff",
    retriever=vectorstore.as_retriever(),
)

In [None]:
chain.run("Who is a main character. Tell me by Korean")

In [None]:
# vectorstore : FAISS
from langchain.chat_models import ChatOpenAI
from langchain.document_loaders import UnstructuredFileLoader
from langchain.text_splitter import CharacterTextSplitter
from langchain.embeddings import OpenAIEmbeddings, CacheBackedEmbeddings
from langchain.vectorstores import FAISS
from langchain.storage import LocalFileStore
from langchain.chains import RetrievalQA

llm = ChatOpenAI()

cache_dir = LocalFileStore("./.cache/")

splitter = CharacterTextSplitter.from_tiktoken_encoder(
    separator="\n",
    chunk_size=600,
    chunk_overlap=100,
)

# loader = UnstructuredFileLoader("./files/chapter_one.docx")
loader = UnstructuredFileLoader("./files/lucky_day.txt")

docs = loader.load_and_split(text_splitter=splitter)

embeddings = OpenAIEmbeddings()

cached_embedding = CacheBackedEmbeddings.from_bytes_store(
    embeddings, cache_dir,
)

vectorstore = FAISS.from_documents(docs, cached_embedding)

chain = RetrievalQA.from_chain_type(
    llm = llm,
    chain_type="refine",
    retriever=vectorstore.as_retriever(),
)

In [None]:
chain.run("Who is a main character. Tell me by Korean")

In [None]:
chain = RetrievalQA.from_chain_type(
    llm = llm,
    chain_type="map_reduce",
    retriever=vectorstore.as_retriever(),
)

chain.run("여기에 등장하는 인물 모두 말해줘.")

In [None]:
chain = RetrievalQA.from_chain_type(
    llm = llm,
    chain_type="map_reduce",
    retriever=vectorstore.as_retriever(),
)

chain.run("Describe Victory Mansions")

### 6.8 Stuff LCEL Chain

LangChain Expression Language를 사용해서 stuff chain을 구현

In [None]:
from langchain.document_loaders import UnstructuredFileLoader
from langchain.text_splitter import CharacterTextSplitter
from langchain.embeddings import OpenAIEmbeddings, CacheBackedEmbeddings
from langchain.vectorstores import FAISS
from langchain.storage import LocalFileStore

cache_dir = LocalFileStore("./.cache/")

splitter = CharacterTextSplitter.from_tiktoken_encoder(
    separator="\n",
    chunk_size=600,
    chunk_overlap=100,
)

loader = UnstructuredFileLoader("./files/chapter_one.docx")
# loader = UnstructuredFileLoader("./files/lucky_day.txt")

docs = loader.load_and_split(text_splitter=splitter)

embeddings = OpenAIEmbeddings()

cached_embedding = CacheBackedEmbeddings.from_bytes_store(embeddings, cache_dir,)

vectorstore = FAISS.from_documents(docs, cached_embedding)

In [None]:
from langchain.chat_models import ChatOpenAI
from langchain.prompts import ChatPromptTemplate
from langchain.schema.runnable import RunnablePassthrough

llm = ChatOpenAI(
    model_name="gpt-3.5-turbo-0125",
    temperature=0.1
)

retriver = vectorstore.as_retriever()

prompt = ChatPromptTemplate.from_messages([
    (
        "system",
        "You are a helpful assistant. \
        Answer questions using only the following context. \
        If you don't know the answer just say you don't know, \
        don't make it up:\n\n{context}",
    ),
    ("human", "{question}",),
])

chain = {"context":retriver, "question":RunnablePassthrough()} | prompt | llm

chain.invoke("Describe Victory Mansions.")

AIMessage(content="Victory Mansions is a building with glass doors that Winston Smith enters. The hallway smells of boiled cabbage and old rag mats. It has seven flights of stairs, and the lift is seldom working. There is a large colored poster of a man's face on the wall, and the building is not well maintained, with the electric current cut off during daylight hours as part of an economy drive.")

In [None]:
from langchain.chat_models import ChatOpenAI
from langchain.prompts import ChatPromptTemplate
from langchain.schema.runnable import RunnablePassthrough

llm = ChatOpenAI(
    model_name="gpt-3.5-turbo-0125",
    temperature=0.1
)

retriver = vectorstore.as_retriever()

prompt = ChatPromptTemplate.from_messages([
    (
        "system",
        "You are a helpful assistant. \
        Answer questions using only the following context. \
        If you don't know the answer just say you don't know, \
        don't make it up:\n\n{context}",
    ),
    ("human", "{question}",),
])

chain = (
    {
        "context":retriver,
        "question":RunnablePassthrough(),
    }
    | prompt
    | llm
)

chain.invoke("who is main character")

AIMessage(content='The main character is Winston Smith.')

### 6.9 Map Reduce LCEL Chain

LangChain Expression Language를 사용해서 Map Reduce chain을 구현

In [None]:
from langchain.document_loaders import UnstructuredFileLoader
from langchain.text_splitter import CharacterTextSplitter
from langchain.embeddings import OpenAIEmbeddings, CacheBackedEmbeddings
from langchain.vectorstores import FAISS
from langchain.storage import LocalFileStore

cache_dir = LocalFileStore("./.cache/")

splitter = CharacterTextSplitter.from_tiktoken_encoder(
    separator="\n",
    chunk_size=600,
    chunk_overlap=100,
)

loader = UnstructuredFileLoader("./files/chapter_one.docx")
# loader = UnstructuredFileLoader("./files/lucky_day.txt")

docs = loader.load_and_split(text_splitter=splitter)

embeddings = OpenAIEmbeddings()

cached_embedding = CacheBackedEmbeddings.from_bytes_store(embeddings, cache_dir,)

vectorstore = FAISS.from_documents(docs, cached_embedding)

In [None]:
from langchain.chat_models import ChatOpenAI
from langchain.prompts import ChatPromptTemplate
from langchain.schema.runnable import RunnablePassthrough, RunnableLambda

llm = ChatOpenAI(
    model_name="gpt-3.5-turbo-0125",
    temperature=0.1
)

retriver = vectorstore.as_retriever()

# 1 : list of docs
# 2 : for doc in list of docs | prompt | llm
# 3 : for respone in list of llms respons | put them all together
# 4 : final doc | prompt | llm 

map_doc_prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            """
            Use the following portion of a long document to see if any of the text is relevant to answer the question.
            Return any relevant text verbatim.
            If there is no relevant text, return : ''
            -------
            {context}
            """,
        ),
        ("human", "{question}"),
    ]
)

# 2. for doc in list of docs | prompt | llm
map_doc_chain = map_doc_prompt | llm

# 3. for respone in list of llms respons | put them all together
def map_docs(inputs):
    documents = inputs['documents']
    question = inputs['question']
    # results = []
    # for document in documents:
    #     result = map_doc_chain.invoke({
    #         "context": document.page_content,
    #         "question": question
    #     }).content
    #     results.append(result)
    # return "\n\n".join(results)
    return "\n\n".join(
        map_doc_chain.invoke(
            {"context": doc.page_content, "question": question}
        ).content
        for doc in documents
    )

# 1. list of docs
map_chain = {"documents":retriver, "question": RunnablePassthrough()} | RunnableLambda(map_docs)
    
# 4. final doc | prompt | llm 
final_prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            """
            Given the following extracted parts of a long document and a question, create a final answer. 
            If you don't know the answer, just say that you don't know. Don't try to make up an answer.
            ------
            {context}
            """,
        ),
        ("human", "{question}")
    ]
)

chain = (
    {
        "context": map_chain,
        "question": RunnablePassthrough()
    }
    | final_prompt
    | llm
)

chain.invoke("Describe Vitory Mansions")
# chain.invoke("Where does Winston go to work?")

AIMessage(content='Winston goes to work at the Ministry of Truth, which is described as an enormous pyramidal structure of glittering white concrete, soaring up 300 meters into the air.')