# 构建检索增强生成 （RAG） 应用
资料地址：https://python.langchain.com/v0.2/docs/tutorials/rag/

## 概念
RAG有两个重要的组件
- 索引
    - 加载
        - 通过文件加载器加载文档
    - 拆分
        - 使用文档拆分器将文档拆分为段落
    - 存储
        - 存储和索引拆分
- 检索和生成
    - 检索
        - 通过用户输入的问题检索相关的段落
    - 生成
        - 使用检索到的段落针对问题生成答案

## 安装相关包

In [None]:
!pip install langchain langchain_community langchain_chroma

## LangSmith

In [None]:
"""
export LANGCHAIN_TRACING_V2="true"
export LANGCHAIN_API_KEY="..."
"""

import getpass
import os

os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_API_KEY"] = getpass.getpass()

## 整体代码预览

In [None]:
!pip install -qU langchain-openai

In [None]:
from langchain_openai import ChatOpenAI
import getpass
import os

os.environ["OPENAI_API_KEY"] = getpass.getpass()


llm = ChatOpenAI(model="gpt-3.5-turbo-0125")

In [None]:
import bs4
from langchain import hub
from langchain_chroma import Chroma
from langchain_community.document_loaders import WebBaseLoader
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
from langchain_openai import OpenAIEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter

# Load, chunk and index the contents of the blog.
loader = WebBaseLoader(
    web_paths=("https://lilianweng.github.io/posts/2023-06-23-agent/",),
    bs_kwargs=dict(
        parse_only=bs4.SoupStrainer(
            class_=("post-content", "post-title", "post-header")
        )
    ),
)
docs = loader.load()

text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=1000, chunk_overlap=200)
splits = text_splitter.split_documents(docs)
vectorstore = Chroma.from_documents(
    documents=splits, embedding=OpenAIEmbeddings())

# Retrieve and generate using the relevant snippets of the blog.
retriever = vectorstore.as_retriever()
prompt = hub.pull("rlm/rag-prompt")


def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)


rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

rag_chain.invoke("What is Task Decomposition?")

In [None]:
# cleanup
vectorstore.delete_collection()

## 解析代码

### 1. 索引：加载

使用 WebBaseLoader， 用于从 Web URL 加载 HTML 和 将其解析为文本

In [2]:
import bs4
from langchain_community.document_loaders import WebBaseLoader

# Only keep post title, headers, and content from the full HTML.
bs4_strainer = bs4.SoupStrainer(class_=("post-title", "post-header", "post-content"))  # 将页面中的标题、头部和内容提取出来
loader = WebBaseLoader(
    web_paths=("https://lilianweng.github.io/posts/2023-06-23-agent/",),  # 从这个网址获取
    bs_kwargs={"parse_only": bs4_strainer},  # 仅解析上面提取的内容
)
docs = loader.load()  # 加载

len(docs[0].page_content)  # 查看加载的内容长度

43131

In [3]:
print(docs[0].page_content[:500])  # 查看加载的内容的前500个字符



      LLM Powered Autonomous Agents
    
Date: June 23, 2023  |  Estimated Reading Time: 31 min  |  Author: Lilian Weng


Building agents with LLM (large language model) as its core controller is a cool concept. Several proof-of-concepts demos, such as AutoGPT, GPT-Engineer and BabyAGI, serve as inspiring examples. The potentiality of LLM extends beyond generating well-written copies, stories, essays and programs; it can be framed as a powerful general problem solver.
Agent System Overview#
In


## 2. 索引：拆分

将文档拆分成1000个字符的段落，每个段落有200个字符的重叠

In [4]:
from langchain_text_splitters import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=1000, chunk_overlap=200, add_start_index=True
)  # add_start_index=True 会在每个分割的文本前加上索引
all_splits = text_splitter.split_documents(docs)

len(all_splits)

66

In [5]:
len(all_splits[0].page_content)

969

In [6]:
all_splits[10].metadata

{'source': 'https://lilianweng.github.io/posts/2023-06-23-agent/',
 'start_index': 7056}

## 3. 索引：存储

In [7]:
import os
from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv())

In [8]:
from langchain_chroma import Chroma
from langchain_openai import OpenAIEmbeddings

vectorstore = Chroma.from_documents(
    documents=all_splits, embedding=OpenAIEmbeddings(api_key=os.getenv("OPENAI_API_KEY"), base_url=os.getenv("OPENAI_API_BASE")))

## 4. 检索和生成：检索

In [9]:
retriever = vectorstore.as_retriever(search_type="similarity", search_kwargs={"k": 6})

retrieved_docs = retriever.invoke("What are the approaches to Task Decomposition?")

len(retrieved_docs)

6

In [10]:
print(retrieved_docs[0].page_content)

Tree of Thoughts (Yao et al. 2023) extends CoT by exploring multiple reasoning possibilities at each step. It first decomposes the problem into multiple thought steps and generates multiple thoughts per step, creating a tree structure. The search process can be BFS (breadth-first search) or DFS (depth-first search) with each state evaluated by a classifier (via a prompt) or majority vote.
Task decomposition can be done (1) by LLM with simple prompting like "Steps for XYZ.\n1.", "What are the subgoals for achieving XYZ?", (2) by using task-specific instructions; e.g. "Write a story outline." for writing a novel, or (3) with human inputs.


## 5. 检索和生成：生成

In [None]:
!pip install -qU langchain-openai

In [11]:
from langchain_openai import ChatOpenAI
import getpass
import os

# os.environ["OPENAI_API_KEY"] = getpass.getpass()
os.environ["OPENAI_API_KEY"] = os.getenv("OPENAI_API_KEY")
os.environ["OPENAI_API_BASE"] = os.getenv("OPENAI_API_BASE")


llm = ChatOpenAI(model="gpt-3.5-turbo-0125")

In [14]:
from langchain import hub

prompt = hub.pull("rlm/rag-prompt")

example_messages = prompt.invoke(
    {"context": "filler context", "question": "filler question"}
).to_messages()

example_messages

[HumanMessage(content="You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.\nQuestion: filler question \nContext: filler context \nAnswer:")]

In [15]:
print(example_messages[0].content)

You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.
Question: filler question 
Context: filler context 
Answer:


In [16]:
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough


def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)


rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

for chunk in rag_chain.stream("What is Task Decomposition?"):
    print(chunk, end="", flush=True)

Task Decomposition is a technique that involves breaking down complex tasks into smaller and simpler steps. It can be done through prompting techniques like Chain of Thought (CoT) or Tree of Thoughts to transform big tasks into manageable ones. Task decomposition can be carried out by LLM with simple prompts, task-specific instructions, or human inputs.

内置的链

In [18]:
from langchain.chains import create_retrieval_chain
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain_core.prompts import ChatPromptTemplate

system_prompt = (
    "You are an assistant for question-answering tasks. "
    "Use the following pieces of retrieved context to answer "
    "the question. If you don't know the answer, say that you "
    "don't know. Use three sentences maximum and keep the "
    "answer concise."
    "\n\n"
    "{context}"
)

prompt = ChatPromptTemplate.from_messages(
    [
        ("system", system_prompt),
        ("human", "{input}"),
    ]
)


question_answer_chain = create_stuff_documents_chain(llm, prompt)  # 指定使用的模型和模板
rag_chain = create_retrieval_chain(retriever, question_answer_chain)  # 指定使用的检索器和问题回答链

response = rag_chain.invoke({"input": "What is Task Decomposition?"})
print(response["answer"])

Task decomposition involves breaking down complex tasks into smaller and more manageable subtasks or steps. This process helps agents or models deal with intricate tasks by dividing them into simpler components, making it easier to perform and achieve the overall goal. Task decomposition can be done through techniques like Chain of Thought and Tree of Thoughts, which help in organizing and executing tasks effectively.


自定义提示词

In [19]:
from langchain_core.prompts import PromptTemplate

template = """Use the following pieces of context to answer the question at the end.
If you don't know the answer, just say that you don't know, don't try to make up an answer.
Use three sentences maximum and keep the answer as concise as possible.
Always say "thanks for asking!" at the end of the answer.

{context}

Question: {question}

Helpful Answer:"""
custom_rag_prompt = PromptTemplate.from_template(template)

rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | custom_rag_prompt
    | llm
    | StrOutputParser()
)

rag_chain.invoke("What is Task Decomposition?")

'Task decomposition is the process of breaking down complex tasks into smaller, more manageable steps to facilitate planning and execution. It can be done using techniques like Chain of Thought or Tree of Thoughts, which involve breaking tasks into multiple steps and exploring different reasoning possibilities. Thanks for asking!'