# 使用 LangChain 构建本地 RAG 应用


https://datawhalechina.github.io/handy-ollama/#/C7/3.%20%E4%BD%BF%E7%94%A8%20LangChain%20%E6%90%AD%E5%BB%BA%E6%9C%AC%E5%9C%B0%20RAG%20%E5%BA%94%E7%94%A8


## 环境设置

https://github.com/ollama/ollama


```bash
docker run -d -v /Users/tiankonguse-m3/.ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama

ollama pull llama3.1:8b
ollama pull nomic-embed-text

conda activate llm-study
pip install langchain langchain_community
pip install langchain_chroma
pip install langchain_ollama
```





## 文档加载

现在让我们加载并分割一个示例文档。

我们将以 Lilian Weng 的关于 Agent 的 博客 为例。

In [3]:
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.document_loaders import WebBaseLoader

loader = WebBaseLoader("https://lilianweng.github.io/posts/2023-06-23-agent/")
data = loader.load()

text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=0)
all_splits = text_splitter.split_documents(data)



USER_AGENT environment variable not set, consider setting it to identify your requests.


In [4]:
from langchain_chroma import Chroma
from langchain_ollama import OllamaEmbeddings

local_embeddings = OllamaEmbeddings(model="nomic-embed-text")

vectorstore = Chroma.from_documents(documents=all_splits, embedding=local_embeddings)

现在我们得到了一个本地的向量数据库! 来简单测试一下相似度检索:

In [5]:
question = "What are the approaches to Task Decomposition?"
docs = vectorstore.similarity_search(question)
len(docs)

4

In [6]:
docs[0]

Document(id='b93dddaf-bef3-4b0c-bde1-2d7f5f7c42a6', metadata={'description': 'Building agents with LLM (large language model) as its core controller is a cool concept. Several proof-of-concepts demos, such as AutoGPT, GPT-Engineer and BabyAGI, serve as inspiring examples. The potentiality of LLM extends beyond generating well-written copies, stories, essays and programs; it can be framed as a powerful general problem solver.\nAgent System Overview\nIn a LLM-powered autonomous agent system, LLM functions as the agent’s brain, complemented by several key components:\n\nPlanning\n\nSubgoal and decomposition: The agent breaks down large tasks into smaller, manageable subgoals, enabling efficient handling of complex tasks.\nReflection and refinement: The agent can do self-criticism and self-reflection over past actions, learn from mistakes and refine them for future steps, thereby improving the quality of final results.\n\n\nMemory\n\nShort-term memory: I would consider all the in-context l

接下来实例化大语言模型 llama3.2 并测试模型推理是否正常：

In [7]:
from langchain_ollama import ChatOllama

model = ChatOllama(
    model="llama3.2",
)

In [8]:
response_message = model.invoke(
    "Simulate a rap battle between Stephen Colbert and John Oliver"
)

print(response_message.content)

**The scene is set in a dark, crowded nightclub. The crowd is hype, and the judges' table is filled with notable figures from politics and entertainment. In the blue corner, we have Stephen Colbert, aka "The Late Show" host, ready to take on his rival from HBO's "Last Week Tonight". In the red corner, John Oliver, aka "The Daily Show" host, is confident in his lyrical prowess. The battle begins!**

**Stephen Colbert:**
Yo, John, I heard you've been talking smack
About my show, and my wit, right back at that
But let me tell you, buddy, I'm the king of the game
I make the politicians shiver with shame

My words are sharp, like a sword in the night
Cutting down lies, and making everything right
You may have HBO, but I've got the throne
And when it comes to satire, I'm the one to call home

**John Oliver:**
Hold up, Stephen, let me set the record straight
You think you're funny? You're just a late-night mate
I tackle tough topics, and make the powerful quake
My show's not just about laughs

## 构建 Chain 表达形式


我们可以通过传入检索到的文档和简单的 prompt 来构建一个 summarization chain 。

它使用提供的输入键值格式化提示模板，并将格式化后的字符串传递给指定的模型：


In [9]:
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate

prompt = ChatPromptTemplate.from_template(
    "Summarize the main themes in these retrieved docs: {docs}"
)


# 将传入的文档转换成字符串的形式
def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)


chain = {"docs": format_docs} | prompt | model | StrOutputParser()

question = "What are the approaches to Task Decomposition?"

docs = vectorstore.similarity_search(question)

chain.invoke(docs)

'The main themes in these retrieved docs are:\n\n1. **Task Decomposition**: Breaking down complex tasks into smaller, manageable subgoals to enable efficient handling.\n2. **Planning and Execution**: A crucial component of an autonomous agent system, where expert models execute on specific tasks and log results.\n3. **Autonomy and Self-Awareness**: The ability of the agent to plan, reflect, and refine its actions to improve the quality of final results.\n\nAdditionally, the docs mention three methods for task decomposition:\n\n1. Using simple prompting from a Large Language Model (LLM) with instructions like "Steps for XYZ."\n2. Using task-specific instructions, such as "Write a story outline."\n3. Utilizing human inputs and feedback.'

## 简单QA





In [10]:
from langchain_core.runnables import RunnablePassthrough

RAG_TEMPLATE = """
You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.

<context>
{context}
</context>

Answer the following question:

{question}"""

rag_prompt = ChatPromptTemplate.from_template(RAG_TEMPLATE)

chain = (
    RunnablePassthrough.assign(context=lambda input: format_docs(input["context"]))
    | rag_prompt
    | model
    | StrOutputParser()
)

question = "What are the approaches to Task Decomposition?"

docs = vectorstore.similarity_search(question)

# Run
chain.invoke({"context": docs, "question": question})

'There are three approaches to task decomposition: (1) using a Large Language Model (LLM) with simple prompting, (2) using task-specific instructions, and (3) incorporating human inputs. These approaches enable agents to break down complex tasks into smaller subgoals, enhancing efficient handling of complicated tasks. Task-specific instructions provide more precise guidance than simple prompts or human input.'

## 带有检索的QA

最后，我们带有语义检索功能的 QA 应用（本地 RAG 应用），可以根据用户问题自动从向量数据库中检索语义上最相近的文档片段：





In [11]:
retriever = vectorstore.as_retriever()

qa_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | rag_prompt
    | model
    | StrOutputParser()
)

In [12]:
question = "What are the approaches to Task Decomposition?"

qa_chain.invoke(question)

'There are three approaches to task decomposition: (1) using simple prompting with a Large Language Model (LLM), (2) utilizing task-specific instructions, and (3) leveraging human inputs. These methods allow agents to break down complex tasks into manageable subgoals, enabling efficient handling of complicated tasks. Additionally, reflection and refinement can be employed to improve the quality of results.'

## 总结

恭喜，至此，你已经完整的实现了一个基于 Langchain 框架和本地模型构建的 RAG 应用。你可以在教程的基础上替换本地模型来尝试不同模型的效果和能力，或进一步进行扩展，丰富应用的能力和表现力，或者添加更多实用有趣的功能。