# 使用 DeepSeek R1 和 Ollama 实现本地 RAG 应用

## 文档加载

加载并分割 PDF 文档，这里以 DeepSeek_R1.pdf 为例。

In [1]:
from langchain_community.document_loaders import PDFPlumberLoader

file = "DeepSeek_R1.pdf"

# Load the PDF
loader = PDFPlumberLoader(file)
docs = loader.load()

from langchain.text_splitter import RecursiveCharacterTextSplitter
text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=0)
all_splits = text_splitter.split_documents(docs)

接着，初始化向量存储。 我们使用的文本嵌入模型是 `nomic-embed-text` 。

In [2]:
from langchain_chroma import Chroma
from langchain_ollama import OllamaEmbeddings

local_embeddings = OllamaEmbeddings(model="nomic-embed-text")

vectorstore = Chroma.from_documents(documents=all_splits, embedding=local_embeddings)

现在，我们得到了一个向量存储，可以用来进行相似度搜索。

In [3]:
question = "What is the purpose of the DeepSeek project?"
docs = vectorstore.similarity_search(question)
for doc in docs:
    print(doc.page_content)

engineeringtasks. Asaresult,DeepSeek-R1hasnotdemonstratedahugeimprovement
over DeepSeek-V3 on software engineering benchmarks. Future versions will address
thisbyimplementingrejectionsamplingonsoftwareengineeringdataorincorporating
asynchronousevaluationsduringtheRLprocesstoimproveefficiency.
16
DeepSeek-R1avoidsintroducinglengthbiasduringGPT-basedevaluations,furthersolidifying
itsrobustnessacrossmultipletasks.
On math tasks, DeepSeek-R1 demonstrates performance on par with OpenAI-o1-1217,
surpassingothermodelsbyalargemargin. Asimilartrendisobservedoncodingalgorithm
tasks,suchasLiveCodeBenchandCodeforces,wherereasoning-focusedmodelsdominatethese
benchmarks. Onengineering-orientedcodingtasks,OpenAI-o1-1217outperformsDeepSeek-R1
first open research to validate that reasoning capabilities of LLMs can be incentivized
purelythroughRL,withouttheneedforSFT.Thisbreakthroughpavesthewayforfuture
advancementsinthisarea.
• We introduce our pipeline to develop DeepSeek-R1. The pipeline incorporates

## 构建 Chain 表达式

我们将使用 `langchain` 库整合文档、prompt 模板和输出解析器，来构建一个 Chain 表达式，以便在 DeepSeek R1 和 Ollama 之间进行交互。

In [4]:
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_ollama import ChatOllama

model = ChatOllama(
    model="deepseek-r1:1.5b",
)

prompt = ChatPromptTemplate.from_template(
    "Summarize the main themes in these retrieved docs: {docs}"
)

# 将传入的文档转换成字符串的形式
def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)


chain = {"docs": format_docs} | prompt | model | StrOutputParser()

question = "What is the purpose of the DeepSeek project?"

docs = vectorstore.similarity_search(question)

chain.invoke(docs)

"<think>\nOkay, I need to summarize the main themes from these retrieved documents about DeepSeek-R1 improving in math and coding tasks over previous models. Let me start by reading through each point carefully.\n\nFirst document: It talks about DeepSeek-R1 not showing significant improvement over V3 on software engineering benchmarks. Future versions will address this by implementing rejections sampling on data or incorporating asynchronous evaluations to improve efficiency.\n\nSecond document: DeepSeek-R1 avoids length bias in GPT-based evaluations and outperforms other models on math tasks, surpassing OpenAI's o1-1217 by a large margin. They also observed similar performance on coding algorithm tasks where reasoning-focused models dominate.\n\nThird document introduces their pipeline for DeepSeek-R1, which uses two RL stages: one for discovering improved patterns and aligning with human preferences, and another for SFT stages using cold-start data and a multi-stage pipeline. They fi

## 带有检索的 QA

最后，我们将检索到的相似文本段落与完整文档合并为统一的上下文（context）。随后，将用户提问（question）与这个上下文结合，按照 RAG_TEMPLATE 的格式整合，最终输入到模型中进行问答处理。

In [5]:
from langchain_core.runnables import RunnablePassthrough

RAG_TEMPLATE = """
You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.

<context>
{context}
</context>

Answer the following question:

{question}"""

rag_prompt = ChatPromptTemplate.from_template(RAG_TEMPLATE)

retriever = vectorstore.as_retriever()

qa_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | rag_prompt
    | model
    | StrOutputParser()
)

question = "What is the purpose of the DeepSeek project?"

# Run
qa_chain.invoke(question)

'<think>\nOkay, I need to figure out the answer to the question: "What is the purpose of the DeepSeek project?" \n\nFirst, I\'ll look through the provided context. It seems like the context is about a research paper or something related to DeepSeek-R1 and its improvements over other models.\n\nI see mentions of DeepSeek-R1 having made significant advancements in various fields like engineering tasks, mathematics, coding algorithms, and even software engineering benchmarks. The mention of "deep learning" comes up multiple times, which makes me think that the project is focused on deep learning applications or specific areas within it.\n\nThere are references to openAI-o1-1217, OpenAI\'s model, and other models like DeepSeek-R1 Zero. This suggests that DeepSeek-R1 might be an improvement over existing models, possibly addressing certain limitations in their performance across different domains.\n\nI also see mentions ofRL (reinforcement learning) stages within the pipeline for training t

## 总结

这篇文章展示了如何使用 DeepSeek R1 和 Ollama 实现本地 RAG 应用。我们首先加载并分割 PDF 文档，然后初始化向量存储，接着构建 Chain 表达式，最后进行带有检索的 QA。

可以扩展这个例子，使用不同的文档加载器、文本嵌入模型、向量存储、模型和输出解析器，以适应不同的应用场景。