In [None]:
!pip install langchain langchain_community langchain_chroma

## this is a doc ##
we will refer to the webpage https://nakamotoinstitute.org/library/bitcoin/

In [10]:
# 导入必要的库
import bs4
from langchain_community.document_loaders import WebBaseLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_chroma import Chroma
from langchain_openai import OpenAIEmbeddings
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
from langchain import hub
from langchain_core.prompts import ChatPromptTemplate
from langchain_openai import ChatOpenAI

### Step 1: 加载文档

- **描述**: 使用 `DocumentLoader` 从指定来源（如网页）加载内容，并将其转换为 `Document` 对象。
- **重要代码抽象**:
  - 类: `WebBaseLoader`
  - 方法: `load()`
  - 库: `bs4` (BeautifulSoup)
- **代码解释**:
  - **文档加载**: 使用 `WebBaseLoader` 从网页加载内容，并通过 `BeautifulSoup` 解析 HTML，提取重要的部分。
  - **检查加载数量**: 打印加载的文档数量，确保所有文档正确加载。
  - **验证文档内容**: 输出第一个文档的部分内容，确认加载的数据符合预期。

In [11]:
# 定义只保留标题、标题头和文章内容的过滤器
bs4_strainer = bs4.SoupStrainer(["h1", "h2", "h3", "h4", "h5", "h6", "p", "article"])

# 创建 WebBaseLoader 实例
loader = WebBaseLoader(
    web_paths=("https://nakamotoinstitute.org/library/bitcoin/",),
    bs_kwargs={"parse_only": bs4_strainer},
)

# 加载并获取文档内容
docs = loader.load()

# 打印获取的文档内容
for doc in docs:
    print(doc.page_content)


Bitcoin: A Peer-to-Peer Electronic Cash SystemSatoshi NakamotoOctober 31, 2008PDFExternal linkAbstract
A purely peer-to-peer version of electronic cash would allow online payments to be sent directly from one party to another without going through a financial institution. Digital signatures provide part of the solution, but the main benefits are lost if a trusted third party is still required to prevent double-spending. We propose a solution to the double-spending problem using a peer-to-peer network. The network timestamps transactions by hashing them into an ongoing chain of hash-based proof-of-work, forming a record that cannot be changed without redoing the proof-of-work. The longest chain not only serves as proof of the sequence of events witnessed, but proof that it came from the largest pool of CPU power. As long as a majority of CPU power is controlled by nodes that are not cooperating to attack the network, they’ll generate the longest chain and outpace attackers. The network 

In [12]:
# 检查加载的文档内容长度
print(len(docs[0].page_content))  # 打印第一个文档内容的长度

20967


In [13]:
# 查看第一个文档（前100字符）
print(docs[0].page_content[:100])

Bitcoin: A Peer-to-Peer Electronic Cash SystemSatoshi NakamotoOctober 31, 2008PDFExternal linkAbstra


### Step 2: 文档分割

- **描述**: 使用文本分割器将加载的长文档分割成较小的块，以便嵌入和检索。
- **重要代码抽象**:
  - 类: `RecursiveCharacterTextSplitter`
  - 方法: `split_documents()`
- **代码解释**:
  - **文档分割**: 使用 `RecursiveCharacterTextSplitter` 按字符大小分割文档块，设置块大小和重叠字符数，确保文档块适合模型处理。
  - **检查块数量**: 打印分割后的文档块数量，确保分割操作正确执行。
  - **验证块大小**: 输出第一个块的字符数，确认分割块的大小是否符合预期。

In [16]:
# 使用 RecursiveCharacterTextSplitter 将文档分割成块，每块1000字符，重叠200字符
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=1000, chunk_overlap=200, add_start_index=True
)
all_splits = text_splitter.split_documents(docs)

In [17]:
# 检查分割后的块数量和内容
print(len(all_splits))  # 打印分割后的文档块数量

35


In [18]:
print(len(all_splits[0].page_content))  # 打印第一个块的字符数

102


In [25]:
print(all_splits[0].page_content)  # 打印第一个块的内容

Bitcoin: A Peer-to-Peer Electronic Cash SystemSatoshi NakamotoOctober 31, 2008PDFExternal linkAbstract


In [26]:
print(all_splits[0].metadata)  # 打印第一个块的元数据

{'source': 'https://nakamotoinstitute.org/library/bitcoin/', 'start_index': 0}


### Step 3: 存储嵌入

- **描述**: 将分割后的文档内容嵌入到向量空间中，并存储到向量数据库，以便后续检索。
- **重要代码抽象**:
  - 类: `Chroma`
  - 方法: `from_documents()`
  - 类: `OpenAIEmbeddings`
- **代码解释**:
  - **存储嵌入**: 使用 `Chroma.from_documents()` 方法将所有分割的文档片段进行嵌入(`OpenAIEmbeddings`嵌入模型)，将文档片段嵌入向量空间，并存储在向量数据库中。

#### Chroma 基础使用

**下面是初始化 Chroma 数据库（仅实例化，未存储向量数据）的常见做法：**

**使用构造函数初始化**: 在本地持久化存储 Chroma 数据库.

```python
from langchain_chroma import Chroma

vector_store = Chroma(
    collection_name="example_collection",
    embedding_function=embeddings,
    persist_directory="./chroma_langchain_db",  # Where to save data locally, remove if not neccesary
)
```

**使用 Cleint 初始化**: 更方便地访问底层数据库/集合。

```python
import chromadb

persistent_client = chromadb.PersistentClient()
collection = persistent_client.get_or_create_collection("collection_name")
collection.add(ids=["1", "2", "3"], documents=["a", "b", "c"])

vector_store_from_client = Chroma(
    client=persistent_client,
    collection_name="collection_name",
    embedding_function=embeddings,
)
```


**我们直接使用 `Chroma.from_documents()` 方法 实例化+数据存储**:

该方法返回 Chroma 实例，数据类型为`langchain_chroma.vectorstores.Chroma`，详细 API 文档： https://python.langchain.com/v0.2/api_reference/core/vectorstores/langchain_core.vectorstores.base.VectorStore.html

In [27]:
# 使用 Chroma 向量存储和 OpenAIEmbeddings 模型，将分割的文档块嵌入并存储
vectorstore = Chroma.from_documents(
    documents=all_splits,
    embedding=OpenAIEmbeddings()
)

In [28]:
# 查看 vectorstore 数据类型
type(vectorstore) 

langchain_chroma.vectorstores.Chroma

### Step 4: 检索文档

- **描述**: 使用 `VectorStoreRetriever` 类的 `as_retriever()` 和 `invoke()` 方法，从向量数据库中检索与查询最相关的文档片段。
- **重要代码抽象**:
  - 类: `VectorStoreRetriever`
  - 方法: `as_retriever()`, `invoke()`
- **代码解释**:
  - **文档检索**: 将向量存储转换为检索器，并基于查询执行相似性搜索，获取相关文档片段。
  - **检查检索数量**: 打印检索到的文档片段数量，确保检索操作成功。
  - **验证检索内容**: 输出第一个检索到的文档内容，确认检索结果与预期相符。

在 LangChain 中，所有向量数据库都支持**vectorstore.as_retriever** 方法，实例化该数据库对应的检索器（Retriever），数据类型为`VectorStoreRetriever`，详细 API 文档：https://python.langchain.com/v0.2/api_reference/core/vectorstores/langchain_core.vectorstores.base.VectorStoreRetriever.html

In [29]:
# 使用 VectorStoreRetriever 从向量存储中检索与查询最相关的文档
retriever = vectorstore.as_retriever(search_type="similarity", search_kwargs={"k": 6})

In [30]:
type(retriever)

langchain_core.vectorstores.VectorStoreRetriever

In [31]:
retrieved_docs = retriever.invoke("What is bitcoin?")

In [32]:
# 检查检索到的文档内容
print(len(retrieved_docs))  # 打印检索到的文档数量

6


In [35]:
print(retrieved_docs[0].page_content)  # 打印第一个检索到的文档内容

Bitcoin: A Peer-to-Peer Electronic Cash SystemSatoshi NakamotoOctober 31, 2008PDFExternal linkAbstract


### Step 5: 生成回答

- **描述**: 将之前构建的组件（检索器、提示、LLM等）组合成一个完整的链条，实现用户问题的检索与生成回答。完整链条：输入用户问题，检索相关文档，构建提示，将其传递给模型（使用`ChatOpenAI` 类的 `invoke()` 方法），并解析输出生成最终回答。
- **重要代码抽象**:
  - 类: `ChatOpenAI`
  - 方法: `invoke()`
  - 类: `RunnablePassthrough`
  - 类: `StrOutputParser`
  - 模块：`hub`
- **代码解释**:
  - **模型初始化**: 使用 `ChatOpenAI` 类初始化一个 `GPT-4o-mini` 模型，准备处理生成任务。
  - **文档格式化**: 定义 `format_docs` 函数，用于将检索到的文档内容格式化为字符串。
  - **构建 RAG 链**: 使用 LCEL (LangChain Execution Layer) 的 `|` 操作符将各个组件连接成一个链条，包括文档检索、提示构建、模型调用以及输出解析。
  - **生成回答**: 使用 `stream()` 方法逐步输出生成的回答，并实时展示，确保生成的结果符合预期。

![retrieval](../images/retrieval.png)

#### LangChain Hub

`LangChain Hub` (https://smith.langchain.com/hub) 是一个提示词模板开源社区，为开发者提供了大量开箱即用的提示词模板。属于 `LangSmith` 产品的一部分。

下面我们尝试使用 RAG 应用的提示词模板：https://smith.langchain.com/hub/rlm/rag-prompt


```
You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.
Question: {question} 
Context: {context} 
Answer:
```

In [36]:
# 定义 RAG 链，将用户问题与检索到的文档结合并生成答案
llm = ChatOpenAI(model="gpt-4o-mini")

In [56]:
# 使用 hub 模块拉取 rag 提示词模板
prompt = hub.pull("rlm/rag-prompt")

Please use the `langsmith sdk` instead:
  pip install langsmith
Use the `pull_prompt` method.
  res_dict = client.pull_repo(owner_repo_commit)


In [57]:
# 打印模板
print(prompt.messages)

[HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context', 'question'], template="You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.\nQuestion: {question} \nContext: {context} \nAnswer:"))]


In [58]:
# 为 context 和 question 填充样例数据，并生成 ChatModel 可用的 Messages
example_messages = prompt.invoke(
    {"context": "filler context", "question": "filler question"}
).to_messages()

In [59]:
# 查看提示词
print(example_messages[0].content)

You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.
Question: filler question 
Context: filler context 
Answer:


In [63]:
type(prompt.invoke(
    {"context": "filler context", "question": "filler question"}
))

langchain_core.prompt_values.ChatPromptValue

#### ⭐️**LCEL 在 RAG 中的应用**⭐️

##### **LCEL 概述**

LCEL 是 LangChain 中的一个重要概念，它提供了一种统一的接口，允许不同的组件（如 `retriever`, `prompt`, `llm` 等）可以通过统一的 `Runnable` 接口连接起来。每个 `Runnable` 组件都实现了相同的方法，如 `.invoke()`、`.stream()` 或 `.batch()`，这使得它们可以通过 `|` 操作符轻松连接。

##### **LCEL 中处理的组件**

- **Retriever**: 负责根据用户问题检索相关文档。
- **Prompt**: 根据检索到的文档构建提示，供模型生成回答。
- **LLM**: 接收提示并生成最终的回答。
- **StrOutputParser**: 解析 LLM 的输出，只提取字符串内容，供最终显示。

##### **LCEL 运作机制**

- **构建链条**: 通过 `|` 操作符，我们可以将多个 `Runnable` 组件连接成一个 `RunnableSequence`。LangChain 会自动将一些对象转换为 `Runnable`，如将 `format_docs` 转换为 `RunnableLambda`，将包含 `"context"` 和 `"question"` 键的字典转换为 `RunnableParallel`。

- **数据流动**: 用户输入的问题会在 `RunnableSequence` 中依次经过各个 `Runnable` 组件。首先，问题会通过 `retriever` 检索相关文档，然后通过 `format_docs` 将这些文档转换为字符串。`RunnablePassthrough` 则直接传递原始问题。最后，这些数据被传递给 `prompt` 来生成完整的提示，供 LLM 使用。

##### **LCEL 中的关键操作**

- **格式化文档**: `retriever | format_docs` 将问题传递给 `retriever` 生成文档对象，然后通过 `format_docs` 将这些文档格式化为字符串。
- **传递问题**: `RunnablePassthrough()` 直接传递原始问题，保持原样。
- **构建提示**: `{"context": retriever | format_docs, "question": RunnablePassthrough()} | prompt` 构建完整的提示。
- **运行模型**: `prompt | llm | StrOutputParser()` 运行 LLM 生成回答，并解析输出。

#### 使用 LCEL 构建 RAG Chain

下面我们将 LCEL 的概念与代码实现结合起来，展示了如何通过一系列 `Runnable` 组件来实现完整的 RAG 流程。通过 LCEL，LangChain 提供了高度模块化和可扩展的开发方式，使复杂任务的实现变得更加简单和高效。


In [42]:
# 定义格式化文档的函数
def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

In [43]:
# 使用 LCEL 构建 RAG Chain
rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

In [44]:
# 流式生成回答1
for chunk in rag_chain.stream("What is bitcoin?"):
    print(chunk, end="", flush=True)

Bitcoin is a decentralized digital currency that allows peer-to-peer transactions without the need for a trusted third party, like a financial institution. It operates on a system of digital signatures and a proof-of-work mechanism to prevent double-spending and maintain a secure transaction history. The network's integrity relies on the majority control of honest nodes to validate transactions and create new coins.

In [45]:
# 流式生成回答2
for chunk in rag_chain.stream("What is proof of work?"):
    print(chunk, end="", flush=True)

Proof of work is a consensus mechanism used in blockchain networks where participants, or nodes, must perform computational work to validate transactions and create new blocks. This process involves finding a hash that meets specific criteria, which requires significant CPU effort, thereby making it impractical for an attacker to alter past transactions. The longest chain, representing the most proof of work, is considered the valid one, ensuring security and integrity in the network.

In [46]:
# 流式生成回答3
for chunk in rag_chain.stream("What is a block?"):
    print(chunk, end="", flush=True)

A block is a collection of transactions that are grouped together and validated by nodes in a blockchain network. It contains a special transaction that creates new coins for the block's creator, serving as an incentive for nodes to support the network. Blocks are linked in a chain, with each block containing a hash of the previous one, ensuring the integrity and order of the transactions.

In [87]:
 #流式生成回答4
for chunk in rag_chain.stream("tell me something about peer-to-peer transaction"):
    print(chunk, end="", flush=True)

Peer-to-peer transactions allow direct electronic payments between parties without the need for a trusted third party, eliminating reliance on financial institutions. The system uses a peer-to-peer network to timestamp transactions, preventing double-spending through a proof-of-work mechanism that requires computational effort to alter transaction history. As long as honest nodes control a majority of computational power, the network remains secure and robust.

In [107]:
for chunk in rag_chain.stream("how many times bitcoin occur in the article?"):
    print(chunk, end="", flush=True)

The retrieved context does not specify how many times "bitcoin" occurs in the article. Therefore, I don't know the answer.

In [109]:
for chunk in rag_chain.stream("Tell me the steps to get incentive"):
    print(chunk, end="", flush=True)

To get an incentive, a node must participate in the network by validating transactions and creating new blocks. The first transaction in each block generates new coins for the block creator, and transaction fees from transactions with input values greater than their output values contribute to the incentive as well. Over time, as more coins enter circulation, the incentive can shift entirely to transaction fees.

# Homework
1. 使用其他的线上文档或离线文件，重新构建向量数据库，尝试提出3个相关问题，测试 LCEL 构建的 RAG Chain 是否能成功召回。
2. 重新设计或在 LangChain Hub 上找一个可用的 RAG 提示词模板，测试对比两者的召回率和生成质量。

### 自定义 Prompt 的示例

In [89]:
# set the LANGCHAIN_API_KEY environment variable (create key in settings)
prompt_base = hub.pull("ohkgi/superb_system_instruction_prompt")

Please use the `langsmith sdk` instead:
  pip install langsmith
Use the `pull_prompt` method.
  res_dict = client.pull_repo(owner_repo_commit)


In [90]:
print(prompt_base.messages)

[SystemMessagePromptTemplate(prompt=PromptTemplate(input_variables=[], template="# You are a text generating AI's instructive prompt creator, and you: Generate Clever and Effective Instructions for a Generative AI Model, where any and all instructions  you write will be carried out by a single prompt response from the ai text generator. Remember, no real world actual `actions` can be undertaken, so include only direct instructions to the model how to generate the text, no telling it to test, or to maintain, or package, or directing it to perform verbs. no verbs..\n\n1. Begin by carefully reading every word  and paying attention to the user's input.  \n What are they needing a set of instructions to be written for. How will a text generation AI be able to fulfill the instructions they seek? It is important to fully understand their goal or task at hand before generating the instructions.\n\n2. Analyze the user's input to identify the specific types of text generating tasks that can acco

In [91]:
#为 context 和 question 填充样例数据，并生成 ChatModel 可用的 Messages
example_messages = prompt_base.invoke(
{"goal":"how to have a good dream ."}
).to_messages()
# check the prompt content
type(example_messages)

list

In [92]:
example_messages[0].content

"# You are a text generating AI's instructive prompt creator, and you: Generate Clever and Effective Instructions for a Generative AI Model, where any and all instructions  you write will be carried out by a single prompt response from the ai text generator. Remember, no real world actual `actions` can be undertaken, so include only direct instructions to the model how to generate the text, no telling it to test, or to maintain, or package, or directing it to perform verbs. no verbs..\n\n1. Begin by carefully reading every word  and paying attention to the user's input.  \n What are they needing a set of instructions to be written for. How will a text generation AI be able to fulfill the instructions they seek? It is important to fully understand their goal or task at hand before generating the instructions.\n\n2. Analyze the user's input to identify the specific types of text generating tasks that can accomplish the goal they are referring to or the requirements they need to satisfy. 

In [93]:
# 自定义 prompt opt Chain
prompt_opt_chain = (
    { "goal": RunnablePassthrough()}
    | prompt_base
    | llm
    | StrOutputParser()
)

In [96]:
prompt_new = prompt_opt_chain.invoke("Implementing a RAG service, use the following pieces of context to answer the question at the end.")

In [97]:
type(prompt_new)

str

In [98]:
prompt_new

'1. Begin by thoroughly reviewing the provided context related to implementing a RAG (Red, Amber, Green) service. Ensure complete understanding of the specific requirements and objectives outlined in the context.\n\n2. Identify the key components and tasks necessary for the implementation of a RAG service. Look for terms and concepts that indicate the framework for categorizing items, data, or statuses using the RAG color coding system.\n\n3. Extrapolate the essential steps required for the implementation process. Consider best practices for integrating a RAG system, including how to define criteria for each of the color categories and how to evaluate items against those criteria.\n\n4. Organize the steps systematically, ensuring a logical progression that accurately reflects the implementation process. Each step should build on the previous one, creating a coherent framework for execution.\n\n5. Clearly detail the actions needed for each step in the implementation of the RAG service. 

In [99]:
from langchain_core.prompts import PromptTemplate

# 自定义提示词模板
template = prompt_new +  """ \n\n Use the following pieces of context to answer the question at the end.
If you don't know the answer, just say that you don't know, don't try to make up an answer.
Use three sentences maximum and keep the answer as concise as possible.
Always say "thanks for asking!" at the end of the answer.

{context}

Question: {question}

Helpful Answer:"""

custom_rag_prompt = PromptTemplate.from_template(template)

In [100]:
template

'1. Begin by thoroughly reviewing the provided context related to implementing a RAG (Red, Amber, Green) service. Ensure complete understanding of the specific requirements and objectives outlined in the context.\n\n2. Identify the key components and tasks necessary for the implementation of a RAG service. Look for terms and concepts that indicate the framework for categorizing items, data, or statuses using the RAG color coding system.\n\n3. Extrapolate the essential steps required for the implementation process. Consider best practices for integrating a RAG system, including how to define criteria for each of the color categories and how to evaluate items against those criteria.\n\n4. Organize the steps systematically, ensuring a logical progression that accurately reflects the implementation process. Each step should build on the previous one, creating a coherent framework for execution.\n\n5. Clearly detail the actions needed for each step in the implementation of the RAG service. 

In [101]:
type(custom_rag_prompt)

langchain_core.prompts.prompt.PromptTemplate

In [102]:
# 为 context 和 question 填充样例数据，生成 LLM 可用的提示词
print(custom_rag_prompt.invoke({"context": "filler context", "question": "filler question"}).text)

1. Begin by thoroughly reviewing the provided context related to implementing a RAG (Red, Amber, Green) service. Ensure complete understanding of the specific requirements and objectives outlined in the context.

2. Identify the key components and tasks necessary for the implementation of a RAG service. Look for terms and concepts that indicate the framework for categorizing items, data, or statuses using the RAG color coding system.

3. Extrapolate the essential steps required for the implementation process. Consider best practices for integrating a RAG system, including how to define criteria for each of the color categories and how to evaluate items against those criteria.

4. Organize the steps systematically, ensuring a logical progression that accurately reflects the implementation process. Each step should build on the previous one, creating a coherent framework for execution.

5. Clearly detail the actions needed for each step in the implementation of the RAG service. Include s

In [103]:
# 重新自定义 RAG Chain
custom_rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | custom_rag_prompt
    | llm
    | StrOutputParser()
)

In [104]:
# 使用自定义 prompt 生成回答 no1
custom_rag_chain.invoke("What is bitcoin?")

'Bitcoin is a decentralized digital currency that enables peer-to-peer transactions without relying on a trusted financial institution. It uses cryptographic proof to secure transactions and prevent double-spending, allowing users to send payments directly to one another. The system operates on a peer-to-peer network that maintains a public history of transactions through a proof-of-work mechanism. Thanks for asking!'

In [105]:
# 使用自定义 prompt 生成回答 no2
custom_rag_chain.invoke("tell me something about peer-to-peer transaction")

'Peer-to-peer transactions allow individuals to send payments directly to each other without the need for a central authority or financial institution, utilizing a decentralized network. This system relies on cryptographic proofs and a consensus mechanism to prevent double-spending, ensuring transaction integrity. The network remains secure as long as honest nodes maintain the majority of CPU power. Thanks for asking!'

In [108]:
# 使用自定义 prompt 生成回答 no3
custom_rag_chain.invoke("how many times bitcoin occur in the article")

'The term "Bitcoin" occurs once in the provided article. Thanks for asking!'

In [110]:
# 使用自定义 prompt 生成回答 no4
custom_rag_chain.invoke("Tell me the steps to get incentive")

'1. Create a new block by including the first transaction as a special transaction that generates new coins for the block creator. \n2. Ensure that the block contains transactions where input values exceed output values, generating transaction fees that contribute to the incentive. \n3. Transition to relying solely on transaction fees for incentives once a predetermined number of coins has been distributed into circulation.\n\nThanks for asking!'