# 12.1 基础 RAG 实现

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/your-org/ai-first-app/blob/main/demos/12-rag-memory/basic_rag.ipynb)

**预计 API 费用: ~$0.02**

本 Notebook 从零构建一个完整的 RAG 系统。

In [None]:
!pip install -q langchain langchain-openai langchain-community chromadb

In [None]:
import os
from getpass import getpass

if not os.environ.get("OPENAI_API_KEY"):
    os.environ["OPENAI_API_KEY"] = getpass("请输入你的 OpenAI API Key: ")

## 实验 1: 从零实现 RAG 六步流程

In [None]:
from langchain.document_loaders import TextLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_openai import OpenAIEmbeddings
from langchain.vectorstores import Chroma
from langchain_openai import ChatOpenAI
from langchain.chains import RetrievalQA

# 准备示例文档
sample_text = """
Python 异常处理

Python 使用 try-except 语句处理异常。基本语法:

try:
    # 可能出错的代码
    result = 10 / 0
except ZeroDivisionError:
    # 处理特定异常
    print("除数不能为 0")
except Exception as e:
    # 处理其他异常
    print(f"发生错误: {e}")
finally:
    # 总是执行
    print("清理资源")

Python 数据类型

Python 有以下主要数据类型:
- int: 整数
- float: 浮点数
- str: 字符串
- list: 列表
- dict: 字典
- tuple: 元组
- set: 集合
"""

# 保存为文件
with open("python_tutorial.txt", "w", encoding="utf-8") as f:
    f.write(sample_text)

print("✅ 示例文档已创建")

In [None]:
# 步骤 1: Load (加载文档)
print("=== 步骤 1: Load ===")
loader = TextLoader("python_tutorial.txt", encoding="utf-8")
documents = loader.load()

print(f"加载了 {len(documents)} 个文档")
print(f"文档内容长度: {len(documents[0].page_content)} 字符")

In [None]:
# 步骤 2: Split (切分块)
print("\n=== 步骤 2: Split ===")
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=300,
    chunk_overlap=50,
    length_function=len,
)

chunks = text_splitter.split_documents(documents)

print(f"切分为 {len(chunks)} 个块")
for i, chunk in enumerate(chunks):
    print(f"\nChunk {i+1}:")
    print(chunk.page_content[:100] + "...")

In [None]:
# 步骤 3: Embed + 4: Store (向量化 + 存储)
print("\n=== 步骤 3 & 4: Embed + Store ===")

embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
vectorstore = Chroma.from_documents(
    documents=chunks,
    embedding=embeddings,
    persist_directory="./chroma_db"
)

print("✅ 向量库已创建并持久化")

In [None]:
# 步骤 5: Retrieve (检索)
print("\n=== 步骤 5: Retrieve ===")

question = "Python 如何处理异常?"
print(f"问题: {question}")

docs = vectorstore.similarity_search(question, k=2)

print(f"\n检索到 {len(docs)} 个相关文档:")
for i, doc in enumerate(docs):
    print(f"\n文档 {i+1}:")
    print(doc.page_content)

In [None]:
# 步骤 6: Generate (生成答案)
print("\n=== 步骤 6: Generate ===")

from openai import OpenAI

client = OpenAI()

# 构造 Prompt
context = "\n\n".join([doc.page_content for doc in docs])
prompt = f"""请基于以下信息回答问题:

{context}

问题: {question}

请用中文回答,并引用具体的代码示例。
"""

# 生成答案
response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": prompt}]
)

print("\n最终答案:")
print(response.choices[0].message.content)

## 实验 2: 使用 LangChain RetrievalQA

In [None]:
from langchain.chains import RetrievalQA
from langchain_openai import ChatOpenAI

# 创建 QA Chain
qa_chain = RetrievalQA.from_chain_type(
    llm=ChatOpenAI(model="gpt-4o-mini", temperature=0),
    chain_type="stuff",
    retriever=vectorstore.as_retriever(search_kwargs={"k": 2}),
    return_source_documents=True
)

# 测试问题
test_questions = [
    "Python 如何处理异常?",
    "Python 有哪些数据类型?",
    "如何捕获除零错误?"
]

for question in test_questions:
    print(f"\n{'='*60}")
    print(f"问题: {question}")
    print(f"{'='*60}")
    
    result = qa_chain.invoke({"query": question})
    
    print("\n答案:")
    print(result["result"])
    
    print("\n来源文档:")
    for i, doc in enumerate(result["source_documents"]):
        print(f"\n文档 {i+1}:")
        print(doc.page_content[:150] + "...")

## 实验 3: RAG 效果对比

对比有无 RAG 的差异。

In [None]:
from openai import OpenAI

client = OpenAI()

question = "Python 中 finally 块的作用是什么?"

print("=== 无 RAG (纯 LLM) ===")
response_no_rag = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": question}]
)
print(response_no_rag.choices[0].message.content)

print("\n" + "="*60)
print("=== 有 RAG (基于文档) ===")
result_with_rag = qa_chain.invoke({"query": question})
print(result_with_rag["result"])

print("\n观察: RAG 的答案基于文档,更具体、可验证")

## 动手练习

1. **添加更多文档**: 创建关于其他主题的文档
2. **调整分块大小**: 测试不同 chunk_size 的影响
3. **改变检索数量**: 调整 k 值,观察效果
4. **尝试不同问题**: 测试 RAG 的边界

---

## 关键要点总结

1. **RAG 六步流程**: Load → Split → Embed → Store → Retrieve → Generate
2. **向量搜索**: 基于语义相似度检索
3. **LangChain 简化**: RetrievalQA 封装了完整流程
4. **成本可控**: 只检索相关文档,不需全量输入
5. **效果依赖检索质量**: 检索准确,答案才准确

---

**下一步**: 学习 [12.2 向量搜索](./vector_search.ipynb),深入理解 Embedding 和向量数据库。