2002yy · 2002yy · Jun 5, 2026 · Jun 5, 2026
diff --git a/.env.example b/.env.example
@@ -70,3 +70,13 @@ DEEPSEEK_MODEL_PRO_NAME=deepseek-v4-pro
 # RAG_VECTOR_BACKEND=local
 # RAG_CHROMA_PATH=logs/chroma
 # RAG_CHROMA_COLLECTION=study_agent
+
+# === RAG Embeddings（默认 local_hash，无需 API key）===
+# local_hash 适合本地开发和测试；openai 适合 Chroma 持久化向量检索。
+# RAG_EMBEDDING_PROVIDER=local_hash
+# RAG_EMBEDDING_PROVIDER=openai
+# RAG_EMBEDDING_MODEL=text-embedding-3-small
+# RAG_EMBEDDING_DIMENSIONS=1536
+# RAG_EMBEDDING_API_KEY=
+# RAG_EMBEDDING_BASE_URL=
+# RAG_EMBEDDING_TIMEOUT_SECONDS=30
diff --git a/README.md b/README.md
@@ -3,7 +3,7 @@
 <p>
   <a href="https://github.com/2002yy/study-agent/actions/workflows/ci.yml"><img src="https://github.com/2002yy/study-agent/actions/workflows/ci.yml/badge.svg" alt="CI"></a>
   <img src="https://img.shields.io/badge/python-3.12-blue" alt="Python 3.12">
-  <img src="https://img.shields.io/badge/tests-273%20passed-green" alt="273 tests passed">
+  <img src="https://img.shields.io/badge/tests-277%20passed-green" alt="277 tests passed">
 </p>
 
 A local AI learning assistant with long-term memory, role-based group chat,
@@ -17,7 +17,7 @@ Study Agent 是一个本地优先的 AI 学习助手，重点不是简单调用
 - **长期记忆**：Markdown memory + safe writer
 - **上下文分层**：fast / light / deep / archive
 - **联网搜索**：RSS / News fetch → article extraction → LLM digest → source tracing
-- **RAG MVP**：本地 Markdown / TXT / DOCX / PDF 索引、关键词 / 本地向量原型 / hybrid / backend-vector 检索、引用上下文、来源块、Streamlit 检索/调试面板、聊天注入和 FastAPI RAG 接口
+- **RAG MVP**：本地 Markdown / TXT / DOCX / PDF 索引、关键词 / 本地向量原型 / hybrid / backend-vector 检索、可配置 embedding provider、可选 Chroma 持久化、引用上下文、来源块、Streamlit 检索/调试面板、聊天注入和 FastAPI RAG 接口
 - **工程安全**：SSRF protection、detect-secrets、配置模板
 - **工程质量**：pytest 测试套件、Ruff、GitHub Actions CI、打包检查
 
@@ -27,11 +27,11 @@ Study Agent 是一个本地优先的 AI 学习助手，重点不是简单调用
 - **Model routing** with fast / light / deep / archive context tiers
 - **Long-term memory** based on Markdown files and safe-writer persistence
 - **Web search pipeline**: feed registry → URL safety checks → article extraction → LLM digest → auditable source trace
-- **RAG MVP**: local Markdown / TXT / DOCX / PDF indexing, lexical / local vector prototype / hybrid / backend-vector retrieval, citation-first context formatting, source blocks, a Streamlit retrieval/debug panel, optional chat injection, and FastAPI RAG endpoints
+- **RAG MVP**: local Markdown / TXT / DOCX / PDF indexing, lexical / local vector prototype / hybrid / backend-vector retrieval, configurable embedding providers, optional Chroma persistence, citation-first context formatting, source blocks, a Streamlit retrieval/debug panel, optional chat injection, and FastAPI RAG endpoints
 - **SSRF protection** for article fetching, **detect-secrets** in CI
 - **Batched session logging** and multi-layer caching for performance
 - **Performance budget**: mode-based `max_tokens` bounds on the main chat, WeChat, and news LLM paths
-- **273 pytest tests**, Ruff clean, GitHub Actions CI workflow
+- **277 pytest tests**, Ruff clean, mypy clean, GitHub Actions CI workflow
 
 For a detailed breakdown of the stack and engineering highlights, see [Technical Stack & Engineering Highlights](docs/TECH_STACK.md).
 
@@ -207,13 +207,19 @@ pip-compile requirements-dev.in    # 重新锁定开发依赖
 
 参数优先级：代码显式参数 → 任务级环境变量 → 任务默认值 → 全局环境变量 → provider 级环境变量。完整配置见 [`.env.example`](.env.example) 和 [用户指南](USER_GUIDE.md)。
 
-RAG 向量后端默认使用 `local`，不需要额外服务；可选 `chroma` adapter 需要用户自行安装 `chromadb`：
+RAG 向量后端默认使用 `local`，不需要额外服务；可选 `chroma` adapter 需要用户自行安装 `chromadb`。Embedding provider 默认 `local_hash`，生产检索可显式切到 OpenAI-compatible embeddings：
 
 ```bash
 RAG_VECTOR_BACKEND=local
 # RAG_VECTOR_BACKEND=chroma
 # RAG_CHROMA_PATH=logs/chroma
 # RAG_CHROMA_COLLECTION=study_agent
+
+RAG_EMBEDDING_PROVIDER=local_hash
+# RAG_EMBEDDING_PROVIDER=openai
+# RAG_EMBEDDING_MODEL=text-embedding-3-small
+# RAG_EMBEDDING_DIMENSIONS=1536
+# RAG_EMBEDDING_API_KEY=...
 ```
 
 ---
@@ -243,7 +249,7 @@ RAG_VECTOR_BACKEND=local
 │   ├── config.py           # 全局配置
 │   ├── router.py           # 路由配置
 │   ├── news/               # 新闻聚合链路
-│   ├── rag/                # 本地 RAG MVP：加载、分块、索引、关键词/向量原型/可选后端检索
+│   ├── rag/                # 本地 RAG MVP：加载、分块、索引、关键词/向量原型/embedding/可选后端检索
 │   └── ui/                 # Streamlit UI 组件
 ├── tests/                  # pytest 测试套件
 ├── docs/                   # 设计文档与工程说明
@@ -264,13 +270,13 @@ RAG_VECTOR_BACKEND=local
 ## 测试
 
 ```bash
-pytest tests/ -v            # current local baseline: 273 passed
+pytest tests/ -v            # current local baseline: 277 passed
 pytest tests/ --cov=src     # 覆盖率
 ruff check src/ tests/      # linting
-mypy --explicit-package-bases src/  # CI soft check; may report type debt
+mypy --explicit-package-bases src/  # type check
 ```
 
-CI 通过 GitHub Actions 在 push / pull request 上运行，集成 `pytest`、`ruff`、打包检查、`detect-secrets` 扫描，以及非阻断的 `mypy` soft check。当前验证状态见 [docs/TESTING.md](docs/TESTING.md)。
+CI 通过 GitHub Actions 在 push / pull request 上运行，集成 `pytest`、`ruff`、打包检查、`detect-secrets` 扫描，以及 `mypy` soft check。当前验证状态见 [docs/TESTING.md](docs/TESTING.md)。
 
 ---
 
@@ -307,9 +313,9 @@ CI 通过 GitHub Actions 在 push / pull request 上运行，集成 `pytest`、`
 求职导向的技术演进路线：
 
 - [ ] FastAPI service layer (partial): `/health`, `/rag`, `/rag/index`, `/rag/query` implemented; `/chat` and `/memory` remain planned
-- [x] RAG MVP: Markdown / TXT / DOCX / PDF loading, chunking, local keyword retrieval, local vector prototype, hybrid retrieval, citation context, source blocks, Streamlit retrieval panel, optional single-chat and WeChat interactive injection
-- [ ] RAG document QA (partial): PDF parsing has file-size, page-count, extracted-text and encrypted-file guards; Chroma adapter scaffold exists; production embedding model retrieval remains planned
-- [ ] Vector store: FAISS local prototype, pgvector engineering version
+- [x] RAG MVP: Markdown / TXT / DOCX / PDF loading, chunking, local keyword retrieval, local vector prototype, hybrid retrieval, backend-vector retrieval, configurable embedding provider, optional Chroma adapter, citation context, source blocks, Streamlit retrieval panel, optional single-chat and WeChat interactive injection
+- [ ] RAG document QA (partial): PDF parsing has file-size, page-count, extracted-text and encrypted-file guards; production embedding requires explicit API/env configuration and Chroma remains optional
+- [ ] Vector store: Chroma optional adapter implemented; FAISS local prototype and pgvector engineering version remain planned
 - [ ] Web UI: TypeScript + Vue3 / React, streaming chat, source panel
 - [ ] Observability: trace_id, token usage, latency, provider fallback logs
 

diff --git a/docs/INTERVIEW_NOTES.md b/docs/INTERVIEW_NOTES.md
@@ -10,7 +10,7 @@ Study Agent 是一个本地优先的 AI 学习助手，重点在多 Provider 模
 2. **长期记忆写入安全** — safe writer + preview/confirm 机制，防止不可逆的记忆污染
 3. **联网搜索来源追溯** — Feed registry / RSS 多源聚合 → URL safety matrix → 文章正文三层提取 → LLM digest → pipeline trace 全过程来源可回溯
 4. **Streamlit 重渲染性能优化** — 多层缓存策略、按模式批量落盘、主链路 token 预算控制
-5. **CI / Ruff / detect-secrets 工程检查** — 273 pytest tests、Ruff clean、GitHub Actions workflow、detect-secrets 对未豁免发现硬阻断
+5. **CI / Ruff / detect-secrets 工程检查** — 277 pytest tests、Ruff clean、mypy local clean、GitHub Actions workflow、detect-secrets 对未豁免发现硬阻断
 
 ## 可讲亮点
 
@@ -23,7 +23,7 @@ Study Agent 是一个本地优先的 AI 学习助手，重点在多 Provider 模
 
 ## 展示边界
 
-- `mypy` 已接入 CI soft check，但当前本地仍有类型错误，不能说类型检查 clean。
+- `mypy` 已接入 CI soft check，当前本地 `python -m mypy --explicit-package-bases src` clean；但 CI 配置仍是非阻断检查。
 - `performance_budget.py` 覆盖主要 chat / WeChat / news LLM 路径，辅助 LLM 调用仍需继续收口。
 - `article_fetcher.py` 负责真实网络读取前的 DNS/IP SSRF 校验；`link_resolver.py` 是网络无关的 URL 预检和跳转记录。
 - `detect-secrets` 已接入 CI，并通过解析扫描 JSON 的 `results` 对未豁免发现硬阻断；测试里的 Basic Auth 形态 URL 样例已显式 allowlist。
diff --git a/docs/RAG.md b/docs/RAG.md
@@ -2,7 +2,7 @@
 
 ## Status
 
-Current status: **MVP implemented with a local vector prototype, not a production vector-store RAG system yet**.
+Current status: **MVP implemented with a local-first retrieval path, configurable embeddings and an optional Chroma adapter**.
 
 Implemented:
 
@@ -21,12 +21,13 @@ Implemented:
 - UI source blocks for retrieved file paths, line ranges, scores and matched terms
 - FastAPI endpoints: `GET /health`, `POST /rag`, `POST /rag/index`, `POST /rag/query`
 - Streamlit knowledge/debug panel with index summary, document rows, chunk preview and score breakdowns
-- Optional vector backend interface with local fallback and Chroma adapter scaffold
+- Optional vector backend interface with local fallback and Chroma adapter
+- Configurable embedding providers: deterministic `local_hash` by default, OpenAI-compatible embeddings when explicitly configured
 
 Not implemented yet:
 
-- Production embedding model integration
 - FAISS, pgvector or managed vector stores
+- Production-grade embedding evaluation, relevance tuning and re-index migration tooling
 - Automatic injection into every generation path; current injection covers single chat and WeChat interactive replies, but not news discussion or after-session feedback
 
 ## Module Map
@@ -36,9 +37,9 @@ Not implemented yet:
 | `src/rag/loader.py` | Load supported local files into normalized `RagDocument` objects |
 | `src/rag/chunker.py` | Split documents into line-traceable `RagChunk` objects |
 | `src/rag/index.py` | Build, save, load and search a local JSON RAG index |
-| `src/rag/embeddings.py` | Embedding provider contract and local hash embedding provider |
+| `src/rag/embeddings.py` | Embedding provider contract, local hash provider and OpenAI-compatible provider |
 | `src/rag/backends.py` | Vector backend contract, local backend and environment-driven backend selection |
-| `src/rag/chroma_backend.py` | Optional Chroma persistent backend adapter scaffold |
+| `src/rag/chroma_backend.py` | Optional Chroma persistent backend adapter |
 | `src/rag/vector.py` | Deterministic local vector prototype and hybrid retrieval |
 | `src/rag/eval.py` | LLM-free retrieval quality evaluation over gold query fixtures |
 | `src/rag/service.py` | Application-facing helpers for indexing, querying and context formatting |
@@ -67,7 +68,7 @@ Supported retrieval modes:
 - `lexical`: TF-IDF-style term scoring
 - `vector`: deterministic local hash-vector cosine similarity
 - `hybrid`: normalized lexical score plus vector similarity
-- `backend_vector`: configured vector backend; defaults to local and can use the optional Chroma adapter
+- `backend_vector`: configured vector backend; defaults to local and can use the optional Chroma adapter with configured embeddings
 
 Each result keeps:
 
@@ -139,11 +140,12 @@ P4-C / P6 adds Streamlit inspection controls:
 
 P5 adds the first vector-backend abstraction:
 
-- `EmbeddingProvider` protocol plus `LocalHashEmbeddingProvider`
+- `EmbeddingProvider` protocol plus `LocalHashEmbeddingProvider` and `OpenAIEmbeddingProvider`
 - `VectorBackend` protocol plus `LocalVectorBackend`
 - `RAG_VECTOR_BACKEND=local|chroma`
+- `RAG_EMBEDDING_PROVIDER=local_hash|openai`, `RAG_EMBEDDING_MODEL`, `RAG_EMBEDDING_DIMENSIONS`, `RAG_EMBEDDING_API_KEY`, `RAG_EMBEDDING_BASE_URL`
 - Optional `ChromaVectorBackend` using lazy `chromadb` import, `PersistentClient`, collection `upsert` and vector query
-- `tests/test_rag_backends.py` verifies local backend behavior, environment config and Chroma fake-client upsert/query behavior
+- `tests/test_rag_backends.py` verifies local backend behavior, embedding environment config, OpenAI-compatible embedding batching and Chroma fake-client upsert/query behavior
 
 ## Next Steps
 
@@ -163,9 +165,9 @@ Goal: replace the local hash-vector prototype with optional real embeddings with
 
 - [x] Extract an embedding-provider and vector-backend contract.
 - [x] Keep JSON + lexical / hybrid retrieval as the zero-infrastructure fallback.
-- [x] Add an optional Chroma adapter scaffold with lazy import and fake-client tests.
+- [x] Add an optional Chroma adapter with lazy import and fake-client tests.
 - [x] Make vector backend selection explicit through config.
-- [ ] Add a production embedding provider; current Chroma adapter uses the local hash embedding provider by default.
+- [x] Add a production embedding provider path; current default remains `local_hash`, while OpenAI-compatible embeddings require explicit env/API configuration.
 
 ### P6: Knowledge UI