PaperBrain is a fully automated research intelligence pipeline that turns daily paper streams into actionable research notes.
In one run, it completes a full loop: Fetch → Screen → Deep Analyze → Link to Your Knowledge Base → Notify/Podcast.
- It Thinks: Uses strong reasoning models for engineering-level analysis (method, math, evidence, limitations).
- It Remembers: Uses Context-Aware RAG to compare new papers with your existing notes.
- It Speaks: Generates a ~5-minute Deep Dive Podcast by default (duration configurable via CLI).
- It Organizes: Writes structured Markdown notes, metadata, and graph-ready links into Obsidian.
- Intelligent Screening: Fast first-pass scoring on large daily paper sets.
- Deep Analysis: Detailed report for high-value papers (problem gap, method, formulas, evidence, critique).
- Context-Aware RAG: Differential analysis against your prior notes.
- Strict Link Policy:
[[Note]]links are used only when a detailed note file exists; otherwise, it uses an ArXiv web link. - Visual Extraction: Automatic architecture figure extraction from PDFs.
- AI Podcast + Mobile Push: Audio briefing generation and Bark notifications.
- OS: Windows / macOS / Linux
- Python: 3.10+ (Conda recommended)
- Obsidian: For the best knowledge base experience.
Clone the repository and set up the environment:
git clone https://github.com/YourUsername/PaperBrain.git
cd PaperBrain/script
# Create Conda environment
conda create -n wd python=3.10
conda activate wd
# Install dependencies
pip install -r requirements.txtPaperBrain uses OpenRouter (recommended) or Doubao for LLM inference.
- Copy the example env file:
cp .env.example .env
- Edit
.envand fill in your keys:# Recommended: Use OpenRouter for access to Claude 3.7 / Gemini Pro OPENROUTER_API_KEY=sk-or-v1-your-key-here # Optional: For Bark Mobile Notifications (iOS) BARK_URL=https://api.day.app/your-token/
Edit config.yaml to define what you care about:
search:
keywords:
- "World Model"
- "Embodied AI"
- "Humanoid Robot"
arxiv_categories:
- "cs.RO"
- "cs.AI"Option A: Run Immediately (Manual Mode)
# Default: Generates Podcast (~5 minutes)
python main.py --run-now --provider openrouter
# Skip Podcast (faster)
python main.py --run-now --provider openrouter --no-podcast
# Custom podcast duration (e.g., 10 minutes)
python main.py --run-now --provider openrouter --podcast-minutes 10Option B: Generate Podcast for Existing Note
# Great for revisiting old papers!
python generate_podcast.py "Solaris.md" --provider openrouter
# Custom duration for single-note podcast
python generate_podcast.py "Solaris.md" --provider openrouter --minutes 12Option C: Auto-Schedule (Windows)
Use Task Scheduler to run script/run_daily.bat every morning at 8:00 AM.
- Action: Start a program
- Program:
D:\PaperBrain\script\run_daily.bat - Start in:
D:\PaperBrain\script\(Crucial!)
This tool processes data locally and interacts with third-party AI services.
- API Keys: Stored in
.env(git-ignored). Never shared. - Data Transmission: Paper content is sent to ArXiv (search), Hugging Face (metadata), OpenAI/Anthropic (analysis), and Microsoft (TTS).
- See SECURITY.md for full details.
PaperBrain 不只是“论文抓取器”,而是一个完整的自动化科研工作流引擎。
一次运行会自动完成:抓取 → 筛选 → 深度分析 → 知识库关联 → 推送/播客。
它每天自动从 ArXiv 和 Hugging Face 抓取最新论文,利用 快慢思考 (System 1 & 2) 架构进行筛选和深度解读,并将结构化的知识(公式、图谱、代码审计)同步到您的 Obsidian 知识库中。
-
智能筛选 + 深度分析双层架构:
- 第一层:快速筛选大批论文,给出相关性评分。
- 第二层:只对高价值论文生成深度报告,重点输出方法、公式、实验和局限。
-
上下文感知 RAG 差异化分析:
- 新论文分析前,先检索您已有笔记。
- 报告会明确写出“与既有工作相比,哪里变好、哪里仍不足”。
-
严格链接策略(避免误跳转):
- 只有真正生成了深度笔记的论文,才使用
[[内联链接]]。 - 未生成深度笔记的论文,一律使用 ArXiv 网页链接。
- 只有真正生成了深度笔记的论文,才使用
-
AI 播客与移动推送:
- 默认生成约 5 分钟英文深度播客(可通过命令行参数调整时长);
- 通过 Bark 推送摘要和播客就绪提醒。
- 推荐安装 Anaconda 或 Miniconda。
- 推荐使用 Obsidian 作为知识库前端。
# 1. 克隆代码
git clone https://github.com/YourUsername/PaperBrain.git
cd PaperBrain/script
# 2. 创建环境
conda create -n wd python=3.10
conda activate wd
# 3. 安装依赖 (包含 PyTorch, Edge-TTS 等)
pip install -r requirements.txt为了保护隐私,我们使用 .env 文件存储密钥。
- 复制模板:
cp .env.example .env
- 编辑
.env文件(使用记事本或 VS Code):# 推荐使用 OpenRouter (可访问 Claude 3.7, Gemini Pro 等 SOTA 模型) OPENROUTER_API_KEY=sk-or-v1-your-key-here # 选填:Bark 推送链接 (iOS App Store 下载 Bark) BARK_URL=https://api.day.app/your-token/
打开 config.yaml,修改您的关注领域:
search:
keywords:
- "World Model"
- "Embodied AI"
- "Humanoid Robot"
arxiv_categories:
- "cs.RO" # 机器人
- "cs.CV" # 计算机视觉方式一:立即运行 (手动)
# 默认模式:包含播客生成(默认约 5 分钟)
python main.py --run-now --provider openrouter
# 纯文本模式:不生成播客 (速度更快)
python main.py --run-now --provider openrouter --no-podcast
# 自定义播客时长(示例:10分钟)
python main.py --run-now --provider openrouter --podcast-minutes 10方式二:为特定笔记生成播客 想听听以前读过的某篇论文?
python generate_podcast.py "Solaris.md" --provider openrouter
# 自定义该篇播客时长(示例:12分钟)
python generate_podcast.py "Solaris.md" --provider openrouter --minutes 12方式三:每日自动运行 (Windows) 使用 Windows 自带的 任务计划程序 (Task Scheduler):
- 创建基本任务 -> 每天上午 8:00。
- 操作:启动程序 -> 选择
D:\PaperBrain\script\run_daily.bat。 - 关键点:在“起始于 (Start in)”一栏中,必须填入脚本所在目录
D:\PaperBrain\script\,否则会报错。
本项目注重您的数据隐私与密钥安全。
- API 密钥:仅存储在本地
.env文件中(默认被 git 忽略),绝不上传。 - 数据传输:论文内容仅发送至 ArXiv (搜索)、Hugging Face (元数据)、OpenAI/Anthropic (分析) 及 Microsoft (语音合成) 用于生成报告。
- 详见 SECURITY.md 了解完整安全策略。
Powered by PaperBrain Team & LLMs
