content-intelligence

从小红书/抖音采集数据中产出可执行洞察 — 趋势检测、爆款归因、模式识别、选题建议

痛点

做内容创作或电商选品时，手动刷抖音/小红书效率低、判断靠直觉。采集工具能把数据抓下来，但"数据"不等于"洞察"——你有 100 条笔记的互动数据，但不知道"什么在火、为什么火、下一个该做什么"。

解决方案

content-intelligence 消费 xiaohongshu-downloader 和 douyin-downloader 的采集输出，自动检测平台格式，用 Claude 分析内容模式，产出结构化洞察（JSON）+ 人可读报告（HTML/MD）。

ci run ./xhs-output ./douyin-output -o ./reports
open reports/*_report.html

架构

┌──────────────────┐     ┌──────────────────┐
│  xiaohongshu-    │     │  douyin-          │
│  downloader      │     │  downloader       │
│  output/         │     │  downloads/       │
└────────┬─────────┘     └────────┬──────────┘
         │                        │
         └───────────┬────────────┘
                     ▼
            ┌─────────────────┐
            │  Ingest         │  自动检测平台
            │  → ContentItem  │  统一数据模型
            └────────┬────────┘
                     ▼
            ┌─────────────────┐
            │  Analyze        │  Claude API
            │  趋势 / 归因 /   │  模式识别
            │  模式 / 选题     │
            └────────┬────────┘
                     ▼
            ┌─────────────────┐
            │  Report         │
            │  JSON + MD +    │
            │  HTML           │
            └─────────────────┘

快速开始

# 1. 克隆仓库
git clone https://github.com/zinan92/content-intelligence.git
cd content-intelligence

# 2. 安装
pip install -e .

# 3. 预览采集数据（不需要 API key）
ci ingest ./path/to/xhs-output ./path/to/douyin-output

# 4. 完整分析（需要 Anthropic API key）
export ANTHROPIC_API_KEY=sk-ant-...
ci run ./path/to/xhs-output ./path/to/douyin-output -o ./reports

# 5. 查看报告
open reports/*_report.html

功能一览

功能	说明	状态
XHS Ingest	解析 xiaohongshu-downloader 输出（manifest.json, items/notes）	✅
Douyin Ingest	解析 douyin-downloader 输出（download_manifest.jsonl, metadata）	✅
自动平台检测	根据文件格式自动判断小红书 vs 抖音	✅
趋势检测	话题热度排名 + 上升/下降方向	✅
爆款归因	分析 top 内容为什么火	✅
模式识别	标题模式、内容模式、互动模式	✅
选题建议	基于数据的选题 + 切入角度 + 置信度	✅
JSON 输出	机器可读，供下游工具消费	✅
MD/HTML 报告	人可读，趋势表 + 排名 + 建议	✅

输出格式

intelligence.json（机器用）

{
  "trends": [{"topic": "AI工具", "heat_score": 85, "direction": "rising"}],
  "top_content": [{"title": "...", "why_it_works": "数字列表 + 痛点开头"}],
  "patterns": [{"pattern": "反常识标题", "frequency": "top 30% 中出现 60%"}],
  "topic_suggestions": [{"topic": "...", "angle": "...", "confidence": "high"}]
}

report.html（人看）

趋势热度表 + Top 内容排名 + 爆款模式列表 + 选题建议卡片

技术栈

层级	技术	用途
语言	Python 3.11+	核心
CLI	Click	命令行界面
AI	Anthropic Claude API	内容分析、模式识别
模板	Jinja2	HTML 报告渲染
测试	pytest	单元测试

项目结构

content-intelligence/
├── src/
│   ├── cli.py          # CLI 入口 (ci run / ci ingest)
│   ├── schema.py       # ContentItem + AnalysisResult 数据模型
│   ├── ingest.py       # XHS + Douyin 数据解析（自动检测）
│   ├── analyze.py      # Claude 分析引擎
│   ├── report.py       # HTML/MD/JSON 报告生成
│   └── templates/      # Jinja2 HTML 模板
├── tests/
│   ├── test_schema.py
│   ├── test_ingest.py
│   ├── test_analyze.py
│   └── fixtures/       # 测试用 XHS/Douyin 样本数据
├── pyproject.toml
└── README.md

配置

变量	说明	必填	默认值
`ANTHROPIC_API_KEY`	Claude API key（`ci run` 需要）	分析时必填	—

ci ingest 预览模式不需要 API key。

For AI Agents

结构化元数据

name: content-intelligence
description: Turn social media data into actionable content intelligence — trends, patterns, topic suggestions
version: 0.1.0
cli_command: ci
cli_args:
  - name: run
    description: Full pipeline — ingest + Claude analysis + report generation
    args:
      - name: dirs
        type: string[]
        required: true
        description: Data directories from xiaohongshu-downloader or douyin-downloader
    flags:
      - name: --output / -o
        type: string
        default: ./reports
        description: Output directory for reports
      - name: --skip-analysis
        type: boolean
        description: Only ingest, skip Claude analysis
  - name: ingest
    description: Preview ingested data without analysis
    args:
      - name: dirs
        type: string[]
        required: true
install_command: pip install -e .
dependencies:
  - click
  - anthropic
  - jinja2
capabilities:
  - ingest xiaohongshu-downloader output (manifest.json with notes/items)
  - ingest douyin-downloader output (download_manifest.jsonl)
  - auto-detect platform from file format
  - analyze content trends and patterns via Claude
  - identify why top content performs well
  - generate topic suggestions with confidence levels
  - output JSON (machine) + MD + HTML (human) reports
input_format: xiaohongshu-downloader output dir / douyin-downloader output dir
output_format: JSON + Markdown + HTML reports

Agent 调用示例

import subprocess
import json

# Run content intelligence analysis
result = subprocess.run(
    ["ci", "run", "./xhs-data", "./douyin-data", "-o", "./reports"],
    capture_output=True, text=True,
    env={**os.environ, "ANTHROPIC_API_KEY": "sk-ant-..."}
)

# Read structured output
intelligence = json.loads(open("./reports/2026-03-20_intelligence.json").read())
trends = intelligence["trends"]           # What's hot
suggestions = intelligence["topic_suggestions"]  # What to create next

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
docs		docs
src		src
tests		tests
.gitignore		.gitignore
README.md		README.md
pyproject.toml		pyproject.toml

项目	关系	链接
xiaohongshu-downloader	上游数据源（小红书采集）	zinan92/xiaohongshu-downloader
douyin-downloader	上游数据源（抖音采集）	zinan92/douyin-downloader-1
videocut	下游消费者（内容生产）	zinan92/videocut

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

content-intelligence

痛点

解决方案

架构

快速开始

功能一览

输出格式

intelligence.json（机器用）

report.html（人看）

技术栈

项目结构

配置

For AI Agents

结构化元数据

Agent 调用示例

相关项目

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

content-intelligence

痛点

解决方案

架构

快速开始

功能一览

输出格式

intelligence.json（机器用）

report.html（人看）

技术栈

项目结构

配置

For AI Agents

结构化元数据

Agent 调用示例

相关项目

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages