Node.js + Vercel AI SDK implementation of the Xiaohongshu (小红书) viral note generation system.
cd xhs-note-node
npm installCopy .env.example to .env and update your Qwen API key:
cp .env.example .envEdit .env:
LLM_API_KEY=sk-your-qwen-api-key-here
LLM_REGION=cn # or intl, finance
npm run devServer starts at http://localhost:8072
# Check health
curl http://localhost:8072/health/live
# Generate note (requires actual images)
curl -X POST http://localhost:8072/api/v1/xhs/notes/report \
-F "idea_text=我想分享地中海饮食减脂经验" \
-F "images=@image1.jpg" \
-F "images=@image2.jpg"src/
├── main.ts # Entry point
├── app.ts # Hono app factory
├── config/
│ ├── settings.ts # Environment config (Zod)
│ └── agentConfig.ts # Load YAML configs
├── schemas/
│ └── xhsNote.ts # All Zod schemas
├── providers/
│ └── qwen.ts # Qwen LLM provider
├── agents/
│ ├── visualAnalystAgent.ts
│ ├── imageEditAgent.ts
│ ├── growthStrategistAgent.ts
│ ├── contentWriterAgent.ts
│ ├── seoExpertAgent.ts
│ └── pipeline/
│ └── xhsNotePipeline.ts # 3-phase orchestrator
├── services/
│ └── xhsNoteService.ts # File handling + flow
├── api/
│ └── routes/
│ ├── xhsNote.ts # Main endpoint
│ └── health.ts # Health checks
└── utils/
├── imageUtils.ts # sharp compression
└── promptBuilder.ts # Prompt templates
-
Phase 1: Visual Analysis (Parallel)
- Visual Analyst Agent analyzes each image
- Multimodal (qwen3-vl-plus)
-
Phase 2: Image Editing Plans (Parallel)
- Image Editor Agent creates edit strategies
- Multimodal (qwen3-vl-plus)
-
Phase 3: Content Creation (Sequential)
- Growth Strategist → Content Writer → SEO Expert
- Text-only (qwen3-max)
# Development with hot reload
npm run dev
# Build TypeScript
npm build
# Start production
npm start
# Type check
npm run lint
# Run tests (when available)
npm run test- ✅ Three-phase multi-agent pipeline
- ✅ Concurrent image processing
- ✅ Multimodal image analysis
- ✅ Structured output with Zod schemas
- ✅ YAML configuration (agents + tasks)
- ✅ Image compression with sharp
- ✅ Type-safe TypeScript throughout
- ⏳ Observability (logging, metrics) - coming soon
- ⏳ Unit tests - coming soon
- ⏳ Database integration - coming soon
- Framework: Hono (lightweight, flexible deployment)
- LLM: Vercel AI SDK + OpenAI-compatible provider (Qwen DashScope)
- Image Processing: sharp (fast, libvips-based)
- Validation: Zod (runtime + TypeScript types)
- Configuration: Zod for env vars, YAML for agent prompts
| Variable | Default | Description |
|---|---|---|
NODE_ENV |
development | Environment type |
PORT |
8072 | Server port |
LLM_API_KEY |
(required) | Qwen API key |
LLM_REGION |
cn | DashScope region: cn, intl, finance |
XHS_IMAGE_MAX_SIZE |
1024 | Max image width/height (pixels) |
XHS_IMAGE_QUALITY |
85 | JPEG quality (1-100) |
XHS_MAX_IMAGES |
20 | Max images per request |
- Eliminated:
AddImageToolLocaltool (images pre-encoded in service layer) - Simplified: No
IntermediateTool(structuredgenerateObjecthandles output directly) - Improved: Direct JSON Schema support (no prompt engineering for JSON format)
- Cleaner: No multimodal message normalization hack (Vercel AI SDK handles it)
- Faster: sharp compression vs Pillow
- Copy sample images from Python project's
tests/integration/ - Update
.envwith real Qwen API key - Test main pipeline:
npm run devthencurl -X POST ... - Add observability (pino logging, prom-client metrics)
- Add unit tests (vitest)