Skip to content

Commit b7c6c02

Browse files
authored
feat: add weixin article download adapter & abstract download helpers (#280)
- New: src/clis/weixin/download.ts — WeChat article to Markdown adapter - New: src/download/article-download.ts — shared article download helper (TurndownService, image localization, frontmatter, customizable labels) - New: src/download/media-download.ts — shared media download helper (batch download, ProgressTracker, yt-dlp routing, auto cookie export) - Refactor: migrate zhihu/download to use downloadArticle() - Refactor: migrate xiaohongshu/download to use downloadMedia() - Refactor: migrate twitter/download to use downloadMedia() - Refactor: migrate bilibili/download to use downloadMedia() - Docs: add weixin to README, README.zh-CN, download docs, adapter docs
1 parent 722c180 commit b7c6c02

File tree

12 files changed

+775
-427
lines changed

12 files changed

+775
-427
lines changed

README.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -115,6 +115,7 @@ Run `opencli list` for the live registry.
115115
| **apple-podcasts** | `search` `episodes` `top` | Public |
116116
| **xiaoyuzhou** | `podcast` `podcast-episodes` `episode` | Public |
117117
| **zhihu** | `hot` `search` `question` `download` | Browser |
118+
| **weixin** | `download` | Browser |
118119
| **youtube** | `search` `video` `transcript` | Browser |
119120
| **boss** | `search` `detail` `recommend` `joblist` `greet` `batchgreet` `send` `chatlist` `chatmsg` `invite` `mark` `exchange` `resume` `stats` | Browser |
120121
| **coupang** | `search` `add-to-cart` | Browser |
@@ -192,6 +193,7 @@ OpenCLI supports downloading images, videos, and articles from supported platfor
192193
| **bilibili** | Videos | Requires `yt-dlp` installed |
193194
| **twitter** | Images, Videos | Downloads from user media tab or single tweet |
194195
| **zhihu** | Articles (Markdown) | Exports articles with optional image download |
196+
| **weixin** | Articles (Markdown) | Exports WeChat Official Account articles |
195197

196198
### Prerequisites
197199

@@ -225,6 +227,9 @@ opencli zhihu download "https://zhuanlan.zhihu.com/p/xxx" --output ./zhihu
225227

226228
# Export with local images
227229
opencli zhihu download "https://zhuanlan.zhihu.com/p/xxx" --download-images
230+
231+
# Export WeChat article to Markdown
232+
opencli weixin download --url "https://mp.weixin.qq.com/s/xxx" --output ./weixin
228233
```
229234

230235

README.zh-CN.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -117,6 +117,7 @@ npm install -g @jackwener/opencli@latest
117117
| **apple-podcasts** | `search` `episodes` `top` | 公开 |
118118
| **xiaoyuzhou** | `podcast` `podcast-episodes` `episode` | 公开 |
119119
| **zhihu** | `hot` `search` `question` `download` | 浏览器 |
120+
| **weixin** | `download` | 浏览器 |
120121
| **youtube** | `search` `video` `transcript` | 浏览器 |
121122
| **boss** | `search` `detail` `recommend` `joblist` `greet` `batchgreet` `send` `chatlist` `chatmsg` `invite` `mark` `exchange` `resume` `stats` | 浏览器 |
122123
| **coupang** | `search` `add-to-cart` | 浏览器 |
@@ -194,6 +195,7 @@ OpenCLI 支持从各平台下载图片、视频和文章。
194195
| **B站** | 视频 | 需要安装 `yt-dlp` |
195196
| **Twitter/X** | 图片、视频 | 从用户媒体页或单条推文下载 |
196197
| **知乎** | 文章(Markdown) | 导出文章,可选下载图片到本地 |
198+
| **微信公众号** | 文章(Markdown) | 导出微信公众号文章为 Markdown |
197199

198200
### 前置依赖
199201

@@ -227,6 +229,9 @@ opencli zhihu download "https://zhuanlan.zhihu.com/p/xxx" --output ./zhihu
227229

228230
# 导出并下载图片
229231
opencli zhihu download "https://zhuanlan.zhihu.com/p/xxx" --download-images
232+
233+
# 导出微信公众号文章为 Markdown
234+
opencli weixin download --url "https://mp.weixin.qq.com/s/xxx" --output ./weixin
230235
```
231236

232237

docs/adapters/browser/weixin.md

Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
# WeChat (微信公众号)
2+
3+
**Mode**: 🔐 Browser · **Domain**: `mp.weixin.qq.com`
4+
5+
## Commands
6+
7+
| Command | Description |
8+
|---------|-------------|
9+
| `opencli weixin download` | 下载微信公众号文章为 Markdown 格式 |
10+
11+
## Usage Examples
12+
13+
```bash
14+
# Export article to Markdown
15+
opencli weixin download --url "https://mp.weixin.qq.com/s/xxx" --output ./weixin
16+
17+
# Export with locally downloaded images
18+
opencli weixin download --url "https://mp.weixin.qq.com/s/xxx" --download-images
19+
20+
# Export without images
21+
opencli weixin download --url "https://mp.weixin.qq.com/s/xxx" --no-download-images
22+
```
23+
24+
## Output
25+
26+
Downloads to `<output>/<article-title>/`:
27+
- `<article-title>.md` — Markdown with frontmatter (title, author, publish time, source URL)
28+
- `images/` — Downloaded images (if `--download-images` is enabled, default: true)
29+
30+
## Prerequisites
31+
32+
- Chrome running and **logged into** mp.weixin.qq.com (for articles behind login wall)
33+
- [Browser Bridge extension](/guide/browser-bridge) installed

docs/advanced/download.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@ OpenCLI supports downloading images, videos, and articles from supported platfor
1010
| **bilibili** | Videos | Requires `yt-dlp` installed |
1111
| **twitter** | Images, Videos | Downloads from user media tab or single tweet |
1212
| **zhihu** | Articles (Markdown) | Exports articles with optional image download |
13+
| **weixin** | Articles (Markdown) | Exports WeChat Official Account articles |
1314

1415
## Prerequisites
1516

@@ -43,6 +44,9 @@ opencli zhihu download "https://zhuanlan.zhihu.com/p/xxx" --output ./zhihu
4344

4445
# Export with local images
4546
opencli zhihu download "https://zhuanlan.zhihu.com/p/xxx" --download-images
47+
48+
# Export WeChat article to Markdown
49+
opencli weixin download --url "https://mp.weixin.qq.com/s/xxx" --output ./weixin
4650
```
4751

4852
## Pipeline Step (YAML Adapters)

src/clis/bilibili/download.ts

Lines changed: 25 additions & 83 deletions
Original file line numberDiff line numberDiff line change
@@ -8,18 +8,9 @@
88
* - yt-dlp must be installed: pip install yt-dlp
99
*/
1010

11-
import * as fs from 'node:fs';
12-
import * as path from 'node:path';
1311
import { cli, Strategy } from '../../registry.js';
14-
import {
15-
ytdlpDownload,
16-
checkYtdlp,
17-
sanitizeFilename,
18-
getTempDir,
19-
exportCookiesToNetscape,
20-
formatCookieHeader,
21-
} from '../../download/index.js';
22-
import { DownloadProgressTracker, formatBytes } from '../../download/progress.js';
12+
import { checkYtdlp, sanitizeFilename } from '../../download/index.js';
13+
import { downloadMedia } from '../../download/media-download.js';
2314

2415
cli({
2516
site: 'bilibili',
@@ -63,21 +54,8 @@ cli({
6354

6455
const title = sanitizeFilename(data?.title || 'video');
6556

66-
// Extract cookies for authenticated downloads
67-
const cookies = await page.getCookies({ domain: 'bilibili.com' });
68-
const cookieString = formatCookieHeader(cookies);
69-
70-
// Create output directory
71-
fs.mkdirSync(output, { recursive: true });
72-
73-
// Export cookies to Netscape format for yt-dlp
74-
let cookiesFile: string | undefined;
75-
if (cookies.length > 0) {
76-
const tempDir = getTempDir();
77-
fs.mkdirSync(tempDir, { recursive: true });
78-
cookiesFile = path.join(tempDir, `bilibili_cookies_${Date.now()}.txt`);
79-
exportCookiesToNetscape(cookies, cookiesFile);
80-
}
57+
// Extract cookies for yt-dlp
58+
const browserCookies = await page.getCookies({ domain: 'bilibili.com' });
8159

8260
// Build yt-dlp format string based on quality
8361
let format = 'bestvideo[ext=mp4]+bestaudio[ext=m4a]/best[ext=mp4]/best';
@@ -89,62 +67,26 @@ cli({
8967
format = 'bestvideo[height<=480][ext=mp4]+bestaudio[ext=m4a]/best[height<=480]';
9068
}
9169

92-
const destPath = path.join(output, `${bvid}_${title}.mp4`);
93-
94-
const tracker = new DownloadProgressTracker(1, true);
95-
const progressBar = tracker.onFileStart(`${bvid}.mp4`, 0);
96-
97-
try {
98-
const result = await ytdlpDownload(
99-
`https://www.bilibili.com/video/${bvid}`,
100-
destPath,
101-
{
102-
cookiesFile,
103-
format,
104-
extraArgs: [
105-
'--merge-output-format', 'mp4',
106-
'--embed-thumbnail',
107-
],
108-
onProgress: (percent) => {
109-
if (progressBar) progressBar.update(percent, 100);
110-
},
111-
},
112-
);
113-
114-
if (progressBar) {
115-
progressBar.complete(result.success, result.success ? formatBytes(result.size) : undefined);
116-
}
117-
118-
tracker.onFileComplete(result.success);
119-
tracker.finish();
120-
121-
// Cleanup cookies file
122-
if (cookiesFile && fs.existsSync(cookiesFile)) {
123-
fs.unlinkSync(cookiesFile);
124-
}
125-
126-
return [{
127-
bvid,
128-
title: data?.title || 'video',
129-
status: result.success ? 'success' : 'failed',
130-
size: result.success ? formatBytes(result.size) : (result.error || 'unknown error'),
131-
}];
132-
} catch (err: any) {
133-
if (progressBar) progressBar.fail(err.message);
134-
tracker.onFileComplete(false);
135-
tracker.finish();
136-
137-
// Cleanup cookies file
138-
if (cookiesFile && fs.existsSync(cookiesFile)) {
139-
fs.unlinkSync(cookiesFile);
140-
}
141-
142-
return [{
143-
bvid,
144-
title: data?.title || 'video',
145-
status: 'failed',
146-
size: err.message,
147-
}];
148-
}
70+
const videoUrl = `https://www.bilibili.com/video/${bvid}`;
71+
const filename = `${bvid}_${title}.mp4`;
72+
73+
const results = await downloadMedia(
74+
[{ type: 'video-ytdlp', url: videoUrl, filename }],
75+
{
76+
output,
77+
browserCookies,
78+
filenamePrefix: bvid,
79+
ytdlpExtraArgs: ['-f', format, '--merge-output-format', 'mp4', '--embed-thumbnail'],
80+
},
81+
);
82+
83+
// Map results to bilibili-specific columns
84+
const r = results[0] || { status: 'failed', size: '-' };
85+
return [{
86+
bvid,
87+
title: data?.title || 'video',
88+
status: r.status,
89+
size: r.size,
90+
}];
14991
},
15092
});

src/clis/twitter/download.ts

Lines changed: 13 additions & 111 deletions
Original file line numberDiff line numberDiff line change
@@ -6,19 +6,9 @@
66
* opencli twitter download --tweet-url https://x.com/xxx/status/123 --output ./twitter
77
*/
88

9-
import * as fs from 'node:fs';
10-
import * as path from 'node:path';
119
import { cli, Strategy } from '../../registry.js';
12-
import {
13-
httpDownload,
14-
ytdlpDownload,
15-
checkYtdlp,
16-
sanitizeFilename,
17-
getTempDir,
18-
exportCookiesToNetscape,
19-
formatCookieHeader,
20-
} from '../../download/index.js';
21-
import { DownloadProgressTracker, formatBytes } from '../../download/progress.js';
10+
import { formatCookieHeader } from '../../download/index.js';
11+
import { downloadMedia } from '../../download/media-download.js';
2212

2313
cli({
2414
site: 'twitter',
@@ -101,32 +91,11 @@ cli({
10191
`);
10292

10393
if (!data || data.length === 0) {
104-
return [{
105-
index: 0,
106-
type: '-',
107-
status: 'failed',
108-
size: 'No media found',
109-
}];
94+
return [{ index: 0, type: '-', status: 'failed', size: 'No media found' }];
11095
}
11196

11297
// Extract cookies
113-
const cookies = await page.getCookies({ domain: 'x.com' });
114-
const cookieString = formatCookieHeader(cookies);
115-
116-
// Create output directory
117-
const outputDir = tweetUrl
118-
? path.join(output, 'tweets')
119-
: path.join(output, username || 'media');
120-
fs.mkdirSync(outputDir, { recursive: true });
121-
122-
// Export cookies for yt-dlp
123-
let cookiesFile: string | undefined;
124-
if (cookies.length > 0) {
125-
const tempDir = getTempDir();
126-
fs.mkdirSync(tempDir, { recursive: true });
127-
cookiesFile = path.join(tempDir, `twitter_cookies_${Date.now()}.txt`);
128-
exportCookiesToNetscape(cookies, cookiesFile);
129-
}
98+
const browserCookies = await page.getCookies({ domain: 'x.com' });
13099

131100
// Deduplicate media
132101
const seen = new Set<string>();
@@ -136,81 +105,14 @@ cli({
136105
return true;
137106
}).slice(0, limit);
138107

139-
const tracker = new DownloadProgressTracker(uniqueMedia.length, true);
140-
const results: any[] = [];
141-
142-
for (let i = 0; i < uniqueMedia.length; i++) {
143-
const media = uniqueMedia[i];
144-
const ext = media.type === 'image' ? 'jpg' : 'mp4';
145-
const filename = `${username || 'tweet'}_${i + 1}.${ext}`;
146-
const destPath = path.join(outputDir, filename);
147-
148-
const progressBar = tracker.onFileStart(filename, i);
149-
150-
try {
151-
let result: { success: boolean; size: number; error?: string };
152-
153-
if (media.type === 'video-tweet' && checkYtdlp()) {
154-
// Use yt-dlp for video tweets
155-
result = await ytdlpDownload(media.url, destPath, {
156-
cookiesFile,
157-
extraArgs: ['--merge-output-format', 'mp4'],
158-
onProgress: (percent) => {
159-
if (progressBar) progressBar.update(percent, 100);
160-
},
161-
});
162-
} else if (media.type === 'image') {
163-
// Direct HTTP download for images
164-
result = await httpDownload(media.url, destPath, {
165-
cookies: cookieString,
166-
timeout: 30000,
167-
onProgress: (received, total) => {
168-
if (progressBar) progressBar.update(received, total);
169-
},
170-
});
171-
} else {
172-
// Direct HTTP download for direct video URLs
173-
result = await httpDownload(media.url, destPath, {
174-
cookies: cookieString,
175-
timeout: 60000,
176-
onProgress: (received, total) => {
177-
if (progressBar) progressBar.update(received, total);
178-
},
179-
});
180-
}
181-
182-
if (progressBar) {
183-
progressBar.complete(result.success, result.success ? formatBytes(result.size) : undefined);
184-
}
185-
186-
tracker.onFileComplete(result.success);
187-
188-
results.push({
189-
index: i + 1,
190-
type: media.type === 'video-tweet' ? 'video' : media.type,
191-
status: result.success ? 'success' : 'failed',
192-
size: result.success ? formatBytes(result.size) : (result.error || 'unknown error'),
193-
});
194-
} catch (err: any) {
195-
if (progressBar) progressBar.fail(err.message);
196-
tracker.onFileComplete(false);
197-
198-
results.push({
199-
index: i + 1,
200-
type: media.type,
201-
status: 'failed',
202-
size: err.message,
203-
});
204-
}
205-
}
206-
207-
tracker.finish();
208-
209-
// Cleanup cookies file
210-
if (cookiesFile && fs.existsSync(cookiesFile)) {
211-
fs.unlinkSync(cookiesFile);
212-
}
213-
214-
return results;
108+
const subdir = tweetUrl ? 'tweets' : (username || 'media');
109+
return downloadMedia(uniqueMedia, {
110+
output,
111+
subdir,
112+
cookies: formatCookieHeader(browserCookies),
113+
browserCookies,
114+
filenamePrefix: username || 'tweet',
115+
ytdlpExtraArgs: ['--merge-output-format', 'mp4'],
116+
});
215117
},
216118
});

0 commit comments

Comments
 (0)