Add cover image and bold font support with CJK markdown fixes#333
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces bold font support for EPUB generation, implements a preprocessor to fix CJK bold parsing issues, and improves PDF rendering with pseudo-bolding, orphan protection for code blocks, and better image page-breaking. The review feedback suggests improving the markdown fence parsing and bold text regex in the preprocessor to support nested blocks and single asterisks (crucial for Zig pointers), parallelizing font downloads to optimize build performance, and removing a redundant layout check in the PDF renderer.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| export function fixCjkStrong(md: string): string { | ||
| let inFence = false; | ||
| let fenceMark = ""; | ||
| return md | ||
| .split(/\r?\n/) | ||
| .map((ln) => { | ||
| const fence = ln.match(/^\s*(```+|~~~+)/); | ||
| if (fence) { | ||
| if (!inFence) { | ||
| inFence = true; | ||
| fenceMark = fence[1][0]; | ||
| } else if (fence[1][0] === fenceMark) { | ||
| inFence = false; | ||
| fenceMark = ""; | ||
| } | ||
| return ln; | ||
| } |
There was a problem hiding this comment.
The current fence parsing logic in fixCjkStrong is naive and can be broken by nested code blocks or different-length fences (e.g., a 3-backtick fence inside a 4-backtick fence). In CommonMark, a code fence can only be closed by a fence of the same character and of equal or greater length. We should track both the fence character and its length to correctly handle nested fences and prevent premature closing.
export function fixCjkStrong(md: string): string {
let inFence = false;
let fenceChar = "";
let fenceLength = 0;
return md
.split(/\r?\n/)
.map((ln) => {
const fence = ln.match(/^\s*(```+|~~~+)/);
if (fence) {
const char = fence[1][0];
const len = fence[1].length;
if (!inFence) {
inFence = true;
fenceChar = char;
fenceLength = len;
} else if (char === fenceChar && len >= fenceLength) {
inFence = false;
fenceChar = "";
fenceLength = 0;
}
return ln;
}| codeSpans.push(m); | ||
| return `\u0000${codeSpans.length - 1}\u0000`; | ||
| }); | ||
| s = s.replace(/\*\*(?!\s)([^*\n]+?)(?<!\s)\*\*/g, "<strong>$1</strong>"); |
There was a problem hiding this comment.
The regex [^*\n]+? prevents matching any bold text that contains a single asterisk * (such as pointer types or dereferences like **指针 *p** or **类型 *T**). Since this is a Zig programming course, pointer notation is extremely common. We can improve the regex to allow single asterisks inside bold text by using (?:[^*\n]|\*(?!\*))+? instead.
| s = s.replace(/\*\*(?!\s)([^*\n]+?)(?<!\s)\*\*/g, "<strong>$1</strong>"); | |
| s = s.replace(/\*\*(?!\s)((?:[^*\n]|\*(?!\*))+?)(?<!\s)\*\*/g, "<strong>$1</strong>"); |
| export function fixCjkStrong(lines: string[]): string[] { | ||
| let inFence = false; | ||
| let fenceMark = ""; | ||
| return lines.map((ln) => { | ||
| const fence = ln.match(/^\s*(```+|~~~+)/); | ||
| if (fence) { | ||
| if (!inFence) { | ||
| inFence = true; | ||
| fenceMark = fence[1][0]; | ||
| } else if (fence[1][0] === fenceMark) { | ||
| inFence = false; | ||
| fenceMark = ""; | ||
| } | ||
| return ln; | ||
| } |
There was a problem hiding this comment.
The current fence parsing logic in fixCjkStrong is naive and can be broken by nested code blocks or different-length fences (e.g., a 3-backtick fence inside a 4-backtick fence). In CommonMark, a code fence can only be closed by a fence of the same character and of equal or greater length. We should track both the fence character and its length to correctly handle nested fences and prevent premature closing.
export function fixCjkStrong(lines: string[]): string[] {
let inFence = false;
let fenceChar = "";
let fenceLength = 0;
return lines.map((ln) => {
const fence = ln.match(/^\s*(```+|~~~+)/);
if (fence) {
const char = fence[1][0];
const len = fence[1].length;
if (!inFence) {
inFence = true;
fenceChar = char;
fenceLength = len;
} else if (char === fenceChar && len >= fenceLength) {
inFence = false;
fenceChar = "";
fenceLength = 0;
}
return ln;
}| protectedLine = protectedLine.replace( | ||
| /\*\*(?!\s)([^*\n]+?)(?<!\s)\*\*/g, | ||
| "<strong>$1</strong>", | ||
| ); |
There was a problem hiding this comment.
The regex [^*\n]+? prevents matching any bold text that contains a single asterisk * (such as pointer types or dereferences like **指针 *p** or **类型 *T**). Since this is a Zig programming course, pointer notation is extremely common. We can improve the regex to allow single asterisks inside bold text by using (?:[^*\n]|\*(?!\*))+? instead.
| protectedLine = protectedLine.replace( | |
| /\*\*(?!\s)([^*\n]+?)(?<!\s)\*\*/g, | |
| "<strong>$1</strong>", | |
| ); | |
| protectedLine = protectedLine.replace( | |
| /\*\*(?!\s)((?:[^*\n]|\*(?!\*))+?)(?<!\s)\*\*/g, | |
| "<strong>$1</strong>", | |
| ); |
| // 先串行预热下载(填充磁盘缓存),避免 normal/bold 共享同一 fileName 时并发写缓存的竞态。 | ||
| // 由于同一字体文件的下载被 download() 缓存,这里只会实际下载三个原始可变字体。 | ||
| const cjk = await buildOne(config.fonts.cjk, usedText, config.cacheDir); | ||
| const cjkBold = await buildOne(config.fonts.cjkBold, usedText, config.cacheDir); | ||
| const sans = await buildOne(config.fonts.sans, usedText, config.cacheDir); | ||
| const sansBold = await buildOne( | ||
| config.fonts.sansBold, | ||
| usedText, | ||
| config.cacheDir, | ||
| ); | ||
| const mono = await buildOne( | ||
| config.fonts.mono, | ||
| asciiText + usedText, | ||
| config.cacheDir, | ||
| ); |
There was a problem hiding this comment.
Currently, the font files are prepared sequentially to avoid concurrent write race conditions on the same file name. However, this serial execution slows down the build process significantly because downloading and subsetting are done one by one. We can parallelize the preparation by first downloading all unique font files in parallel (which populates the cache safely), and then running the CPU-intensive subsetting (buildOne) in parallel using Promise.all.
| // 先串行预热下载(填充磁盘缓存),避免 normal/bold 共享同一 fileName 时并发写缓存的竞态。 | |
| // 由于同一字体文件的下载被 download() 缓存,这里只会实际下载三个原始可变字体。 | |
| const cjk = await buildOne(config.fonts.cjk, usedText, config.cacheDir); | |
| const cjkBold = await buildOne(config.fonts.cjkBold, usedText, config.cacheDir); | |
| const sans = await buildOne(config.fonts.sans, usedText, config.cacheDir); | |
| const sansBold = await buildOne( | |
| config.fonts.sansBold, | |
| usedText, | |
| config.cacheDir, | |
| ); | |
| const mono = await buildOne( | |
| config.fonts.mono, | |
| asciiText + usedText, | |
| config.cacheDir, | |
| ); | |
| // 先并行下载所有唯一的原始字体文件以填充缓存,避免并发写入竞态 | |
| await Promise.all([ | |
| download(config.fonts.cjk.url, path.join(config.cacheDir, config.fonts.cjk.fileName)), | |
| download(config.fonts.sans.url, path.join(config.cacheDir, config.fonts.sans.fileName)), | |
| download(config.fonts.mono.url, path.join(config.cacheDir, config.fonts.mono.fileName)), | |
| ]); | |
| // 随后并行执行子集化,充分利用并发性能 | |
| const [cjk, cjkBold, sans, sansBold, mono] = await Promise.all([ | |
| buildOne(config.fonts.cjk, usedText, config.cacheDir), | |
| buildOne(config.fonts.cjkBold, usedText, config.cacheDir), | |
| buildOne(config.fonts.sans, usedText, config.cacheDir), | |
| buildOne(config.fonts.sansBold, usedText, config.cacheDir), | |
| buildOne(config.fonts.mono, asciiText + usedText, config.cacheDir), | |
| ]); |
| if (this.y + drawH + 4 > A4.h - MARGIN.bottom) { | ||
| this.newPage(); | ||
| } else { | ||
| this.ensureSpace(drawH + 4); | ||
| } |
There was a problem hiding this comment.
| if (!this._dry) this.doc.text(piece, x, curY); | ||
| if (!this._dry) { | ||
| if (bold) { | ||
| // 伪粗体:用填充+描边模式加粗笔画(无需额外 bold 字体) |
There was a problem hiding this comment.
🚫 [AutoCorrect Lint] <AutoCorrect> reported by reviewdog 🐶
| // 伪粗体:用填充+描边模式加粗笔画(无需额外 bold 字体) | |
| // 伪粗体:用填充 + 描边模式加粗笔画(无需额外 bold 字体) |
修复 `bun check`(prettier)与 AutoCorrect CI。两者在 main 分支均已是 红的:prettier 从旧版升到 3.8.4 后产生格式漂移(7 个文件),autocorrect 有 8 处 CJK 间距/标点遗留问题;本 PR 另引入 1 处(renderer.ts 的 "填充+描边" 注释)。全部为换行/缩进与 CJK 间距调整,无逻辑变更。
511e765 to
fe4be9d
Compare
- fixCjkStrong(epub/pdf):按 CommonMark 规则跟踪围栏字符与长度, 正确处理变长/嵌套围栏(如 4 反引号内的 3 反引号),避免提前闭合。 - fixCjkStrong 加粗正则允许内部出现单个 *(如 Zig 指针 *T / *p), 此前 [^*\n]+? 会漏配含指针记法的加粗。 - renderer 图片换页:if-else 与 ensureSpace 等价,简化为单次 ensureSpace。 - fonts:按 fileName 去重并行预热下载后再并行子集化,替代串行(消除写竞态同时提速)。 14 项 fixCjkStrong 边界用例通过;EPUB/PDF 重建无回归(图片不裁切、加粗正常)。
Summary
.vitepress/epub/cover.pngand update config to reference itfixCjkStrong()to convert**text**→<strong>text</strong>where CommonMark flanking rules fail with CJK punctuation/textWhy
Cover image improves presentation. Bold fonts enable proper typography (especially critical for CJK titles and emphasis) without reader pseudo-bold. CJK markdown fix resolves parsing failure for adjacent strong emphasis. Layout improvements prevent content loss in PDF rendering.