feat: 本地TTML歌词支持通过元数据解析网易云ID并缓存,翻译段遇繁体中文自动屏蔽#943
feat: 本地TTML歌词支持通过元数据解析网易云ID并缓存,翻译段遇繁体中文自动屏蔽#943kid141252010 wants to merge 6 commits intoimsyy:devfrom
Conversation
Summary of ChangesHello @kid141252010, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! 此拉取请求旨在增强应用程序处理本地歌词文件的能力,特别是针对 TTML 格式。通过引入一个智能缓存系统,它能够高效地将本地 TTML 歌词文件与网易云音乐 ID 相关联,从而显著提升歌词加载和匹配的速度。这一改进使得用户体验更加流畅,尤其是在处理大量本地歌词时。 Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
这个 PR 引入了对本地 TTML 歌词的网易云音乐 ID 解析和缓存机制,这是一个很棒的功能。代码实现了一个 TtmlIdMappingCache 类来管理缓存,并通过多级查找策略(缓存 -> 文件名 -> 元数据扫描)来定位歌词文件,逻辑比较清晰。
不过,当前实现中存在一些可以改进的地方:
- 缓存持久化:缓存的写入和删除操作在持久化方面存在不一致,可能导致缓存数据丢失。
- 性能问题:在某些情况下,歌词查找会触发一个非常耗时的全盘扫描操作,可能会阻塞 UI。
- 逻辑缺陷:当 TTML 从缓存加载时,会跳过对相应 LRC 文件的查找。
- 代码实践:在一些地方使用了同步的 I/O 操作,以及存在冗余的文件系统调用。
具体的建议请看下面的评论。修复这些问题后,这个功能会更健壮和高效。
| async function readLocalLyricImpl( | ||
| lyricDirs: string[], | ||
| id: number, | ||
| ): Promise<{ lrc: string; ttml: string }> { | ||
| const result = { lrc: "", ttml: "" }; | ||
| const cache = await getTtmlIdCache(); | ||
|
|
||
| // 首先尝试从缓存中查找 | ||
| const cached = cache.getById(id); | ||
| if (cached) { | ||
| try { | ||
| const fileStat = await stat(cached.filePath); | ||
| if (fileStat.mtimeMs === cached.mtime) { | ||
| const ttmlContent = await readFile(cached.filePath, "utf-8"); | ||
| result.ttml = ttmlContent; | ||
| console.log(`[readLocalLyric] 从缓存中找到 TTML: ${cached.filePath}`); | ||
| } else { | ||
| cache.delete(cached.filePath); | ||
| } | ||
| } catch { | ||
| cache.delete(cached.filePath); | ||
| } | ||
| } | ||
|
|
||
| // 如果缓存没找到或失效,尝试通过文件名匹配 (原始逻辑) | ||
| if (!result.ttml) { | ||
| const patterns = { | ||
| ttml: `**/{,*.}${id}.ttml`, | ||
| lrc: `**/{,*.}${id}.lrc`, | ||
| }; | ||
|
|
||
| for (const dir of lyricDirs) { | ||
| try { | ||
| if (!result.ttml) { | ||
| const ttmlFiles = await FastGlob(patterns.ttml, globOpt(dir)); | ||
| if (ttmlFiles.length > 0) { | ||
| const filePath = join(dir, ttmlFiles[0]); | ||
| await access(filePath); | ||
| result.ttml = await readFile(filePath, "utf-8"); | ||
| const fileStat = await stat(filePath); | ||
| cache.set(id, filePath, fileStat.mtimeMs); | ||
| } | ||
| } | ||
|
|
||
| if (!result.lrc) { | ||
| const lrcFiles = await FastGlob(patterns.lrc, globOpt(dir)); | ||
| if (lrcFiles.length > 0) { | ||
| const filePath = join(dir, lrcFiles[0]); | ||
| await access(filePath); | ||
| result.lrc = await readFile(filePath, "utf-8"); | ||
| } | ||
| } | ||
|
|
||
| if (result.ttml && result.lrc) break; | ||
| } catch { | ||
| // 某个路径异常,跳过 | ||
| } | ||
| } | ||
| } | ||
|
|
||
| // 如果通过文件名没找到 TTML,尝试遍历所有 TTML 文件并通过元数据匹配 ID | ||
| if (!result.ttml) { | ||
| for (const dir of lyricDirs) { | ||
| try { | ||
| const allTtmlFiles = await FastGlob("**/*.ttml", globOpt(dir)); | ||
| for (const fileName of allTtmlFiles) { | ||
| const filePath = join(dir, fileName); | ||
| try { | ||
| const ttmlContent = await readFile(filePath, "utf-8"); | ||
| const extractedId = extractNcmIdFromTTML(ttmlContent); | ||
| if (extractedId === id) { | ||
| result.ttml = ttmlContent; | ||
| const fileStat = await stat(filePath); | ||
| cache.set(id, filePath, fileStat.mtimeMs); | ||
| await cache.save(); | ||
| break; | ||
| } | ||
| } catch { | ||
| continue; | ||
| } | ||
| } | ||
| if (result.ttml) break; | ||
| } catch { | ||
| continue; | ||
| } | ||
| } | ||
| } | ||
|
|
||
| return result; | ||
| } |
There was a problem hiding this comment.
| // 如果通过文件名没找到 TTML,尝试遍历所有 TTML 文件并通过元数据匹配 ID | ||
| if (!result.ttml) { | ||
| for (const dir of lyricDirs) { | ||
| try { | ||
| const allTtmlFiles = await FastGlob("**/*.ttml", globOpt(dir)); | ||
| for (const fileName of allTtmlFiles) { | ||
| const filePath = join(dir, fileName); | ||
| try { | ||
| const ttmlContent = await readFile(filePath, "utf-8"); | ||
| const extractedId = extractNcmIdFromTTML(ttmlContent); | ||
| if (extractedId === id) { | ||
| result.ttml = ttmlContent; | ||
| const fileStat = await stat(filePath); | ||
| cache.set(id, filePath, fileStat.mtimeMs); | ||
| await cache.save(); | ||
| break; | ||
| } | ||
| } catch { | ||
| continue; | ||
| } | ||
| } | ||
| if (result.ttml) break; | ||
| } catch { | ||
| continue; | ||
| } | ||
| } | ||
| } |
| const ttmlFiles = await FastGlob(patterns.ttml, globOpt(dir)); | ||
| if (ttmlFiles.length > 0) { | ||
| const filePath = join(dir, ttmlFiles[0]); | ||
| await access(filePath); |
| return null; | ||
| } catch { | ||
| return null; | ||
| } | ||
| }; | ||
|
|
||
| /** | ||
| * 读取本地目录中的歌词(通过ID查找) | ||
| * @param lyricDirs 歌词目录列表 | ||
| * @param id 歌曲ID | ||
| * @returns 歌词内容 | ||
| */ | ||
| async function readLocalLyricImpl( | ||
| lyricDirs: string[], | ||
| id: number, | ||
| ): Promise<{ lrc: string; ttml: string }> { | ||
| const result = { lrc: "", ttml: "" }; | ||
| const cache = await getTtmlIdCache(); | ||
|
|
||
| // 步骤1: 从缓存中查找 TTML | ||
| const cached = cache.getById(id); | ||
| if (cached) { | ||
| try { | ||
| const fileStat = await stat(cached.filePath); | ||
| if (fileStat.mtimeMs === cached.mtime) { | ||
| const ttmlContent = await readFile(cached.filePath, "utf-8"); | ||
| result.ttml = ttmlContent; | ||
| console.log(`[readLocalLyric] 从缓存中找到 TTML: ${cached.filePath}`); | ||
| } else { | ||
| await cache.delete(cached.filePath); | ||
| } | ||
| } catch { | ||
| await cache.delete(cached.filePath); | ||
| } | ||
| } | ||
|
|
||
| // 步骤2: 无论 TTML 是否找到,都要查找 LRC 文件 | ||
| if (!result.lrc) { | ||
| const lrcPattern = `**/{,*.}${id}.lrc`; | ||
| for (const dir of lyricDirs) { | ||
| try { | ||
| const lrcFiles = await FastGlob(lrcPattern, globOpt(dir)); | ||
| if (lrcFiles.length > 0) { | ||
| const filePath = join(dir, lrcFiles[0]); | ||
| await access(filePath); | ||
| result.lrc = await readFile(filePath, "utf-8"); | ||
| break; | ||
| } | ||
| } catch { | ||
| // 某个路径异常,跳过 | ||
| } | ||
| } | ||
| } | ||
|
|
||
| // 步骤3: 如果 TTML 还没找到,尝试通过文件名匹配 | ||
| if (!result.ttml) { | ||
| const ttmlPattern = `**/{,*.}${id}.ttml`; | ||
| for (const dir of lyricDirs) { | ||
| try { | ||
| const ttmlFiles = await FastGlob(ttmlPattern, globOpt(dir)); | ||
| if (ttmlFiles.length > 0) { | ||
| const filePath = join(dir, ttmlFiles[0]); | ||
| await access(filePath); | ||
| result.ttml = await readFile(filePath, "utf-8"); | ||
| const fileStat = await stat(filePath); | ||
| await cache.set(id, filePath, fileStat.mtimeMs); | ||
| break; | ||
| } | ||
| } catch { | ||
| // 某个路径异常,跳过 | ||
| } | ||
| } | ||
| } | ||
|
|
||
| // 步骤4: 如果 TTML 仍未找到,遍历所有 TTML 文件并通过元数据匹配 ID | ||
| if (!result.ttml) { | ||
| for (const dir of lyricDirs) { | ||
| try { | ||
| const allTtmlFiles = await FastGlob("**/*.ttml", globOpt(dir)); | ||
| for (const fileName of allTtmlFiles) { | ||
| const filePath = join(dir, fileName); | ||
| try { | ||
| const ttmlContent = await readFile(filePath, "utf-8"); | ||
| const extractedId = extractNcmIdFromTTML(ttmlContent); | ||
| if (extractedId === id) { | ||
| result.ttml = ttmlContent; | ||
| const fileStat = await stat(filePath); | ||
| await cache.set(id, filePath, fileStat.mtimeMs); | ||
| break; | ||
| } | ||
| } catch { | ||
| continue; |
There was a problem hiding this comment.
readLocalLyricImpl 函数中的缓存持久化和错误处理逻辑存在一些可以改进的地方:
-
数据丢失风险与性能问题:
- 在步骤1(缓存失效)和步骤3(通过文件名匹配)中,调用了
cache.delete()和cache.set()修改了内存中的缓存,但没有调用cache.save()将更改持久化到文件。这会导致这些更改在应用重启后丢失。 - 在步骤4中,
await cache.save()在for循环内部被调用。这会导致不必要的频繁磁盘写入,影响性能。
- 在步骤1(缓存失效)和步骤3(通过文件名匹配)中,调用了
-
错误处理不充分:
- 多处
try...catch块为空,完全“吞掉”了错误。这使得在文件或目录访问出问题时难以调试。 console.log的使用与项目中ipcLog的日志记录方式不一致。
- 多处
建议进行重构,统一在函数末尾保存缓存,并使用 ipcLog 记录详细的错误信息。
async function readLocalLyricImpl(
lyricDirs: string[],
id: number,
): Promise<{ lrc: string; ttml: string }> {
const result = { lrc: "", ttml: "" };
const cache = await getTtmlIdCache();
let isCacheDirty = false;
// 步骤1: 从缓存中查找 TTML
const cached = cache.getById(id);
if (cached) {
try {
const fileStat = await stat(cached.filePath);
if (fileStat.mtimeMs === cached.mtime) {
const ttmlContent = await readFile(cached.filePath, "utf-8");
result.ttml = ttmlContent;
ipcLog.info(`[readLocalLyric] 从缓存中找到 TTML: ${cached.filePath}`);
} else {
cache.delete(cached.filePath);
isCacheDirty = true;
}
} catch (e) {
ipcLog.warn(`[readLocalLyric] 访问缓存的 TTML 文件失败,删除缓存: ${cached.filePath}`, e);
cache.delete(cached.filePath);
isCacheDirty = true;
}
}
// 步骤2: 无论 TTML 是否找到,都要查找 LRC 文件
if (!result.lrc) {
const lrcPattern = `**/{,*.}${id}.lrc`;
for (const dir of lyricDirs) {
try {
const lrcFiles = await FastGlob(lrcPattern, globOpt(dir));
if (lrcFiles.length > 0) {
const filePath = join(dir, lrcFiles[0]);
await access(filePath);
result.lrc = await readFile(filePath, "utf-8");
break;
}
} catch (e) {
ipcLog.warn(`[readLocalLyric] 查找 LRC 文件时路径异常,跳过: ${dir}`, e);
}
}
}
// 步骤3: 如果 TTML 还没找到,尝试通过文件名匹配
if (!result.ttml) {
const ttmlPattern = `**/{,*.}${id}.ttml`;
for (const dir of lyricDirs) {
try {
const ttmlFiles = await FastGlob(ttmlPattern, globOpt(dir));
if (ttmlFiles.length > 0) {
const filePath = join(dir, ttmlFiles[0]);
await access(filePath);
result.ttml = await readFile(filePath, "utf-8");
const fileStat = await stat(filePath);
cache.set(id, filePath, fileStat.mtimeMs);
isCacheDirty = true;
break;
}
} catch (e) {
ipcLog.warn(`[readLocalLyric] 查找 TTML 文件时路径异常,跳过: ${dir}`, e);
}
}
}
// 步骤4: 如果 TTML 仍未找到,遍历所有 TTML 文件并通过元数据匹配 ID
if (!result.ttml) {
for (const dir of lyricDirs) {
try {
const allTtmlFiles = await FastGlob("**/*.ttml", globOpt(dir));
for (const fileName of allTtmlFiles) {
const filePath = join(dir, fileName);
try {
const ttmlContent = await readFile(filePath, "utf-8");
const extractedId = extractNcmIdFromTTML(ttmlContent);
if (extractedId === id) {
result.ttml = ttmlContent;
const fileStat = await stat(filePath);
cache.set(id, filePath, fileStat.mtimeMs);
isCacheDirty = true;
break;
}
} catch (e) {
ipcLog.warn(`[readLocalLyric] 读取或解析 TTML 文件失败,跳过: ${filePath}`, e);
continue;
}
}
if (result.ttml) break;
} catch (e) {
ipcLog.warn(`[readLocalLyric] 遍历 TTML 目录时异常,跳过: ${dir}`, e);
continue;
}
}
}
if (isCacheDirty) {
await cache.save();
}
return result;
}| /** 从文件加载缓存 */ | ||
| async load(): Promise<void> { | ||
| try { | ||
| if (existsSync(this.cacheFilePath)) { | ||
| const content = await readFile(this.cacheFilePath, "utf-8"); | ||
| const data = JSON.parse(content) as Record<string, TtmlIdCache>; | ||
| this.cache = new Map(Object.entries(data)); | ||
| console.log(`[TtmlIdMappingCache] 加载了 ${this.cache.size} 条缓存`); | ||
| } | ||
| } catch (e) { | ||
| console.error("[TtmlIdMappingCache] 加载缓存失败:", e); | ||
| } | ||
| } | ||
|
|
||
| /** 保存缓存到文件 */ | ||
| private async saveInternal(): Promise<void> { | ||
| if (this.pendingSave) return; | ||
| this.pendingSave = true; | ||
| try { | ||
| const data = Object.fromEntries(this.cache); | ||
| await writeFile(this.cacheFilePath, JSON.stringify(data, null, 2), "utf-8"); | ||
| } catch (e) { |
There was a problem hiding this comment.
在 TtmlIdMappingCache 类中,使用了 console.log 和 console.error 来记录日志。然而,项目中其他地方(包括此文件)已经引入并使用了 ipcLog。为了保持日志记录方式的一致性,便于集中管理和过滤,建议将 console 调用替换为 ipcLog。
/** 从文件加载缓存 */
async load(): Promise<void> {
try {
if (existsSync(this.cacheFilePath)) {
const content = await readFile(this.cacheFilePath, "utf-8");
const data = JSON.parse(content) as Record<string, TtmlIdCache>;
this.cache = new Map(Object.entries(data));
ipcLog.info(`[TtmlIdMappingCache] 加载了 ${this.cache.size} 条缓存`);
}
} catch (e) {
ipcLog.error("[TtmlIdMappingCache] 加载缓存失败:", e);
}
}
/** 保存缓存到文件 */
async save(): Promise<void> {
try {
const data = Object.fromEntries(this.cache);
await writeFile(this.cacheFilePath, JSON.stringify(data, null, 2), "utf-8");
} catch (e) {
ipcLog.error("[TtmlIdMappingCache] 保存缓存失败:", e);
}
}| async set(ncmId: number, filePath: string, mtime: number): Promise<void> { | ||
| this.cache.set(`id:${ncmId}`, { ncmId, filePath, mtime }); | ||
| this.cache.set(`path:${filePath}`, { ncmId, filePath, mtime }); | ||
| await this.saveInternal(); | ||
| } |
There was a problem hiding this comment.
set 方法目前每次调用都会写入磁盘,当在 scanTtmlIdMapping 这样的循环中调用时会非常低效。建议增加一个 autoSave 选项来控制此行为,以便进行批量保存。
async set(
ncmId: number,
filePath: string,
mtime: number,
options: { autoSave: boolean } = { autoSave: true },
): Promise<void> {
this.cache.set(`id:${ncmId}`, { ncmId, filePath, mtime });
this.cache.set(`path:${filePath}`, { ncmId, filePath, mtime });
if (options.autoSave) {
await this.saveInternal();
}
}|
|
||
| return result; | ||
| } | ||
|
|
||
| /** | ||
| * 后台扫描所有歌词目录,构建 TTML-ID 映射缓存 | ||
| * @param lyricDirs 歌词目录列表 | ||
| * @param onProgress 进度回调 (current, total) | ||
| */ | ||
| export async function scanTtmlIdMapping( | ||
| lyricDirs: string[], | ||
| onProgress?: (current: number, total: number) => void, | ||
| ): Promise<number> { | ||
| const cache = await getTtmlIdCache(); | ||
| let scannedCount = 0; | ||
|
|
||
| for (const dir of lyricDirs) { | ||
| try { | ||
| const allTtmlFiles = await FastGlob("**/*.ttml", globOpt(dir)); | ||
| const total = allTtmlFiles.length; | ||
|
|
||
| for (let i = 0; i < allTtmlFiles.length; i++) { | ||
| const fileName = allTtmlFiles[i]; | ||
| const filePath = join(dir, fileName); | ||
|
|
||
| try { | ||
| // 检查是否已有有效缓存 | ||
| const existingCache = cache.getByPath(filePath); | ||
| if (existingCache) { | ||
| const fileStat = await stat(filePath); | ||
| if (fileStat.mtimeMs === existingCache.mtime) { | ||
| continue; // 缓存有效,跳过 | ||
| } | ||
| } | ||
|
|
||
| const ttmlContent = await readFile(filePath, "utf-8"); | ||
| const extractedId = extractNcmIdFromTTML(ttmlContent); | ||
| if (extractedId) { | ||
| const fileStat = await stat(filePath); | ||
| await cache.set(extractedId, filePath, fileStat.mtimeMs); | ||
| scannedCount++; | ||
| } | ||
| } catch { | ||
| continue; | ||
| } | ||
|
|
||
| if (onProgress) { | ||
| onProgress(i + 1, total); |
There was a problem hiding this comment.
为了提高扫描大量文件时的性能,最好在内存中更新缓存,并在扫描结束后一次性保存到磁盘。配合对 set 方法的修改,可以在循环中禁用 autoSave,然后在函数末尾调用 save()。此重构还可避免对同一文件多次调用 stat。
export async function scanTtmlIdMapping(
lyricDirs: string[],
onProgress?: (current: number, total: number) => void,
): Promise<number> {
const cache = await getTtmlIdCache();
let scannedCount = 0;
let hasChanges = false;
for (const dir of lyricDirs) {
try {
const allTtmlFiles = await FastGlob("**/*.ttml", globOpt(dir));
const total = allTtmlFiles.length;
for (let i = 0; i < allTtmlFiles.length; i++) {
const fileName = allTtmlFiles[i];
const filePath = join(dir, fileName);
try {
const fileStat = await stat(filePath);
// 检查是否已有有效缓存
const existingCache = cache.getByPath(filePath);
if (existingCache && fileStat.mtimeMs === existingCache.mtime) {
continue; // 缓存有效,跳过
}
const ttmlContent = await readFile(filePath, "utf-8");
const extractedId = extractNcmIdFromTTML(ttmlContent);
if (extractedId) {
await cache.set(extractedId, filePath, fileStat.mtimeMs, { autoSave: false });
hasChanges = true;
scannedCount++;
}
} catch {
continue;
}
if (onProgress) {
onProgress(i + 1, total);
}
}
} catch {
continue;
}
}
if (hasChanges) {
await cache.save();
}
return scannedCount;
}| const fileStat = await stat(cached.filePath); | ||
| if (fileStat.mtimeMs === cached.mtime) { | ||
| const ttmlContent = await readFile(cached.filePath, "utf-8"); | ||
| result.ttml = ttmlContent; |
| } | ||
| } catch (e) { | ||
| ipcLog.warn(`[readLocalLyric] 访问缓存的 TTML 文件失败,删除缓存: ${cached.filePath}`, e); | ||
| cache.delete(cached.filePath); |
| await access(filePath); | ||
| result.ttml = await readFile(filePath, "utf-8"); | ||
| const fileStat = await stat(filePath); | ||
| cache.set(id, filePath, fileStat.mtimeMs); |
| await access(filePath); | ||
| result.ttml = await readFile(filePath, "utf-8"); | ||
| const fileStat = await stat(filePath); | ||
| cache.set(id, filePath, fileStat.mtimeMs); |
There was a problem hiding this comment.
| const filteredContent = ttmlContent | ||
| .replace( | ||
| /<translation[^>]*\btype="replacement"[^>]*\bxml:lang="zh-Hant"[^>]*>[\s\S]*?<\/translation>/g, | ||
| "", | ||
| ) | ||
| .replace( | ||
| /<translation[^>]*\bxml:lang="zh-Hant"[^>]*\btype="replacement"[^>]*>[\s\S]*?<\/translation>/g, | ||
| "", | ||
| ); |
There was a problem hiding this comment.
There was a problem hiding this comment.
Code Review
本次 PR 引入了对本地 TTML 歌词的缓存支持,通过从元数据中解析网易云音乐 ID 并建立映射,显著提升了歌词的加载性能。主要改动包括:新增了 TtmlIdMappingCache 类来管理缓存的持久化,添加了后台扫描功能 scanTtmlIdMapping 来构建缓存,并重构了 readLocalLyric 以利用新缓存。此外,还进行了一些相关的代码优化。
整体实现结构清晰,逻辑合理。我的审查意见主要集中在以下几个方面以进一步提升代码质量:
- 缓存写入效率:在
readLocalLyricImpl中,建议将分散的缓存写入操作统一到函数末尾执行,避免不必要的重复保存。 - 错误处理:在
scanTtmlIdMapping中,建议增加对目录扫描失败的日志记录,方便问题排查。 - 代码简洁性:在
cleanTTMLTranslations中,可以通过优化正则表达式来简化代码。
这些调整将有助于提高代码的健壮性和性能。做得很好!
| } | ||
| } catch (e) { | ||
| ipcLog.warn(`[readLocalLyric] 访问缓存的 TTML 文件失败,删除缓存: ${cached.filePath}`, e); | ||
| await cache.delete(cached.filePath, { autoSave: false }); |
There was a problem hiding this comment.
在 readLocalLyricImpl 函数中,当访问缓存的 TTML 文件失败时,调用 cache.delete(cached.filePath) 会立即触发一次缓存保存操作(因为 autoSave 默认为 true)。然而,在函数的末尾,如果 isCacheDirty 为 true,会再次触发保存。这会导致不必要的重复保存。为了保持逻辑一致性和效率,建议在这里使用 autoSave: false,并依赖函数末尾的统一保存逻辑。另外,delete 方法是异步的,应该使用 await。
| await cache.delete(cached.filePath, { autoSave: false }); | |
| await cache.delete(cached.filePath, { autoSave: false }); |
| } catch { | ||
| continue; | ||
| } |
| const filteredContent = ttmlContent | ||
| .replace( | ||
| /<translation[^>]*\btype="replacement"[^>]*\bxml:lang="zh-Hant"[^>]*>[\s\S]*?<\/translation>/g, | ||
| "", | ||
| ) | ||
| .replace( | ||
| /<translation[^>]*\bxml:lang="zh-Hant"[^>]*\btype="replacement"[^>]*>[\s\S]*?<\/translation>/g, | ||
| "", | ||
| ); |
There was a problem hiding this comment.
在 cleanTTMLTranslations 函数中,使用了两个连续的 .replace() 调用来过滤 type="replacement" 和 xml:lang="zh-Hant" 的 translation 标签,这两种调用是为了处理属性顺序不同的情况。这种方式虽然能工作,但略显冗余。可以通过使用一个更高效的正则表达式(利用正向先行断言 (?=...))来将这两步合并为一步,使代码更简洁。
const filteredContent = ttmlContent.replace(
/<translation(?=[^>]*\btype="replacement")(?=[^>]*\bxml:lang="zh-Hant")[^>]*>[\s\S]*?<\/translation>/g,
"",
);There was a problem hiding this comment.
Code Review
This pull request introduces a new caching mechanism for TTML lyrics to improve lookup performance. It adds a TtmlIdMappingCache class in MusicMetadataService.ts to store mappings between NCM IDs and TTML file paths, including file modification times for cache validation. A new scanTtmlIdMapping function is implemented to scan lyric directories, extract NCM IDs from TTML files, and populate this cache. The existing readLocalLyric method is refactored to utilize this cache and, if a TTML lyric is not found, it triggers a full scanTtmlIdMapping before re-attempting a lookup. A new IPC handler, scan-ttml-id-mapping, is added to ipc-file.ts to expose this scanning functionality. Additionally, the LyricManager.ts file is updated to enhance TTML cleaning by filtering out 'replacement' type 'zh-Hant' translations and modifying language counting/filtering logic to exclude these. Minor changes include converting a let to const in AutomixManager.ts and adding debug console.log statements in LyricManager.ts.
| try { | ||
| const allTtmlFiles = await FastGlob("**/*.ttml", globOpt(dir)); | ||
| const total = allTtmlFiles.length; | ||
|
|
||
| for (let i = 0; i < allTtmlFiles.length; i++) { | ||
| const fileName = allTtmlFiles[i]; | ||
| const filePath = join(dir, fileName); | ||
|
|
||
| try { | ||
| const fileStat = await stat(filePath); | ||
| // 检查是否已有有效缓存 | ||
| const existingCache = cache.getByPath(filePath); | ||
| if (existingCache && fileStat.mtimeMs === existingCache.mtime) { | ||
| continue; // 缓存有效,跳过 | ||
| } | ||
|
|
||
| const ttmlContent = await readFile(filePath, "utf-8"); | ||
| const extractedIds = extractNcmIdFromTTML(ttmlContent); | ||
| if (extractedIds.length > 0) { | ||
| await cache.set(extractedIds, filePath, fileStat.mtimeMs, { autoSave: false }); | ||
| hasChanges = true; | ||
| scannedCount++; | ||
| } | ||
| } catch (e) { | ||
| ipcLog.warn(`[scanTtmlIdMapping] 处理文件失败,已跳过: ${filePath}`, e); | ||
| continue; | ||
| } | ||
|
|
||
| if (onProgress) { | ||
| onProgress(i + 1, total); | ||
| } | ||
| } | ||
| } catch { | ||
| continue; | ||
| } |
| const filteredContent = ttmlContent | ||
| .replace( | ||
| /<translation[^>]*\btype\s*=\s*["']replacement["'][^>]*\bxml:lang\s*=\s*["']zh-Hant[^"']*["'][^>]*>[\s\S]*?<\/translation>/gi, | ||
| "", | ||
| ) | ||
| .replace( | ||
| /<translation[^>]*\bxml:lang\s*=\s*["']zh-Hant[^"']*["'][^>]*\btype\s*=\s*["']replacement["'][^>]*>[\s\S]*?<\/translation>/gi, | ||
| "", | ||
| ); |
There was a problem hiding this comment.
这里使用了两个非常相似的正则表达式来处理 type 和 xml:lang 属性的不同顺序,这使得代码有些冗余且不易维护。
可以考虑使用一个带有正向预查(positive lookaheads)的正则表达式来合并这两个替换操作,这样可以更优雅地处理任意属性顺序。
const filteredContent = ttmlContent.replace(
/<translation(?=[^>]*\btype\s*=\s*["']replacement["'])(?=[^>]*\bxml:lang\s*=\s*["']zh-Hant[^"]*["'])[^>]*>[\s\S]*?<\/translation>/gi,
"",
);There was a problem hiding this comment.
Code Review
你好,感谢你对本地TTML歌词支持所做的贡献。这次的改动引入了基于元数据ID的缓存机制,这是一个很棒的功能增强。代码整体结构清晰,逻辑也比较完善。我主要在以下几个方面提出了一些建议,希望能帮助你进一步优化代码:
- 性能优化:在
extractNcmIdFromTTML函数中,建议使用Set来替代数组,以提高去重效率。 - 代码健壮性:在
scanTtmlIdMapping函数中,为空的catch块添加日志,以避免静默地忽略错误。 - 代码可维护性:在
cleanTTMLTranslations函数中,可以将处理不同属性顺序的两个正则表达式合并为一个,使代码更简洁。 - 代码整洁:移除在
fetchLocalOverrideLyric函数中用于调试的console.log语句。
希望这些建议对你有帮助!
| } catch { | ||
| continue; | ||
| } |
| const filteredContent = ttmlContent | ||
| .replace( | ||
| /<translation[^>]*\btype\s*=\s*["']replacement["'][^>]*\bxml:lang\s*=\s*["']zh-Hant[^"']*["'][^>]*>[\s\S]*?<\/translation>/gi, | ||
| "", | ||
| ) | ||
| .replace( | ||
| /<translation[^>]*\bxml:lang\s*=\s*["']zh-Hant[^"']*["'][^>]*\btype\s*=\s*["']replacement["'][^>]*>[\s\S]*?<\/translation>/gi, | ||
| "", | ||
| ); |
There was a problem hiding this comment.
这两个 .replace() 调用是为了处理 <translation> 标签中 type 和 xml:lang 属性的不同顺序。这可以通过使用一个更健壮的正则表达式来简化,该表达式使用正向先行断言(positive lookaheads)来检查两个属性的存在,而不管它们的顺序如何。这样可以使代码更简洁、更易于维护。
const filteredContent = ttmlContent.replace(
/<translation(?=[^>]*\btype\s*=\s*["']replacement["'])(?=[^>]*\bxml:lang\s*=\s*["']zh-Hant[^"]*["'])[^>]*>[\s\S]*?<\/translation>/gi,
"",
);| // 从本地遍历 | ||
| try { | ||
| const lyricDirs = Array.isArray(localLyricPath) ? localLyricPath.map((p) => String(p)) : []; | ||
| console.log("[本地歌词] 搜索目录:", lyricDirs, "歌曲ID:", id); |
|
Note Gemini is unable to generate a review for this pull request due to the file types involved not being currently supported. |
|
It looks like there might be a typo in your command. Did you mean to type |
|
/gemini review |
There was a problem hiding this comment.
Code Review
此 PR 引入了 TTML 歌词的缓存机制,通过解析 TTML 元数据中的网易云音乐 ID 并缓存文件路径,来避免重复的文件搜索,从而提高性能。该实现包括了基于文件修改时间的缓存加载、保存和失效逻辑,并增加了一个后台扫描任务来构建和维护缓存。
代码结构清晰,功能实现完整。我发现了一些可以改进的地方:
- 目录扫描逻辑中的错误处理可以更明确,通过记录日志而不是静默忽略错误,能更好地帮助调试。
- 文件扫描过程是串行的,当歌词文件数量庞大时可能会很慢。使用并发处理可以显著提升性能。
总体而言,这是一个很棒的功能,能够改善本地歌词的使用体验。
| for (let i = 0; i < allTtmlFiles.length; i++) { | ||
| const fileName = allTtmlFiles[i]; | ||
| const filePath = join(dir, fileName); | ||
|
|
||
| try { | ||
| const fileStat = await stat(filePath); | ||
| const existingCache = cache.getByPath(filePath); | ||
| if (existingCache && fileStat.mtimeMs === existingCache.mtime) { | ||
| continue; | ||
| } | ||
|
|
||
| const ttmlFull = await readFile(filePath, "utf-8"); | ||
| const ttmlHeader = ttmlFull.substring(0, 5000); | ||
| const extractedIds = extractNcmIdFromTTML(ttmlHeader); | ||
| if (extractedIds.length > 0) { | ||
| await cache.set(extractedIds, filePath, fileStat.mtimeMs, { autoSave: false }); | ||
| hasChanges = true; | ||
| scannedCount++; | ||
| } | ||
| } catch (e) { | ||
| ipcLog.warn(`[scanTtmlIdMapping] 处理文件失败,已跳过: ${filePath}`, e); | ||
| continue; | ||
| } | ||
|
|
||
| if (onProgress) { | ||
| onProgress(i + 1, total); | ||
| } | ||
| } |
| } catch { | ||
| continue; | ||
| } |
|
/gemini review |
| const ids: number[] = []; | ||
| for (const match of matches) { | ||
| if (match[1]) { | ||
| const ncmId = parseInt(match[1], 10); | ||
| if (!isNaN(ncmId) && ncmId > 0 && !ids.includes(ncmId)) { | ||
| ids.push(ncmId); | ||
| } | ||
| } | ||
| } | ||
| return ids; |
There was a problem hiding this comment.
在 extractNcmIdFromTTML 函数中,使用了 Array.prototype.includes() 来检查重复 ID,其时间复杂度为 O(n)。如果 TTML 文件包含大量重复的 NCM ID,这可能会导致性能下降。建议改用 Set 来存储 ID,它提供了 O(1) 的插入和查找复杂度,可以提高去重效率。
| const ids: number[] = []; | |
| for (const match of matches) { | |
| if (match[1]) { | |
| const ncmId = parseInt(match[1], 10); | |
| if (!isNaN(ncmId) && ncmId > 0 && !ids.includes(ncmId)) { | |
| ids.push(ncmId); | |
| } | |
| } | |
| } | |
| return ids; | |
| const ids = new Set<number>(); | |
| for (const match of matches) { | |
| if (match[1]) { | |
| const ncmId = parseInt(match[1], 10); | |
| if (!isNaN(ncmId) && ncmId > 0) { | |
| ids.add(ncmId); | |
| } | |
| } | |
| } | |
| return Array.from(ids); |
| } catch { | ||
| continue; | ||
| } |
| const filteredContent = ttmlContent | ||
| .replace( | ||
| /<translation[^>]*\btype\s*=\s*["']replacement["'][^>]*\bxml:lang\s*=\s*["']zh-Hant[^"']*["'][^>]*>[\s\S]*?<\/translation>/gi, | ||
| "", | ||
| ) | ||
| .replace( | ||
| /<translation[^>]*\bxml:lang\s*=\s*["']zh-Hant[^"']*["'][^>]*\btype\s*=\s*["']replacement["'][^>]*>[\s\S]*?<\/translation>/gi, | ||
| "", | ||
| ); |
There was a problem hiding this comment.
在 cleanTTMLTranslations 函数中,代码使用了两次独立的 .replace() 调用来处理 type 和 xml:lang 属性的不同顺序。这可以通过使用正则表达式的“正向先行断言” (?=...) 合并为一次更高效的操作。这样不仅能避免对字符串进行两次扫描,也让代码更简洁。
const filteredContent = ttmlContent.replace(
/<translation(?=[^>]*\btype\s*=\s*["']replacement["'])(?=[^>]*\bxml:lang\s*=\s*["']zh-Hant[^"']*["'])[^>]*>[\s\S]*?<\/translation>/gi,
"",
);
百分百AI 在本地实验一切正常