feat: add local VL model directory support for PaddleOCR native backend#50
Conversation
- Add PaddleOCRVLMProvider implementing VLMProviderProtocol - Support two modes: - Fast mode: paddleocr ocr command (~1s) - Precise mode: paddleocr doc_parser VL-1.5 (~12s) - Add PaddleOCRMode enum with settings UI - Add cloud API configuration options (baseURL, API key) - Fix JSON parsing for numpy arrays and float formats - Update localization (en/zh-Hans) - Fix screen recording permission check using SCShareableContent Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
- Remove redundant raw values from PaddleOCRMode enum - Add PaddleOCR settings reset in resetToDefaults() - Remove redundant MainActor.run wrapper in SettingsViewModel - Add static defaultBaseURL constant to avoid force unwrap - Add CJK-aware separator in merged(with:) for proper spacing - Replace window counting heuristic with async SCShareableContent check in OnboardingViewModel Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
- testPaddleOCRConnection: check cloud mode first with URL/API validation - checkPermissions: use permissionCheckTask to avoid race conditions - paddleOCRCloudAPIKey: store in Keychain instead of UserDefaults - analyze(image:): check PaddleOCREngine availability only for local mode - isAvailable: mode-aware check (cloud checks URL, local checks installation) - weightedConfidence: handle divide-by-zero edge case Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
- Fix concurrency violation: capture paddleOCRCloudAPIKey before Task.detached - Extract Keychain constants: add serviceIdentifier and paddleOCRAccount to KeychainService - Fix CJK separator: check last char of first string and first char of second string - Update AppSettings to use shared KeychainService constants Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
- Add paddleOCRUseMLXVLM, paddleOCRMLXVLMServerURL, paddleOCRMLXVLMModelName settings - Update PaddleOCREngine.Configuration with MLX-VLM fields - Add --vl_rec_backend, --vl_rec_server_url, --vl_rec_api_model_name args - Add UI settings (visible when precise mode and not using cloud) - Add English and Chinese localizations Usage: 1. Install: pip install "mlx-vlm>=0.3.11" 2. Start server: mlx_vlm.server --port 8111 3. Enable in Settings > PaddleOCR > Use MLX-VLM Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
- Add isMLXVLMServerRunning and isCheckingMLXVLMServer state - Add checkMLXVLMServerStatus() method to verify server connectivity - Add UI with status indicator and refresh button - Add English and Chinese localizations for status messages Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
- Add MLX-VLM check in refreshPaddleOCRStatus() - Add onAppear handler in PaddleOCRStatusSection Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
Keys should not include format specifier %@ Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
- Add localVLModelDir configuration to PaddleOCREngine.Configuration - Support native backend with --vl_rec_model_dir when MLX-VLM is disabled - Add UI input field for local model directory (shows when MLX-VLM unchecked in precise mode) - Fix error message to show correct model info based on provider type - Fix localization key from error.ocr.recognition.failed to error.ocr.failed - Add localization strings for local model directory in EN/CN Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
📝 WalkthroughWalkthrough本 PR 引入 PaddleOCR 作为新的 VLM 提供商(本地/云/MLX‑VLM),扩展设置、模型与 Keychain 支持,并将屏幕录制权限检查从纯同步改为以 SCShareableContent 为准的异步验证,同时在多处使用该异步检查。 Changes
Sequence Diagram(s)sequenceDiagram
participant UI as rgba(60,130,160,0.5) UI/Onboarding
participant CaptureMgr as rgba(80,160,100,0.5) CaptureManager
participant ScreenDet as rgba(200,120,60,0.5) ScreenDetector
participant SCShare as rgba(180,80,200,0.5) SCShareableContent
UI->>CaptureMgr: 请求捕获/检查权限 (await)
CaptureMgr->>ScreenDet: hasPermission() (async)
ScreenDet->>SCShare: CGPreflight 快速检查
alt CGPreflight = false
ScreenDet-->>CaptureMgr: 返回 false
else CGPreflight = true
ScreenDet->>SCShare: 请求 SCShareableContent.current (async)
SCShare-->>ScreenDet: 返回内容或抛错
ScreenDet-->>CaptureMgr: 返回最终权限结果
end
CaptureMgr-->>UI: 权限结果 / 执行捕获
sequenceDiagram
participant UI as rgba(60,130,160,0.5) UI/TranslationRequest
participant EngineFactory as rgba(80,160,100,0.5) ScreenCoderEngine
participant Provider as rgba(200,120,60,0.5) PaddleOCRVLMProvider
participant PaddleEngine as rgba(180,80,200,0.5) PaddleOCREngine
participant MLXVLM as rgba(120,100,220,0.5) MLX-VLM Server
UI->>EngineFactory: 请求分析 (provider = paddleocr)
EngineFactory->>Provider: 返回 PaddleOCRVLMProvider
Provider->>PaddleEngine: buildConfiguration() (读取 AppSettings)
alt useCloud
Provider->>PaddleEngine: 调用云端接口
else useMLXVLM
Provider->>MLXVLM: 调用 MLX-VLM Server for VL
MLXVLM-->>PaddleEngine: 返回 VL 输出
PaddleEngine-->>Provider: 返回精确解析结果
else local
Provider->>PaddleEngine: 调用本地命令/二进制
end
Provider-->>UI: 返回 ScreenAnalysisResult
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Possibly related PRs
兔子诗
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (8)
ScreenTranslate/Services/PaddleOCREngine.swift (2)
615-617: 删除无效代码。
content.replacingOccurrences(of: "[]", with: "[]")是一个空操作,没有任何效果。♻️ 建议删除无效代码
content = content.replacingOccurrences(of: "[,", with: "[") content = content.replacingOccurrences(of: ",]", with: "]") - // Handle edge case of empty nested arrays - content = content.replacingOccurrences(of: "[]", with: "[]") return content🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ScreenTranslate/Services/PaddleOCREngine.swift` around lines 615 - 617, The line content = content.replacingOccurrences(of: "[]", with: "[]") in PaddleOCREngine (near the block handling "edge case of empty nested arrays") is a no-op and should be removed; update the surrounding code in the same function so the early-return still occurs for truly empty content (keep the comment and the return content statement if needed) and delete only the redundant replacingOccurrences call to avoid misleading code.
264-270: 考虑对localVLModelDir进行路径展开。用户可能会输入类似
~/.paddlex/official_models/PaddleOCR-VL-1.5的路径(如 PR 测试计划所述),但~不会被 shell 自动展开。由于命令通过 zsh 执行,这可能可以正常工作,但建议在 Swift 端显式展开以确保可靠性。♻️ 建议的路径展开修复
} else if !config.localVLModelDir.isEmpty { // Use native backend with local model + let expandedPath = (config.localVLModelDir as NSString).expandingTildeInPath args += [ "--vl_rec_backend", "native", - "--vl_rec_model_dir", config.localVLModelDir + "--vl_rec_model_dir", expandedPath ] }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ScreenTranslate/Services/PaddleOCREngine.swift` around lines 264 - 270, Expand any leading tilde in config.localVLModelDir before adding it to args; replace uses of config.localVLModelDir in the PaddleOCREngine argument-building branch (the block that sets "--vl_rec_model_dir") with an expanded path (e.g., use NSString(...).expandingTildeInPath or URL-based expansion) so the final argument passes a full filesystem path rather than a "~" form.ScreenTranslate/Features/Capture/CaptureManager.swift (1)
91-93: 建议与ScreenDetector保持一致。
CaptureManager.hasPermission现在使用SCShareableContent.current进行更可靠的权限验证,但ScreenDetector.hasPermission(lines 128-134) 仍然只使用CGPreflightScreenCaptureAccess()。考虑统一两处的权限检查逻辑以确保一致性。Also applies to: 160-162
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ScreenTranslate/Features/Capture/CaptureManager.swift` around lines 91 - 93, Unify the permission-check logic by changing ScreenDetector.hasPermission (and the other similar check around the second occurrence) to use the same SCShareableContent.current-based approach as CaptureManager.hasPermission instead of CGPreflightScreenCaptureAccess(); update the method to perform the async query to SCShareableContent.current, mirror the same success/failure handling and return/error semantics used by CaptureManager.hasPermission, and ensure any callers awaiting permission are updated to the async pattern.ScreenTranslate/Services/Security/KeychainService.swift (1)
322-370: 存在代码重复,考虑抽取通用方法。
savePaddleOCRCredentials的实现与saveCredentials(apiKey:forCompatibleId:)几乎完全相同,仅账户名称不同。虽然为了保持一致性当前实现是可接受的,但可以考虑将通用逻辑抽取为私有辅助方法以减少重复。♻️ 可选的重构方案
// 私有辅助方法 private func saveCredentialsInternal(apiKey: String, account: String, label: String) throws { // 共享的 Keychain 保存逻辑 } // 公共方法使用辅助方法 func savePaddleOCRCredentials(apiKey: String) throws { try saveCredentialsInternal(apiKey: apiKey, account: Self.paddleOCRAccount, label: "PaddleOCR cloud") } func saveCredentials(apiKey: String, forCompatibleId compatibleId: String) throws { try saveCredentialsInternal(apiKey: apiKey, account: compatibleId, label: "compatible engine \(compatibleId)") }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ScreenTranslate/Services/Security/KeychainService.swift` around lines 322 - 370, The savePaddleOCRCredentials implementation duplicates the keychain save logic in saveCredentials(apiKey:forCompatibleId:); extract the shared logic into a private helper (e.g., saveCredentialsInternal(apiKey:account:label:)) that accepts the apiKey, account string (use Self.paddleOCRAccount or compatibleId) and a descriptive label, perform the JSON encoding, SecItemCopyMatching/SecItemUpdate/SecItemAdd handling and KeychainError handling there, and then have savePaddleOCRCredentials and saveCredentials(apiKey:forCompatibleId:) simply call that helper; keep references to StoredCredentials, KeychainError, and the SecItem APIs unchanged.ScreenTranslate/Services/PaddleOCRVLMProvider.swift (2)
53-71: analyze 方法实现正确,但错误信息可本地化。方法逻辑清晰:先构建配置,检查本地可用性(仅限本地模式),然后执行 OCR 并转换结果。建议将硬编码的安装提示字符串提取为本地化字符串,以支持多语言。
♻️ 建议使用本地化字符串
考虑将安装提示信息提取为本地化字符串:
guard await PaddleOCREngine.shared.isAvailable else { throw VLMProviderError.invalidConfiguration( - "PaddleOCR is not installed. Install it using: pip3 install paddleocr paddlepaddle" + NSLocalizedString("error.paddleocr.not.installed", comment: "PaddleOCR not installed error") ) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ScreenTranslate/Services/PaddleOCRVLMProvider.swift` around lines 53 - 71, The hardcoded error message in analyze is not localized; replace the literal install prompt passed to VLMProviderError.invalidConfiguration with a localized string (e.g. use NSLocalizedString or a Localization helper) and add a localization key like "PaddleOCRNotInstalled" that contains the install instruction ("Install it using: pip3 install paddleocr paddlepaddle") so PaddleOCREngine.shared.isAvailable, analyze, and the thrown VLMProviderError.invalidConfiguration all use the localized message.
232-244: CJK 字符检测覆盖主要范围,但可能遗漏部分字符。当前实现覆盖了 CJK 统一汉字、扩展 A、平假名、片假名和韩文音节。可考虑添加更多范围以提高准确性,但当前实现对于主要使用场景已足够。
♻️ 可选:扩展 CJK 字符范围
可考虑添加以下范围:
- CJK 扩展 B-F: U+20000-U+2FA1F
- 全角标点: U+3000-U+303F
- 日文标点符号
return (0x4E00...0x9FFF).contains(scalar) || (0x3400...0x4DBF).contains(scalar) || (0x3040...0x309F).contains(scalar) || (0x30A0...0x30FF).contains(scalar) || - (0xAC00...0xD7AF).contains(scalar) + (0xAC00...0xD7AF).contains(scalar) || + (0x3000...0x303F).contains(scalar) // CJK 符号和标点🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ScreenTranslate/Services/PaddleOCRVLMProvider.swift` around lines 232 - 244, The isCJKChar(_:) function currently checks common CJK blocks but omits extended ranges and fullwidth punctuation; update isCJKChar to include additional Unicode scalar ranges such as CJK Symbols and Punctuation (U+3000–U+303F), Fullwidth Forms (U+FF00–U+FFEF), and the CJK Extension planes (U+20000–U+2FA1F) so characters from extensions B–F and fullwidth/japanese punctuation are detected; adjust logic in isCJKChar to test these ranges against char.unicodeScalars (handling scalars > 0xFFFF correctly) and return true when any of these ranges match.ScreenTranslate/Models/AppSettings.swift (1)
506-519: 重置逻辑完整,但Task.detached中错误被静默忽略。重置 PaddleOCR 设置的逻辑是完整的,包括清除 Keychain 凭证。但
try?会静默忽略删除 Keychain 凭证时的错误,建议至少记录错误日志以便调试。♻️ 建议添加错误日志
// Delete PaddleOCR cloud API key from Keychain Task.detached { - try? await KeychainService.shared.deletePaddleOCRCredentials() + do { + try await KeychainService.shared.deletePaddleOCRCredentials() + } catch { + Logger.settings.warning("Failed to delete PaddleOCR credentials from Keychain: \(error)") + } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ScreenTranslate/Models/AppSettings.swift` around lines 506 - 519, The Task.detached block that calls KeychainService.shared.deletePaddleOCRCredentials() currently swallows errors with try?, change it to await the call inside a do-catch (or use Task { } with try and catch) and log any caught error instead of ignoring it; update the block that references KeychainService.shared.deletePaddleOCRCredentials(), paddleOCR settings reset, and Task.detached to perform do { try await KeychainService.shared.deletePaddleOCRCredentials() } catch { <log the error via the project's logger or os_log/print with context> } so deletion failures are recorded.ScreenTranslate/Features/Settings/SettingsViewModel.swift (1)
869-896: MLX-VLM 服务状态检查的 HTTP 状态码判断可能过于宽松。当前逻辑将所有
statusCode < 500都视为服务运行中,这包括了 4xx 客户端错误(如 400 Bad Request、404 Not Found)。对于健康检查,通常只有 2xx 状态码才表示服务正常运行。♻️ 建议收紧状态码判断
if let httpResponse = response as? HTTPURLResponse { - isRunning = httpResponse.statusCode < 500 + // 2xx 或 3xx 表示服务可用 + isRunning = (200..<400).contains(httpResponse.statusCode) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ScreenTranslate/Features/Settings/SettingsViewModel.swift` around lines 869 - 896, The health-check treats any HTTP statusCode < 500 as "running", which incorrectly includes 4xx errors; update the Task.detached block that uses paddleOCRMLXVLMServerURL to consider the server running only for 2xx responses (e.g., check if let httpResponse = response as? HTTPURLResponse { isRunning = (200...299).contains(httpResponse.statusCode) }), leaving the MainActor updates to isMLXVLMServerRunning and isCheckingMLXVLMServer unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@ScreenTranslate/Features/Capture/CaptureManager.swift`:
- Around line 91-93: Unify the permission-check logic by changing
ScreenDetector.hasPermission (and the other similar check around the second
occurrence) to use the same SCShareableContent.current-based approach as
CaptureManager.hasPermission instead of CGPreflightScreenCaptureAccess(); update
the method to perform the async query to SCShareableContent.current, mirror the
same success/failure handling and return/error semantics used by
CaptureManager.hasPermission, and ensure any callers awaiting permission are
updated to the async pattern.
In `@ScreenTranslate/Features/Settings/SettingsViewModel.swift`:
- Around line 869-896: The health-check treats any HTTP statusCode < 500 as
"running", which incorrectly includes 4xx errors; update the Task.detached block
that uses paddleOCRMLXVLMServerURL to consider the server running only for 2xx
responses (e.g., check if let httpResponse = response as? HTTPURLResponse {
isRunning = (200...299).contains(httpResponse.statusCode) }), leaving the
MainActor updates to isMLXVLMServerRunning and isCheckingMLXVLMServer unchanged.
In `@ScreenTranslate/Models/AppSettings.swift`:
- Around line 506-519: The Task.detached block that calls
KeychainService.shared.deletePaddleOCRCredentials() currently swallows errors
with try?, change it to await the call inside a do-catch (or use Task { } with
try and catch) and log any caught error instead of ignoring it; update the block
that references KeychainService.shared.deletePaddleOCRCredentials(), paddleOCR
settings reset, and Task.detached to perform do { try await
KeychainService.shared.deletePaddleOCRCredentials() } catch { <log the error via
the project's logger or os_log/print with context> } so deletion failures are
recorded.
In `@ScreenTranslate/Services/PaddleOCREngine.swift`:
- Around line 615-617: The line content = content.replacingOccurrences(of: "[]",
with: "[]") in PaddleOCREngine (near the block handling "edge case of empty
nested arrays") is a no-op and should be removed; update the surrounding code in
the same function so the early-return still occurs for truly empty content (keep
the comment and the return content statement if needed) and delete only the
redundant replacingOccurrences call to avoid misleading code.
- Around line 264-270: Expand any leading tilde in config.localVLModelDir before
adding it to args; replace uses of config.localVLModelDir in the PaddleOCREngine
argument-building branch (the block that sets "--vl_rec_model_dir") with an
expanded path (e.g., use NSString(...).expandingTildeInPath or URL-based
expansion) so the final argument passes a full filesystem path rather than a "~"
form.
In `@ScreenTranslate/Services/PaddleOCRVLMProvider.swift`:
- Around line 53-71: The hardcoded error message in analyze is not localized;
replace the literal install prompt passed to
VLMProviderError.invalidConfiguration with a localized string (e.g. use
NSLocalizedString or a Localization helper) and add a localization key like
"PaddleOCRNotInstalled" that contains the install instruction ("Install it
using: pip3 install paddleocr paddlepaddle") so
PaddleOCREngine.shared.isAvailable, analyze, and the thrown
VLMProviderError.invalidConfiguration all use the localized message.
- Around line 232-244: The isCJKChar(_:) function currently checks common CJK
blocks but omits extended ranges and fullwidth punctuation; update isCJKChar to
include additional Unicode scalar ranges such as CJK Symbols and Punctuation
(U+3000–U+303F), Fullwidth Forms (U+FF00–U+FFEF), and the CJK Extension planes
(U+20000–U+2FA1F) so characters from extensions B–F and fullwidth/japanese
punctuation are detected; adjust logic in isCJKChar to test these ranges against
char.unicodeScalars (handling scalars > 0xFFFF correctly) and return true when
any of these ranges match.
In `@ScreenTranslate/Services/Security/KeychainService.swift`:
- Around line 322-370: The savePaddleOCRCredentials implementation duplicates
the keychain save logic in saveCredentials(apiKey:forCompatibleId:); extract the
shared logic into a private helper (e.g.,
saveCredentialsInternal(apiKey:account:label:)) that accepts the apiKey, account
string (use Self.paddleOCRAccount or compatibleId) and a descriptive label,
perform the JSON encoding, SecItemCopyMatching/SecItemUpdate/SecItemAdd handling
and KeychainError handling there, and then have savePaddleOCRCredentials and
saveCredentials(apiKey:forCompatibleId:) simply call that helper; keep
references to StoredCredentials, KeychainError, and the SecItem APIs unchanged.
ℹ️ Review info
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (13)
ScreenTranslate/Features/Capture/CaptureManager.swiftScreenTranslate/Features/Onboarding/OnboardingViewModel.swiftScreenTranslate/Features/Settings/EngineSettingsTab.swiftScreenTranslate/Features/Settings/SettingsViewModel.swiftScreenTranslate/Features/TranslationFlow/TranslationFlowController.swiftScreenTranslate/Models/AppSettings.swiftScreenTranslate/Models/VLMProviderType.swiftScreenTranslate/Resources/en.lproj/Localizable.stringsScreenTranslate/Resources/zh-Hans.lproj/Localizable.stringsScreenTranslate/Services/PaddleOCREngine.swiftScreenTranslate/Services/PaddleOCRVLMProvider.swiftScreenTranslate/Services/ScreenCoderEngine.swiftScreenTranslate/Services/Security/KeychainService.swift
- Fix ScreenDetector.hasPermission to use SCShareableContent.current - Fix health-check to only accept 2xx responses (not <500) - Add error logging for keychain deletion failures - Remove no-op replacingOccurrences in PaddleOCREngine - Add tilde expansion for localVLModelDir path - Localize PaddleOCR not installed error message - Extend isCJKChar with CJK punctuation, fullwidth forms, extensions B-F - Extract saveCredentialsInternal helper to eliminate duplicate keychain logic Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Actionable comments posted: 8
🧹 Nitpick comments (1)
ScreenTranslate/Features/Capture/ScreenDetector.swift (1)
128-145:cachedPermissionStatus被写入但从未被读取。方法设置了
cachedPermissionStatus的值,但该缓存从未在后续权限检查中被使用。如果缓存是有意设计的,应在执行检查前先读取缓存值;如果不需要缓存,建议移除该属性以避免产生死代码。♻️ 建议:使用缓存或移除该属性
方案 1:移除未使用的缓存属性(如果不需要缓存)
- /// Cached permission status - only check once to avoid repeated dialogs - private var cachedPermissionStatus: Bool?并移除方法中所有对
cachedPermissionStatus的赋值。方案 2:实际使用缓存(如果需要避免重复检查)
func hasPermission(silent: Bool = true) async -> Bool { + // Return cached result if available + if let cached = cachedPermissionStatus { + if !silent { print("[ScreenDetector] Permission check: using cached result \(cached)") } + return cached + } + // Quick check first using CGPreflightScreenCaptureAccess guard CGPreflightScreenCaptureAccess() else {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ScreenTranslate/Features/Capture/ScreenDetector.swift` around lines 128 - 145, cachedPermissionStatus 在 hasPermission(_:silent:) 中被写入但未读取,导致死代码; either remove the cachedPermissionStatus property and its assignments from ScreenDetector and leave hasPermission to always perform the checks, or actually use the cache by reading cachedPermissionStatus at the start of hasPermission (e.g. if let cached = cachedPermissionStatus { return cached }) and only perform CGPreflightScreenCaptureAccess() / try await SCShareableContent.current when cache is nil, then assign cachedPermissionStatus = true/false before returning; update references to the symbols cachedPermissionStatus, hasPermission, CGPreflightScreenCaptureAccess, and SCShareableContent.current accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.omd/project-memory.json:
- Line 4: The projectMemory file currently contains a hardcoded local absolute
path in the projectRoot key which exposes a developer's local filesystem; update
the .omd/project-memory.json to remove or neutralize that sensitive value by
replacing the absolute path stored in the "projectRoot" key with a non-sensitive
relative path or a placeholder (e.g. "<PROJECT_ROOT>") and ensure this file is
not committed for local-specific values (add .omd/project-memory.json to
.gitignore or switch to templated/configured generation at runtime); also scrub
the repository history or create a new commit that removes the leaked path so
the sensitive data is not retained in the repo.
- Around line 153-175: The .omd project-memory hotPaths contains incorrect
"type" values (entries where "path" points to specific Swift source files like
AppSettings.swift, PaddleOCRVLMProvider.swift, SettingsViewModel.swift are
marked "directory" instead of "file"); update those JSON entries under the
"hotPaths" array to set "type":"file" (or regenerate the project-memory with the
.omd tool so types are correct), then remove the generated project-memory from
version control and add the .omd/project-memory.json pattern to .gitignore so
this unreliable auto-generated data is not committed again.
- Around line 1-244: The .omd/project-memory.json file is generated metadata and
must not be tracked; remove .omd/ from the PR and stop tracking these files by
adding the .omd/ directory to .gitignore and removing the tracked file from the
index, then commit the change; reference the tracked file
".omd/project-memory.json" and ensure .gitignore contains an entry for ".omd/"
(consistent with other ignored dirs like .claude/ .specify/ .ralph-tui/) so
future autogenerated .omd files are not committed.
In `@ScreenTranslate/Models/AppSettings.swift`:
- Around line 318-331: The didSet for paddleOCRCloudAPIKey currently always
spawns a save Task.detached which races with separate delete Tasks; modify the
didSet on paddleOCRCloudAPIKey to treat an empty capturedKey as a delete
operation (call the KeychainService.deletePaddleOCRCredentials or equivalent)
and non-empty as save (call KeychainService.shared.savePaddleOCRCredentials), so
all persistence/removal logic is centralized in the paddleOCRCloudAPIKey didSet
(using Task.detached for the single operation), and remove any separate delete
Task calls (e.g., in resetToDefaults) so assignments like paddleOCRCloudAPIKey =
"" go through this unified path.
In `@ScreenTranslate/Resources/en.lproj/Localizable.strings`:
- Line 606: Add the missing compatibility localization keys so the app doesn't
fall back to raw keys: add entries for "error.paddleocr.not.installed" and
"error.paddleocr.recovery" with the same value as "error.paddleocr.notInstalled"
(the keys to update/insert are error.paddleocr.notInstalled,
error.paddleocr.not.installed, and error.paddleocr.recovery) so any code reading
either key will get the localized message.
In `@ScreenTranslate/Resources/zh-Hans.lproj/Localizable.strings`:
- Line 606: The zh-Hans localization uses the key error.paddleocr.notInstalled
but the codebase still references error.paddleocr.not.installed (and the
.recovery variant), causing missing translations; update the Localizable.strings
entry to preserve the original keys for compatibility by adding/keeping entries
for error.paddleocr.not.installed and error.paddleocr.not.installed.recovery
(with the same translated value as error.paddleocr.notInstalled) so both key
forms resolve to the correct Chinese message.
In `@ScreenTranslate/Services/PaddleOCREngine.swift`:
- Around line 48-55: The new cloud configuration (useCloud, cloudBaseURL,
cloudAPIKey) is exposed but recognize still unconditionally checks local
installation and returns notInstalled; update recognize to branch on useCloud:
when useCloud == true skip the local-install check and call the cloud
recognition path using cloudBaseURL/cloudAPIKey, otherwise keep the existing
local flow (including the local availability check and local OCR calls). Ensure
error handling and return types remain the same so cloud errors propagate as
before.
- Around line 264-271: The code appends a user-controlled expandedPath (from
config.localVLModelDir) into args and later joins args into a shell string,
which breaks paths with spaces and exposes command-injection risk; modify the
caller to execute the subprocess with an argument array (no shell interpolation)
and ensure PaddleOCREngine's use of args (and the code constructing it,
including the expandedPath variable and the "--vl_rec_model_dir" entry) is
passed directly to the process execution API that accepts [String] rather than
being joined into a single shell command string so arguments are not re-parsed
by a shell.
---
Nitpick comments:
In `@ScreenTranslate/Features/Capture/ScreenDetector.swift`:
- Around line 128-145: cachedPermissionStatus 在 hasPermission(_:silent:)
中被写入但未读取,导致死代码; either remove the cachedPermissionStatus property and its
assignments from ScreenDetector and leave hasPermission to always perform the
checks, or actually use the cache by reading cachedPermissionStatus at the start
of hasPermission (e.g. if let cached = cachedPermissionStatus { return cached })
and only perform CGPreflightScreenCaptureAccess() / try await
SCShareableContent.current when cache is nil, then assign cachedPermissionStatus
= true/false before returning; update references to the symbols
cachedPermissionStatus, hasPermission, CGPreflightScreenCaptureAccess, and
SCShareableContent.current accordingly.
ℹ️ Review info
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (10)
.omd/project-memory.json.omd/sessions/0c203f54-c10d-4417-8115-005c18e9036b.jsonScreenTranslate/Features/Capture/ScreenDetector.swiftScreenTranslate/Features/Settings/SettingsViewModel.swiftScreenTranslate/Models/AppSettings.swiftScreenTranslate/Resources/en.lproj/Localizable.stringsScreenTranslate/Resources/zh-Hans.lproj/Localizable.stringsScreenTranslate/Services/PaddleOCREngine.swiftScreenTranslate/Services/PaddleOCRVLMProvider.swiftScreenTranslate/Services/Security/KeychainService.swift
✅ Files skipped from review due to trivial changes (1)
- .omd/sessions/0c203f54-c10d-4417-8115-005c18e9036b.json
| { | ||
| "version": "1.0.0", | ||
| "lastScanned": 1772269310504, | ||
| "projectRoot": "/Users/hubo/.superset/worktrees/screentranslate/featadd-translate-engine", | ||
| "techStack": { | ||
| "languages": [], | ||
| "frameworks": [], | ||
| "packageManager": null, | ||
| "runtime": null | ||
| }, | ||
| "build": { | ||
| "buildCommand": null, | ||
| "testCommand": null, | ||
| "lintCommand": null, | ||
| "devCommand": null, | ||
| "scripts": {} | ||
| }, | ||
| "conventions": { | ||
| "namingStyle": null, | ||
| "importStyle": null, | ||
| "testPattern": null, | ||
| "fileOrganization": null | ||
| }, | ||
| "structure": { | ||
| "isMonorepo": false, | ||
| "workspaces": [], | ||
| "mainDirectories": [ | ||
| "docs" | ||
| ], | ||
| "gitBranches": { | ||
| "defaultBranch": "main", | ||
| "branchingStrategy": null | ||
| } | ||
| }, | ||
| "customNotes": [], | ||
| "directoryMap": { | ||
| "Build": { | ||
| "path": "Build", | ||
| "purpose": "Build output", | ||
| "fileCount": 1, | ||
| "lastAccessed": 1772269310491, | ||
| "keyFiles": [] | ||
| }, | ||
| "ScreenTranslate": { | ||
| "path": "ScreenTranslate", | ||
| "purpose": null, | ||
| "fileCount": 0, | ||
| "lastAccessed": 1772269310492, | ||
| "keyFiles": [] | ||
| }, | ||
| "ScreenTranslate.xcodeproj": { | ||
| "path": "ScreenTranslate.xcodeproj", | ||
| "purpose": null, | ||
| "fileCount": 1, | ||
| "lastAccessed": 1772269310492, | ||
| "keyFiles": [ | ||
| "project.pbxproj" | ||
| ] | ||
| }, | ||
| "ScreenTranslateTests": { | ||
| "path": "ScreenTranslateTests", | ||
| "purpose": null, | ||
| "fileCount": 5, | ||
| "lastAccessed": 1772269310493, | ||
| "keyFiles": [ | ||
| "KeyboardShortcutTests.swift", | ||
| "README.md", | ||
| "ScreenTranslateErrorTests.swift", | ||
| "ShortcutRecordingTypeTests.swift", | ||
| "TextTranslationErrorTests.swift" | ||
| ] | ||
| }, | ||
| "docs": { | ||
| "path": "docs", | ||
| "purpose": "Documentation", | ||
| "fileCount": 6, | ||
| "lastAccessed": 1772269310493, | ||
| "keyFiles": [ | ||
| "README.md", | ||
| "api-reference.md", | ||
| "architecture.md", | ||
| "components.md", | ||
| "developer-guide.md" | ||
| ] | ||
| }, | ||
| "skills": { | ||
| "path": "skills", | ||
| "purpose": null, | ||
| "fileCount": 0, | ||
| "lastAccessed": 1772269310494, | ||
| "keyFiles": [] | ||
| }, | ||
| "tasks": { | ||
| "path": "tasks", | ||
| "purpose": null, | ||
| "fileCount": 6, | ||
| "lastAccessed": 1772269310494, | ||
| "keyFiles": [ | ||
| "prd-.md", | ||
| "prd-macos-screentranslate.md", | ||
| "prd-screencoder-kiss-translator.md", | ||
| "prd-screencoder.md", | ||
| "prd-text-translation.json" | ||
| ] | ||
| }, | ||
| "ScreenTranslate/App": { | ||
| "path": "ScreenTranslate/App", | ||
| "purpose": "Application code", | ||
| "fileCount": 2, | ||
| "lastAccessed": 1772269310495, | ||
| "keyFiles": [ | ||
| "AppDelegate.swift", | ||
| "ScreenTranslateApp.swift" | ||
| ] | ||
| }, | ||
| "ScreenTranslate/Models": { | ||
| "path": "ScreenTranslate/Models", | ||
| "purpose": "Data models", | ||
| "fileCount": 23, | ||
| "lastAccessed": 1772269310495, | ||
| "keyFiles": [ | ||
| "Annotation.swift", | ||
| "AppLanguage.swift", | ||
| "AppSettings.swift" | ||
| ] | ||
| }, | ||
| "ScreenTranslate/Services": { | ||
| "path": "ScreenTranslate/Services", | ||
| "purpose": "Business logic services", | ||
| "fileCount": 26, | ||
| "lastAccessed": 1772269310495, | ||
| "keyFiles": [ | ||
| "AccessibilityPermissionChecker.swift", | ||
| "AppleTranslationProvider.swift", | ||
| "ClaudeVLMProvider.swift" | ||
| ] | ||
| } | ||
| }, | ||
| "hotPaths": [ | ||
| { | ||
| "path": "ScreenTranslate/Services/PaddleOCREngine.swift", | ||
| "accessCount": 17, | ||
| "lastAccessed": 1772277198204, | ||
| "type": "file" | ||
| }, | ||
| { | ||
| "path": "ScreenTranslate/Services/Security/KeychainService.swift", | ||
| "accessCount": 14, | ||
| "lastAccessed": 1772277575721, | ||
| "type": "file" | ||
| }, | ||
| { | ||
| "path": "ScreenTranslate/Models/AppSettings.swift", | ||
| "accessCount": 13, | ||
| "lastAccessed": 1772277135092, | ||
| "type": "directory" | ||
| }, | ||
| { | ||
| "path": "ScreenTranslate/Resources/en.lproj/Localizable.strings", | ||
| "accessCount": 6, | ||
| "lastAccessed": 1772277251555, | ||
| "type": "file" | ||
| }, | ||
| { | ||
| "path": "ScreenTranslate/Services/PaddleOCRVLMProvider.swift", | ||
| "accessCount": 6, | ||
| "lastAccessed": 1772277354512, | ||
| "type": "directory" | ||
| }, | ||
| { | ||
| "path": "ScreenTranslate/Features/Settings/SettingsViewModel.swift", | ||
| "accessCount": 5, | ||
| "lastAccessed": 1772277092578, | ||
| "type": "directory" | ||
| }, | ||
| { | ||
| "path": "ScreenTranslate/Resources", | ||
| "accessCount": 4, | ||
| "lastAccessed": 1772271181502, | ||
| "type": "directory" | ||
| }, | ||
| { | ||
| "path": "ScreenTranslate/Features/Settings/EngineSettingsTab.swift", | ||
| "accessCount": 4, | ||
| "lastAccessed": 1772274724758, | ||
| "type": "directory" | ||
| }, | ||
| { | ||
| "path": "ScreenTranslate/Resources/zh-Hans.lproj/Localizable.strings", | ||
| "accessCount": 4, | ||
| "lastAccessed": 1772277283269, | ||
| "type": "file" | ||
| }, | ||
| { | ||
| "path": "ScreenTranslate/Features/TranslationFlow/TranslationFlowController.swift", | ||
| "accessCount": 3, | ||
| "lastAccessed": 1772270086752, | ||
| "type": "file" | ||
| }, | ||
| { | ||
| "path": "ScreenTranslate", | ||
| "accessCount": 3, | ||
| "lastAccessed": 1772271133277, | ||
| "type": "directory" | ||
| }, | ||
| { | ||
| "path": "ScreenTranslate/Features/Capture/ScreenDetector.swift", | ||
| "accessCount": 3, | ||
| "lastAccessed": 1772277067534, | ||
| "type": "directory" | ||
| }, | ||
| { | ||
| "path": "ScreenTranslate/Features/Capture/CaptureManager.swift", | ||
| "accessCount": 2, | ||
| "lastAccessed": 1772277030873, | ||
| "type": "directory" | ||
| }, | ||
| { | ||
| "path": "", | ||
| "accessCount": 1, | ||
| "lastAccessed": 1772269949125, | ||
| "type": "directory" | ||
| }, | ||
| { | ||
| "path": "ScreenTranslate/Errors/ScreenTranslateError.swift", | ||
| "accessCount": 1, | ||
| "lastAccessed": 1772269960584, | ||
| "type": "file" | ||
| }, | ||
| { | ||
| "path": "ScreenTranslate/Models/VLMProviderType.swift", | ||
| "accessCount": 1, | ||
| "lastAccessed": 1772269960585, | ||
| "type": "file" | ||
| }, | ||
| { | ||
| "path": "ScreenTranslate/Services/ScreenCoderEngine.swift", | ||
| "accessCount": 1, | ||
| "lastAccessed": 1772270035460, | ||
| "type": "file" | ||
| } | ||
| ], | ||
| "userDirectives": [] | ||
| } No newline at end of file |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check if .gitignore exists and contains .omd rules
echo "=== Checking .gitignore ==="
if [ -f .gitignore ]; then
echo ".gitignore exists. Checking for .omd rules:"
grep -n "\.omd\|omd" .gitignore || echo "No .omd rules found"
else
echo ".gitignore does not exist"
fi
echo -e "\n=== Checking if .omd directory exists ==="
if [ -d .omd ]; then
echo ".omd directory exists. Contents:"
find .omd -type f | head -20
else
echo ".omd directory does not exist"
fi
echo -e "\n=== Checking file details ==="
if [ -f .omd/project-memory.json ]; then
echo ".omd/project-memory.json exists"
wc -l .omd/project-memory.json
echo -e "\nFirst 10 lines:"
head -10 .omd/project-memory.json
fiRepository: hubo1989/ScreenTranslate
Length of output: 687
🏁 Script executed:
#!/bin/bash
# List all top-level files to see what's actually in the repo
echo "=== Top-level files/directories ==="
ls -la | head -20
echo -e "\n=== Check git status if in a repo ==="
if [ -d .git ]; then
echo "Repository found. Checking git status:"
git status --short | head -20
fiRepository: hubo1989/ScreenTranslate
Length of output: 1355
🏁 Script executed:
#!/bin/bash
# Check if .omd files are staged or tracked in git
echo "=== Check if .omd files are tracked/staged in git ==="
git ls-files .omd 2>/dev/null || echo "No .omd files tracked in git"
echo -e "\n=== Check git status for .omd directory ==="
git status .omd 2>/dev/null || echo "Not in git tracking"
echo -e "\n=== Check .gitignore content ==="
cat .gitignoreRepository: hubo1989/ScreenTranslate
Length of output: 1409
不应将 .omd/ 目录提交到版本控制。
.omd/project-memory.json 和相关会话文件是开发工具自动生成的元数据,包含硬编码的开发者本地路径 /Users/hubo/.superset/worktrees/screentranslate/featadd-translate-engine、机器特定的访问时间戳和访问计数等开发环境信息。这类文件当前已被追踪在 git 中,会导致:
- 泄露开发者本地环境信息
- 每位开发者的访问模式不同,频繁产生合并冲突
- 对生产代码无任何价值,增加仓库噪音
建议将 .omd/ 添加到 .gitignore(与现有的 .claude/、.specify/、.ralph-tui/ 等配置一致)并从此 PR 中移除该文件:
# Development tools (Claude Code specific)
.claude/
.specify/
+ .omd/
.ralph-tui/🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.omd/project-memory.json around lines 1 - 244, The .omd/project-memory.json
file is generated metadata and must not be tracked; remove .omd/ from the PR and
stop tracking these files by adding the .omd/ directory to .gitignore and
removing the tracked file from the index, then commit the change; reference the
tracked file ".omd/project-memory.json" and ensure .gitignore contains an entry
for ".omd/" (consistent with other ignored dirs like .claude/ .specify/
.ralph-tui/) so future autogenerated .omd files are not committed.
| { | ||
| "version": "1.0.0", | ||
| "lastScanned": 1772269310504, | ||
| "projectRoot": "/Users/hubo/.superset/worktrees/screentranslate/featadd-translate-engine", |
There was a problem hiding this comment.
硬编码的本地路径泄露了开发者环境信息。
projectRoot 包含完整的本地用户目录路径 /Users/hubo/.superset/worktrees/...,这属于敏感的开发环境信息,不应出现在公开仓库中。
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.omd/project-memory.json at line 4, The projectMemory file currently
contains a hardcoded local absolute path in the projectRoot key which exposes a
developer's local filesystem; update the .omd/project-memory.json to remove or
neutralize that sensitive value by replacing the absolute path stored in the
"projectRoot" key with a non-sensitive relative path or a placeholder (e.g.
"<PROJECT_ROOT>") and ensure this file is not committed for local-specific
values (add .omd/project-memory.json to .gitignore or switch to
templated/configured generation at runtime); also scrub the repository history
or create a new commit that removes the leaked path so the sensitive data is not
retained in the repo.
| "path": "ScreenTranslate/Models/AppSettings.swift", | ||
| "accessCount": 13, | ||
| "lastAccessed": 1772277135092, | ||
| "type": "directory" | ||
| }, | ||
| { | ||
| "path": "ScreenTranslate/Resources/en.lproj/Localizable.strings", | ||
| "accessCount": 6, | ||
| "lastAccessed": 1772277251555, | ||
| "type": "file" | ||
| }, | ||
| { | ||
| "path": "ScreenTranslate/Services/PaddleOCRVLMProvider.swift", | ||
| "accessCount": 6, | ||
| "lastAccessed": 1772277354512, | ||
| "type": "directory" | ||
| }, | ||
| { | ||
| "path": "ScreenTranslate/Features/Settings/SettingsViewModel.swift", | ||
| "accessCount": 5, | ||
| "lastAccessed": 1772277092578, | ||
| "type": "directory" | ||
| }, |
There was a problem hiding this comment.
hotPaths 中 type 字段存在数据不一致。
部分文件路径被标记为 "type": "directory",例如:
- 第 157 行:
AppSettings.swift→"type": "directory" - 第 169 行:
PaddleOCRVLMProvider.swift→"type": "directory" - 第 175 行:
SettingsViewModel.swift→"type": "directory"
这是 .omd 工具的数据错误,进一步说明此文件不应被版本控制——它包含不可靠的自动生成数据。
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.omd/project-memory.json around lines 153 - 175, The .omd project-memory
hotPaths contains incorrect "type" values (entries where "path" points to
specific Swift source files like AppSettings.swift, PaddleOCRVLMProvider.swift,
SettingsViewModel.swift are marked "directory" instead of "file"); update those
JSON entries under the "hotPaths" array to set "type":"file" (or regenerate the
project-memory with the .omd tool so types are correct), then remove the
generated project-memory from version control and add the
.omd/project-memory.json pattern to .gitignore so this unreliable auto-generated
data is not committed again.
| var paddleOCRCloudAPIKey: String { | ||
| didSet { | ||
| // Capture the value on the actor before spawning detached task | ||
| let capturedKey = paddleOCRCloudAPIKey | ||
| // Save to Keychain asynchronously | ||
| Task.detached { | ||
| do { | ||
| try await KeychainService.shared.savePaddleOCRCredentials(apiKey: capturedKey) | ||
| } catch { | ||
| Logger.settings.error("Failed to save PaddleOCR cloud API key to Keychain: \(error)") | ||
| } | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
cat -n ScreenTranslate/Models/AppSettings.swift | sed -n '315,335p'Repository: hubo1989/ScreenTranslate
Length of output: 1042
🏁 Script executed:
cat -n ScreenTranslate/Models/AppSettings.swift | sed -n '500,525p'Repository: hubo1989/ScreenTranslate
Length of output: 1376
🏁 Script executed:
rg -n 'class KeychainService|func savePaddleOCRCredentials|func deletePaddleOCRCredentials' --type swiftRepository: hubo1989/ScreenTranslate
Length of output: 293
🏁 Script executed:
cat -n ScreenTranslate/Services/Security/KeychainService.swift | sed -n '296,310p'Repository: hubo1989/ScreenTranslate
Length of output: 701
🏁 Script executed:
cat -n ScreenTranslate/Services/Security/KeychainService.swift | sed -n '339,360p'Repository: hubo1989/ScreenTranslate
Length of output: 902
🏁 Script executed:
rg -n 'actor AppSettings|class AppSettings' ScreenTranslate/Models/AppSettings.swift | head -5Repository: hubo1989/ScreenTranslate
Length of output: 96
🏁 Script executed:
rg -n 'nonisolated' ScreenTranslate/Models/AppSettings.swift | head -10Repository: hubo1989/ScreenTranslate
Length of output: 50
🏁 Script executed:
cat -n ScreenTranslate/Models/AppSettings.swift | sed -n '1,30p'Repository: hubo1989/ScreenTranslate
Length of output: 1174
🏁 Script executed:
rg -n '@MainActor|nonisolated.*var paddleOCRCloudAPIKey' ScreenTranslate/Models/AppSettings.swift | head -15Repository: hubo1989/ScreenTranslate
Length of output: 81
🏁 Script executed:
grep -n 'class KeychainService\|actor KeychainService' ScreenTranslate/Services/Security/KeychainService.swift | head -3Repository: hubo1989/ScreenTranslate
Length of output: 94
🏁 Script executed:
cat -n ScreenTranslate/Services/Security/KeychainService.swift | sed -n '1,50p'Repository: hubo1989/ScreenTranslate
Length of output: 1897
🏁 Script executed:
rg -n 'paddleOCRCloudAPIKey\s*=' ScreenTranslate/Models/AppSettings.swiftRepository: hubo1989/ScreenTranslate
Length of output: 256
🏁 Script executed:
cat -n ScreenTranslate/Services/Security/KeychainService.swift | sed -n '296,360p'Repository: hubo1989/ScreenTranslate
Length of output: 2686
🏁 Script executed:
cat -n ScreenTranslate/Models/AppSettings.swift | sed -n '32,40p'Repository: hubo1989/ScreenTranslate
Length of output: 310
🏁 Script executed:
rg -n 'loadPaddleOCRAPIKeyFromKeychain' ScreenTranslate/Models/AppSettings.swift -A 5Repository: hubo1989/ScreenTranslate
Length of output: 931
🏁 Script executed:
cat -n ScreenTranslate/Models/AppSettings.swift | sed -n '580,600p'Repository: hubo1989/ScreenTranslate
Length of output: 1051
🏁 Script executed:
rg -n 'Task\.detached.*paddleOCRCloudAPIKey|paddleOCRCloudAPIKey.*=' ScreenTranslate/Models/AppSettings.swift | head -20Repository: hubo1989/ScreenTranslate
Length of output: 256
🏁 Script executed:
rg -n 'saveCredentialsInternal' ScreenTranslate/Services/Security/KeychainService.swift | head -3Repository: hubo1989/ScreenTranslate
Length of output: 265
🏁 Script executed:
cat -n ScreenTranslate/Services/Security/KeychainService.swift | sed -n '60,120p'Repository: hubo1989/ScreenTranslate
Length of output: 3446
🏁 Script executed:
cat -n ScreenTranslate/Models/AppSettings.swift | sed -n '495,520p'Repository: hubo1989/ScreenTranslate
Length of output: 1236
Keychain 保存与删除存在竞态条件,可能导致空值残留
Line 510 设置 paddleOCRCloudAPIKey = "" 触发 didSet(Line 323-329),使用 Task.detached 异步保存空字符串。同时 Line 512-518 的 Task.detached 调用删除操作。两个后台任务并发执行,KeychainService actor 无法保证执行顺序:若保存在删除后完成,Keychain 会残留空值而非真正删除。
建议在 didSet 中统一处理空值作为删除操作:
修复建议(统一删除和保存路径)
var paddleOCRCloudAPIKey: String {
didSet {
- // Capture the value on the actor before spawning detached task
let capturedKey = paddleOCRCloudAPIKey
- // Save to Keychain asynchronously
Task.detached {
do {
- try await KeychainService.shared.savePaddleOCRCredentials(apiKey: capturedKey)
+ if capturedKey.isEmpty {
+ try await KeychainService.shared.deletePaddleOCRCredentials()
+ } else {
+ try await KeychainService.shared.savePaddleOCRCredentials(apiKey: capturedKey)
+ }
} catch {
Logger.settings.error("Failed to save PaddleOCR cloud API key to Keychain: \(error)")
}
}
}
}
@@
- // Delete PaddleOCR cloud API key from Keychain
- Task.detached {
- do {
- try await KeychainService.shared.deletePaddleOCRCredentials()
- } catch {
- Logger.settings.error("Failed to delete PaddleOCR credentials from keychain: \(error.localizedDescription)")
- }
- }这样在 resetToDefaults 中只需 paddleOCRCloudAPIKey = "" 一次赋值,由单一代码路径处理删除逻辑,消除竞态。
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@ScreenTranslate/Models/AppSettings.swift` around lines 318 - 331, The didSet
for paddleOCRCloudAPIKey currently always spawns a save Task.detached which
races with separate delete Tasks; modify the didSet on paddleOCRCloudAPIKey to
treat an empty capturedKey as a delete operation (call the
KeychainService.deletePaddleOCRCredentials or equivalent) and non-empty as save
(call KeychainService.shared.savePaddleOCRCredentials), so all
persistence/removal logic is centralized in the paddleOCRCloudAPIKey didSet
(using Task.detached for the single operation), and remove any separate delete
Task calls (e.g., in resetToDefaults) so assignments like paddleOCRCloudAPIKey =
"" go through this unified path.
| "settings.paddleocr.mlxVLMNotRunning" = "Not Running"; | ||
| "settings.paddleocr.localVLModelDir" = "Local Model Directory"; | ||
| "settings.paddleocr.localVLModelDir.hint" = "Path to local PaddleOCR-VL model (e.g. ~/.paddlex/official_models/PaddleOCR-VL-1.5)"; | ||
| "error.paddleocr.notInstalled" = "PaddleOCR is not installed. Install it using: pip3 install paddleocr paddlepaddle"; |
There was a problem hiding this comment.
补齐兼容 key,避免 PaddleOCR 未安装提示在部分路径丢失
Line 606 新增的是 error.paddleocr.notInstalled,但代码中仍有 error.paddleocr.not.installed / .recovery 的读取路径。建议补齐兼容 key,避免回退显示原始 key。
🔧 建议修复
"error.paddleocr.notInstalled" = "PaddleOCR is not installed. Install it using: pip3 install paddleocr paddlepaddle";
+"error.paddleocr.not.installed" = "PaddleOCR is not installed. Install it using: pip3 install paddleocr paddlepaddle";
+"error.paddleocr.not.installed.recovery" = "Install PaddleOCR first, then try again.";🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@ScreenTranslate/Resources/en.lproj/Localizable.strings` at line 606, Add the
missing compatibility localization keys so the app doesn't fall back to raw
keys: add entries for "error.paddleocr.not.installed" and
"error.paddleocr.recovery" with the same value as "error.paddleocr.notInstalled"
(the keys to update/insert are error.paddleocr.notInstalled,
error.paddleocr.not.installed, and error.paddleocr.recovery) so any code reading
either key will get the localized message.
| "settings.paddleocr.mlxVLMNotRunning" = "未运行"; | ||
| "settings.paddleocr.localVLModelDir" = "本地模型目录"; | ||
| "settings.paddleocr.localVLModelDir.hint" = "本地 PaddleOCR-VL 模型路径(如 ~/.paddlex/official_models/PaddleOCR-VL-1.5)"; | ||
| "error.paddleocr.notInstalled" = "PaddleOCR 未安装。请使用以下命令安装:pip3 install paddleocr paddlepaddle"; |
There was a problem hiding this comment.
统一 PaddleOCR 未安装错误 key,避免本地化失效
Line 606 使用了 error.paddleocr.notInstalled,但当前代码里还有 error.paddleocr.not.installed(含 .recovery)路径;这会导致部分场景显示 key 字面量而不是翻译文本。建议同时保留兼容 key。
🔧 建议修复
"error.paddleocr.notInstalled" = "PaddleOCR 未安装。请使用以下命令安装:pip3 install paddleocr paddlepaddle";
+"error.paddleocr.not.installed" = "PaddleOCR 未安装。请使用以下命令安装:pip3 install paddleocr paddlepaddle";
+"error.paddleocr.not.installed.recovery" = "请先安装 PaddleOCR 后重试。";📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| "error.paddleocr.notInstalled" = "PaddleOCR 未安装。请使用以下命令安装:pip3 install paddleocr paddlepaddle"; | |
| "error.paddleocr.notInstalled" = "PaddleOCR 未安装。请使用以下命令安装:pip3 install paddleocr paddlepaddle"; | |
| "error.paddleocr.not.installed" = "PaddleOCR 未安装。请使用以下命令安装:pip3 install paddleocr paddlepaddle"; | |
| "error.paddleocr.not.installed.recovery" = "请先安装 PaddleOCR 后重试。"; |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@ScreenTranslate/Resources/zh-Hans.lproj/Localizable.strings` at line 606, The
zh-Hans localization uses the key error.paddleocr.notInstalled but the codebase
still references error.paddleocr.not.installed (and the .recovery variant),
causing missing translations; update the Localizable.strings entry to preserve
the original keys for compatibility by adding/keeping entries for
error.paddleocr.not.installed and error.paddleocr.not.installed.recovery (with
the same translated value as error.paddleocr.notInstalled) so both key forms
resolve to the correct Chinese message.
| /// Whether to use cloud API | ||
| var useCloud: Bool | ||
|
|
||
| /// Cloud API base URL | ||
| var cloudBaseURL: String | ||
|
|
||
| /// Cloud API key | ||
| var cloudAPIKey: String |
There was a problem hiding this comment.
云端模式配置已暴露,但识别流程仍被本地安装强制拦截
你在 Line 48-55 增加了云端配置,但 recognize 里仍无条件检查本地可用性(Line 144)。这会让“仅云端”场景直接报 notInstalled,功能上不可用。
🔧 建议修复
- guard isAvailable else {
- throw PaddleOCREngineError.notInstalled
- }
+ if !config.useCloud {
+ guard isAvailable else {
+ throw PaddleOCREngineError.notInstalled
+ }
+ }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@ScreenTranslate/Services/PaddleOCREngine.swift` around lines 48 - 55, The new
cloud configuration (useCloud, cloudBaseURL, cloudAPIKey) is exposed but
recognize still unconditionally checks local installation and returns
notInstalled; update recognize to branch on useCloud: when useCloud == true skip
the local-install check and call the cloud recognition path using
cloudBaseURL/cloudAPIKey, otherwise keep the existing local flow (including the
local availability check and local OCR calls). Ensure error handling and return
types remain the same so cloud errors propagate as before.
| } else if !config.localVLModelDir.isEmpty { | ||
| // Use native backend with local model | ||
| // Expand tilde in path (e.g., ~/.paddlex -> /Users/xxx/.paddlex) | ||
| let expandedPath = NSString(string: config.localVLModelDir).expandingTildeInPath | ||
| args += [ | ||
| "--vl_rec_backend", "native", | ||
| "--vl_rec_model_dir", expandedPath | ||
| ] |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# Find the file and check its size
find . -name "PaddleOCREngine.swift" -type fRepository: hubo1989/ScreenTranslate
Length of output: 116
🏁 Script executed:
# Read the relevant sections around lines 264-271 and 280-286
if [ -f "ScreenTranslate/Services/PaddleOCREngine.swift" ]; then
echo "=== Lines 260-275 ===" && sed -n '260,275p' "ScreenTranslate/Services/PaddleOCREngine.swift"
echo ""
echo "=== Lines 280-295 ===" && sed -n '280,295p' "ScreenTranslate/Services/PaddleOCREngine.swift"
fiRepository: hubo1989/ScreenTranslate
Length of output: 1480
🏁 Script executed:
# Search for the command execution pattern with shell concatenation
rg -n -A5 'Process\(\)' "ScreenTranslate/Services/PaddleOCREngine.swift" | head -50Repository: hubo1989/ScreenTranslate
Length of output: 383
🏁 Script executed:
# Look for how arguments are passed and if shell is involved
rg -n -B2 -A8 'joined\(separator: " "\)|executableURL|task\.arguments' "ScreenTranslate/Services/PaddleOCREngine.swift" | head -80Repository: hubo1989/ScreenTranslate
Length of output: 866
Shell 命令拼接导致路径解析失败与命令注入风险
第 280 行将参数数组用空格拼接进 shell 命令,第 267-271 行的用户配置路径(包含 expandedPath)会直接受影响。路径中包含空格(如 /Users/xxx/.paddlex)时会被 shell 错误分割,特殊字符可导致命令注入。
应改为直接传递参数数组,避免 shell 解析:
修复建议
- let fullCommand = "\(executablePath) \(arguments.joined(separator: " "))"
- Logger.ocr.info("Executing: \(fullCommand)")
+ Logger.ocr.info("Executing PaddleOCR with \(arguments.count) arguments")
let task = Process()
- task.executableURL = URL(fileURLWithPath: "/bin/zsh")
- task.arguments = ["-c", fullCommand]
+ task.executableURL = URL(fileURLWithPath: executablePath)
+ task.arguments = arguments🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@ScreenTranslate/Services/PaddleOCREngine.swift` around lines 264 - 271, The
code appends a user-controlled expandedPath (from config.localVLModelDir) into
args and later joins args into a shell string, which breaks paths with spaces
and exposes command-injection risk; modify the caller to execute the subprocess
with an argument array (no shell interpolation) and ensure PaddleOCREngine's use
of args (and the code constructing it, including the expandedPath variable and
the "--vl_rec_model_dir" entry) is passed directly to the process execution API
that accepts [String] rather than being joined into a single shell command
string so arguments are not re-parsed by a shell.
Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
Summary
localVLModelDirconfiguration to support native backend with local PaddleOCR-VL modelserror.ocr.recognition.failed→error.ocr.failed)Changes
localVLModelDirproperty toPaddleOCREngine.Configuration--vl_rec_backend native --vl_rec_model_dirwhen MLX-VLM is disabled but local model path is setTranslationFlowController.showErrorAlert()to show provider-appropriate model infoTest Plan
~/.paddlex/official_models/PaddleOCR-VL-1.5)🤖 Generated with Droid
Summary by CodeRabbit
发布说明
新功能
改进
本地化