feat(telegram): CDP-driven Telegram Web K scanner (#630)#638
feat(telegram): CDP-driven Telegram Web K scanner (#630)#638senamakel merged 42 commits intotinyhumansai:mainfrom
Conversation
- Added a new Accounts page for managing user accounts, including the ability to add and remove accounts. - Introduced AddAccountModal for selecting account providers and initiating account setup. - Implemented WebviewHost to display third-party web applications (e.g., WhatsApp Web) within the app. - Enhanced routing to include a protected route for the Accounts page. - Updated the BottomTabBar to include an Accounts tab for easy navigation. - Integrated Redux for state management of accounts, messages, and logs, ensuring a seamless user experience. - Updated dependencies in Cargo.lock to version 0.52.9 for compatibility. This commit significantly enhances the application's functionality by allowing users to manage accounts directly within the app, improving overall user engagement and experience.
…adability - Removed unnecessary comments and simplified the structure of the Accounts component for better clarity. - Adjusted the rendering logic to enhance the layout of the active account section, improving user experience. - Reformatted text in the no accounts message for better readability. - Streamlined the import statements by consolidating related imports, enhancing code organization.
…ting updates - Introduced support for additional account providers: Telegram, LinkedIn, Gmail, and Slack, expanding user options for account management. - Updated routing to replace the old /conversations path with /chat, streamlining navigation and improving user experience. - Refactored the App component to include an AppShell for better layout management, ensuring the bottom tab bar visibility aligns with the selected account. - Enhanced the BottomTabBar component to reflect the new routing and account options, improving accessibility and usability. - Implemented fullscreen logic for accounts, allowing for a more immersive experience when interacting with selected accounts. - Added utility functions for managing fullscreen states and account provider icons, enhancing code organization and maintainability. This commit significantly improves the application's account management capabilities, providing users with a more flexible and engaging experience.
…webview functionality - Added support for the Chromium Embedded Framework (CEF) as an alternative runtime, allowing for improved webview capabilities. - Updated Cargo.toml to include new dependencies and features for CEF integration, ensuring compatibility with existing Tauri plugins. - Enhanced the WhatsApp recipe to include ghost-text autocomplete functionality, improving user experience during message composition. - Implemented WebSocket observation in the WhatsApp recipe to capture and forward relevant message frames, enhancing real-time interaction. - Introduced user agent spoofing for specific providers to bypass fingerprinting checks, ensuring better compatibility with services like Slack and LinkedIn. - Refactored various components to accommodate the new runtime and improve overall code organization and maintainability. This commit significantly enhances the application's webview capabilities and user interaction with messaging services, providing a more robust and flexible experience.
- Introduced a new development command `dev:cef` in both package.json files to streamline the development process for the Chromium Embedded Framework (CEF). - Updated Cargo.toml to include the `tauri/devtools` feature alongside `tauri/cef`, enhancing debugging capabilities. - Modified tauri.conf.json to adjust visibility settings for the application window and refined the Content Security Policy (CSP) for improved security. - Enhanced resource paths in tauri.conf.json to support recursive file inclusion for better resource management. - Updated the Rust code to bypass macOS Keychain prompts when using CEF, improving user experience during development. This commit enhances the development workflow for CEF integration, providing better tools and configurations for developers.
- Modified the `dev:cef` command in package.json to include the `APPLE_SIGNING_IDENTITY` environment variable, enhancing the development process for CEF on macOS. - This change improves the build process by ensuring proper code signing during development, streamlining the workflow for developers working with CEF integration.
- Updated the `dev:cef` command in package.json to include a call to a new script, `setup-chromium-safe-storage.sh`, which pre-seeds the "Chromium Safe Storage" keychain entry with a permissive ACL. - Added the `setup-chromium-safe-storage.sh` script to ensure that CEF/Chromium can read the keychain entry without prompting, improving the development experience on macOS. - This change streamlines the setup process for developers working with CEF integration, ensuring a smoother workflow.
- Introduced a new `IngestMessage` interface to standardize message structure for WhatsApp. - Updated `IngestPayload` to include additional fields for better message handling, including `provider`, `chatId`, and `day`. - Implemented a new function `persistWhatsappChatDay` to handle the ingestion of chat messages by day, improving data organization and retrieval. - Enhanced the WhatsApp recipe to utilize IndexedDB for direct data access, eliminating the need for DOM scraping and improving performance. - Updated the Tauri configuration to enable development tools for easier debugging of webview accounts. This commit significantly improves the application's ability to manage and ingest WhatsApp messages, providing a more robust and efficient user experience.
…ls Protocol - Added a new module for scanning IndexedDB using the Chrome DevTools Protocol (CDP), enabling direct access to WhatsApp data without DOM scraping. - Implemented a scanner that communicates with the embedded CEF instance to read and decrypt messages stored in IndexedDB. - Updated the Tauri application to manage the new scanner, ensuring it operates seamlessly with existing webview accounts. - Enhanced the Cargo.toml and Cargo.lock files to include necessary dependencies such as `tokio-tungstenite` and `futures-util` for asynchronous operations. - Refactored the WhatsApp recipe to utilize the new scanning capabilities, improving performance and data handling. This commit significantly enhances the application's ability to interact with WhatsApp's IndexedDB, providing a more efficient and robust user experience.
- Updated the ScanSnapshot struct to include new fields for message diagnostics: `messageKeyUnion`, `messageTypeBreakdown`, and `sampleByType`, providing a comprehensive overview of message structures and types. - Modified the scanner logic to capture and log detailed information about message types and their shapes, improving debugging capabilities. - Refactored the JavaScript scanner to aggregate message key signatures and counts, enhancing the analysis of message records. This commit significantly improves the application's ability to analyze and log message data from WhatsApp's IndexedDB, facilitating better debugging and data handling.
…WhatsApp - Introduced a new fast-tick DOM scraping mechanism to extract rendered WhatsApp message bodies, enabling near real-time message updates without relying on IndexedDB. - Added scripts for capturing and logging CryptoKey operations within WhatsApp's workers, allowing for better analysis of key derivations and decryptions. - Enhanced the CDP scanner to interleave fast DOM scans with full IndexedDB scans, optimizing data retrieval and reducing UI spamming during idle periods. - Updated the ScanSnapshot struct to include new fields for DOM-scraped messages and crypto operation statistics, improving the overall diagnostic capabilities of the application. This commit significantly enhances the application's ability to interact with WhatsApp's messaging system, providing a more efficient and responsive user experience.
…ced message handling - Replaced the `cdp_indexeddb` module with `whatsapp_scanner` to streamline the scanning process for WhatsApp messages. - Updated the application to manage the new `ScannerRegistry` for WhatsApp, improving the integration with the Chrome DevTools Protocol. - Introduced new scripts for fast DOM scraping and full IndexedDB scanning, optimizing data retrieval and enhancing real-time message updates. - Added a new `dom_scan.js` for efficient extraction of rendered message bodies directly from the DOM, reducing reliance on IndexedDB. - Enhanced the `ScanSnapshot` struct to accommodate new fields for DOM-scraped messages, improving diagnostic capabilities. This commit significantly improves the application's ability to interact with WhatsApp, providing a more efficient and responsive user experience.
…OM snapshot - Removed the `dom_scan.js` script and replaced it with a new Rust module `dom_snapshot.rs` that captures DOM snapshots directly via the Chrome DevTools Protocol, enhancing performance and reliability. - Introduced a new `idb.rs` module for scanning WhatsApp's IndexedDB, streamlining data retrieval and improving integration with the Rust backend. - Updated the `ScanSnapshot` struct to accommodate changes in data handling, ensuring compatibility with the new scanning methods. - Enhanced overall message handling capabilities, providing a more efficient and responsive user experience. This commit significantly improves the application's ability to interact with WhatsApp, leveraging Rust for better performance and reducing reliance on JavaScript for DOM operations.
- Introduced a comprehensive playbook detailing the process for integrating third-party webviews (e.g., Instagram, Messenger) into the application. - Documented architecture, workflow, and best practices for building and debugging new integrations, leveraging Rust and Chrome DevTools Protocol. - Included step-by-step instructions for setting up scanners, monitoring logs, and optimizing message handling, ensuring a streamlined development experience for future integrations. This addition enhances the documentation, providing developers with a clear guide to implement and maintain webview integrations effectively.
- Enhanced the formatting of tables in the webview integration playbook to improve readability and consistency. - Adjusted column headers and alignment for better presentation of job intervals and costs, ensuring clearer communication of scanning processes and common pitfalls. This update aims to provide a more user-friendly documentation experience for developers integrating third-party webviews.
…ement - Updated the OpenHuman package version to 0.52.15 in both Cargo.lock files. - Introduced a new Slack scanner module to extract messages, users, and channels from Slack's IndexedDB using the Chrome DevTools Protocol. - Added functionality to manage Slack accounts within the application, allowing for automatic opening of Slack webviews based on environment variables. - Enhanced the existing webview account management to support Slack integration, ensuring seamless interaction with the Slack API. This commit significantly improves the application's ability to interact with Slack, providing a robust framework for message handling and account management.
CEF 146's IndexedDB.requestData rejects `indexName: ""` with "Could not get index"; the CDP spec says empty string means the primary-key index but this backend only accepts the field unset. Omit it entirely so the Slack Redux-persist dump actually comes back. Also switch memory grouping from (channel, day) → channel. Each Slack channel is now one long-running memory doc keyed by channel name (e.g. `general`, `team-product`, `elvin516`), falling back to channel id for non-slug names. Every transcript line carries its own `YYYY-MM-DD HH:MM` stamp and the header records the full date range. `infer_team_id` updated to Slack's real DB naming pattern `objectStore-<TEAM>-<USER>` (not `ReduxPersistIDB:` as initially assumed).
…gOverlay tests - Changed the navigation path from '/conversations' to '/chat' in the OnboardingOverlay tests to reflect the updated routing logic. - Updated test descriptions for clarity, ensuring they accurately describe the functionality being tested. These changes enhance the accuracy and readability of the onboarding tests, aligning them with the current application flow.
- Introduced a return statement in the Conversations component to ensure proper rendering of the sidebar or page variant. - This change enhances the component's functionality by ensuring it returns the expected JSX structure. These modifications improve the overall structure and behavior of the Conversations component.
- Introduced Discord as a new account provider, including its icon and service details. - Updated the AddAccountModal to filter out already connected providers, improving user experience. - Enhanced the UI to display a message when all providers are connected, ensuring clarity for users. - Implemented context menu functionality for account management, allowing users to log out directly from the accounts list. These changes expand the application's capabilities by integrating Discord and refining account management features.
…oring - Added a new `discord_scanner` module to capture Discord API calls and WebSocket frames using the Chrome DevTools Protocol (CDP). - Updated the `lib.rs` to manage the new Discord scanner alongside existing WhatsApp and Slack scanners. - Enhanced the `webview_accounts` module to support Discord account management, including scanner registration and cleanup. These changes expand the application's capabilities by enabling real-time monitoring of Discord interactions, enhancing user experience and functionality.
- Added Google Meet as a supported account provider, including its icon and service details. - Updated the account management logic to handle Google Meet interactions, including recipe integration for call monitoring and notifications. - Enhanced the UI to accommodate the new provider, ensuring a seamless user experience when managing accounts. These changes expand the application's capabilities by integrating Google Meet, allowing users to join calls and receive notifications directly within the app.
…lete - Eliminated the notification interception logic and associated functions, streamlining the runtime code. - Removed composer autocomplete features, transferring responsibility for ghost-text overlays to the UI host. - Updated comments to reflect the changes and clarify the remaining functionality. These modifications simplify the runtime script, focusing on core features while delegating UI responsibilities.
…nt handling - Implemented lifecycle event handling for Google Meet, including events for call start, captions, and call end. - Introduced in-memory storage for caption snapshots during meetings, allowing for the generation of markdown transcripts upon call completion. - Added interfaces for payload structures related to Google Meet events, improving type safety and clarity in the codebase. - Updated the webview account service to manage active meetings and flush transcripts to memory, ensuring a seamless user experience. These changes significantly enhance the Google Meet integration, enabling real-time caption handling and transcript generation, thereby improving the overall functionality of the application.
…dencies - Added support for native OS notifications through the `tauri-runtime-cef` crate, enabling interception of browser notifications in embedded webviews. - Introduced a new submodule for `tauri-cef` to manage CEF dependencies and facilitate notification handling. - Updated the `.gitignore` to exclude CEF-related build artifacts and lock files. - Removed the deprecated `notification_scanner` module, streamlining the codebase and focusing on the new CEF integration. - Enhanced the `webview_accounts` module to register and manage CEF browser notifications, improving user experience with real-time alerts. These changes significantly enhance the application's notification capabilities, leveraging CEF for a more integrated and responsive user experience.
- Added `cef` and `tauri-runtime-cef` as dependencies to enhance CEF support. - Updated `Cargo.toml` to reference `tauri-runtime-cef` from a local path, ensuring proper integration with the vendored CEF submodule. - Removed direct Git references for Tauri packages, streamlining dependency management by using local paths. These changes improve the application's CEF capabilities and simplify the dependency structure, facilitating better integration and maintenance.
…ociated resources - Introduced BrowserScan as a development-only account provider, including its icon and service details. - Updated the account provider types and management logic to accommodate BrowserScan, enhancing the application's capabilities. - Added a new recipe and manifest for BrowserScan, ensuring it integrates seamlessly into the existing webview account lifecycle. - Enhanced the UI to display BrowserScan, providing users with a bot-detection sandbox for testing purposes. These changes expand the application's functionality by integrating BrowserScan, allowing for improved testing and development workflows.
…t handling and session recovery - Introduced a new service to handle Google Meet transcripts, enabling structured note extraction and proactive follow-up actions. - Implemented session recovery logic to manage in-progress meetings when navigating away from the call. - Updated the webview account service to log call events and captions, improving monitoring and debugging capabilities. - Enhanced the Google Meet recipe to persist meeting state across navigations, ensuring seamless user experience. These changes significantly improve the Google Meet integration, allowing for better management of meeting transcripts and user interactions.
- Removed unused imports from App.tsx to streamline the code. - Adjusted the import order for better readability and consistency. - Enhanced the BottomTabBar component by simplifying the button rendering logic. - Cleaned up the AddAccountModal component by consolidating prop destructuring. - Improved formatting in various components for better code clarity. These changes enhance code maintainability and readability across the application.
…n logic - Enhanced the logic for extracting speaker names from Google Meet rows, adding checks to filter out icon ligatures and irrelevant text. - Updated the caption processing to better identify and score caption regions, ensuring more accurate transcript generation. - Introduced new utility functions to differentiate between real captions and icon names, improving the overall reliability of the captioning feature. These changes significantly enhance the accuracy and usability of the Google Meet integration, providing users with clearer and more relevant caption data.
…ree (tinyhumansai#630) Picks up 1b58f715 which fixes the bundler to copy the entire cef-helper/src/ tree instead of only main.rs — required for our CEF helper's Web Notifications interception to link properly in downstream consumers. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ai#630) The vendor tauri-cef workspace pins cef-dll-sys to the fix/146-location-windows branch via its own [patch.crates-io], but cargo patches do not propagate through path dependencies. Without pinning cef-dll-sys here too, helper processes crash with `CefApp_0_CToCpp called with invalid version -1` because the app-side bindings target a different CEF ABI than what the vendor's cef-helper was built against. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…nyhumansai#630) New telegram_scanner module mirrors the Slack/WhatsApp scanner shape from PR tinyhumansai#629 but targets Telegram Web K's IndexedDB surface via CDP: - mod.rs: per-account poller + ScannerRegistry; connects to CDP on 127.0.0.1:9222, picks the Telegram target, and runs an IDB tick every 30s. Emits webview:event and POSTs openhuman.memory_doc_ingest so memory fills even when the main window is hidden. - idb.rs: IndexedDB walker — requestDatabaseNames / requestDatabase / requestData, with record caps per store. - extract.rs: peer-grouped message/user/chat extraction from the `tweb` snapshot. cef-only (wry has no remote-debugging port). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…sai#630) Wires telegram_scanner into the Tauri builder: - Registers ScannerRegistry as managed state (cef-only). - Adds OPENHUMAN_DEV_AUTO_TELEGRAM=<uuid> helper mirroring the Slack / Google Meet dev-auto flow — opens the Telegram Web K account webview 2s after startup so the CDP scanner has a target without manual UI clicks. Useful for iterating on the scanner end-to-end. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…tinyhumansai#630) Mirrors the slack/discord branches in webview_accounts: - open(provider="telegram"): look up the telegram ScannerRegistry and ensure a CDP scanner is running for this account. - close / purge: forget the account's scanner entry alongside the other providers so we don't leak poll loops. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
📝 WalkthroughWalkthroughAdds a feature-gated Telegram IndexedDB scanner using Chrome DevTools Protocol that walks Telegram Web stores, extracts messages/users/chats, polls continuously per account, emits Tauri events, and posts ingested transcripts to OpenHUMAN. Integrates scanner lifecycle with webview account open/close/purge and adds a dev helper to auto-open Telegram webview. Changes
Sequence Diagram(s)sequenceDiagram
participant App as Tauri App
participant Reg as ScannerRegistry
participant Poll as Scanner Poll Loop
participant CDP as CDP (Browser)
participant Tele as Telegram Web IndexedDB
participant Core as OpenHUMAN Core
App->>Reg: ensure_scanner(account_id, url_prefix, marker)
Reg->>Poll: spawn per-account poll loop
loop every IDB_SCAN_INTERVAL
Poll->>CDP: GET /json/version -> websocketUrl
Poll->>CDP: Attach to page matching url_prefix/marker
CDP-->>Poll: sessionId
Poll->>CDP: idb::walk(sessionId) (enable IndexedDB, list DBs/stores)
CDP->>Tele: requestData pages per store
Tele-->>CDP: objectStoreDataEntries (values / objectIds)
Poll->>CDP: Runtime.callFunctionOn batches -> serialized JSON
CDP-->>Poll: IdbDump (per-db/store records)
Poll->>Poll: extract::harvest(IdbDump) -> messages/users/chats
Poll->>Poll: dedupe & group by peer
Poll->>App: emit webview:event per peer
Poll->>Core: POST memory_doc_ingest per peer
Core-->>Poll: response
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~75 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
@oxoxDev there are quite a few merge conflicts. pls resolve them? |
There was a problem hiding this comment.
Actionable comments posted: 2
Note
Due to the large number of review comments, Critical severity comments were prioritized as inline comments.
🟠 Major comments (21)
scripts/setup-chromium-safe-storage.sh-17-22 (1)
17-22:⚠️ Potential issue | 🟠 MajorRemove
-Aflag andunsigned:partition; fail loudly on ACL failures.The
-Aflag allows any application (including malicious ones) to access the Chromium Safe Storage secret without prompts—a known insecure practice. Theunsigned:partition entry compounds this by granting access to any unsigned process on the system. Additionally, the|| trueerror suppression masks ACL failures, silently leaving the keychain ACL unset while the script reports success.Replace
-Awith explicit-Tpaths for only trusted binaries, removeunsigned:from the partition list, and remove the error suppression to catch ACL setup failures.Applies to lines 17–22 and 26–31.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/setup-chromium-safe-storage.sh` around lines 17 - 22, Update the security set-generic-password-partition-list invocation: remove the insecure -A flag and the "unsigned:" partition entry, add explicit -T entries for only trusted binaries (e.g., the Chromium/Chrome helper and any other approved tools) in place of -A, and remove the trailing "|| true" so ACL setup failures cause the script to exit non‑zero; modify the calls to security set-generic-password-partition-list (the lines using -S, -s, -a, -k and "$KEYCHAIN") accordingly so they explicitly grant access only to the listed -T paths and fail loudly on error.app/src/AppRoutes.tsx-59-68 (1)
59-68:⚠️ Potential issue | 🟠 MajorReintroduce
/conversationscompatibility routes inAppRoutes.The new
/chatroute is fine, but removing/conversationsand/conversations/:threadIdbreaks the declared route contract and can regress existing deep links/navigation paths. Keep/chat, and add compatibility entries.As per coding guidelines, "Route definitions in `app/src/AppRoutes.tsx` include `/`, `/onboarding`, `/mnemonic`, `/home`, `/intelligence`, `/skills`, `/conversations`, `/invites`, `/agents`, `/settings/*`, and `DefaultRedirect`; no dedicated `/login` route".🔧 Suggested route-compatibility patch
<Route path="/chat" element={ <ProtectedRoute requireAuth={true}> <Accounts /> </ProtectedRoute> } /> + + <Route + path="/conversations" + element={ + <ProtectedRoute requireAuth={true}> + <Accounts /> + </ProtectedRoute> + } + /> + + <Route + path="/conversations/:threadId" + element={ + <ProtectedRoute requireAuth={true}> + <Accounts /> + </ProtectedRoute> + } + />🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/src/AppRoutes.tsx` around lines 59 - 68, Add back compatibility routes for "/conversations" and "/conversations/:threadId" in AppRoutes by registering two Route entries that mirror the existing "/chat" route: each should use the same ProtectedRoute wrapper (ProtectedRoute requireAuth={true}) and render the Accounts component so existing deep-links keep working; update AppRoutes.tsx to include these paths alongside the current "/chat" entry.app/src/components/BottomTabBar.tsx-151-163 (1)
151-163:⚠️ Potential issue | 🟠 MajorMake the collapsed tab bar reachable without a mouse.
When
collapsedis true, the only reveal affordance isonMouseEnter, while the nav is merely translated off-screen. Keyboard users can get stuck here or tab into hidden controls. Please add a focusable reveal control and disable or unmount the hidden nav while collapsed.app/src-tauri/tauri.conf.json-40-40 (1)
40-40:⚠️ Potential issue | 🟠 MajorCSP is too permissive for production and weakens XSS containment.
Line 40 allows
'unsafe-inline','unsafe-eval', and broad protocol sources (http:,ws:, etc.) underdefault-src/connect-src. This materially reduces security guarantees for the main app webview.🔒 Suggested direction
- "csp": "default-src 'self' 'unsafe-inline' 'unsafe-eval' data: blob: http: https: ws: wss: ipc: http://ipc.localhost; img-src 'self' data: blob: https: http:; connect-src 'self' ipc: http://ipc.localhost http: https: ws: wss: data: blob:; frame-src 'self' https: http: data: blob:" + "csp": "default-src 'self' ipc: http://ipc.localhost; script-src 'self'; style-src 'self' 'unsafe-inline'; img-src 'self' data: blob: https:; connect-src 'self' ipc: http://ipc.localhost https: wss:; frame-src 'self' https:"(If dev-only relaxations are needed, keep them out of production config.)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/src-tauri/tauri.conf.json` at line 40, The current "csp" value is overly permissive; update the tauri config's "csp" entry to remove 'unsafe-inline' and 'unsafe-eval', restrict default-src to 'self' and trusted origins (prefer https: only), and tighten connect-src/frame-src/img-src to only the exact hosts/endpoints the app needs (or 'self' and specific https origins); if you need dev-only relaxations, implement a separate dev build config or environment conditional so production uses the locked-down "csp" entry, and consider using nonces or hashes for any required inline scripts/styles.app/src-tauri/recipes/linkedin/recipe.js-34-37 (1)
34-37:⚠️ Potential issue | 🟠 MajorConversation IDs are collision-prone and can overwrite entries.
Line 34 derives
idfrom display name, which is not unique across threads.🔧 Suggested fix
+ const threadHref = + row.querySelector('a[href*="/messaging/thread/"]')?.getAttribute('href') || null; if (name || preview) { messages.push({ - id: name ? 'li:' + name : 'li:row:' + idx, + id: threadHref ? 'li:' + threadHref : 'li:row:' + idx, from: name || null, body: preview || null, unread: unread, }); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/src-tauri/recipes/linkedin/recipe.js` around lines 34 - 37, The current id derivation (the id property built as name ? 'li:'+name : 'li:row:'+idx in recipe.js) can collide because display names are not unique; update the id generation to use a unique, stable identifier from the LinkedIn thread record (e.g., threadId, urn, or the API's native id field) and fall back to a deterministic composite (for example combining that native id with idx or a timestamp) only if no native id exists; locate the id assignment in the mapping that sets id/from/body/unread and replace the name-based logic so ids are guaranteed unique and stable across runs.app/src-tauri/capabilities/webview-accounts.json-9-19 (1)
9-19:⚠️ Potential issue | 🟠 MajorRemote origin allowlist is broader than necessary for a privileged command.
Allowing extra origins (notably Line 18
browserscan.net) increases the invocation surface forwebview_recipe_event. Keep this list strictly to required provider origins, and move diagnostics/testing origins to a dev-only capability.🔒 Suggested hardening
"urls": [ "https://web.whatsapp.com/*", "https://web.telegram.org/*", "https://www.linkedin.com/*", "https://mail.google.com/*", "https://app.slack.com/*", "https://discord.com/*", "https://meet.google.com/*", - "https://accounts.google.com/*", - "https://www.browserscan.net/*" + "https://accounts.google.com/*" ]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/src-tauri/capabilities/webview-accounts.json` around lines 9 - 19, The webview-accounts.json allowlist is too broad for the privileged command webview_recipe_event — remove non-provider origins (notably "https://www.browserscan.net/*") so the "urls" array contains only the actual provider domains required for webview_recipe_event (e.g., web.whatsapp, web.telegram, linkedin, mail.google, app.slack, discord, meet.google, accounts.google), and move any diagnostic or testing origins like browserscan.net into a dev-only capability file or a separate dev-only allowlist used only in non-production builds; update any capability documentation and gating logic so webview_recipe_event uses the tightened webview-accounts.json in production and the dev-only list only in development.app/src-tauri/recipes/telegram/recipe.js-30-35 (1)
30-35:⚠️ Potential issue | 🟠 MajorUse a stable chat identifier instead of
name/ row index.
tg:${name}will collide for duplicate chat names, andtg:row:${idx}changes whenever the list reorders. That will merge unrelated chats or create fake “new” threads after a rename/reorder. Use a peer-specific DOM attribute or href-based identifier if one is available on the row.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/src-tauri/recipes/telegram/recipe.js` around lines 30 - 35, The current messages.push call sets id using name or idx (id: name ? 'tg:'+name : 'tg:row:'+idx), which causes collisions when chat names duplicate and breaks stability on reorder; change the id generation in the messages.push block to use a stable, peer-specific identifier available on the row (for example a data attribute like data-peer-id, a peer id property, or the href/URL of the chat link) instead of name or idx, falling back to a preserved unique Telegram peer id only if present; update references in the function that builds messages (messages.push, the id field) to extract that stable attribute from the row element or parsed link and use it as the 'tg:' id prefix so ids remain stable across renames and reorders.app/src/components/accounts/WebviewHost.tsx-71-77 (1)
71-77:⚠️ Potential issue | 🟠 MajorDon't mark the host as opened before the native open succeeds.
If
openWebviewAccount(...)rejects once,openedRef.currentstaystrueand this effect never retries the open path for that account. Every later tick goes throughsetWebviewAccountBounds(...)against a webview that may not exist.Suggested fix
if (!openedRef.current) { openedRef.current = true; log('opening account=%s at %o', accountId, bounds); - void openWebviewAccount({ accountId, provider, bounds }); + void openWebviewAccount({ accountId, provider, bounds }).catch(error => { + openedRef.current = false; + log('open failed account=%s error=%O', accountId, error); + }); } else { void setWebviewAccountBounds(accountId, bounds); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/src/components/accounts/WebviewHost.tsx` around lines 71 - 77, The effect sets openedRef.current = true before calling openWebviewAccount, so if openWebviewAccount rejects we never retry and later calls go to setWebviewAccountBounds for a non-existent webview; change the flow to only mark openedRef.current = true after openWebviewAccount successfully resolves (e.g., await openWebviewAccount or use .then() to set openedRef.current on success and .catch() to leave it false and log/handle the error), and ensure the failure path does not call setWebviewAccountBounds for that account; update the block around openedRef.current, openWebviewAccount and setWebviewAccountBounds accordingly.app/src-tauri/recipes/gmail/recipe.js-15-18 (1)
15-18:⚠️ Potential issue | 🟠 MajorEmit an empty snapshot when Gmail stops exposing inbox rows.
Returning early here leaves the previous non-empty snapshot live forever. If the user signs out, switches to a view with no matching rows, or Gmail changes the selector, the account can keep showing stale unread/message data until a later successful scrape.
Suggested fix
api.loop(function () { const rows = document.querySelectorAll('tr.zA'); - if (!rows || rows.length === 0) return; + if (!rows || rows.length === 0) { + const key = JSON.stringify({ n: 0, u: 0, first: [] }); + if (key !== last) { + last = key; + api.ingest({ messages: [], unread: 0, snapshotKey: key }); + } + return; + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/src-tauri/recipes/gmail/recipe.js` around lines 15 - 18, The current api.loop in api.loop(function () { ... }) returns early when document.querySelectorAll('tr.zA') yields no nodes, which leaves the last non-empty snapshot active; change that branch to emit an explicit empty snapshot (e.g., call api.emitSnapshot([]) or the project’s snapshot-empty helper) when rows is null or rows.length === 0, then return — ensure this happens before any later processing so sign-outs, view changes, or selector failures clear the displayed state instead of preserving stale data.app/src-tauri/recipes/discord/recipe.js-14-18 (1)
14-18:⚠️ Potential issue | 🟠 MajorReset Discord state when the sidebar selector goes empty.
Bailing out here means the previous non-empty scrape keeps driving unread state even after the user leaves the guild/DM list or the selector stops matching. The UI needs an explicit empty snapshot to clear stale badges.
Suggested fix
api.loop(function () { const rows = document.querySelectorAll( '[role="treeitem"][data-list-item-id], [data-list-item-id^="channels"], [data-list-item-id^="private-channels"]' ); - if (!rows || rows.length === 0) return; + if (!rows || rows.length === 0) { + const key = JSON.stringify({ n: 0, u: 0, first: [] }); + if (key !== last) { + last = key; + api.ingest({ messages: [], unread: 0, snapshotKey: key }); + } + return; + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/src-tauri/recipes/discord/recipe.js` around lines 14 - 18, The loop currently bails when the querySelectorAll result (rows) is empty, leaving previous scrape state active; change the early-return so that when rows is empty you explicitly emit/dispatch an empty snapshot to clear state (use the same API used elsewhere in this file within api.loop — e.g., the function that sends scrape results/snapshot) and then return, ensuring api.loop (and any downstream handlers) receive an empty list instead of nothing; update the block around api.loop and the rows handling to call that snapshot/dispatch method with an empty array when rows.length === 0.app/src-tauri/recipes/slack/recipe.js-15-19 (1)
15-19:⚠️ Potential issue | 🟠 MajorClear the snapshot when the sidebar query returns nothing.
This early return keeps the last successful Slack scrape alive even after the workspace/sidebar disappears. That can leave stale unread badges in the app after sign-out, workspace switches, or DOM changes.
Suggested fix
api.loop(function () { const rows = document.querySelectorAll( '[data-qa="virtual-list-item"], .p-channel_sidebar__channel' ); - if (!rows || rows.length === 0) return; + if (!rows || rows.length === 0) { + const key = JSON.stringify({ n: 0, u: 0, first: [] }); + if (key !== last) { + last = key; + api.ingest({ messages: [], unread: 0, snapshotKey: key }); + } + return; + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/src-tauri/recipes/slack/recipe.js` around lines 15 - 19, When api.loop's DOM query for rows returns no elements, the code currently returns early leaving the previous snapshot intact; update the early-return branch inside api.loop (the block that checks rows / rows.length === 0) to clear the stored snapshot state (e.g., set snapshot to null/empty or call the existing snapshot-reset helper) before returning so stale Slack scrape data and unread badges are removed when the sidebar/workspace disappears; modify the branch that references rows and the snapshot variable inside api.loop to perform the clear then return.app/src-tauri/recipes/google-meet/recipe.js-287-380 (1)
287-380:⚠️ Potential issue | 🟠 MajorRedact the Meet diagnostics before they hit logs.
This diagnostic path logs raw caption text, participant names, and nearby UI text (
dumpandsample). That can leak meeting content into application logs atinfolevel. Please remove the text payloads or gate the whole dump behind an explicit local-dev flag that is off by default.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/src-tauri/recipes/google-meet/recipe.js` around lines 287 - 380, The diagnostic function maybeLogDiag currently logs raw caption text and participant names (variables dump, rows[0].speaker, rows[0].text and region child text) via api.log; update maybeLogDiag to avoid emitting any user-visible text by either (A) stripping/redacting text payloads before logging (replace actual text with placeholders like "[REDACTED]" and only log counts/lengths), or (B) gate the entire dump/sample output behind an explicit dev flag (e.g., ENABLE_MEET_DIAG) that defaults to false so api.log never includes dump or rows' text unless the flag is turned on; make changes where dump is built and where api.log composes the sample string and extra region child text.app/src-tauri/src/webview_accounts/ua_spoof.js-16-22 (1)
16-22:⚠️ Potential issue | 🟠 MajorSync spoofed Chrome version with bundled CEF (currently 22 versions behind).
The hardcoded Chrome 124 in ua_spoof.js and mod.rs will drift dangerously from the bundled CEF 146.4.1 engine. This creates an immediate and detectable fingerprint mismatch: providers checking
navigator.userAgentagainst actual Chromium capabilities will see Chrome 124 claiming features (or lacking them) that don't match Chromium 146's real behavior. As CEF increments to 147, 148, etc., this gap only widens and detection becomes trivial.The values are scattered across files without centralization:
app/src-tauri/src/webview_accounts/ua_spoof.js(lines 16–17, 40, 46–47, 51–52, 69)app/src-tauri/src/webview_accounts/mod.rs(hardcoded constant)Extract the bundled CEF version from
Cargo.toml(currentlycef = "=146.4.1") and derive the Chrome major and full version numbers from it, then use those constants everywhere. If deriving the full version string from just the CEF version is not feasible, at least centralize the constants in a single location (e.g., a shared config file or a Rust module) so they update in lockstep with CEF pin changes.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/src-tauri/src/webview_accounts/ua_spoof.js` around lines 16 - 22, The UA constants (CHROME_MAJOR, CHROME_FULL, UA) are hardcoded and drift from the bundled CEF; update by deriving the Chrome version from the pinned CEF version in Cargo.toml (cef = "=146.4.1") or centralize the values into a single source of truth (e.g., a Rust module or build-time script) that exposes the Chrome major/full constants to both mod.rs and ua_spoof.js; replace the duplicated CHROME_MAJOR/CHROME_FULL definitions with references to that shared constant (or an env var generated at build time) so both mod.rs and the UA string (UA) use the same derived values and remain in sync with CEF pins.app/src-tauri/src/discord_scanner/mod.rs-454-537 (1)
454-537:⚠️ Potential issue | 🟠 MajorTrack gateway
requestIds before emittingwebSocketFrame*events.
Network.webSocketFrameSent/Receivedfires for every websocket on the page. This branch emits them all as Discord ingest, so once voice/RTC or other auxiliary sockets are open you'll forward unrelated traffic too. Capture therequestIds fromNetwork.webSocketCreatedwhen the URL matchesgateway.discord..., then drop frame events whoserequestIdis not in that set.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/src-tauri/src/discord_scanner/mod.rs` around lines 454 - 537, Record gateway request IDs when handling the "Network.webSocketCreated" branch (where you check is_discord_gateway(url)) by inserting the requestId into a per-account set (e.g., discord_gateway_request_ids: HashSet<String>) and then, in the m @ ("Network.webSocketFrameSent" | "Network.webSocketFrameReceived") branch, check the extracted request_id against that set and return/drop the event if it's not present; also remove the id from the set when you observe the corresponding "Network.webSocketClosed" (or similar) event so the set doesn't grow indefinitely.app/src/services/webviewAccountService.ts-107-123 (1)
107-123:⚠️ Potential issue | 🟠 MajorNormalize
ingestpayloads per provider before dispatching or persisting them.This branch assumes every
ingestevent carriesmessages[]shaped likeIngestMessage, but the new scanners do not. Telegram/Slack emit rows withsender+date/ts_secs, and Discord emits HTTP/WS envelopes with nomessagesarray at all. Today that means Slack/Telegram lose sender + real timestamps in Redux, Discord ingest is silently dropped, and scanner-backed providers also get a second generic memory doc on top of the Rust-side persistence.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/src/services/webviewAccountService.ts` around lines 107 - 123, The ingest branch assumes messages are already IngestMessage-shaped; update the handler for evt.kind === 'ingest' to normalize provider-specific payloads before mapping to IngestedMessage: detect provider (evt.provider) and for Slack/Telegram pull sender=>from and date/ts_secs=>ts, for Discord extract messages from the HTTP/WS envelope or synthesize a single message with proper sender/ts if messages[] is missing, and preserve original unread flags; then call store.dispatch(appendMessages({ accountId, messages, unread: ingest.unread })) with the normalized messages and only call persistIngestToMemory(accountId, evt.provider, ingest, messages) when the provider is not already persisting to Rust-side memory (skip for scanner-backed providers) to avoid duplicate memory docs.app/src-tauri/src/webview_accounts/mod.rs-644-682 (1)
644-682:⚠️ Potential issue | 🟠 Major
forget()here does not actually stop the scanner tasks.The spawned scanner loops keep running forever;
forget()only removes the account id from each registry’sHashSet. After close/purge, those background tasks can keep polling and later reattach to the next matching provider page, and reopening the same account will start a duplicate scanner. Store a cancellation token /JoinHandleper account and shut it down here before dropping registry state.Also applies to: 708-746
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/src-tauri/src/webview_accounts/mod.rs` around lines 644 - 682, The current calls to registry.forget(&acct) only remove the account id from the ScannerRegistry but do not stop the spawned scanner tasks; modify ScannerRegistry (e.g., crate::whatsapp_scanner::ScannerRegistry, crate::slack_scanner::ScannerRegistry, crate::discord_scanner::ScannerRegistry, crate::telegram_scanner::ScannerRegistry) to track a cancellation token or JoinHandle per account (instead of just a HashSet) and expose a shutdown method (e.g., shutdown_account(account_id: &str) or forget_and_cancel) that cancels the background task and awaits or joins it; then replace the simple registry.forget calls in this block with calls to that shutdown method so the scanner is cancelled before removing browser_ids/state and calling tauri_runtime_cef::notification::unregister; apply the same change to the duplicate code region noted (also at lines 708-746).app/src-tauri/src/webview_accounts/mod.rs-631-643 (1)
631-643:⚠️ Potential issue | 🟠 MajorDon’t drop the account→label mapping until
close()succeeds.Both
closeandpurgeremove the state entry before callingwv.close(). Ifclose()fails, the child webview is still alive but now unreachable throughWebviewAccountsState, so follow-up hide/show/purge/retry commands can’t find it anymore.Also applies to: 699-706
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/src-tauri/src/webview_accounts/mod.rs` around lines 631 - 643, The code removes the account→label entry from state (via state.inner.lock().unwrap().remove(&args.account_id)) before calling wv.close(), making the live webview unreachable if close() fails; change the flow in the close and purge handlers (locations using label_opt / state.inner.lock().unwrap().remove and the wv.close() call, and the purge code at the other noted range) to first look up the label without removing (e.g., peek/get the mapping or clone the label), call app.get_webview(&label) and attempt wv.close(), and only call state.inner.lock().unwrap().remove(&args.account_id) after wv.close() returns Ok (or after any cleanup that guarantees the webview is gone); on Err from wv.close() leave the mapping in place and return the error so follow-up commands can find and retry the webview.app/src-tauri/src/webview_accounts/mod.rs-162-166 (1)
162-166:⚠️ Potential issue | 🟠 MajorRedact query strings before logging external navigation URLs.
OAuth and invite flows commonly put auth codes, state, email hints, and similar sensitive values in the query string. These branches log the full URL, which will persist those values in desktop logs. Log origin/path only, or strip query + fragment before logging. As per coding guidelines "Never log secrets, raw JWTs, API keys, or full PII in debug logs; redact or omit sensitive fields".
Also applies to: 427-446
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/src-tauri/src/webview_accounts/mod.rs` around lines 162 - 166, The code logs full external URLs (including query and fragment) which can expose sensitive tokens; in open_in_system_browser (and any other places in this module that log external navigation URLs) parse the URL (e.g., with url::Url), construct a sanitized form containing only origin + path (omit query and fragment), and use that sanitized string in log messages instead of the full url; if parsing fails, fall back to a safe redacted placeholder rather than logging the raw URL.app/src/services/webviewAccountService.ts-58-68 (1)
58-68:⚠️ Potential issue | 🟠 MajorReset
startedwhen listener registration fails.If
listen()throws here,startedstaystrue, so every laterstartWebviewAccountService()call returns early and the service never retries after a transient startup failure.Suggested fix
started = true; void (async () => { try { unlisten = await listen<RecipeEventPayload>('webview:event', evt => { handleRecipeEvent(evt.payload); }); log('event listener attached'); } catch (err) { + started = false; errLog('failed to attach listener', err); } })();🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/src/services/webviewAccountService.ts` around lines 58 - 68, If listen('webview:event', ...) throws, the module-level flag started remains true and prevents retries; update startWebviewAccountService (the block that sets started = true, calls listen, and assigns unlisten) so that on error you reset started to false (and ensure unlisten remains undefined) inside the catch for listen failures; specifically, in the async IIFE that calls listen<RecipeEventPayload> ensure the catch handler sets started = false (and optionally clears any partially set unlisten) before logging the error so subsequent calls to startWebviewAccountService() will retry listener registration.app/src-tauri/src/telegram_scanner/mod.rs-345-352 (1)
345-352:⚠️ Potential issue | 🟠 MajorUse
peer_idas the memory key, not the mutable display name.Keying on
peer_namebreaks the promised “one doc per peer” behavior: a rename creates a new doc, and two peers with the same clean title can overwrite each other. The stable upsert key here ispeer_id; keep the human-readable name intitle/metadataonly.Suggested fix
- let key = if peer_key_looks_clean(peer_name) { - peer_name.to_string() - } else { - peer_id.to_string() - }; + let key = peer_id.to_string();🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/src-tauri/src/telegram_scanner/mod.rs` around lines 345 - 352, The current memory key uses the mutable display name (peer_name) which allows renames/collisions; change the upsert key to always use peer_id (keep namespace = format!("telegram-web:{account_id}")), so set key = peer_id.to_string() (or equivalent) instead of conditionally using peer_name; continue to preserve peer_name (and peer_key_looks_clean logic) only for the document title/metadata fields, not for the storage key.app/src-tauri/src/slack_scanner/mod.rs-431-441 (1)
431-441:⚠️ Potential issue | 🟠 MajorUse a stable channel identifier for the memory key.
channel_nameis mutable and only workspace-local. With a namespace ofslack-web:{account_id}, a rename or two workspaces that both have#generalwill fragment or collide transcript docs. Use the stable channel identifier here instead, optionally namespaced withteam_idif you need workspace disambiguation.Suggested fix
- let key = if channels_key_looks_clean(channel_name) { - channel_name.to_string() - } else { - channel_id.to_string() - }; + let key = if team_id.is_empty() { + channel_id.to_string() + } else { + format!("{team_id}_{channel_id}") + };🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/src-tauri/src/slack_scanner/mod.rs` around lines 431 - 441, The code currently builds the memory key from the mutable workspace-local channel_name (namespace = format!("slack-web:{account_id}"); key = if channels_key_looks_clean(channel_name) ...), which will fragment/collide on renames or identical names across workspaces; change it to use the stable channel identifier (channel_id) as the key and, if you need workspace disambiguation, include team_id in the namespace (e.g. namespace includes team_id or use format!("slack-web:{account_id}:{team_id}")), remove reliance on channels_key_looks_clean/channel_name for the stored key, and ensure all places using namespace/key are updated to the new stable format.
🟡 Minor comments (6)
app/src/components/accounts/AddAccountModal.tsx-30-33 (1)
30-33:⚠️ Potential issue | 🟡 MinorSet explicit
type="button"on modal buttons.At Line 30 and Line 52, default button type is
submit; this can trigger unintended form submissions if the modal is rendered within a form context.Proposed fix
<button + type="button" onClick={onClose} className="rounded p-1 text-stone-500 hover:bg-stone-100" aria-label="close"> ... <button key={p.id} + type="button" onClick={() => onPick(p)} className="flex w-full items-center gap-3 rounded-lg px-3 py-2 text-left transition-colors hover:bg-stone-100">Also applies to: 52-55
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/src/components/accounts/AddAccountModal.tsx` around lines 30 - 33, The modal's buttons (the close button using onClose and the other modal action button(s) in AddAccountModal component) lack an explicit type and default to "submit", which can unintentionally submit enclosing forms; update the button elements in AddAccountModal.tsx to include type="button" for the close button (the one with onClick={onClose}) and for the other action button(s) around line where the save/confirm action is defined so they don't trigger form submission.app/src/store/accountsSlice.ts-82-85 (1)
82-85:⚠️ Potential issue | 🟡 MinorGuard
appendLogagainst unknown/removed accounts.At Line 84, logs are created even if
accountIdis no longer instate.accounts, which can leave orphaned state after races (remove/purge vs. late events).Proposed fix
appendLog(state, action: PayloadAction<{ accountId: string; entry: AccountLogEntry }>) { const { accountId, entry } = action.payload; + if (!state.accounts[accountId]) return; const list = (state.logs[accountId] ??= []); list.push(entry); if (list.length > MAX_LOG_LINES_PER_ACCOUNT) { list.splice(0, list.length - MAX_LOG_LINES_PER_ACCOUNT); } },🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/src/store/accountsSlice.ts` around lines 82 - 85, appendLog currently creates or appends log entries for any accountId, which can create orphaned logs when an account has been removed; modify appendLog (in accountsSlice) to first check that state.accounts has an entry for action.payload.accountId and return early (no-op) if it doesn't exist, so you only create/modify state.logs for known accounts and avoid races that leave orphaned state.app/src/components/accounts/AddAccountModal.tsx-22-30 (1)
22-30:⚠️ Potential issue | 🟡 MinorProvide an explicit accessible name for the dialog.
At Line 22,
role="dialog"should be paired witharia-labelledby(oraria-label) so screen readers announce a meaningful dialog title.Proposed fix
<div className="fixed inset-0 z-50 flex items-center justify-center bg-black/40 backdrop-blur-sm" role="dialog" aria-modal="true" + aria-labelledby="add-account-title" onClick={onClose}> ... - <h2 className="text-lg font-semibold text-stone-900">Add account</h2> + <h2 id="add-account-title" className="text-lg font-semibold text-stone-900"> + Add account + </h2>🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/src/components/accounts/AddAccountModal.tsx` around lines 22 - 30, The dialog element with role="dialog" in AddAccountModal lacks an accessible name; add aria-labelledby or aria-label to the outer dialog container and point it to the modal title. Give the h2 ("Add account") a stable id (e.g., addAccountTitle or add-account-title) and set aria-labelledby="{that-id}" on the element that has role="dialog" (or alternatively add aria-label="Add account" to the dialog) so screen readers announce the dialog title; keep existing onClick/onClick stopPropagation behavior intact.app/src/App.tsx-66-69 (1)
66-69:⚠️ Potential issue | 🟡 MinorRoute comment is stale and conflicts with current fullscreen logic.
Line 66 says “On
/accounts” but the fullscreen helper is keyed on/chat. Please update the comment to match behavior.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/src/App.tsx` around lines 66 - 69, Update the stale route comment above the fullscreen assignment to describe the actual behavior used by isAccountsFullscreen: replace the reference to “/accounts” with “/chat” and briefly state that the helper determines whether the current path (location.pathname) and activeAccountId should render the app fullscreen (edge-to-edge webview) versus keeping the agent tab bar and reserved bottom padding; keep the comment adjacent to the fullscreen = isAccountsFullscreen(location.pathname, activeAccountId) line so it matches the current logic.app/src-tauri/recipes/linkedin/recipe.js-42-45 (1)
42-45:⚠️ Potential issue | 🟡 MinorSnapshot dedupe can miss updates outside the first five rows.
Line 44 only fingerprints the first five messages, so deeper-list changes may never ingest.
🔧 Suggested fix
- const key = JSON.stringify({ - n: messages.length, - first: messages.slice(0, 5).map(function (m) { return m.from + '|' + m.body; }), - }); + const key = JSON.stringify( + messages.map(function (m) { + return m.id + '|' + m.from + '|' + m.body + '|' + m.unread; + }) + );🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/src-tauri/recipes/linkedin/recipe.js` around lines 42 - 45, The snapshot key currently only fingerprints the first five messages (const key = JSON.stringify({... first: messages.slice(0, 5)...})) which can miss updates deeper in the thread; update the key generation in recipe.js to include all relevant messages (e.g., use messages.map(...) to include every message's from and body or a deterministic window/hashing of the full messages array) and keep messages.length as part of the key so any change anywhere in the array (not just the first five) changes the fingerprint.app/src/utils/accountsFullscreen.ts-14-14 (1)
14-14:⚠️ Potential issue | 🟡 MinorTighten
/chatroute matching to avoid false fullscreen triggers.Line 14 also matches paths like
/chat-settings. Prefer exact segment matching (/chator/chat/...).🔧 Suggested fix
- if (!pathname.startsWith('/chat')) return false; + const isChatRoute = pathname === '/chat' || pathname.startsWith('/chat/'); + if (!isChatRoute) return false;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/src/utils/accountsFullscreen.ts` at line 14, The current check in accountsFullscreen.ts uses pathname.startsWith('/chat') which incorrectly matches routes like '/chat-settings'; update the condition in the function that computes fullscreen (look for the pathname variable and the early return) to only return true for exact '/chat' or any path under it by replacing startsWith('/chat') with a stricter check such as pathname === '/chat' || pathname.startsWith('/chat/'), ensuring '/chat-settings' and similar do not trigger fullscreen.
🧹 Nitpick comments (14)
.gitmodules (1)
6-6: Prefer HTTPS submodule URL for broader clone compatibility.Using an SSH URL here can break onboarding/CI in environments without preconfigured SSH keys. Consider switching to
https://github.com/tinyhumansai/tauri-cef.gitunless SSH-only access is intentional.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.gitmodules at line 6, The submodule URL in .gitmodules uses an SSH address (the line "url = git@github.com:tinyhumansai/tauri-cef.git") which can fail for users or CI without SSH keys; update that url entry to use the HTTPS form "https://github.com/tinyhumansai/tauri-cef.git" so clones work broadly, then run git submodule sync && git submodule update --init to propagate the change.app/src/components/accounts/providerIcons.tsx (1)
23-33: Use named prop interfaces instead of inline object types.Extract these prop shapes into
interfacedeclarations (e.g.,AgentIconProps,ProviderIconProps) for consistency with the rest of the codebase, which consistently uses this pattern across all other components.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/src/components/accounts/providerIcons.tsx` around lines 23 - 33, Extract the inline prop object types into named interfaces: create AgentIconProps for the AgentIcon ({ className?: string }) and ProviderIconProps for ProviderIcon ({ provider: AccountProvider; className?: string }), then update the component signatures to use these interfaces (export const AgentIcon = ({...}: AgentIconProps) and export const ProviderIcon = ({...}: ProviderIconProps)) so the file matches the project's existing prop-interface pattern.app/src/types/accounts.ts (1)
54-110: Consider exporting provider catalogs as readonly.
BASE_PROVIDERS/DEV_PROVIDERS/PROVIDERSare currently mutable arrays; making them readonly reduces accidental runtime mutation from consumers.Optional refactor
-const BASE_PROVIDERS: ProviderDescriptor[] = [ +const BASE_PROVIDERS: readonly ProviderDescriptor[] = [ ... -const DEV_PROVIDERS: ProviderDescriptor[] = [ +const DEV_PROVIDERS: readonly ProviderDescriptor[] = [ ... -export const PROVIDERS: ProviderDescriptor[] = IS_DEV +export const PROVIDERS: readonly ProviderDescriptor[] = IS_DEV ? [...BASE_PROVIDERS, ...DEV_PROVIDERS] : BASE_PROVIDERS;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/src/types/accounts.ts` around lines 54 - 110, BASE_PROVIDERS, DEV_PROVIDERS and PROVIDERS are mutable arrays; change their declarations to readonly to prevent accidental mutation by consumers: update BASE_PROVIDERS and DEV_PROVIDERS to type ReadonlyArray<ProviderDescriptor> (or use "as const" if all fields are literal) and ensure PROVIDERS is also declared as ReadonlyArray<ProviderDescriptor> (e.g., export const PROVIDERS: ReadonlyArray<ProviderDescriptor> = IS_DEV ? [...BASE_PROVIDERS, ...DEV_PROVIDERS] : BASE_PROVIDERS) so callers cannot push/pop, or alternatively Object.freeze the arrays after creation.app/src-tauri/recipes/whatsapp/manifest.json (1)
5-6: Avoid pinning a stale UA string here.A fixed
Chrome/124.0.0.0macOS UA will drift from the bundled CEF/Chromium version and can start triggering unsupported-browser heuristics over time. Prefer deriving it from the embedded runtime, or at least centralizing it behind one versioned constant so it stays in sync.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/src-tauri/recipes/whatsapp/manifest.json` around lines 5 - 6, The manifest currently hardcodes the "userAgent" string (Chrome/124.0.0.0) which will drift from the embedded CEF/Chromium; change the manifest to stop pinning a stale UA by deriving the user agent from the embedded runtime at startup (e.g., call a helper like getEmbeddedUserAgent() or use a centralized constant such as EMBEDDED_USER_AGENT) and set the "userAgent" value from that derived/centralized source instead of the fixed string; update any code that reads manifest.json to fallback to the runtime-derived UA when present so the app stays in sync with the bundled Chromium.app/src/pages/Accounts.tsx (1)
1-2: Import the React types explicitly here.Using
React.MouseEvent/React.ReactNodewithout a type import is out of line with the repo'simport typerule, and it can break in stricter React/TypeScript setups.As per coding guidelines `Use import type for TypeScript-only imports where appropriate` and `Run tsc --noEmit for TypeScript type checking in the app workspace before merging`.Suggested fix
-import { useEffect, useMemo, useState } from 'react'; +import { useEffect, useMemo, useState } from 'react'; +import type { MouseEvent, ReactNode } from 'react'; @@ interface RailButtonProps { active: boolean; onClick: () => void; - onContextMenu?: (e: React.MouseEvent) => void; + onContextMenu?: (e: MouseEvent) => void; tooltip: string; badge?: number; - children: React.ReactNode; + children: ReactNode; } @@ - const openContextMenu = (accountId: string, e: React.MouseEvent) => { + const openContextMenu = (accountId: string, e: MouseEvent) => {Also applies to: 24-31, 127-130
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/src/pages/Accounts.tsx` around lines 1 - 2, The file uses React types (e.g., React.MouseEvent, React.ReactNode) without a type-only import; add an explicit type import such as "import type React from 'react';" at the top of Accounts.tsx (alongside the existing import of useEffect/useMemo/useState) or alternatively import specific types with "import type { MouseEvent, ReactNode } from 'react';", then ensure all usages (handlers and props declared with React.MouseEvent / React.ReactNode in components like the Accounts component and any onClick handlers) remain unchanged but now resolve via the type-only import to satisfy the repo rule and strict TS checks.app/src-tauri/src/whatsapp_scanner/dom_snapshot.rs (1)
289-294: Body length comparison uses byte length, not character count.At line 291:
if trimmed.len() > best.len() {This compares byte lengths, which can differ from character counts for non-ASCII text. A message with 10 emoji (40 bytes each in UTF-8) would appear "longer" than a message with 100 ASCII characters. Since
truncate_charslater operates on character count, there's a slight inconsistency.In practice, selecting the longest byte-length span is a reasonable heuristic for finding the primary message text, so this is minor.
💡 Optional: use char count for consistency
- if trimmed.len() > best.len() { + if trimmed.chars().count() > best.chars().count() {Note: This adds O(n) overhead per comparison.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/src-tauri/src/whatsapp_scanner/dom_snapshot.rs` around lines 289 - 294, The comparison uses byte length (trimmed.len() and best.len()) which is inconsistent with later character-based truncation; update the comparison in the block that builds `best` (the loop calling collect_text) to compare character counts instead, e.g., use trimmed.chars().count() > best.chars().count() (or track a separate char_count for `best`) so selection matches truncate_chars' character semantics in functions like collect_text/where `best` is used.app/src-tauri/src/whatsapp_scanner/mod.rs (3)
744-751: Fire-and-forget memory writes lose error attribution.The spawned task at lines 747-751 logs errors but doesn't propagate them. If multiple chat-day groups fail, the scanner loop continues without knowing how many succeeded.
This is acceptable for a background scanner (resilience over strict consistency), but consider tracking success/failure counts for observability:
💡 Optional: track write outcomes
// At function level, track outcomes let mut write_ok = 0usize; let mut write_err = 0usize; // ... in loop ... // Use join handles or channels to collect results // After loop: log::info!("[wa][{}] memory writes: {} ok, {} failed", account_id, write_ok, write_err);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/src-tauri/src/whatsapp_scanner/mod.rs` around lines 744 - 751, The fire-and-forget tokio::spawn that calls post_memory_doc_ingest(&acct, &payload).await swallows error attribution; change the pattern to collect outcomes by replacing unobserved spawns with either collecting JoinHandles or sending results over a mpsc channel from inside the spawned tasks, incrementing function-level counters (e.g., write_ok/write_err) based on the Result returned by post_memory_doc_ingest, and after the loop await all handles or drain the channel and log a summary like memory writes: X ok, Y failed; keep using acct and post_memory_doc_ingest names to locate the code and preserve the non-blocking behavior by awaiting the handles only after the scanning iteration completes.
188-208:std::sync::Mutexusage in async context.The
contact_cachefunction usesstd::sync::Mutexwhich can cause issues in async contexts if held across await points. In this case, the lock is acquired and released synchronously withincontact_cache_putandcontact_cache_get(no awaits while locked), so it's safe.However,
unwrap()on the lock at lines 201 and 206 will panic if the mutex is poisoned (a thread panicked while holding it). Consider using.lock().ok()or explicit poison handling for more robust error recovery.💡 Suggested improvement
fn contact_cache_put(account_id: &str, names: &serde_json::Map<String, Value>) { if names.is_empty() { return; } - let mut g = contact_cache().lock().unwrap(); + let Ok(mut g) = contact_cache().lock() else { + log::warn!("[wa] contact cache mutex poisoned"); + return; + }; g.insert(account_id.to_string(), names.clone()); } fn contact_cache_get(account_id: &str) -> serde_json::Map<String, Value> { - let g = contact_cache().lock().unwrap(); + let Ok(g) = contact_cache().lock() else { + return serde_json::Map::new(); + }; g.get(account_id).cloned().unwrap_or_default() }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/src-tauri/src/whatsapp_scanner/mod.rs` around lines 188 - 208, The mutex lock calls in contact_cache_put and contact_cache_get use .lock().unwrap(), which will panic if the mutex is poisoned; change these to handle poisoning gracefully by replacing .lock().unwrap() with a non-panicking pattern (e.g., .lock().ok() or .lock().map_err(|e| e.into()).or_else(|_| /* recover */)) so that contact_cache_put still returns without crashing and contact_cache_get returns an empty/default map on error; update contact_cache_put to log or ignore the poison and exit early, and update contact_cache_get to log the error and return serde_json::Map::new() when locking fails, referencing the contact_cache, contact_cache_put, and contact_cache_get functions.
1-22: Module is substantial — consider splitting per coding guidelines.Per the learning "Keep domain
mod.rsfiles light and export-focused", this ~900-line module could be split into:
mod.rs— exports,ScannerRegistry,spawn_scannerscan.rs—scan_once,scan_dom_once, CDP connectionemit.rs—emit_snapshot,emit_dom_only,emit_grouped_whatsappingest.rs—post_memory_doc_ingest, transcript formattingutil.rs— timestamp parsing, contact cacheThis is optional for the current PR but would improve maintainability as the scanner evolves.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/src-tauri/src/whatsapp_scanner/mod.rs` around lines 1 - 22, The mod.rs file is too large and should be split into focused modules: move scanning logic (scan_once, scan_dom_once and CDP connection setup) into scan.rs, emission helpers (emit_snapshot, emit_dom_only, emit_grouped_whatsapp) into emit.rs, ingestion/formatting (post_memory_doc_ingest and transcript formatting) into ingest.rs, and small helpers (timestamp parsing, contact cache) into util.rs, then reduce mod.rs to exports and public types only (export ScannerRegistry, spawn_scanner and re-export items from the new modules); update mod declarations and use statements so callers still reference ScannerRegistry and spawn_scanner from the top-level mod while implementation details live in the new files.docs/webview-integration-playbook.md (2)
117-120: Add language specifier to log output code block.The fenced code block on line 117 is missing a language specifier. Use
textorshellfor log output examples.📝 Suggested fix
-``` +```text tail -F /tmp/oh-cef.log | grep -E --line-buffered \🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/webview-integration-playbook.md` around lines 117 - 120, The fenced code block containing the log-following command that starts with "tail -F /tmp/oh-cef.log | grep -E --line-buffered \ " is missing a language specifier; update the opening backticks to include a language such as text or shell (for example change "```" to "```text" or "```shell") so the log output is properly highlighted/treated as plain text in the docs.
14-29: Add language specifier to ASCII diagram code block.The fenced code block on line 14 is missing a language specifier. For ASCII diagrams, use
textorplaintextto satisfy linters and improve rendering consistency.📝 Suggested fix
-``` +```text CEF webview (third-party site)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/webview-integration-playbook.md` around lines 14 - 29, The ASCII diagram fenced code block is missing a language specifier; update the opening triple-backtick for the diagram in webview-integration-playbook.md to include a language like text (e.g., change "```" to "```text") so linters and renderers recognize it; locate the diagram's opening fence (the triple backticks surrounding the block shown in the diff) and add the specifier there.app/src-tauri/src/webview_accounts/runtime.js (1)
88-94:sizeOffor strings returns code-unit count, not byte count.For strings,
s.lengthreturns the number of UTF-16 code units, not the byte size. This inconsistency with theWS_MAX_FORWARD_BYTESconstant (which implies bytes) could cause confusion or allow larger-than-expected payloads when strings contain non-BMP characters (surrogate pairs count as 2 code units but represent 1 character).Since this is used for size reporting rather than enforcement (truncation uses
WS_MAX_FORWARD_BYTESonserializeForForward), the practical impact is minor — the loggedsizefield will be approximate for non-ASCII text.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/src-tauri/src/webview_accounts/runtime.js` around lines 88 - 94, The sizeOf function currently returns string.length (UTF-16 code units) which misreports byte size versus the WS_MAX_FORWARD_BYTES byte limit; update sizeOf to compute the actual byte length for strings using UTF-8 (e.g. use new TextEncoder().encode(data).length), and include a safe fallback for environments without TextEncoder (e.g. use Blob or encodeURIComponent-based byte counting); keep the existing branches for ArrayBuffer/Blob/byteLength unchanged and ensure the function still returns 0 for null/undefined.app/src-tauri/src/slack_scanner/idb.rs (1)
1-326: Significant code duplication withtelegram_scanner/idb.rs.The
slack_scanner/idb.rsandtelegram_scanner/idb.rsfiles share nearly identical implementations for:
walk_databaseread_storeserialize_valuescall_function_batch- Struct definitions (
IdbDump,IdbDb,IdbStore)The only differences are the
ORIGINconstant,SKIP_DB_PREFIXES, and log prefixes.Consider extracting a shared
cdp_idbmodule that accepts origin/prefixes as parameters. This would reduce maintenance burden as new provider scanners are added.Example structure:
// app/src-tauri/src/cdp_idb/mod.rs pub struct IdbWalkConfig { pub origin: &'static str, pub skip_prefixes: &'static [&'static str], pub log_prefix: &'static str, } pub async fn walk(cdp: &mut CdpConn, session: &str, config: &IdbWalkConfig) -> Result<IdbDump, String>🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/src-tauri/src/slack_scanner/idb.rs` around lines 1 - 326, Extract the duplicated IndexedDB logic into a new cdp_idb module and make slack_scanner/idb.rs call into it with a small provider config: move the structs IdbDump, IdbDb, IdbStore and the functions walk_database, read_store, serialize_values, call_function_batch (and any helper constants like PAGE_SIZE, MAX_RECORDS_PER_STORE, SERIALIZE_BATCH) into cdp_idb; expose a single async fn walk(cdp: &mut CdpConn, session: &str, config: &IdbWalkConfig) -> Result<IdbDump, String> where IdbWalkConfig has origin: &'static str, skip_prefixes: &'static [&'static str], and log_prefix: &'static str; update slack_scanner::idb to only define the Slack-specific ORIGIN/SKIP_DB_PREFIXES/LOG_PREFIX values and call cdp_idb::walk with an IdbWalkConfig, preserving existing log messages by prefixing with the provided log_prefix.app/src-tauri/src/telegram_scanner/extract.rs (1)
240-261: Float-to-string conversion may produce unexpected precision.At line 246-247:
n.as_f64().map(|f| format!("{f}"))For large integers that exceed i64 range and are parsed as f64, this could produce scientific notation or precision loss (e.g.,
1.23456789012345e15). Telegram IDs are typically within i64 range, but if they're stored as floats in IndexedDB, this could produce inconsistent keys.Consider using a fixed format or handling large numbers explicitly:
💡 Suggested improvement
} else { - n.as_f64().map(|f| format!("{f}")) + n.as_f64().map(|f| format!("{:.0}", f)) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/src-tauri/src/telegram_scanner/extract.rs` around lines 240 - 261, The float conversion path in num_to_str can lose precision or produce scientific notation; update num_to_str to prefer integer/string-safe representations: inside Value::Number(n) first try n.as_i64().map(|i| i.to_string()), then try n.as_u64().map(|u| u.to_string()), and finally fall back to Some(n.to_string()) instead of using n.as_f64().map(|f| format!("{f}")); keep the existing Value::String branch unchanged.
# Conflicts: # .gitmodules # Cargo.lock # app/package.json # app/src-tauri/Cargo.lock # app/src-tauri/Cargo.toml # app/src-tauri/permissions/allow-core-process.toml # app/src-tauri/recipes/gmail/recipe.js # app/src-tauri/src/lib.rs # app/src-tauri/src/webview_accounts/mod.rs # app/src-tauri/tauri.conf.json # app/src-tauri/vendor/tauri-cef # app/src/components/BottomTabBar.tsx # app/src/components/accounts/AddAccountModal.tsx # app/src/pages/Conversations.tsx
There was a problem hiding this comment.
♻️ Duplicate comments (1)
app/src-tauri/src/webview_accounts/mod.rs (1)
595-606:⚠️ Potential issue | 🔴 CriticalPass a per-webview target identity to the Telegram scanner, not only a URL prefix.
At Line 605,
ensure_scanneris still called with onlyurl_prefix. With multiple Telegram accounts open, prefix-based target matching can attach to the wrong tab and mix account ingests.Suggested direction
- registry.ensure_scanner(app_clone, acct, prefix).await; + registry.ensure_scanner(app_clone, acct, target_identity, prefix).await;And update scanner-side target resolution to match by that unique identity (e.g., webview label / CEF browser id / resolved CDP target id), not
starts_with(url_prefix)alone.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/src-tauri/src/webview_accounts/mod.rs` around lines 595 - 606, The Telegram scanner is being started with only the URL prefix which causes cross-tab mixups; modify the call to registry.ensure_scanner so it receives a per-webview unique target identity in addition to the url_prefix (use an existing identifier from args such as args.account_id or add a webview_id on the args struct if needed), then update the ScannerRegistry::ensure_scanner signature and the scanner-side target resolution to match by that unique identity (e.g., webview label / CEF id / CDP target id) instead of relying solely on starts_with(url_prefix); ensure all callers and tests are updated to pass the new identity parameter.
🧹 Nitpick comments (1)
app/src-tauri/src/lib.rs (1)
748-789: Consider extracting a shared dev-auto webview bootstrap helper.This block now duplicates the same startup/open pattern already used for WhatsApp/Slack (and similarly for Meet). A small helper would reduce drift risk across providers.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/src-tauri/src/lib.rs` around lines 748 - 789, Extract the duplicated startup/open logic into a reusable helper (eg. spawn_dev_auto_webview) and call it for Telegram instead of repeating the block: factor out the env-var read + trimming, construction of webview_accounts::OpenArgs, retrieval of app.handle().state::<webview_accounts::WebviewAccountsState>(), and the tauri::async_runtime::spawn wrapper that calls webview_accounts::webview_account_open; reuse the same helper for other providers (WhatsApp/Slack/Meet) so the Telegram-specific code only supplies the env var name "OPENHUMAN_DEV_AUTO_TELEGRAM", provider string "telegram", and any provider-specific Bounds while delegating spawning/logging to the new function.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@app/src-tauri/src/webview_accounts/mod.rs`:
- Around line 595-606: The Telegram scanner is being started with only the URL
prefix which causes cross-tab mixups; modify the call to registry.ensure_scanner
so it receives a per-webview unique target identity in addition to the
url_prefix (use an existing identifier from args such as args.account_id or add
a webview_id on the args struct if needed), then update the
ScannerRegistry::ensure_scanner signature and the scanner-side target resolution
to match by that unique identity (e.g., webview label / CEF id / CDP target id)
instead of relying solely on starts_with(url_prefix); ensure all callers and
tests are updated to pass the new identity parameter.
---
Nitpick comments:
In `@app/src-tauri/src/lib.rs`:
- Around line 748-789: Extract the duplicated startup/open logic into a reusable
helper (eg. spawn_dev_auto_webview) and call it for Telegram instead of
repeating the block: factor out the env-var read + trimming, construction of
webview_accounts::OpenArgs, retrieval of
app.handle().state::<webview_accounts::WebviewAccountsState>(), and the
tauri::async_runtime::spawn wrapper that calls
webview_accounts::webview_account_open; reuse the same helper for other
providers (WhatsApp/Slack/Meet) so the Telegram-specific code only supplies the
env var name "OPENHUMAN_DEV_AUTO_TELEGRAM", provider string "telegram", and any
provider-specific Bounds while delegating spawning/logging to the new function.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: e6883c13-4dd1-4cab-bee4-5b350fb4665d
⛔ Files ignored due to path filters (1)
app/src-tauri/Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (4)
app/src-tauri/Cargo.tomlapp/src-tauri/src/lib.rsapp/src-tauri/src/webview_accounts/mod.rsapp/src/components/accounts/AddAccountModal.tsx
💤 Files with no reviewable changes (2)
- app/src-tauri/Cargo.toml
- app/src/components/accounts/AddAccountModal.tsx
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@app/src-tauri/src/telegram_scanner/mod.rs`:
- Around line 194-211: Replace the current dedupe key (tuple of
date/sender_id/body) with the stable message/store identifier provided by
extract::Harvest: modify the seen HashSet and the rows.retain closure to dedupe
on that stable id (e.g., r.get("store_id") or the field name you added in
extract::Harvest, converting it to a String and using seen.insert(id) to decide
retain). Update the seen declaration to HashSet<String> (or appropriate type)
and keep the same retain logic in the closure (using seen.insert(key) to keep
first-seen rows); if the stable id is missing fall back to the previous
composite key as a safe fallback.
- Around line 365-372: The current logic in telegram_scanner/mod.rs uses
peer_name when peer_key_looks_clean(peer_name) returns true, which causes
renames to create new memory docs; change the memory document key logic to
always use peer_id (set key = peer_id.to_string()) so the memory layer uses the
stable, immutable id for upserts; keep peer_name as a separate field/metadata if
you need human-readable names but do not use it as the namespace/key (refer to
namespace, key, peer_key_looks_clean, peer_name, and peer_id to locate and
update this logic).
- Around line 53-94: The scanner task spawned by spawn_scanner() never stops
because ScannerRegistry::forget() only removes bookkeeping and never cancels the
background loop; change the lifecycle so the registry can stop the task: modify
spawn_scanner to accept or return a cancellation handle (e.g., a
tokio::task::JoinHandle or a CancellationToken/watch channel) and use that token
inside the loop to break and exit cleanly (check token before each sleep and
before/after scan_once), then update ScannerRegistry to store that handle/token
per account and have forget() abort/trigger cancellation and await/cleanup the
task instead of only removing it from started; ensure emit_and_persist and other
resources are safe to stop mid-cycle.
In `@app/src-tauri/src/webview_accounts/mod.rs`:
- Around line 644-658: The detached tokio::spawn calls cause race conditions
updating the ScannerRegistry (functions: ScannerRegistry::ensure_scanner and
ScannerRegistry::forget) — change the code to call and .await these async
registry methods inline instead of spawning background tasks; locate the
branches that call provider_url, try_state (to get ScannerRegistry),
telegram_target_marker and then currently tokio::spawn the ensure_scanner/forget
calls and replace those spawns with direct await calls so registry updates
complete deterministically before returning (apply the same change to the other
similar blocks around the referenced locations).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: eb53c83b-b6af-483f-ba85-de522c056ab3
📒 Files selected for processing (2)
app/src-tauri/src/telegram_scanner/mod.rsapp/src-tauri/src/webview_accounts/mod.rs
| pub fn spawn_scanner<R: Runtime>( | ||
| app: AppHandle<R>, | ||
| account_id: String, | ||
| url_prefix: String, | ||
| target_marker: Option<String>, | ||
| ) { | ||
| tokio::spawn(async move { | ||
| log::info!( | ||
| "[tg] scanner up account={} url_prefix={} target_marker={:?} interval={:?}", | ||
| account_id, | ||
| url_prefix, | ||
| target_marker, | ||
| IDB_SCAN_INTERVAL, | ||
| ); | ||
| // Let tweb hydrate IDB before the first scan — otherwise we'd | ||
| // race empty stores on cold start. | ||
| sleep(Duration::from_secs(10)).await; | ||
|
|
||
| loop { | ||
| match scan_once(&account_id, &url_prefix, target_marker.as_deref()).await { | ||
| Ok(dump) => { | ||
| let harvest = extract::harvest(&dump); | ||
| log::info!( | ||
| "[tg][{}] idb extract: {} msgs, {} users, {} chats, self={}", | ||
| account_id, | ||
| harvest.messages.len(), | ||
| harvest.users.len(), | ||
| harvest.chats.len(), | ||
| harvest.self_id.as_deref().unwrap_or("?"), | ||
| ); | ||
| if !harvest.messages.is_empty() { | ||
| emit_and_persist(&app, &account_id, &harvest); | ||
| } | ||
| } | ||
| Err(e) => { | ||
| log::warn!("[tg][{}] idb scan failed: {}", account_id, e); | ||
| } | ||
| } | ||
| sleep(IDB_SCAN_INTERVAL).await; | ||
| } | ||
| }); | ||
| } |
There was a problem hiding this comment.
Make scanner shutdown real, not just bookkeeping.
ScannerRegistry::forget() only removes the account from started; it never stops the loop spawned by spawn_scanner(). After a close/purge, that task keeps polling forever, and reopening the same account can leave multiple pollers attaching to the same target and double-writing ingests every tick.
♻️ One straightforward way to make the lifecycle consistent
-use tokio::sync::Mutex;
+use tokio::sync::Mutex;
+use tokio::task::JoinHandle;
-pub fn spawn_scanner<R: Runtime>(
+pub fn spawn_scanner<R: Runtime>(
app: AppHandle<R>,
account_id: String,
url_prefix: String,
target_marker: Option<String>,
-) {
- tokio::spawn(async move {
+) -> JoinHandle<()> {
+ tokio::spawn(async move {
// ...
- });
+ })
}
#[derive(Default)]
pub struct ScannerRegistry {
- started: Mutex<std::collections::HashSet<String>>,
+ started: Mutex<HashMap<String, JoinHandle<()>>>,
}
impl ScannerRegistry {
pub async fn ensure_scanner<R: Runtime>(
self: &Arc<Self>,
app: AppHandle<R>,
account_id: String,
url_prefix: String,
target_marker: Option<String>,
) {
let mut g = self.started.lock().await;
- if !g.insert(account_id.clone()) {
+ if g.contains_key(&account_id) {
log::debug!("[tg] scanner already running for {}", account_id);
return;
}
- spawn_scanner(app, account_id, url_prefix, target_marker);
+ let handle = spawn_scanner(app, account_id.clone(), url_prefix, target_marker);
+ g.insert(account_id, handle);
}
pub async fn forget(&self, account_id: &str) {
let mut g = self.started.lock().await;
- g.remove(account_id);
+ if let Some(handle) = g.remove(account_id) {
+ handle.abort();
+ }
}
}Also applies to: 577-606
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@app/src-tauri/src/telegram_scanner/mod.rs` around lines 53 - 94, The scanner
task spawned by spawn_scanner() never stops because ScannerRegistry::forget()
only removes bookkeeping and never cancels the background loop; change the
lifecycle so the registry can stop the task: modify spawn_scanner to accept or
return a cancellation handle (e.g., a tokio::task::JoinHandle or a
CancellationToken/watch channel) and use that token inside the loop to break and
exit cleanly (check token before each sleep and before/after scan_once), then
update ScannerRegistry to store that handle/token per account and have forget()
abort/trigger cancellation and await/cleanup the task instead of only removing
it from started; ensure emit_and_persist and other resources are safe to stop
mid-cycle.
| // De-duplicate by (date, sender_id, body) — the walker can see the | ||
| // same record in multiple store snapshots, so dedupe is not optional. | ||
| let mut seen: std::collections::HashSet<(i64, String, String)> = | ||
| std::collections::HashSet::new(); | ||
| rows.retain(|r| { | ||
| let k = ( | ||
| r.get("date").and_then(|v| v.as_i64()).unwrap_or(0), | ||
| r.get("sender_id") | ||
| .and_then(|v| v.as_str()) | ||
| .unwrap_or("") | ||
| .to_string(), | ||
| r.get("body") | ||
| .and_then(|v| v.as_str()) | ||
| .unwrap_or("") | ||
| .to_string(), | ||
| ); | ||
| seen.insert(k) | ||
| }); |
There was a problem hiding this comment.
Use a stable message identifier for dedupe.
Line 194 currently dedupes on (date, sender_id, body). Telegram timestamps are only second-granularity, so two identical messages from the same sender in the same second will collapse into one row. Carry a stable message/store id through extract::Harvest and dedupe on that instead.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@app/src-tauri/src/telegram_scanner/mod.rs` around lines 194 - 211, Replace
the current dedupe key (tuple of date/sender_id/body) with the stable
message/store identifier provided by extract::Harvest: modify the seen HashSet
and the rows.retain closure to dedupe on that stable id (e.g., r.get("store_id")
or the field name you added in extract::Harvest, converting it to a String and
using seen.insert(id) to decide retain). Update the seen declaration to
HashSet<String> (or appropriate type) and keep the same retain logic in the
closure (using seen.insert(key) to keep first-seen rows); if the stable id is
missing fall back to the previous composite key as a safe fallback.
| // Key = peer name when clean, falling back to the raw peer id. | ||
| // `:` is reserved by the memory layer (it rewrites to `_`). | ||
| let namespace = format!("telegram-web:{account_id}"); | ||
| let key = if peer_key_looks_clean(peer_name) { | ||
| peer_name.to_string() | ||
| } else { | ||
| peer_id.to_string() | ||
| }; |
There was a problem hiding this comment.
Use peer_id as the memory-doc key.
Lines 368-372 switch to peer_name when it looks clean, but names are mutable. A rename will create a second memory doc instead of upserting the existing peer transcript, which breaks the “one doc per peer” behavior.
🧩 Minimal fix
- let key = if peer_key_looks_clean(peer_name) {
- peer_name.to_string()
- } else {
- peer_id.to_string()
- };
+ let key = peer_id.to_string();📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // Key = peer name when clean, falling back to the raw peer id. | |
| // `:` is reserved by the memory layer (it rewrites to `_`). | |
| let namespace = format!("telegram-web:{account_id}"); | |
| let key = if peer_key_looks_clean(peer_name) { | |
| peer_name.to_string() | |
| } else { | |
| peer_id.to_string() | |
| }; | |
| // Key = peer name when clean, falling back to the raw peer id. | |
| // `:` is reserved by the memory layer (it rewrites to `_`). | |
| let namespace = format!("telegram-web:{account_id}"); | |
| let key = peer_id.to_string(); |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@app/src-tauri/src/telegram_scanner/mod.rs` around lines 365 - 372, The
current logic in telegram_scanner/mod.rs uses peer_name when
peer_key_looks_clean(peer_name) returns true, which causes renames to create new
memory docs; change the memory document key logic to always use peer_id (set key
= peer_id.to_string()) so the memory layer uses the stable, immutable id for
upserts; keep peer_name as a separate field/metadata if you need human-readable
names but do not use it as the namespace/key (refer to namespace, key,
peer_key_looks_clean, peer_name, and peer_id to locate and update this logic).
| } else if args.provider == "telegram" { | ||
| if let Some(prefix) = provider_url(&args.provider) { | ||
| let registry = app | ||
| .try_state::<std::sync::Arc<crate::telegram_scanner::ScannerRegistry>>() | ||
| .map(|s| s.inner().clone()); | ||
| if let Some(registry) = registry { | ||
| let app_clone = app.clone(); | ||
| let acct = args.account_id.clone(); | ||
| let prefix = prefix.to_string(); | ||
| let target_marker = telegram_target_marker(&args.account_id, &args.provider); | ||
| tokio::spawn(async move { | ||
| registry | ||
| .ensure_scanner(app_clone, acct, prefix, target_marker) | ||
| .await; | ||
| }); |
There was a problem hiding this comment.
Await the registry updates instead of detaching them.
These ensure_scanner() / forget() calls are just async mutex work plus task bookkeeping, so running them in separate tokio::spawns makes lifecycle ordering nondeterministic for no real benefit. A quick open/close/reopen can race these detached tasks and leave the scanner registry out of sync with the actual webview state.
🛠️ Simpler, ordered calls
- let app_clone = app.clone();
- let acct = args.account_id.clone();
- let prefix = prefix.to_string();
- let target_marker = telegram_target_marker(&args.account_id, &args.provider);
- tokio::spawn(async move {
- registry
- .ensure_scanner(app_clone, acct, prefix, target_marker)
- .await;
- });
+ registry
+ .ensure_scanner(
+ app.clone(),
+ args.account_id.clone(),
+ prefix.to_string(),
+ telegram_target_marker(&args.account_id, &args.provider),
+ )
+ .await;- let registry = registry.inner().clone();
- let acct = args.account_id.clone();
- tokio::spawn(async move { registry.forget(&acct).await });
+ registry.inner().forget(&args.account_id).await;Also applies to: 790-796, 854-860
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@app/src-tauri/src/webview_accounts/mod.rs` around lines 644 - 658, The
detached tokio::spawn calls cause race conditions updating the ScannerRegistry
(functions: ScannerRegistry::ensure_scanner and ScannerRegistry::forget) —
change the code to call and .await these async registry methods inline instead
of spawning background tasks; locate the branches that call provider_url,
try_state (to get ScannerRegistry), telegram_target_marker and then currently
tokio::spawn the ensure_scanner/forget calls and replace those spawns with
direct await calls so registry updates complete deterministically before
returning (apply the same change to the other similar blocks around the
referenced locations).
Summary
telegram_scannermodule for the Tauri shell that drives Telegram Web K over Chrome DevTools Protocol (CDP) to extract peer-grouped message / user / chat records from its IndexedDB, then ingests them into memory viaopenhuman.memory_doc_ingest.webview_accountsdispatch (open / close / purge) and adds anOPENHUMAN_DEV_AUTO_TELEGRAM=<uuid>dev-auto env var to streamline iteration.mainfor visibility but cannot merge until feat(webui-messaging): multi-provider webview accounts, scanners, and chat runtime #629 lands; once it does, a rebase will shrink the diff to just the 5 commits on this branch.Problem
Issue #630 asks for parity with the Slack / WhatsApp / Discord scanners added in PR #629: Telegram Web K is the remaining unpopulated webview account. Without a scanner, opening a Telegram account in the embedded CEF webview produces no memory entries, so downstream agentic flows cannot reference Telegram conversations.
Solution
telegram_scanner/module (new)mod.rs— per-account poll loop; connects to CDP on127.0.0.1:9222, picks the Telegram target, runs an IDB tick every 30s, and exposesScannerRegistryfor lifecycle management. Emitswebview:eventand POSTsopenhuman.memory_doc_ingestso memory fills even when the main window is hidden.idb.rs— IndexedDB walker usingIndexedDB.requestDatabaseNames/requestDatabase/requestData, with per-store record caps.extract.rs— peer-grouped message / user / chat extraction from thetwebsnapshot.lib.rs— registersScannerRegistryunderceffeature;OPENHUMAN_DEV_AUTO_TELEGRAM=<uuid>opens the Telegram account webview 2s after startup so the scanner has a target without manual UI clicks.webview_accounts— open/close/purge branches mirror the slack/discord handling: ensure the scanner on open, forget on close/purge.cef-dll-systofix/146-location-windowsinapp/src-tauri/Cargo.toml[patch.crates-io]. The vendor tauri-cef workspace pins it there, but cargo patches do not propagate through path dependencies, so without this the helper processes crash withCefApp_0_CToCpp called with invalid version -1.vendor/tauri-cefto1b58f715which fixes the bundler to copy the entirecef-helper/src/tree instead of onlymain.rs.Submission Checklist
yarn dev:cefrun that exercisesOPENHUMAN_DEV_AUTO_TELEGRAMand verifies[tg][idb]events + a memory upsert did not complete in this session (dev-server port conflict during the scheduled run). Draft until verified end-to-end.//!module header ontelegram_scanner/mod.rsdescribes the CDP approach, IDB cadence, and emission contract. Additional per-function rustdoc pending.//comments on the Telegram-specific behavior.Impact
app/src-tauri) only. No core crate changes.idb.rsbound work per tick.function(){return [this].concat(arguments);}; no DOM scraping; CDP is localhost-bound on the embedded CEF instance.--features cefis enabled, consistent with the other scanners. Default wry builds are unchanged.Related
extract.rs).Summary by CodeRabbit