Skip to content

Conversation

@Avni2000
Copy link
Owner

@Avni2000 Avni2000 commented Feb 3, 2026

TODO:

There are bound to be inefficiencies when making such quick jumps in terms of architecture, but either the websocket or the lazy loading of the conflicted file in the web CRAWLS. It's painfully slow, even though everything functions. Typing in the custom text box is literally a few miliseconds behind.

Summary by CodeRabbit

  • New Features

    • Browser-based interactive conflict resolver with sessioned web UI, three‑way semantic merge editor, drag‑and‑drop rows/cells, and in‑browser editable resolutions.
    • New UI setting to hide non-conflict outputs.
  • Bug Fixes

    • Conflict handling unified to consistently surface Git UU (semantic) conflicts; textual-marker flows removed.
  • Documentation

    • Added and reorganized tutorial notebooks (positional encoding, Cascadia, Casanovo).
  • Tests

    • Legacy textual-conflict integration tests removed; test notebooks reorganized.
  • Chores

    • Build/tooling reworked for a separate web client bundle and React-based frontend; dependencies updated.

Closes #12

Avni2000 and others added 19 commits February 1, 2026 19:12
- Make utils web safe.
- Still missing some features such as text editor, drag and drop
- Loading/connecting to MergeNB errors.
- Webview panel kept for migratory reference.
- Implement Markdown-it-MathJax library
- Drag and drop any cell
  - Proper drop zones
  - Very slow to actually show user
- Delete cells
- Custom text
  - Not implemented correctly, we should start custom from some branch
- Excessive whitespace
-  Drop zone within cells is == row height
- No textual markers.
- Added in more testing files + improved script
- Flow:
- Choose base/local/remote -> come up with text area for a user to input whatever they'd like -> auto mark this as resolved -> allow user to go back and change the branch the custom text box is based on -> warn user that this will overwrite their current custom text if they changed anything from the original base they chose.
- Ensures no haywire deleting by WSYIWYG principles
…erge_uu.sh

Co-authored-by: Avni2000 <77120766+Avni2000@users.noreply.github.com>
Co-authored-by: Avni2000 <77120766+Avni2000@users.noreply.github.com>
Co-authored-by: Avni2000 <77120766+Avni2000@users.noreply.github.com>
…on.ts

Co-authored-by: Avni2000 <77120766+Avni2000@users.noreply.github.com>
Co-authored-by: Avni2000 <77120766+Avni2000@users.noreply.github.com>
Co-authored-by: Avni2000 <77120766+Avni2000@users.noreply.github.com>
Co-authored-by: Avni2000 <77120766+Avni2000@users.noreply.github.com>
Co-authored-by: Avni2000 <77120766+Avni2000@users.noreply.github.com>
Remove textual conflict marker support, consolidate to semantic resolution only
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 3, 2026

📝 Walkthrough

Walkthrough

The PR removes textual conflict-marker handling and refactors the codebase toward semantic-only (Git UU) conflict detection/resolution, adds browser-safe utilities, a React-based web conflict resolver served via an HTTP/WebSocket web server, updates the build and dependency footprint for a web client bundle, and deletes headless integration tests and textual fixtures.

Changes

Cohort / File(s) Summary
Semantic conflict core
src/conflictDetector.ts, src/resolver.ts, src/types.ts, src/gitIntegration.ts, src/extension.ts
Removed textual-marker detection/resolution and related APIs; narrowed public surface to semantic (Git UU) conflicts; updated types (added base/delete, removed hasTextualConflicts); simplified isSemanticConflict; extension now manages web server lifecycle.
Browser-safe utilities & refactors
src/positionUtils.ts, src/notebookUtils.ts, src/diffUtils.ts, src/cellMatcher.ts, src/notebookParser.ts
Added browser-safe helpers for positions and notebook handling; moved compare/sort and cell-normalization into new modules; diff tokenization updated to token-level inline diffs; cellMatcher and notebookParser delegate to new utilities.
Web server & session infra
src/web/webServer.ts, src/web/index.ts, src/web/webTypes.ts, src/web/WebConflictPanel.ts
Added ConflictResolverWebServer (HTTP + WebSocket), session lifecycle and asset serving, web message contracts, and WebConflictPanel to open browser sessions and relay resolutions back to the extension.
React client & entry
src/web/client/index.tsx, src/web/client/App.tsx, src/web/client/useWebSocket.ts, src/web/client/types.ts, src/web/client/styles.ts, src/web/client/markdown.ts, src/web/client/diff.ts
New React app entry, WebSocket hook and client types, Markdown renderer with DOMPurify+MathJax, styles injector, and a diff wrapper for the web UI.
React components (resolver UI)
src/web/client/ConflictResolver.tsx, src/web/client/MergeRow.tsx, src/web/client/CellContent.tsx
New UI: three-way merge rows, per-cell editors, inline diffs, drag-and-drop reordering, branch-choice controls, resolved-content editing, and delete semantics.
Build, deps, TS config
esbuild.js, package.json, tsconfig.json
Added parallel esbuild context for web client, adjusted externals, added React/DOMPurify/markdown/ws deps and typings, enabled DOM/JSX in tsconfig; build/watch orchestrates extension + web client.
Tests & fixtures changes
src/test/simulate_merge_uu.sh, src/test/03_*.ipynb, src/test/04_*.ipynb
Simulation script generalized for selectable test suites; added several tutorial/test notebooks and updated fixtures to align with semantic-only scope.
Removed headless/integration tests & fixtures
src/test/integration/headlessResolver.ts, src/test/integration/mergeConflict.test.ts, src/test/simple-textual-conflict.ipynb
Deleted headless integration tooling, comprehensive integration tests, and a simple textual-conflict fixture—removing textual conflict validation scaffolding and related exports.
Minor exports & docs
src/webview/ConflictResolverPanel.ts, src/settings.ts, AGENTS.md
Exported escapeHtml from legacy webview module; tiny settings formatting change; AGENTS.md updated to emphasize semantic-only conflicts and the new web-based resolver.

Sequence Diagram(s)

sequenceDiagram
    participant Ext as Extension
    participant Git as Git
    participant WS as WebServer
    participant Browser as Browser/React UI
    participant User as User

    Ext->>Git: query unmerged files (UU)
    Git-->>Ext: list conflicted notebooks

    Ext->>WS: start server & open session
    WS-->>Ext: sessionId + url

    Ext->>Browser: open session URL
    Browser->>WS: WebSocket connect / "ready"
    WS-->>Ext: session connected

    Ext->>Ext: build UnifiedConflict (semantic)
    Ext->>WS: sendConflictData(sessionId, conflict)
    WS->>Browser: "conflict-data"

    Browser-->>Browser: render ConflictResolver UI
    User->>Browser: edit/drag/drop & choose resolutions
    Browser->>WS: "resolve" with resolutions
    WS-->>Ext: route resolution payload
    Ext->>Ext: apply resolutions, update notebook
    Ext->>Git: stage/commit resolved notebook
    Ext->>WS: closeSession(sessionId)
    WS->>Browser: close WebSocket
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Poem

🐰
I hopped through markers, tossed them away,
Opened a browser where conflicts play.
Three columns hum, cells leap in line,
WebSockets whisper — resolutions shine.
A rabbit's cheer — semantic wins today! 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 64.44% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title '[MIGRATE] [WEB] Port to Web-based merge resolver' clearly and specifically describes the main change: migrating to a web-based merge resolver.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch web

Comment @coderabbitai help to get the list of available commands and usage tips.

@Avni2000 Avni2000 changed the title [MIGRATE] Port to Web based merge resolver [MIGRATE] [WEB] Port to Web-based merge resolver Feb 3, 2026
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 17

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
esbuild.js (1)

16-21: ⚠️ Potential issue | 🟡 Minor

Potential null reference when accessing location properties.

The onEnd callback accesses location.file, location.line, and location.column without checking if location exists. Some esbuild errors may not have location information, which would cause a runtime error.

🛡️ Proposed fix to guard against undefined location
 		build.onEnd((result) => {
 			result.errors.forEach(({ text, location }) => {
 				console.error(`✘ [ERROR] ${text}`);
-				console.error(`    ${location.file}:${location.line}:${location.column}:`);
+				if (location) {
+					console.error(`    ${location.file}:${location.line}:${location.column}:`);
+				}
 			});
 			console.log('[watch] build finished');
 		});
src/notebookParser.ts (1)

58-61: ⚠️ Potential issue | 🟠 Major

Bug: Output execution_count uses stale value.

Line 60 sets execution_count: cell.execution_count which is the original (potentially null or old) value, not the newly assigned sequential count. This should use the count being assigned to the cell.

🐛 Proposed fix
                     return {
                         ...cell,
                         execution_count: count++,
                         outputs: cell.outputs.map(output => {
                             if (output.output_type === 'execute_result') {
-                                return { ...output, execution_count: cell.execution_count };
+                                return { ...output, execution_count: count - 1 };
                             }
                             return output;
                         })
                     };

Note: Using count - 1 because count++ increments after assignment.

src/conflictDetector.ts (1)

61-83: ⚠️ Potential issue | 🟠 Major

Raw conflict markers now bypass detection and break parsing

If the notebook still contains <<<<<<< markers (or HTML-styled markers inside cells[].source), parseNotebook throws and we just warn + return null, so conflicts are silently skipped. Add a pre-parse raw-marker detection path (or reintroduce analyzeRawConflicts) and surface these as conflicts instead of dropping them.

As per coding guidelines: “Use analyzeRawConflicts() to detect and handle raw merge conflict markers (<<<<<<</>>>>>>>) that break JSON parsing in .ipynb files” and “Detect and handle HTML-styled conflict markers (e.g., <span><<<<<<< current</span>) in cell source code.”

🤖 Fix all issues with AI agents
In `@package.json`:
- Around line 90-93: The package.json default for the setting
"mergeNB.ui.useWebBrowser" is true but src/settings.ts's fallback in
getSettings() uses false, causing inconsistent behavior; update the fallback in
the getSettings() function (the code path that reads mergeNB.ui.useWebBrowser)
to use true so it matches the package.json declared default (ensure any local
constant or defaultValue used for useWebBrowser in settings.ts is changed from
false to true).

In `@src/resolver.ts`:
- Around line 175-184: resolutionCallback is async but
WebConflictPanel.createOrShow expects a void callback and
ConflictResolverPanel._handleResolution currently invokes it without awaiting or
catching errors; update the callback signature to (resolution:
UnifiedResolution) => Promise<void> (so resolutionCallback in resolver.ts
returns Promise<void>), then modify ConflictResolverPanel._handleResolution to
await the callback invocation and wrap it in try/catch to surface failures
(e.g., via a status update or vscode.window.showErrorMessage) when calling
applySemanticResolutions; this ensures errors from applySemanticResolutions are
observed and reported instead of being swallowed when the panel is disposed.

In `@src/settings.ts`:
- Around line 55-56: The setting ui.hideNonConflictOutputs used in settings.ts
(hideNonConflictOutputs) is not declared in package.json, so add a corresponding
property under contributes.configuration.properties named
"ui.hideNonConflictOutputs" with "type": "boolean", a clear "description", and
"default": true (matching the value in settings.ts); optionally include it near
the existing "ui.useWebBrowser" entry to keep related UI settings together so VS
Code will surface it in the Settings UI.

In `@src/test/03_incoming.ipynb`:
- Around line 515-533: The positional_encoding function currently ignores its
lambda_min and lambda_max parameters when building wavelengths; update the call
to make_wavelength_list inside positional_encoding to pass through lambda_min
and lambda_max so the custom wavelength range is used (refer to
positional_encoding and make_wavelength_list to locate the call and signature).
- Around line 639-677: Two print calls in the fine-dimension block use f-strings
without any placeholders (Ruff F541): change the header print(f"Cosine
similarity in FINE dimensions (0-56):") and the final message print(f"\n→ Fine
wavelengths detect the 0.1 Da difference that coarse wavelengths missed!") to
regular strings (remove the leading f) while leaving the other prints that
format sim_nearby_fine and sim_distant_fine unchanged; locate these around the
cosine similarity computation that uses sim_nearby_fine and sim_distant_fine and
the encode_1000/encode_1000_1/encode_126 variables.
- Around line 427-462: Two notebook cells duplicate imports for numpy and
matplotlib: remove the duplicate imports in one of the cells (either the code
cell with id "7ba18756" or the later code cell with id "43675ed8") and keep a
single import statement (import numpy as np and import matplotlib.pyplot as plt)
in a single, appropriate cell (retain the %pip install scikit-learn line if
needed in its cell); update the notebook so only one cell contains those import
lines to avoid redundancy and improve readability.

In `@src/test/04_incoming.ipynb`:
- Around line 32-49: Update the markdown narrative to correct two typos: change
"becuase" to "because" in the sentence describing DDA precursor selection and
change "previouly" to "previously" in the Cascadia architecture paragraph where
it lists "the previouly predicted amino acids"; verify the replacements in the
markdown cell containing the DDA/DIA description and the Cascadia architecture
paragraph (look for the phrases "Because of the precursor selection" and "the
previouly predicted amino acids" to locate the spots).

In `@src/test/simulate_merge_uu.sh`:
- Line 52: The echo statement in simulate_merge_uu.sh mixes quoting and should
be fixed to avoid ShellCheck SC2140: replace the current mixed-quote line (the
echo that prints "'gum' CLI not found. Identify test "02", "03", or "04" as a
parameter.") with a consistently quoted string, e.g. keep outer double quotes
and escape the inner double quotes around the test numbers (or use outer single
quotes and keep inner double quotes) so the message becomes something like: echo
"'gum' CLI not found. Identify test \"02\", \"03\", or \"04\" as a parameter."
ensuring the echo in the script uses consistent quoting.
- Around line 45-50: Under set -e the `gum choose` non‑zero exit can abort the
script before the fallback runs; change the block that invokes `gum choose` so
its failure is handled (for example run `gum choose` in an if or append `||
true`), then set CHOICE to the selected value or default to TEST_SUITE when `gum
choose` fails or returns empty. Update the code referencing `CHOICE` and the
`gum choose` invocation so they explicitly handle errors (e.g., `if gum choose
...; then CHOICE=...; else CHOICE="$TEST_SUITE"; fi`) ensuring the fallback
logic executes even with set -e.

In `@src/web/client/CellContent.tsx`:
- Around line 95-100: The component injects raw HTML into the DOM (CellContent
using containerRef and dangerouslySetInnerHTML with the html variable, and the
output renderer that uses data['text/html']), which enables XSS; fix by adding
DOMPurify to dependencies, import DOMPurify in this file, and call
DOMPurify.sanitize(...) on the html value before passing it to
dangerouslySetInnerHTML and likewise sanitize any data['text/html'] payloads in
the output renderer (the variable/prop that holds output HTML) so only the
sanitized string is injected; keep the rest of the rendering logic the same.

In `@src/web/client/diff.ts`:
- Around line 37-50: The modified-case logic in getDiffLineClass currently
returns 'modified-old' only for side === 'current' and 'modified-new' for
everything else, which mislabels the 'base' side; update the 'modified' branch
in getDiffLineClass so that 'base' returns 'modified-old' and the other sides
(e.g., 'current' and 'incoming' as appropriate) return 'modified-new' — locate
the getDiffLineClass function and change the conditional in the 'modified' case
to check for side === 'base' (return 'diff-line modified-old') otherwise return
'diff-line modified-new'.

In `@src/web/client/markdown.ts`:
- Around line 15-19: The MarkdownIt instance is configured with html: true which
allows raw HTML and is being inserted into the DOM unsanitized; update the
rendering path (the MarkdownIt instance `md` or the `renderMarkdown()` function
that uses it) to sanitize the HTML output with a trusted sanitizer (e.g.,
DOMPurify) before returning/assigning it to dangerouslySetInnerHTML or innerHTML
(used by CellContent.tsx and ConflictResolverPanel.ts) — alternatively, disable
`html: true` on the `md` config if you intend to forbid inline HTML; ensure the
chosen change is applied where `md.render`/`renderMarkdown()` is called so no
unsanitized HTML ever reaches the DOM.

In `@src/web/client/types.ts`:
- Around line 44-64: The AutoResolveResult interface in this file is mismatched
with the extension payload; update the type used by UnifiedConflictData to reuse
the shared definition from src/web/webTypes.ts (or rename fields to match
conflictDetector.AutoResolveResult) so UI fields are populated correctly—replace
the local AutoResolveResult declaration with an import of the shared type and
update UnifiedConflictData.autoResolveResult to use that imported type
(referencing the AutoResolveResult symbol and UnifiedConflictData interface).

In `@src/web/WebConflictPanel.ts`:
- Around line 110-119: The payload built in WebConflictPanel (inside the
_sendConflictData/_conflict handling) omits the required fileName property
declared by WebConflictData; update the data object to include fileName (e.g.,
derive basename from this._conflict.filePath) so the UI gets a display name
instead of undefined—locate the const data in WebConflictPanel.ts and add
fileName: pathBasename(this._conflict.filePath) (or use an existing utility to
compute basename) before sending.
- Around line 151-171: In _handleResolution, validate each incoming
resolution.choice against an explicit whitelist of allowed values from
ConflictChoice (e.g., 'base', 'current', 'incoming', 'both', 'delete') before
casting; update the semanticResolutionMap population logic in
WebConflictPanel._handleResolution to either map supported values to your
internal union (expanding the union to include 'both'/'delete' if they are
meaningful) or skip/throw/log and reject unsupported values, and ensure the
object passed to _onResolutionComplete uses the validated/normalized choice and
not an unchecked cast from message.resolutions.

In `@src/web/webServer.ts`:
- Around line 391-449: The MathJax config (window.MathJax object and
rerenderMath utility) must be defined before loading the external MathJax script
and the script tag should use defer instead of async; move the entire
window.MathJax = { ... } block and the window.rerenderMath definition so they
appear above the <script
src="https://cdn.jsdelivr.net/npm/mathjax@3/..."></script> tag, and change that
tag's attribute from async to defer so the external script executes after the
config is parsed and the DOM is ready (ensure references to
MathJax.typesetPromise and MathJax.startup.ready remain unchanged).

In `@src/web/webTypes.ts`:
- Around line 135-146: The BrowserToExtensionMessage 'resolve' variant currently
declares choice: string and customContent, which mismatches the actual browser
payload and the extension handler; update the type in webTypes.ts so the
resolution object uses resolvedContent?: string (not customContent) and make
choice a discriminated union matching the real choices used by
src/web/client/types.ts and WebConflictPanel._handleResolution (e.g., the exact
string literals the client sends such as
'accept-local'|'accept-model'|'accept-both' or whatever the client defines);
also verify semanticChoice and markAsResolved optional flags remain and adjust
any code consuming BrowserToExtensionMessage to the new field names.
🧹 Nitpick comments (21)
src/gitIntegration.ts (2)

148-170: Inconsistent function naming convention.

getcurrentVersion and getincomingVersion should use consistent camelCase: getCurrentVersion and getIncomingVersion.

♻️ Proposed fix
-export async function getcurrentVersion(filePath: string): Promise<string | null> {
+export async function getCurrentVersion(filePath: string): Promise<string | null> {
-export async function getincomingVersion(filePath: string): Promise<string | null> {
+export async function getIncomingVersion(filePath: string): Promise<string | null> {

Also update the call site at lines 203-204:

     const [base, current, incoming] = await Promise.all([
         getBaseVersion(filePath),
-        getcurrentVersion(filePath),
-        getincomingVersion(filePath)
+        getCurrentVersion(filePath),
+        getIncomingVersion(filePath)
     ]);

213-214: Remove unused content parameter.

The content parameter is not used in the function implementation—only filePath is passed to isUnmergedFile(). No internal callers were found for this function.

♻️ Proposed fix
-export async function isSemanticConflict(filePath: string, content: string): Promise<boolean> {
+export async function isSemanticConflict(filePath: string): Promise<boolean> {
     return await isUnmergedFile(filePath);
 }

Note: If this is part of a public API, verify no external consumers depend on the current signature before removing the parameter.

src/positionUtils.ts (1)

1-7: Minor formatting: leading whitespace.

Line 1 has extra leading whitespace before the docstring. This appears unintentional.

♻️ Proposed fix
-    /**
+/**
src/web/client/useWebSocket.ts (2)

60-62: Consider exposing error state to callers.

The error handler logs but doesn't expose error state, making it difficult for consuming components to show connection error UI to users.

♻️ Proposed enhancement
+const [error, setError] = useState<Event | null>(null);

 ws.onerror = (err) => {
     console.error('[MergeNB] WebSocket error:', err);
+    setError(err);
 };

-return { connected, sessionId, conflictData, sendMessage };
+return { connected, sessionId, conflictData, sendMessage, error };

Update the interface:

 interface UseWebSocketResult {
     connected: boolean;
     sessionId: string | null;
     conflictData: UnifiedConflictData | null;
     sendMessage: (message: object) => void;
+    error: Event | null;
 }

54-67: No automatic reconnection on disconnect.

If the WebSocket connection drops (network issues, server restart), users must manually refresh. Given the PR description mentions performance concerns, consider adding basic reconnection logic with backoff.

This could be deferred to a follow-up, but documenting the limitation would help:

 ws.onclose = () => {
     console.log('[MergeNB] WebSocket closed');
     setConnected(false);
+    // TODO: Consider implementing reconnection with exponential backoff
 };
src/webview/ConflictResolverPanel.ts (1)

2597-2604: Consider re-exporting escapeHtml from shared notebook utilities to avoid drift.

If the shared notebook utils already expose escapeHtml, re-exporting it here keeps a single source of truth and avoids divergence.

Example refactor (if notebookUtils already exports escapeHtml)
-export function escapeHtml(text: string): string {
-    return text
-        .replace(/&/g, '&amp;')
-        .replace(/</g, '&lt;')
-        .replace(/>/g, '&gt;')
-        .replace(/"/g, '&quot;')
-        .replace(/'/g, '&#039;');
-}
+export { escapeHtml } from '../notebookUtils';
src/web/client/styles.ts (2)

197-205: Duplicate .cell-content selector blocks can be merged.

There are two consecutive .cell-content blocks (lines 197-201 and 203-205). Merging them improves maintainability.

♻️ Proposed merge of duplicate selectors
 .cell-content {
     font-family: 'SF Mono', Monaco, 'Cascadia Code', monospace;
     font-size: 13px;
     line-height: 1.5;
-}
-
-.cell-content {
     color: var(--text-primary);
 }

726-738: Duplicate .cell-placeholder selector overrides earlier definition.

This selector (lines 726-738) duplicates and partially overrides the one defined at lines 238-254. Consider consolidating these into a single definition to avoid confusion about which styles apply.

src/diffUtils.ts (2)

304-305: Move import to the top of the file.

The escapeHtml import is placed at the end of the file, which is unconventional and makes dependencies harder to discover. TypeScript allows this but standard practice is to place imports at the top.

♻️ Move import to top of file
 * - Used by the webview panel to render diff highlighting
 */

+import { escapeHtml } from './notebookUtils';
+
 export interface DiffLine {

And remove lines 304-305.


174-180: Remove the unused tokenize function.

The deprecated tokenize function is not exported and has no usage anywhere in the codebase. Removing it eliminates dead code without affecting the public API.

src/cellMatcher.ts (1)

115-124: Greedy hash matching may produce suboptimal pairings.

Using indexOf finds the first hash match, which may not be optimal when multiple cells have identical content. The second pass (similarity-based) helps, but consider whether cells with duplicate content in different positions could be mismatched. This is likely acceptable for typical notebooks but worth noting.

src/extension.ts (1)

156-159: Empty catch block silently swallows errors.

The catch block on line 157 ignores all errors without any logging. While ignoring errors for decoration providers may be intentional to avoid spam, consider logging at debug level for troubleshooting.

♻️ Optional: Add debug logging
 			} catch {
-				// Ignore errors
+				// Decoration errors are non-critical, but log for debugging
+				// console.debug('[MergeNB] Decoration check failed:', uri.fsPath);
 			}
src/resolver.ts (2)

396-399: Unused method readFile.

The readFile method is defined but never used within this file. Consider removing it to reduce dead code, or verify if it's intended for future use.

🧹 Remove unused method
-    private async readFile(uri: vscode.Uri): Promise<string> {
-        const data = await vscode.workspace.fs.readFile(uri);
-        return new TextDecoder().decode(data);
-    }
-

353-362: Non-null assertions on template notebook metadata.

Lines 358-360 use non-null assertions (templateNotebook!, metadataSource!) assuming at least one notebook exists. While the earlier check at line 235-238 guards against all being undefined, a more defensive approach would ensure these assertions are safe.

🛡️ Add explicit fallback handling
         // Build the final notebook
         // Use auto-resolved notebook metadata if available (preserves kernel auto-resolution)
         const metadataSource = autoResolvedNotebook || currentNotebook || incomingNotebook || baseNotebook;
         const templateNotebook = currentNotebook || incomingNotebook || baseNotebook;
+        
+        if (!templateNotebook || !metadataSource) {
+            vscode.window.showErrorMessage('Cannot build resolved notebook: no valid template available.');
+            return;
+        }
+        
         let resolvedNotebook: Notebook = {
-            nbformat: templateNotebook!.nbformat,
-            nbformat_minor: templateNotebook!.nbformat_minor,
-            metadata: JSON.parse(JSON.stringify(metadataSource!.metadata)),
+            nbformat: templateNotebook.nbformat,
+            nbformat_minor: templateNotebook.nbformat_minor,
+            metadata: JSON.parse(JSON.stringify(metadataSource.metadata)),
             cells: resolvedCells
         };
src/notebookParser.ts (1)

43-44: Import placed after export statement.

The import from ./notebookUtils is placed after the re-export statements. While valid, conventional TypeScript style places all imports at the top of the file for better readability.

📦 Move import to top of file
 import { Notebook, NotebookCell } from './types';
+import { normalizeCellSource } from './notebookUtils';

 // Re-export browser-safe utilities for convenience
 export { normalizeCellSource, sourceToCellFormat, getCellPreview } from './notebookUtils';

-// Import from notebookUtils for internal use
-import { normalizeCellSource } from './notebookUtils';
src/types.ts (1)

178-183: Inconsistent choice options between ResolutionChoice and SemanticConflictResolution.choice.

ResolutionChoice (line 110) includes 'both' as an option, but SemanticConflictResolution.choice (line 180) does not include 'both'. This inconsistency may cause confusion or type errors when mapping between these types.

If 'both' is a valid semantic resolution option (combining current + incoming content), consider aligning the types:

📝 Option 1: Add 'both' to SemanticConflictResolution
 export interface SemanticConflictResolution {
     conflict: SemanticConflict;
-    choice: 'base' | 'current' | 'incoming' | 'delete';
+    choice: 'base' | 'current' | 'incoming' | 'both' | 'delete';
     /** The resolved content from the editable text area (source of truth) */
     resolvedContent?: string;
 }

Alternatively, if 'both' should only apply to textual conflicts, document the distinction clearly.

src/web/client/ConflictResolver.tsx (2)

8-8: Unused import: normalizeCellSource.

The normalizeCellSource function is imported but not used in this file. It's used in MergeRow.tsx instead.

🧹 Remove unused import
 import { sortByPosition } from '../../positionUtils';
-import { normalizeCellSource } from '../../notebookUtils';
 import type {

20-20: Local ResolutionChoice type duplicates shared type.

This local type definition duplicates ResolutionChoice from src/types.ts. Consider importing from the shared types module to maintain consistency and avoid drift.

📝 Import from shared types
+import type { ResolutionChoice } from '../../types';
 import type {
     UnifiedConflictData,
     MergeRow as MergeRowType,
     ConflictChoice,
     NotebookSemanticConflict,
     CellMapping,
     SemanticConflict,
     NotebookCell,
 } from './types';
 import { MergeRow } from './MergeRow';
-
-type ResolutionChoice = 'base' | 'current' | 'incoming' | 'both' | 'delete';
src/notebookUtils.ts (1)

25-28: Edge case: empty string input to sourceToCellFormat.

When source is an empty string, source.split('\n') returns [''], so this function returns [''] instead of []. This may or may not be the desired behavior for an empty cell.

This is a minor edge case and may be intentional to preserve cell presence, but worth documenting.

📝 Option: Handle empty source explicitly
 export function sourceToCellFormat(source: string): string[] {
+    if (source === '') {
+        return [];
+    }
     const lines = source.split('\n');
     return lines.map((line, i) => i < lines.length - 1 ? line + '\n' : line);
 }
src/web/client/MergeRow.tsx (1)

17-39: Duplicate type definitions across components.

The types ResolutionChoice, DraggedCellData, DropTarget, and ResolutionState are duplicated in both MergeRow.tsx and ConflictResolver.tsx. Consider extracting these to a shared types file (e.g., src/web/client/types.ts) to avoid drift and reduce maintenance burden.

📦 Extract shared types

Add to src/web/client/types.ts:

export type ResolutionChoice = 'base' | 'current' | 'incoming' | 'both' | 'delete';

export interface ResolutionState {
    choice: ResolutionChoice;
    originalContent: string;
    resolvedContent: string;
}

export interface DraggedCellData {
    rowIndex: number;
    side: 'base' | 'current' | 'incoming';
    cell: NotebookCell;
}

export interface DropTarget {
    rowIndex: number;
    side: 'base' | 'current' | 'incoming';
}

Then import in both components.

src/conflictDetector.ts (1)

52-109: Remove the LEGO-specific debug logging from the hot path

The hard-coded "2.3 The Lego Analogy" block spams logs and can leak notebook content. Please remove or guard it behind a debug flag to avoid performance/PII noise.

Suggested fix
-        // Debug: Check if we're getting different versions
-        console.log('[MergeNB] detectSemanticConflicts for:', filePath);
-        console.log('[MergeNB] base length:', base?.length ?? 0);
-        console.log('[MergeNB] current length:', current?.length ?? 0);
-        console.log('[MergeNB] incoming length:', incoming?.length ?? 0);
-        console.log('[MergeNB] base === current:', base === current);
-        console.log('[MergeNB] base === incoming:', base === incoming);
-        console.log('[MergeNB] current === incoming:', current === incoming);
@@
-        // Debug: Check parsed notebooks
-        if (baseNotebook && currentNotebook && incomingNotebook) {
-            const baseLegoCell = baseNotebook.cells.find(c => {
-                const src = Array.isArray(c.source) ? c.source.join('') : c.source;
-                return src.includes('2.3 The Lego Analogy');
-            });
-            const currentLegoCell = currentNotebook.cells.find(c => {
-                const src = Array.isArray(c.source) ? c.source.join('') : c.source;
-                return src.includes('2.3 The Lego Analogy');
-            });
-            const incomingLegoCell = incomingNotebook.cells.find(c => {
-                const src = Array.isArray(c.source) ? c.source.join('') : c.source;
-                return src.includes('2.3 The Lego Analogy');
-            });
-            
-            if (baseLegoCell) {
-                const baseSrc = Array.isArray(baseLegoCell.source) ? baseLegoCell.source.join('') : baseLegoCell.source;
-                const currentSrc = currentLegoCell ? (Array.isArray(currentLegoCell.source) ? currentLegoCell.source.join('') : currentLegoCell.source) : '';
-                const incomingSrc = incomingLegoCell ? (Array.isArray(incomingLegoCell.source) ? incomingLegoCell.source.join('') : incomingLegoCell.source) : '';
-                
-                console.log('[MergeNB] LEGO CELL PARSED:');
-                console.log('[MergeNB] - base has "Key insight":', baseSrc.includes('Key insight'));
-                console.log('[MergeNB] - current has "Key insight":', currentSrc.includes('Key insight'));
-                console.log('[MergeNB] - incoming has "Key insight":', incomingSrc.includes('Key insight'));
-            }
-        }

Comment on lines +427 to +462
"cell_type": "code",
"execution_count": null,
"id": "7ba18756",
"metadata": {},
"outputs": [],
"source": [
"import numpy as np\n",
"import matplotlib.pyplot as plt"
]
},
{
"cell_type": "markdown",
"id": "cb9185d0",
"metadata": {},
"source": [
"\n",
"---\n",
"\n",
"# Part II: Implementing Positional Encoding\n",
"\n",
"## 3.10 Code Implementation\n",
"\n",
"The intuition + math was the hard part. Let's implement the positional encoding (in ~10 lines of code) and visualize how it works."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "43675ed8",
"metadata": {},
"outputs": [],
"source": [
"%pip install scikit-learn\n",
"import numpy as np\n",
"import matplotlib.pyplot as plt"
]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# First, check if the notebook file exists and read it
wc -l src/test/03_incoming.ipynb

Repository: Avni2000/MergeNB

Length of output: 90


🏁 Script executed:

# Read the notebook to verify the imports at the specified lines
cat -n src/test/03_incoming.ipynb | sed -n '420,470p'

Repository: Avni2000/MergeNB

Length of output: 2148


🏁 Script executed:

# Check for Ruff configuration in the repository
find . -name "ruff.toml" -o -name "pyproject.toml" -o -name ".ruff.toml" | head -5

Repository: Avni2000/MergeNB

Length of output: 71


🏁 Script executed:

# Check if there's a pyproject.toml with Ruff configuration
if [ -f pyproject.toml ]; then cat pyproject.toml; fi

Repository: Avni2000/MergeNB

Length of output: 42


🏁 Script executed:

# Read the Ruff configuration
cat ruff.toml

Repository: Avni2000/MergeNB

Length of output: 966


🏁 Script executed:

# Check if notebooks are mentioned in any ignore patterns or include patterns
cat ruff.toml | grep -i "notebook\|ipynb\|exclude\|include"

Repository: Avni2000/MergeNB

Length of output: 42


🏁 Script executed:

# Check if there are any exclude patterns or if notebooks are typically checked by Ruff
# First, let's check the repository root for any other Ruff or configuration files
ls -la | grep -E "ruff|pyproject|setup.cfg"

Repository: Avni2000/MergeNB

Length of output: 120


🏁 Script executed:

# Check if Ruff is configured to exclude .ipynb files anywhere
grep -r "ipynb\|notebook" . --include="*.toml" --include="*.cfg" --include="*.ini" 2>/dev/null | head -20

Repository: Avni2000/MergeNB

Length of output: 42


🏁 Script executed:

# Let's also check if there are any CI/CD configs that might run Ruff on notebooks
find . -name "*.yml" -o -name "*.yaml" | xargs grep -l "ruff\|lint" 2>/dev/null | head -5

Repository: Avni2000/MergeNB

Length of output: 84


🏁 Script executed:

# Check if there's a GitHub Actions workflow that runs Ruff
find .github/workflows -name "*.yml" -o -name "*.yaml" 2>/dev/null | head -10

Repository: Avni2000/MergeNB

Length of output: 42


🏁 Script executed:

# Check the dependabot config and any workflow files
cat .github/dependabot.yml

Repository: Avni2000/MergeNB

Length of output: 1194


🏁 Script executed:

# Look for any GitHub Actions workflows
ls -la .github/workflows/ 2>/dev/null || echo "No workflows directory"

Repository: Avni2000/MergeNB

Length of output: 82


🏁 Script executed:

# Check if there's a linting workflow or pre-commit config
cat .pre-commit-config.yaml 2>/dev/null || echo "No pre-commit config"

Repository: Avni2000/MergeNB

Length of output: 80


🏁 Script executed:

# Verify that Ruff by default does NOT check .ipynb files
# Check if there's any nbqa configuration or notebook-specific Ruff setup
grep -r "nbqa\|notebook" . --include="*.toml" --include="*.py" --include="*.txt" 2>/dev/null | head -20

Repository: Avni2000/MergeNB

Length of output: 42


🏁 Script executed:

# Check the full ruff.toml configuration to see if there are any default includes or excludes
cat ruff.toml | grep -A 5 -B 5 "extend-include\|extend-exclude\|include\|exclude" || echo "No include/exclude patterns found"

Repository: Avni2000/MergeNB

Length of output: 93


🏁 Script executed:

# Verify the actual structure of the notebook cells to ensure imports are truly duplicated
python3 << 'EOF'
import json

with open('src/test/03_incoming.ipynb', 'r') as f:
    nb = json.load(f)

# Find all code cells and their imports
for i, cell in enumerate(nb['cells']):
    if cell['cell_type'] == 'code':
        source = ''.join(cell['source'])
        if 'import' in source:
            print(f"Cell {i} (id: {cell.get('id', 'N/A')}):")
            print(source)
            print("---")
EOF

Repository: Avni2000/MergeNB

Length of output: 1544


Consolidate duplicate numpy/matplotlib imports across cells.

You import numpy and matplotlib.pyplot in two separate code cells (Cell 6 and Cell 8). While Ruff does not currently lint notebooks in this repository, consolidating these imports into a single cell is a best practice to avoid unnecessary duplication and improve notebook readability.

Suggested consolidation
- %pip install scikit-learn
- import numpy as np
- import matplotlib.pyplot as plt
+ %pip install scikit-learn
🤖 Prompt for AI Agents
In `@src/test/03_incoming.ipynb` around lines 427 - 462, Two notebook cells
duplicate imports for numpy and matplotlib: remove the duplicate imports in one
of the cells (either the code cell with id "7ba18756" or the later code cell
with id "43675ed8") and keep a single import statement (import numpy as np and
import matplotlib.pyplot as plt) in a single, appropriate cell (retain the %pip
install scikit-learn line if needed in its cell); update the notebook so only
one cell contains those import lines to avoid redundancy and improve
readability.

Comment on lines +515 to +533
"def positional_encoding(m_z, dimensions=512, lambda_min=0.001, lambda_max=10000):\n",
" \"\"\"\n",
" Encode a single m/z value into a d_model-dimensional vector.\n",
" \n",
" First d_model/2 dimensions: sine encoding\n",
" Last d_model/2 dimensions: cosine encoding\n",
" \n",
" Args:\n",
" m_z: The m/z value to encode\n",
" d_model: Total dimensionality of the encoding\n",
" lambda_min, lambda_max: Wavelength range\n",
" \n",
" Returns:\n",
" numpy array of shape (d_model,)\n",
" \"\"\"\n",
" encoding = np.zeros(dimensions)\n",
" d_sin = dimensions // 2\n",
" wavelengths = make_wavelength_list(dimensions)\n",
" for d in range(d_sin):\n",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's examine the notebook file around lines 515-533
cat -n src/test/03_incoming.ipynb | sed -n '500,550p'

Repository: Avni2000/MergeNB

Length of output: 2225


🏁 Script executed:

# Also search for the make_wavelength_list function definition
rg "def make_wavelength_list" -A 5

Repository: Avni2000/MergeNB

Length of output: 1442


🏁 Script executed:

# Let's get more context from the notebook to understand the full structure
wc -l src/test/03_incoming.ipynb

Repository: Avni2000/MergeNB

Length of output: 90


Pass lambda_min and lambda_max arguments to make_wavelength_list.

The positional_encoding function accepts lambda_min and lambda_max parameters but ignores them when calling make_wavelength_list, causing custom wavelength ranges to have no effect.

Proposed fix
-    wavelengths = make_wavelength_list(dimensions)
+    wavelengths = make_wavelength_list(dimensions, lambda_min, lambda_max)
🤖 Prompt for AI Agents
In `@src/test/03_incoming.ipynb` around lines 515 - 533, The positional_encoding
function currently ignores its lambda_min and lambda_max parameters when
building wavelengths; update the call to make_wavelength_list inside
positional_encoding to pass through lambda_min and lambda_max so the custom
wavelength range is used (refer to positional_encoding and make_wavelength_list
to locate the call and signature).

Comment on lines +110 to +119
// Build the data payload for the React app
const data = {
filePath: this._conflict.filePath,
type: this._conflict.type,
semanticConflict: this._conflict.semanticConflict,
autoResolveResult: this._conflict.autoResolveResult,
hideNonConflictOutputs: this._conflict.hideNonConflictOutputs,
currentBranch: this._conflict.semanticConflict?.currentBranch,
incomingBranch: this._conflict.semanticConflict?.incomingBranch,
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Include fileName in the conflict payload

WebConflictData declares fileName as required, but _sendConflictData omits it. If the UI expects a display name, it will render undefined. Add a basename here (or make the type optional).

Suggested fix
+import * as path from 'path';
@@
         const data = {
             filePath: this._conflict.filePath,
+            fileName: path.basename(this._conflict.filePath),
             type: this._conflict.type,
🤖 Prompt for AI Agents
In `@src/web/WebConflictPanel.ts` around lines 110 - 119, The payload built in
WebConflictPanel (inside the _sendConflictData/_conflict handling) omits the
required fileName property declared by WebConflictData; update the data object
to include fileName (e.g., derive basename from this._conflict.filePath) so the
UI gets a display name instead of undefined—locate the const data in
WebConflictPanel.ts and add fileName: pathBasename(this._conflict.filePath) (or
use an existing utility to compute basename) before sending.

Comment on lines 151 to 171
private _handleResolution(message: {
type?: string;
resolutions?: Array<{ index: number; choice: string; resolvedContent: string }>;
semanticChoice?: string;
markAsResolved?: boolean
}): void {
if (this._conflict?.type === 'semantic') {
const semanticResolutionMap = new Map<number, { choice: 'base' | 'current' | 'incoming'; resolvedContent: string }>();
for (const r of (message.resolutions || [])) {
semanticResolutionMap.set(r.index, {
choice: r.choice as 'base' | 'current' | 'incoming',
resolvedContent: r.resolvedContent
});
}
if (this._onResolutionComplete) {
this._onResolutionComplete({
type: 'semantic',
semanticChoice: message.semanticChoice as 'current' | 'incoming' | undefined,
semanticResolutions: semanticResolutionMap,
markAsResolved: message.markAsResolved ?? false
});
Copy link
Contributor

@coderabbitai coderabbitai bot Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Validate resolution choices (and handle both/delete) before building the map

The browser sends ConflictChoice.choice values that include both/delete (see src/web/client/types.ts Line 72-77). Here you cast to 'base' | 'current' | 'incoming' without validation, so unsupported choices can slip through and be misinterpreted downstream. Add a whitelist + update the union to reflect the real choices (or reject unsupported values explicitly).

Suggested fix
-        if (this._conflict?.type === 'semantic') {
-            const semanticResolutionMap = new Map<number, { choice: 'base' | 'current' | 'incoming'; resolvedContent: string }>();
-            for (const r of (message.resolutions || [])) {
-                semanticResolutionMap.set(r.index, {
-                    choice: r.choice as 'base' | 'current' | 'incoming',
-                    resolvedContent: r.resolvedContent
-                });
-            }
+        if (this._conflict?.type === 'semantic') {
+            const semanticResolutionMap = new Map<number, { choice: 'base' | 'current' | 'incoming' | 'both' | 'delete'; resolvedContent: string }>();
+            const allowedChoices = new Set(['base', 'current', 'incoming', 'both', 'delete']);
+            for (const r of (message.resolutions || [])) {
+                if (!allowedChoices.has(r.choice) || typeof r.resolvedContent !== 'string') {
+                    logger.warn('[WebConflictPanel] Ignoring invalid resolution payload:', r);
+                    continue;
+                }
+                semanticResolutionMap.set(r.index, {
+                    choice: r.choice as 'base' | 'current' | 'incoming' | 'both' | 'delete',
+                    resolvedContent: r.resolvedContent
+                });
+            }
🤖 Prompt for AI Agents
In `@src/web/WebConflictPanel.ts` around lines 151 - 171, In _handleResolution,
validate each incoming resolution.choice against an explicit whitelist of
allowed values from ConflictChoice (e.g., 'base', 'current', 'incoming', 'both',
'delete') before casting; update the semanticResolutionMap population logic in
WebConflictPanel._handleResolution to either map supported values to your
internal union (expanding the union to include 'both'/'delete' if they are
meaningful) or skip/throw/log and reject unsupported values, and ensure the
object passed to _onResolutionComplete uses the validated/normalized choice and
not an unchecked cast from message.resolutions.

Copy link
Owner Author

@Avni2000 Avni2000 Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@coderabbitai
Make this a proper issue. Good catch.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Create GitHub issue for resolution choice validation problem

gh issue create \
  --title "Validate resolution choices in WebConflictPanel before casting" \
  --body "## Problem

In \`src/web/WebConflictPanel.ts\` (lines 151-171), the \`_handleResolution\` method receives resolution choices from the browser that can include \`'both'\` and \`'delete'\` values (as defined in \`src/web/client/types.ts\` lines 72-77). However, the code casts these values to a narrower type \`'base' | 'current' | 'incoming'\` without validation:

\`\`\`typescript
const semanticResolutionMap = new Map<number, { choice: 'base' | 'current' | 'incoming'; resolvedContent: string }>();
for (const r of (message.resolutions || [])) {
    semanticResolutionMap.set(r.index, {
        choice: r.choice as 'base' | 'current' | 'incoming',
        resolvedContent: r.resolvedContent
    });
}
\`\`\`

## Impact

Unsupported choice values (\`'both'\` or \`'delete'\`) can slip through the cast and be misinterpreted downstream, potentially causing:
- Type safety violations at runtime
- Incorrect resolution behavior
- Silent failures or unexpected merge results

## Suggested Fix

1. Add validation with an explicit whitelist of allowed values
2. Either:
   - Expand the union type to include \`'both' | 'delete'\` if they are meaningful for semantic resolutions, OR
   - Reject/skip/log unsupported values explicitly
3. Ensure \`resolvedContent\` is also validated as a string

Example:
\`\`\`typescript
const allowedChoices = new Set(['base', 'current', 'incoming', 'both', 'delete']);
for (const r of (message.resolutions || [])) {
    if (!allowedChoices.has(r.choice) || typeof r.resolvedContent !== 'string') {
        logger.warn('[WebConflictPanel] Ignoring invalid resolution payload:', r);
        continue;
    }
    // ... proceed with validated data
}
\`\`\`

## References

- PR: https://github.com/Avni2000/MergeNB/pull/16
- Review comment: https://github.com/Avni2000/MergeNB/pull/16#discussion_r2756704966
- Related types: \`src/web/client/types.ts\` (ConflictChoice interface)
- Requested by: `@Avni2000`" \
  --assignee Avni2000

Length of output: 45


@Avni2000, I've created issue #19 to track this validation problem: #19

The issue documents:

  • The location of the unsafe cast in WebConflictPanel._handleResolution (lines 151-171)
  • The type mismatch between browser choices ('both' | 'delete' included) and the narrower cast
  • Potential impact (runtime type violations, incorrect resolutions, silent failures)
  • Suggested validation approach with a whitelist
  • Backlinks to this PR and review comment

The issue has been assigned to you. Thanks for catching this early! 🐰

🐰💭
/_/
( o.o ) thinking about type safety...

^


🧠 Learnings used
Learnt from: CR
Repo: Avni2000/MergeNB PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-28T22:06:05.265Z
Learning: Applies to src/webview/**/*.{ts,tsx} : Provide a unified conflict resolution UI in `ConflictResolverPanel` webview supporting both textual and semantic conflict resolution with 3-way diff view

Learnt from: CR
Repo: Avni2000/MergeNB PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-28T22:06:05.265Z
Learning: Applies to src/**/resolver.ts : Use the `merge-nb.findConflicts` command as the single unified entry point to find notebooks with merge conflicts (both textual and semantic) and bring up the conflict resolution panel

Learnt from: CR
Repo: Avni2000/MergeNB PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-28T22:06:05.265Z
Learning: Applies to src/**/conflictDetector.ts : Implement semantic conflict detection for Git `UU` status notebooks that lack textual markers but have different execution states, outputs, or cell modifications between branches

Learnt from: CR
Repo: Avni2000/MergeNB PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-28T22:06:05.265Z
Learning: Applies to src/webview/**/*.{ts,tsx} : Use cell-level diffing to show side-by-side or inline diffs for conflicting cells rather than raw JSON line-by-line diffs

Learnt from: CR
Repo: Avni2000/MergeNB PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-28T22:06:05.265Z
Learning: Applies to **/*.{ts,tsx} : Handle Git-nullified execution counts after merge conflict resolution by optionally restoring or renumbering them

Learnt from: CR
Repo: Avni2000/MergeNB PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-28T22:06:05.265Z
Learning: Applies to **/*.{ts,tsx} : Ensure resolved output from merge conflict resolution is valid `.ipynb` JSON with proper cell structure and notebook format compliance

@Avni2000
Copy link
Owner Author

Avni2000 commented Feb 3, 2026

@coderabbitai make an issue re. why the conflicted file might be so slow to load and edit based on what you've learned

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 3, 2026

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

@Avni2000, I'll create an issue documenting the performance problems with the web-based conflict resolver.


🏁 Script executed:

#!/bin/bash
# Create a GitHub issue for performance problems in the web-based conflict resolver

gh issue create \
  --title "Performance: Slow loading and editing in web-based conflict resolver" \
  --body "## Problem

The web-based conflict resolver (introduced in PR `#16`) experiences noticeable performance issues:

- **Heavy crawling/slow loading**: Conflicted files take significant time to load in the browser
- **Input lag**: Typing in the editor shows a few-millisecond lag
- The PR description acknowledges architectural inefficiencies and performance problems

## Potential Contributing Factors

Based on the architecture introduced in PR `#16`, several components may contribute to the slowness:

### 1. WebSocket Communication
- Session-based browser connections with 30-second timeout handling
- Real-time bidirectional messaging between extension and browser
- Potential overhead in message serialization/deserialization

### 2. Lazy-Loading of Conflict Data
- Conflict data sent after WebSocket connection is established
- May cause delays in initial rendering

### 3. UI Rendering Pipeline
- React-based UI with multiple nested components (ConflictResolver, MergeRow, CellContent)
- Markdown rendering with MathJax re-rendering on mount/update
- Real-time diff computation using \`computeLineDiff\` for each cell
- Per-cell and per-inline change styling
- Drag-and-drop state management

### 4. Data Processing
- Cell-level diffing for conflicting cells
- Position-based sorting of merge rows
- Building merge rows from semantic conflicts on initialization

### 5. Large Notebooks
- No apparent virtualization or windowing for large notebooks
- All cells rendered simultaneously
- MathJax processing for every markdown cell

## Suggested Areas to Investigate

1. **Profile WebSocket message sizes**: Check if conflict data payloads are too large
2. **Implement virtualization**: Only render visible cells for large notebooks
3. **Optimize diff computation**: Cache diff results; compute diffs lazily/on-demand
4. **Debounce/throttle**: Add debouncing to text input handlers
5. **Lazy MathJax rendering**: Only render MathJax for visible cells
6. **Bundle size**: Check if client.js bundle is too large (includes React, MarkdownIt, MathJax plugin)
7. **Memory usage**: Monitor memory consumption with large notebooks

## Related

- PR: `#16`
- Mentioned in PR description as known issue

## Environment

- Architecture: WebSocket + HTTP server + React browser UI
- Key files: \`src/web/webServer.ts\`, \`src/web/client/ConflictResolver.tsx\`, \`src/web/client/CellContent.tsx\`" \
  --assignee Avni2000 \
  --label "performance"

Length of output: 44


🏁 Script executed:

#!/bin/bash
# Create a GitHub issue for performance problems (without label since it doesn't exist)

gh issue create \
  --title "Performance: Slow loading and editing in web-based conflict resolver" \
  --body "## Problem

The web-based conflict resolver (introduced in PR `#16`) experiences noticeable performance issues:

- **Heavy crawling/slow loading**: Conflicted files take significant time to load in the browser
- **Input lag**: Typing in the editor shows a few-millisecond lag
- The PR description acknowledges architectural inefficiencies and performance problems

## Potential Contributing Factors

Based on the architecture introduced in PR `#16`, several components may contribute to the slowness:

### 1. WebSocket Communication
- Session-based browser connections with 30-second timeout handling
- Real-time bidirectional messaging between extension and browser
- Potential overhead in message serialization/deserialization

### 2. Lazy-Loading of Conflict Data
- Conflict data sent after WebSocket connection is established
- May cause delays in initial rendering

### 3. UI Rendering Pipeline
- React-based UI with multiple nested components (ConflictResolver, MergeRow, CellContent)
- Markdown rendering with MathJax re-rendering on mount/update
- Real-time diff computation using \`computeLineDiff\` for each cell
- Per-cell and per-inline change styling
- Drag-and-drop state management

### 4. Data Processing
- Cell-level diffing for conflicting cells
- Position-based sorting of merge rows
- Building merge rows from semantic conflicts on initialization

### 5. Large Notebooks
- No apparent virtualization or windowing for large notebooks
- All cells rendered simultaneously
- MathJax processing for every markdown cell

## Suggested Areas to Investigate

1. **Profile WebSocket message sizes**: Check if conflict data payloads are too large
2. **Implement virtualization**: Only render visible cells for large notebooks
3. **Optimize diff computation**: Cache diff results; compute diffs lazily/on-demand
4. **Debounce/throttle**: Add debouncing to text input handlers
5. **Lazy MathJax rendering**: Only render MathJax for visible cells
6. **Bundle size**: Check if client.js bundle is too large (includes React, MarkdownIt, MathJax plugin)
7. **Memory usage**: Monitor memory consumption with large notebooks

## Related

- PR: `#16`
- Mentioned in PR description as known issue

## Environment

- Architecture: WebSocket + HTTP server + React browser UI
- Key files: \`src/web/webServer.ts\`, \`src/web/client/ConflictResolver.tsx\`, \`src/web/client/CellContent.tsx\`" \
  --assignee Avni2000

Length of output: 45


@Avni2000, I've created issue #17 documenting the performance problems in the web-based conflict resolver: #17

The issue includes:

  • Description of the observed slowness (heavy crawling, input lag)
  • Analysis of potential contributing factors based on the new architecture:
    • WebSocket communication overhead
    • Lazy-loading delays
    • UI rendering pipeline (React, MathJax, real-time diffs)
    • Lack of virtualization for large notebooks
  • Suggested investigation areas like profiling WebSocket messages, implementing virtualization, optimizing diff computation, and adding debouncing

The issue has been assigned to you and references PR #16 as mentioned in your request.


🧠 Learnings used
Learnt from: CR
Repo: Avni2000/MergeNB PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-28T22:06:05.265Z
Learning: Applies to src/webview/**/*.{ts,tsx} : Provide a unified conflict resolution UI in `ConflictResolverPanel` webview supporting both textual and semantic conflict resolution with 3-way diff view

Learnt from: CR
Repo: Avni2000/MergeNB PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-28T22:06:05.265Z
Learning: Applies to src/**/resolver.ts : Use the `merge-nb.findConflicts` command as the single unified entry point to find notebooks with merge conflicts (both textual and semantic) and bring up the conflict resolution panel

Learnt from: CR
Repo: Avni2000/MergeNB PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-28T22:06:05.265Z
Learning: Applies to src/**/conflictDetector.ts : Implement semantic conflict detection for Git `UU` status notebooks that lack textual markers but have different execution states, outputs, or cell modifications between branches

Learnt from: CR
Repo: Avni2000/MergeNB PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-28T22:06:05.265Z
Learning: Applies to **/*.{ts,tsx} : Ensure resolved output from merge conflict resolution is valid `.ipynb` JSON with proper cell structure and notebook format compliance

Learnt from: CR
Repo: Avni2000/MergeNB PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-28T22:06:05.265Z
Learning: Applies to src/webview/**/*.{ts,tsx} : Use cell-level diffing to show side-by-side or inline diffs for conflicting cells rather than raw JSON line-by-line diffs

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@package.json`:
- Around line 127-133: package.json currently lists "react" and "react-dom"
under "dependencies" but they are bundled into dist/web/client.js by the esbuild
build (see esbuild.js external config), so remove "react" and "react-dom" from
the "dependencies" block and add them to "devDependencies" instead so they are
available for build time but not packaged into the VSIX; update package.json
accordingly and run your package lock update (npm/yarn) to reflect the move.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@src/web/WebConflictPanel.ts`:
- Around line 152-173: The handler _handleResolution currently forwards
resolvedContent verbatim via semanticResolutionMap into the
_onResolutionComplete call; add runtime validation by parsing each
resolvedContent (or the final merged notebook) with parseNotebook() before
calling _onResolutionComplete and if parseNotebook() throws, surface the error
(reject/throw) so callers can handle it; alternatively wrap the downstream
saveResolvedNotebook() call in try/catch and rethrow or return a descriptive
validation error so invalid .ipynb JSON/cell structure is not silently written.
Ensure references: _handleResolution, semanticResolutionMap,
_onResolutionComplete, parseNotebook(), and saveResolvedNotebook() are used to
locate and implement the validation and error handling.
🧹 Nitpick comments (1)
src/resolver.ts (1)

404-418: Consider using Git extension API instead of shell exec.

The current implementation shells out to git add with a quoted path. While the path comes from VS Code's workspace API (reducing injection risk), consider using VS Code's built-in Git extension API for better cross-platform compatibility and to avoid shell escaping edge cases.

♻️ Alternative using VS Code Git API
private async markFileAsResolved(uri: vscode.Uri): Promise<void> {
    try {
        const gitExtension = vscode.extensions.getExtension('vscode.git')?.exports;
        const git = gitExtension?.getAPI(1);
        const repo = git?.getRepository(uri);
        
        if (repo) {
            await repo.add([uri.fsPath]);
            vscode.window.showInformationMessage(`Marked ${vscode.workspace.asRelativePath(uri, false)} as resolved`);
        } else {
            // Fallback to shell command if Git API unavailable
            const workspaceFolder = vscode.workspace.getWorkspaceFolder(uri);
            if (!workspaceFolder) {
                vscode.window.showWarningMessage('Cannot mark file as resolved: not in a workspace');
                return;
            }
            const relativePath = vscode.workspace.asRelativePath(uri, false);
            await exec(`git add "${relativePath}"`, { cwd: workspaceFolder.uri.fsPath });
            vscode.window.showInformationMessage(`Marked ${relativePath} as resolved`);
        }
    } catch (error) {
        vscode.window.showErrorMessage(`Failed to mark file as resolved: ${error}`);
    }
}

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
package.json (1)

6-6: ⚠️ Potential issue | 🟡 Minor

Incomplete repository URL.

The repository field contains only the GitHub user URL without the actual repository name. This should be updated to the full repository URL.

-  "repository": "https://github.com/Avni2000",
+  "repository": "https://github.com/Avni2000/MergeNB",

- Uses DOMPurify to clean inputted html-as-markdown
- This should be tested more thoroughly... we should look into how DOMPurify functions
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@src/web/client/CellContent.tsx`:
- Around line 174-180: The image handling in CellContent.tsx assumes
data['image/png'] and data['image/jpeg'] are strings but Jupyter can emit
string[]; normalize by detecting if data['image/png'] or data['image/jpeg'] is
an array and joining its elements (e.g., Array.isArray(...) ? ...join('') : ...)
before constructing the data URL used in the <img> src; update the branches that
return the <img> (the checks using data['image/png'] and data['image/jpeg']) to
perform this normalization so the base64 payload is correct.

Comment on lines +174 to +180
// Try image first
if (data['image/png']) {
return <img src={`data:image/png;base64,${data['image/png']}`} alt="output" />;
}
if (data['image/jpeg']) {
return <img src={`data:image/jpeg;base64,${data['image/jpeg']}`} alt="output" />;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Normalize array-formatted image data before building data URLs.

Jupyter outputs can emit image/png or image/jpeg as string[]; interpolating an array produces comma-joined data and breaks the base64.

🧩 Suggested fix
-        if (data['image/png']) {
-            return <img src={`data:image/png;base64,${data['image/png']}`} alt="output" />;
-        }
-        if (data['image/jpeg']) {
-            return <img src={`data:image/jpeg;base64,${data['image/jpeg']}`} alt="output" />;
-        }
+        if (data['image/png']) {
+            const png = Array.isArray(data['image/png']) ? data['image/png'].join('') : String(data['image/png']);
+            return <img src={`data:image/png;base64,${png}`} alt="output" />;
+        }
+        if (data['image/jpeg']) {
+            const jpeg = Array.isArray(data['image/jpeg']) ? data['image/jpeg'].join('') : String(data['image/jpeg']);
+            return <img src={`data:image/jpeg;base64,${jpeg}`} alt="output" />;
+        }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Try image first
if (data['image/png']) {
return <img src={`data:image/png;base64,${data['image/png']}`} alt="output" />;
}
if (data['image/jpeg']) {
return <img src={`data:image/jpeg;base64,${data['image/jpeg']}`} alt="output" />;
}
// Try image first
if (data['image/png']) {
const png = Array.isArray(data['image/png']) ? data['image/png'].join('') : String(data['image/png']);
return <img src={`data:image/png;base64,${png}`} alt="output" />;
}
if (data['image/jpeg']) {
const jpeg = Array.isArray(data['image/jpeg']) ? data['image/jpeg'].join('') : String(data['image/jpeg']);
return <img src={`data:image/jpeg;base64,${jpeg}`} alt="output" />;
}
🤖 Prompt for AI Agents
In `@src/web/client/CellContent.tsx` around lines 174 - 180, The image handling in
CellContent.tsx assumes data['image/png'] and data['image/jpeg'] are strings but
Jupyter can emit string[]; normalize by detecting if data['image/png'] or
data['image/jpeg'] is an array and joining its elements (e.g.,
Array.isArray(...) ? ...join('') : ...) before constructing the data URL used in
the <img> src; update the branches that return the <img> (the checks using
data['image/png'] and data['image/jpeg']) to perform this normalization so the
base64 payload is correct.

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@src/web/client/ConflictResolver.tsx`:
- Around line 145-189: The handleCellDrop logic currently overwrites a target
side cell and clears the source without moving its associated index/metadata,
causing silent loss and stale conflict state; modify handleCellDrop to
block/drop when targetRow already has a cell on the same side (return early or
reject the drop) or implement a safe-swap instead of overwrite, ensure you also
move the corresponding *CellIndex fields (e.g., baseCellIndex, currentCellIndex,
incomingCellIndex) from sourceRow to targetRow along with cell content, and then
recompute and set row metadata (type, isUnmatched, anchorPosition and any other
derived fields) for both sourceRow and targetRow before writing them back into
setRows so no stale metadata remains and cells are not accidentally deleted by
the final filter.

@Avni2000
Copy link
Owner Author

Avni2000 commented Feb 3, 2026

Merging, we're already way out of scope.

@Avni2000 Avni2000 merged commit df7b7e7 into main Feb 3, 2026
1 check was pending
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@src/web/webServer.ts`:
- Around line 281-285: When closeSession removes a pending entry from
pendingConnections it must also reject the pending promise created in
openSession so callers don't hang; after clearTimeout(pending.timeout) and
before this.pendingConnections.delete(sessionId) call pending.reject(new
Error('Session closed')) (or the existing pending.reject function stored on the
pending object) so the openSession promise is rejected, then delete the entry.
Ensure you reference the pendingConnections map, the pending object (with its
timeout and reject), closeSession, and openSession when making this change.
- Around line 195-200: Replace the weak session ID and add origin/session
checks: change generateSessionId() to use a cryptographically secure generator
(e.g., crypto.randomBytes encoded to hex/base64 combined with timestamp) instead
of Math.random(), validate the WebSocket connection origin in your
upgrade/connection handler before accepting connections and verify the session
ID maps to an active session in your session store (reject/close connections if
origin mismatch or session not found), and tighten CORS response headers to only
allow configured origins rather than "*"; also update closeSession() to not only
clear the pending connection timeout but to immediately reject the associated
promise/notify the awaiting caller so it does not hang waiting for the 30s
timeout.
🧹 Nitpick comments (1)
src/web/webServer.ts (1)

211-241: Fail fast if openSession is called before the server starts.

Right now a missing start() leads to a browser open on port 0 and a 30s timeout. Consider throwing early (or auto-starting) for clearer failures.

🧭 Suggested guard
     public async openSession(
         sessionId: string,
         htmlContent: string,
         onMessage: (message: unknown) => void
     ): Promise<WebSocket> {
+        if (!this.isRunning()) {
+            throw new Error('Web server not started. Call start() before openSession().');
+        }

Comment on lines +195 to +200
/**
* Generate a unique session ID.
*/
public generateSessionId(): string {
return `session-${Date.now()}-${Math.random().toString(36).substring(2, 11)}`;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's find and examine the webServer.ts file
fd "webServer.ts" -t f

Repository: Avni2000/MergeNB

Length of output: 80


🏁 Script executed:

# Check package.json for dependencies and Node.js version
cat package.json | head -50

Repository: Avni2000/MergeNB

Length of output: 1340


🏁 Script executed:

# Read the webServer.ts file to understand the actual implementation
wc -l src/web/webServer.ts

Repository: Avni2000/MergeNB

Length of output: 84


🏁 Script executed:

# Read the webServer.ts file, focusing on the key sections mentioned
cat -n src/web/webServer.ts

Repository: Avni2000/MergeNB

Length of output: 21112


🏁 Script executed:

# Check if there's a package-lock.json or yarn.lock to see actual versions
ls -la | grep -E "package-lock|yarn.lock"

Repository: Avni2000/MergeNB

Length of output: 128


🏁 Script executed:

# Look for any ws version specification in package.json
grep -A 5 -B 5 "ws" package.json

Repository: Avni2000/MergeNB

Length of output: 537


🏁 Script executed:

# Check if crypto is already imported or used elsewhere
grep -n "crypto" src/web/webServer.ts
grep -rn "randomUUID\|crypto.random" src/

Repository: Avni2000/MergeNB

Length of output: 42


🏁 Script executed:

# Check for any other session validation patterns in the codebase
grep -n "sessions.get\|sessions.has" src/web/webServer.ts

Repository: Avni2000/MergeNB

Length of output: 203


🌐 Web query:

crypto.randomUUID Node.js version support minimum

💡 Result:

crypto.randomUUID() in Node’s built-in node:crypto module is supported starting in:

  • Node.js v14.17.0 (minimum in the 14.x line) [1]
  • Also introduced in Node.js v15.6.0 (15.x line) [1]

(If you mean the Web Crypto API version globalThis.crypto.randomUUID(), that was added in Node.js v16.7.0.) [2]

Sources:
[1] Node.js Crypto docs (crypto.randomUUID) https://nodejs.org/api/crypto.html
[2] Node.js Web Crypto docs (crypto.randomUUID) https://nodejs.org/download/release/v16.9.0/docs/api/webcrypto.html


🌐 Web query:

ws 8.x WebSocket origin validation verifyClient perMessage-deflate

💡 Result:

ws 8.x: origin validation (verifyClient) + perMessageDeflate

1) Validate WebSocket Origin in verifyClient

ws lets you accept/reject the handshake with verifyClient (runs before the connection is established). The info object includes info.origin and info.req (so you can always fall back to reading headers directly). An example origin-allowlist pattern is shown here. [4]

Important for “protocol version 8” (HyBi drafts 07–12): some clients use Sec-WebSocket-Origin instead of Origin, so check both. [5]

import { WebSocketServer } from 'ws';

const allowed = new Set([
  'https://example.com',
  'https://app.example.com',
  'http://localhost:3000', // dev only
]);

const wss = new WebSocketServer({
  port: 8080,

  verifyClient: (info, cb) => {
    // Prefer ws-provided origin, but fall back to headers.
    const origin =
      info.origin ??
      info.req.headers.origin ??
      info.req.headers['sec-websocket-origin']; // protocolVersion: 8 legacy

    if (origin && allowed.has(origin)) return cb(true);

    cb(false, 403, 'Forbidden');
  },

  // (optional) compression config below
});

Why you should do this: missing origin validation is a known WebSocket weakness class (CSWSH-style risk). [6]

2) perMessageDeflate in ws 8.x

ws supports the permessage-deflate extension. It’s disabled by default on the server and enabled by default on the client; it can add significant CPU/memory overhead, so enable it only if you need it. [1]

Server-side enable/tune:

const wss = new WebSocketServer({
  port: 8080,
  perMessageDeflate: {
    clientNoContextTakeover: true,
    serverNoContextTakeover: true,
    serverMaxWindowBits: 10,
    concurrencyLimit: 10,
    threshold: 1024,
  },
});

Client-side disable (if you want to ensure it never negotiates):

import WebSocket from 'ws';
const ws = new WebSocket('ws://host/path', { perMessageDeflate: false });

[1]

RFC reference for the extension: permessage-deflate is standardized in RFC 7692. [2]

Sources: [1] [2] [4] [5] [6]


🌐 Web query:

VS Code 1.108.0 Node.js version requirement bundled

💡 Result:

VS Code 1.108.0 (December 2025) bundles Node.js 22.21.1 (via Electron 39.2.7). [1]


🏁 Script executed:

# Check if there's any call to pending.reject in closeSession
sed -n '273,288p' src/web/webServer.ts

Repository: Avni2000/MergeNB

Length of output: 589


🏁 Script executed:

# Also check the entire openSession flow to understand the promise lifecycle
sed -n '211,244p' src/web/webServer.ts

Repository: Avni2000/MergeNB

Length of output: 1299


🏁 Script executed:

# Verify if verifyClient is used anywhere in the WebSocket setup
grep -n "verifyClient\|WebSocketServer" src/web/webServer.ts | head -20

Repository: Avni2000/MergeNB

Length of output: 111


Harden session authentication and origin validation for WebSocket connections.

Line 199 generates session IDs from timestamp + Math.random(), and lines 458–505 accept any WebSocket connection without validating the origin or confirming the session exists. Combined with line 299's permissive CORS headers and the ability to set a non-loopback host, this creates a session hijacking vulnerability if a session ID leaks or is guessed. Use cryptographically strong IDs, validate the origin before accepting connections, and verify session existence.

Additionally, line 281–284 clears the pending connection timeout in closeSession() but does not reject the promise, causing it to hang indefinitely until the 30-second timeout expires if the browser never connects.

🔒 Suggested hardening (IDs + origin/session validation + promise rejection)
+import * as crypto from 'crypto';
-    public generateSessionId(): string {
-        return `session-${Date.now()}-${Math.random().toString(36).substring(2, 11)}`;
-    }
+    public generateSessionId(): string {
+        return `session-${crypto.randomUUID()}`;
+    }
-            this.wss = new WebSocket.Server({ server: this.httpServer });
+            this.wss = new WebSocket.Server({ 
+                server: this.httpServer,
+                verifyClient: (info, cb) => {
+                    const origin = info.origin || info.req.headers.origin;
+                    // For localhost dev server, allow connections from the same origin
+                    if (origin && origin === this.getServerUrl()) {
+                        return cb(true);
+                    }
+                    cb(false, 403, 'Forbidden');
+                }
+            });
-    private handleWebSocketConnection(ws: WebSocket, req: http.IncomingMessage): void {
-        const url = new URL(req.url || '/', `http://${req.headers.host}`);
-        const sessionId = url.searchParams.get('session') || 'default';
+    private handleWebSocketConnection(ws: WebSocket, req: http.IncomingMessage): void {
+        const url = new URL(req.url || '/', `http://${req.headers.host}`);
+        const sessionId = url.searchParams.get('session') || 'default';
+        
+        // Validate session exists before accepting connection
+        if (!this.sessions.has(sessionId)) {
+            ws.close(1008, 'Unknown session');
+            return;
+        }
         const pending = this.pendingConnections.get(sessionId);
         if (pending) {
             clearTimeout(pending.timeout);
+            pending.reject(new Error('Session closed'));
             this.pendingConnections.delete(sessionId);
         }
🤖 Prompt for AI Agents
In `@src/web/webServer.ts` around lines 195 - 200, Replace the weak session ID and
add origin/session checks: change generateSessionId() to use a cryptographically
secure generator (e.g., crypto.randomBytes encoded to hex/base64 combined with
timestamp) instead of Math.random(), validate the WebSocket connection origin in
your upgrade/connection handler before accepting connections and verify the
session ID maps to an active session in your session store (reject/close
connections if origin mismatch or session not found), and tighten CORS response
headers to only allow configured origins rather than "*"; also update
closeSession() to not only clear the pending connection timeout but to
immediately reject the associated promise/notify the awaiting caller so it does
not hang waiting for the 30s timeout.

Comment on lines +281 to +285
const pending = this.pendingConnections.get(sessionId);
if (pending) {
clearTimeout(pending.timeout);
this.pendingConnections.delete(sessionId);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Reject pending openSession promises when a session is closed.

If closeSession is called before the browser connects, the pending promise created in openSession never resolves or rejects, leading to a hang. Please reject it explicitly.

🧹 Proposed fix
         const pending = this.pendingConnections.get(sessionId);
         if (pending) {
             clearTimeout(pending.timeout);
+            pending.reject(new Error('Session closed'));
             this.pendingConnections.delete(sessionId);
         }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const pending = this.pendingConnections.get(sessionId);
if (pending) {
clearTimeout(pending.timeout);
this.pendingConnections.delete(sessionId);
}
const pending = this.pendingConnections.get(sessionId);
if (pending) {
clearTimeout(pending.timeout);
pending.reject(new Error('Session closed'));
this.pendingConnections.delete(sessionId);
}
🤖 Prompt for AI Agents
In `@src/web/webServer.ts` around lines 281 - 285, When closeSession removes a
pending entry from pendingConnections it must also reject the pending promise
created in openSession so callers don't hang; after
clearTimeout(pending.timeout) and before
this.pendingConnections.delete(sessionId) call pending.reject(new Error('Session
closed')) (or the existing pending.reject function stored on the pending object)
so the openSession promise is rejected, then delete the entry. Ensure you
reference the pendingConnections map, the pending object (with its timeout and
reject), closeSession, and openSession when making this change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FIX] Filename failing to change upon switching conflicted files

2 participants