Skip to content

feat: enhance auto pagination detection logic#976

Merged
amhsirak merged 4 commits intogetmaxun:developfrom
RohitR311:auto-pag
Feb 25, 2026
Merged

feat: enhance auto pagination detection logic#976
amhsirak merged 4 commits intogetmaxun:developfrom
RohitR311:auto-pag

Conversation

@RohitR311
Copy link
Contributor

@RohitR311 RohitR311 commented Feb 15, 2026

What this PR does?

Improves the auto pagination detection logic during capture list action.

Summary by CodeRabbit

  • New Features

    • Added user-facing success and warning notifications for automatic pagination detection (success for detected Scroll Down/Up; warning when none found).
  • Bug Fixes

    • Improved detection of next/load-more controls with tighter text/icon matching and multilingual coverage
    • Better distinction between infinite scroll and click-based pagination, reducing false positives
    • More robust handling of complex and dynamic layouts via proximity, visibility and container-aware heuristics
    • Returns explicit actions (click next, click load more, scroll) with confidence levels and safer low-confidence fallbacks

@coderabbitai
Copy link

coderabbitai bot commented Feb 15, 2026

Walkthrough

Refactors browser/client pagination detection into a multi-path system (pagination wrapper, nearby elements, full-document, infinite-scroll), adds multilingual text/arrow/load-more patterns, visibility/proximity helpers, and returns explicit actions (clickNext, clickLoadMore, scrollDown) with confidence and debug context.

Changes

Cohort / File(s) Summary
Browser-side analyzer
server/src/sdk/browserSide/pageAnalyzer.js
Rewrote pagination detection into staged phases, added visibility/proximity/shadow-DOM utilities, multilingual patterns, and action-oriented outputs with confidence and debug info. Review for new heuristics, shadow DOM traversal, and debug payload shape.
Client pagination detector
src/helpers/clientPaginationDetector.ts
Large refactor: introduced MAX_BUTTON_TEXT_LENGTH, regex/pattern sets (next/load-more/arrows/container), multi-path detection (detectFromPaginationWrapper, detectFromNearbyElements, detectFromFullDocument), infinite-scroll scoring, selector generation, and many helper utilities. Check proximity thresholds, scoring, and selector stability.
Recorder UI updates
src/components/recorder/RightSidePanel.tsx
Added user notifications for successful/failed auto-detection (success and warning levels) and adjusted messaging around highlighting. Verify UX flow and state resets when detection fails.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested labels

Type: Feature, Type: Enhancement

Suggested reviewers

  • amhsirak

Poem

🐰 I hopped through nodes and patterns bright,
I chased the "Next" by day and night,
Arrows, texts, and scrolls in tune,
Confidence rose beneath the moon,
Hooray — pagination found just right!

🚥 Pre-merge checks | ✅ 3 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 27.27% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat: enhance auto pagination detection logic' directly and accurately summarizes the main changes across all three modified files, which collectively refactor and improve pagination detection with new helper functions, patterns, and multi-phase detection flows.
Merge Conflict Detection ✅ Passed ✅ No merge conflicts detected when merging into develop

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Fix all issues with AI agents
In `@server/src/sdk/browserSide/pageAnalyzer.js`:
- Around line 2015-2032: The scrollToTopPatterns array in pageAnalyzer.js is
missing the accessibility/text selectors present in the TS version; update the
scrollToTopPatterns constant (the array used by the for...of loop that queries
document.querySelector and checks isVisible) to include the aria-label and title
selectors—specifically add selectors like '[aria-label*="scroll to top" i]' and
'[title*="back to top" i]' so the loop will detect those elements as well.

In `@src/helpers/clientPaginationDetector.ts`:
- Around line 339-357: The nav exclusion in shouldSkipElement is too narrow;
update the nav-detection logic (inside shouldSkipElement, where nav is computed)
to treat a nav as pagination if its className or aria-label matches a broader
allowlist (e.g., /\b(paginat(ion|e)|page|page[- ]navigation|prev|next)\b/i) OR
switch to a denylist approach by excluding common non-pagination navs (e.g.,
class/aria values containing 'site', 'main-nav', 'primary', 'header', 'footer',
'breadcrumb', 'sidebar') and only skip when a nav matches a denylist; modify the
conditional that currently uses /\bpaginat(ion|e)\b/i so it uses the new regex
or denylist check on nav.className and nav.getAttribute('aria-label')
accordingly.
- Around line 434-453: The JS implementation of scroll detection is missing two
attribute selectors that the TypeScript version includes; update the
scrollToTopPatterns array (the one used with doc.querySelector in the loop that
checks isVisible) to include '[aria-label*="scroll to top" i]' and
'[title*="back to top" i]' so both JS and TS versions stay in sync and the
for-loop that queries selectors behaves identically.
- Around line 406-432: The selector '[class*="infinite-scroll"]' is duplicated
in sentinelPatterns and infiniteScrollLibraries causing double-counting; update
clientPaginationDetector by removing the duplicate from one of the arrays (or
merge the two arrays into a single set) so that
doc.querySelector('[class*="infinite-scroll"]') only increments score
once—adjust the loops that iterate over sentinelPatterns and
infiniteScrollLibraries (or replace them with a single combinedDedupedPatterns
list) to ensure each unique selector contributes at most one score increment
used by hasStrongInfiniteScrollSignals/score logic.
- Around line 95-104: Anchored regexes are being matched against combinedText
(the concatenation of text, ariaLabel, and title), so start/end-anchored
patterns like those in nextButtonTextPatterns, loadMorePatterns, and
nextButtonArrowPatterns fail when ariaLabel/title are non-empty; update the
detection logic in the block that builds combinedText so anchored patterns are
tested only against the raw text variable (text) while
non-anchored/word-boundary patterns are tested against combinedText—use
matchesAnyPattern (or a new helper) to run anchoredPatterns on text and
unanchoredPatterns on combinedText, then adjust scoring/reasons the same way
(score += 10/8 and reasons.push) so Load More/Next detection works when
ariaLabel/title exist.
🧹 Nitpick comments (3)
src/helpers/clientPaginationDetector.ts (2)

65-74: Dead code: prevButtonTextPatterns and prevButtonArrowPatterns are defined but never used.

These two pattern arrays were introduced but no code in this file references them. If prev-button detection was intentionally removed, these declarations should be cleaned up.

🧹 Suggested removal
-      const prevButtonTextPatterns = [
-        /^\s*prev(ious)?\s*$/i,
-        /\bprevious\s+page\b/i,
-        /\bpage\s+précédente\b/i,
-      ];
-
-      const prevButtonArrowPatterns = [
-        /^[<\s‹←«]+$/,
-        /^<<$/,
-      ];

39-63: Significant code duplication with pageAnalyzer.js.

Nearly the entire pagination detection logic — patterns, shouldSkipElement, isNearList, detectInfiniteScrollIndicators, candidate scoring, threshold constants, and decision tree — is duplicated across this TypeScript class and the JS IIFE in pageAnalyzer.js. Any bug fix or tuning change must be applied twice, and the two copies have already diverged slightly (e.g., scrollToTopPatterns mismatch noted separately).

Consider extracting the shared constants (patterns, thresholds) and core scoring logic into a single source of truth that both consumers import or inline from.

Also applies to: 78-173, 175-235, 395-463

server/src/sdk/browserSide/pageAnalyzer.js (1)

1043-1067: Substantial duplication with clientPaginationDetector.ts.

All patterns, constants, shouldSkipElement, isNearList, infinite scroll detection, and the scoring/decision tree are duplicated. This has already caused a drift (scrollToTopPatterns). A shared module or build-time inlining would prevent future divergence.

Also applies to: 1076-1091, 1847-1875, 1976-2042, 2047-2103

Comment on lines 95 to 104
const combinedText = `${text} ${ariaLabel} ${title}`;
if (this.matchesAnyPattern(combinedText, nextButtonTextPatterns)) {
score += 10;
reasons.push('text match (+10)');
} else if (text.length <= 3 && this.matchesAnyPattern(text, nextButtonArrowPatterns)) {
score += 8;
reasons.push('arrow match (+8)');
}

if (this.isNearList(element, listContainer)) {
if (score === 0) continue;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check loadMorePatterns usage
rg -n 'loadMorePatterns' src/helpers/clientPaginationDetector.ts
rg -n 'loadMorePatterns' server/src/sdk/browserSide/pageAnalyzer.js

Repository: getmaxun/maxun

Length of output: 275


🏁 Script executed:

# Also check for pattern definitions to understand if they're anchored
rg -n 'nextButtonTextPatterns\|nextButtonArrowPatterns\|loadMorePatterns' -A 5 src/helpers/clientPaginationDetector.ts | head -100

Repository: getmaxun/maxun

Length of output: 40


🏁 Script executed:

# Get the full context of the file structure
wc -l src/helpers/clientPaginationDetector.ts

Repository: getmaxun/maxun

Length of output: 101


🏁 Script executed:

# Get pattern definitions
sed -n '50,80p' src/helpers/clientPaginationDetector.ts

Repository: getmaxun/maxun

Length of output: 859


🏁 Script executed:

# Get the context around lines 95-104
sed -n '85,115p' src/helpers/clientPaginationDetector.ts

Repository: getmaxun/maxun

Length of output: 1064


🏁 Script executed:

# Find and display the matchesAnyPattern method
rg -n 'matchesAnyPattern' src/helpers/clientPaginationDetector.ts -A 10 | head -50

Repository: getmaxun/maxun

Length of output: 1355


🏁 Script executed:

# Get nextButtonTextPatterns and nextButtonArrowPatterns definitions
rg -n 'nextButtonTextPatterns\|nextButtonArrowPatterns' -B 2 -A 8 src/helpers/clientPaginationDetector.ts | head -80

Repository: getmaxun/maxun

Length of output: 40


🏁 Script executed:

# Get nextButtonTextPatterns definition specifically
rg -n 'const nextButtonTextPatterns' -A 15 src/helpers/clientPaginationDetector.ts

Repository: getmaxun/maxun

Length of output: 475


🏁 Script executed:

# Let's also check if there's a nextButtonArrowPatterns definition
rg -n 'const nextButtonArrowPatterns' -A 10 src/helpers/clientPaginationDetector.ts

Repository: getmaxun/maxun

Length of output: 375


Anchored patterns fail when ariaLabel or title are non-empty in combinedText.

combinedText concatenates text, ariaLabel, and title with spaces. Patterns with start/end anchors (like /^\s*next\s*$/i, /^\s*load\s+more\s*$/i) cannot match when ariaLabel or title are non-empty, because the combined string will have trailing content after the expected text.

Specifically:

  • In nextButtonTextPatterns: the first pattern /^\s*next\s*$/i (line 42) is anchored and will fail; only the word-boundary patterns (lines 43–47) will match in combinedText.
  • All loadMorePatterns (lines 55–62) are anchored, so a "Load More" button with a non-empty aria-label or title will score 0 despite matching the text itself.
  • The same issue exists with nextButtonArrowPatterns (lines 50–51).

Consider either:

  1. Testing anchored patterns against text only, and word-boundary patterns against combinedText, or
  2. Removing the anchors from patterns that need to match inside combinedText.
🤖 Prompt for AI Agents
In `@src/helpers/clientPaginationDetector.ts` around lines 95 - 104, Anchored
regexes are being matched against combinedText (the concatenation of text,
ariaLabel, and title), so start/end-anchored patterns like those in
nextButtonTextPatterns, loadMorePatterns, and nextButtonArrowPatterns fail when
ariaLabel/title are non-empty; update the detection logic in the block that
builds combinedText so anchored patterns are tested only against the raw text
variable (text) while non-anchored/word-boundary patterns are tested against
combinedText—use matchesAnyPattern (or a new helper) to run anchoredPatterns on
text and unanchoredPatterns on combinedText, then adjust scoring/reasons the
same way (score += 10/8 and reasons.push) so Load More/Next detection works when
ariaLabel/title exist.

Comment on lines 406 to 432
const sentinelPatterns = [
'[class*="sentinel"]',
'[class*="trigger"]',
'[data-infinite]',
'[data-scroll-trigger]',
'#infinite-scroll-trigger',
'[class*="infinite"]',
'[id*="infinite"]'
'[class*="infinite-scroll"]',
'[id*="infinite-scroll"]',
];

for (const selector of sentinelPatterns) {
if (doc.querySelector(selector)) {
score += 4;
indicators.push(`Sentinel element: ${selector} (+4)`);
score += 6;
break;
}
}

const scrollToTopPatterns = [
'[class*="scroll"][class*="top"]',
'[aria-label*="scroll to top"]',
'[title*="back to top"]',
'.back-to-top',
'#back-to-top',
'[class*="scrolltop"]',
'[class*="backtotop"]',
'button[class*="top"]',
'a[href="#top"]',
'a[href="#"]'
];

for (const selector of scrollToTopPatterns) {
const element = doc.querySelector(selector);
if (element && this.isVisible(element as HTMLElement)) {
score += 2;
indicators.push(`Scroll-to-top button (+2)`);
break;
}
}

if (initialHeight > viewportHeight * 3) {
score += 3;
indicators.push(`Very tall page (${(initialHeight / viewportHeight).toFixed(1)}x viewport) (+3)`);
} else if (initialHeight > viewportHeight * 2) {
score += 2;
indicators.push(`Tall page (${(initialHeight / viewportHeight).toFixed(1)}x viewport) (+2)`);
}

if (initialItemCount >= 20) {
score += 2;
indicators.push(`Many list items (${initialItemCount}) (+2)`);
} else if (initialItemCount >= 10) {
score += 1;
indicators.push(`Good number of list items (${initialItemCount}) (+1)`);
}

const infiniteScrollLibraries = [
'.infinite-scroll',
'[data-infinite-scroll]',
'[data-flickity]',
'[data-slick]',
'.masonry',
'[data-masonry]',
'[class*="infinite-scroll"]',
'[class*="lazy-load"]',
'[data-lazy]'
];

for (const selector of infiniteScrollLibraries) {
if (doc.querySelector(selector)) {
score += 4;
indicators.push(`Infinite scroll library: ${selector} (+4)`);
score += 6;
break;
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Double-counting: sentinelPatterns and infiniteScrollLibraries both match [class*="infinite-scroll"].

If a page has an element with class="infinite-scroll-component", both loops match and the score jumps by 12 (6+6), which meets the hasStrongInfiniteScrollSignals threshold on its own. This single CSS class match would cause the detector to return scrollDown with medium confidence — potentially a false positive for pages that simply use an infinite-scroll library wrapper around a paginated list.

Either deduplicate the selector or treat the two groups as a single check.

♻️ Suggested deduplication
       const sentinelPatterns = [
         '[data-infinite]',
         '[data-scroll-trigger]',
         '#infinite-scroll-trigger',
-        '[class*="infinite-scroll"]',
-        '[id*="infinite-scroll"]',
-      ];
-
-      for (const selector of sentinelPatterns) {
-        if (doc.querySelector(selector)) {
-          score += 6;
-          break;
-        }
-      }
-
-      const infiniteScrollLibraries = [
-        '.infinite-scroll',
-        '[data-infinite-scroll]',
         '[class*="infinite-scroll"]',
+        '[id*="infinite-scroll"]',
+        '.infinite-scroll',
+        '[data-infinite-scroll]',
       ];
 
-      for (const selector of infiniteScrollLibraries) {
-        if (doc.querySelector(selector)) {
+      for (const selector of sentinelPatterns) {
+        if (doc.querySelector(selector)) {
           score += 6;
           break;
         }
       }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const sentinelPatterns = [
'[class*="sentinel"]',
'[class*="trigger"]',
'[data-infinite]',
'[data-scroll-trigger]',
'#infinite-scroll-trigger',
'[class*="infinite"]',
'[id*="infinite"]'
'[class*="infinite-scroll"]',
'[id*="infinite-scroll"]',
];
for (const selector of sentinelPatterns) {
if (doc.querySelector(selector)) {
score += 4;
indicators.push(`Sentinel element: ${selector} (+4)`);
score += 6;
break;
}
}
const scrollToTopPatterns = [
'[class*="scroll"][class*="top"]',
'[aria-label*="scroll to top"]',
'[title*="back to top"]',
'.back-to-top',
'#back-to-top',
'[class*="scrolltop"]',
'[class*="backtotop"]',
'button[class*="top"]',
'a[href="#top"]',
'a[href="#"]'
];
for (const selector of scrollToTopPatterns) {
const element = doc.querySelector(selector);
if (element && this.isVisible(element as HTMLElement)) {
score += 2;
indicators.push(`Scroll-to-top button (+2)`);
break;
}
}
if (initialHeight > viewportHeight * 3) {
score += 3;
indicators.push(`Very tall page (${(initialHeight / viewportHeight).toFixed(1)}x viewport) (+3)`);
} else if (initialHeight > viewportHeight * 2) {
score += 2;
indicators.push(`Tall page (${(initialHeight / viewportHeight).toFixed(1)}x viewport) (+2)`);
}
if (initialItemCount >= 20) {
score += 2;
indicators.push(`Many list items (${initialItemCount}) (+2)`);
} else if (initialItemCount >= 10) {
score += 1;
indicators.push(`Good number of list items (${initialItemCount}) (+1)`);
}
const infiniteScrollLibraries = [
'.infinite-scroll',
'[data-infinite-scroll]',
'[data-flickity]',
'[data-slick]',
'.masonry',
'[data-masonry]',
'[class*="infinite-scroll"]',
'[class*="lazy-load"]',
'[data-lazy]'
];
for (const selector of infiniteScrollLibraries) {
if (doc.querySelector(selector)) {
score += 4;
indicators.push(`Infinite scroll library: ${selector} (+4)`);
score += 6;
break;
}
}
const sentinelPatterns = [
'[data-infinite]',
'[data-scroll-trigger]',
'#infinite-scroll-trigger',
'[class*="infinite-scroll"]',
'[id*="infinite-scroll"]',
'.infinite-scroll',
'[data-infinite-scroll]',
];
for (const selector of sentinelPatterns) {
if (doc.querySelector(selector)) {
score += 6;
break;
}
}
🤖 Prompt for AI Agents
In `@src/helpers/clientPaginationDetector.ts` around lines 406 - 432, The selector
'[class*="infinite-scroll"]' is duplicated in sentinelPatterns and
infiniteScrollLibraries causing double-counting; update clientPaginationDetector
by removing the duplicate from one of the arrays (or merge the two arrays into a
single set) so that doc.querySelector('[class*="infinite-scroll"]') only
increments score once—adjust the loops that iterate over sentinelPatterns and
infiniteScrollLibraries (or replace them with a single combinedDedupedPatterns
list) to ensure each unique selector contributes at most one score increment
used by hasStrongInfiniteScrollSignals/score logic.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Fix all issues with AI agents
In `@server/src/sdk/browserSide/pageAnalyzer.js`:
- Around line 2029-2037: In detectFromNearbyElements, the element.closest(...)
call that awards the +5 pagination-ancestor bonus only checks class selectors;
update that closest call to also include the [aria-label*="paginat" i] attribute
selector (i.e., add , [aria-label*="paginat" i] alongside [class*="paginat"],
[class*="pager"]) so elements inside containers identified by aria-label receive
the same +5 bonus; keep the existing try/catch and scoring logic unchanged.
- Around line 1871-1872: The getListContainer function currently returns
listElements[0] when listElements.length === 0 (producing undefined); change the
conditional so empty input returns a clear value (e.g., null) and otherwise
return the first element — i.e., in getListContainer replace the faulty branch
so when listElements.length === 0 it returns null (or another explicit sentinel)
and otherwise returns listElements[0], ensuring a stable, non-latent NPE-safe
return for callers.

In `@src/helpers/clientPaginationDetector.ts`:
- Line 278: The ternary assigning confidence is dead because both branches
return 'high'; update the expression in clientPaginationDetector.ts where
confidence is set (variable name: confidence, condition: hasNumberedPages) so
that the false branch returns 'medium' instead of 'high' (i.e., make the ternary
yield 'high' when hasNumberedPages is true and 'medium' when false) to reflect
differing confidence levels.
- Around line 116-118: getListContainer currently returns listElements[0] even
when listElements is empty which yields undefined typed as HTMLElement; change
getListContainer to return HTMLElement | null and update its implementation to
return null when listElements.length === 0 (otherwise return listElements[0]),
then update any callers of getListContainer to handle a possible null return
(e.g., guard before dereferencing the result or bail out) so you no longer risk
an NPE; reference: getListContainer(listElements: HTMLElement[]), and ensure
callers that previously assumed an HTMLElement now check for null.
🧹 Nitpick comments (2)
src/helpers/clientPaginationDetector.ts (1)

231-267: Significant code duplication across the three detection phases.

detectFromPaginationWrapper, detectFromNearbyElements, and detectFromFullDocument share nearly identical loops for candidate scoring (text extraction, pattern matching, score accumulation). Consider extracting a shared scoreCandidates(clickables, listContainer, options) helper to reduce duplication and risk of the phases drifting out of sync.

Also applies to: 332-382, 416-460

server/src/sdk/browserSide/pageAnalyzer.js (1)

1940-2151: Same code duplication concern as the TS file — three detection phase functions share near-identical candidate scoring loops.

Consider extracting the shared text-extraction + pattern-matching + scoring logic into a reusable helper to keep the JS and TS implementations maintainable and consistent.

Comment on lines +1871 to +1872
function getListContainer(listElements) {
if (listElements.length === 0) return listElements[0];
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Same empty-array bug as the TS file: getListContainer returns undefined for empty input.

Line 1872: listElements[0] is undefined when the array is empty. The caller guards against this, but this is a latent NPE if the guard is ever removed.

🐛 Proposed fix
 function getListContainer(listElements) {
-  if (listElements.length === 0) return listElements[0];
+  if (listElements.length === 0) return null;
   var firstParent = listElements[0].parentElement;
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
function getListContainer(listElements) {
if (listElements.length === 0) return listElements[0];
function getListContainer(listElements) {
if (listElements.length === 0) return null;
var firstParent = listElements[0].parentElement;
🤖 Prompt for AI Agents
In `@server/src/sdk/browserSide/pageAnalyzer.js` around lines 1871 - 1872, The
getListContainer function currently returns listElements[0] when
listElements.length === 0 (producing undefined); change the conditional so empty
input returns a clear value (e.g., null) and otherwise return the first element
— i.e., in getListContainer replace the faulty branch so when
listElements.length === 0 it returns null (or another explicit sentinel) and
otherwise returns listElements[0], ensuring a stable, non-latent NPE-safe return
for callers.

Comment on lines +116 to 118
private getListContainer(listElements: HTMLElement[]): HTMLElement {
if (listElements.length === 0) return listElements[0];

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

getListContainer returns undefined when listElements is empty.

Line 117: listElements[0] is undefined when the array is empty. While the caller guards against this at line 61, this method has a latent bug — accessing an empty array and returning undefined typed as HTMLElement will cause NPEs downstream if this guard is ever bypassed.

🐛 Proposed fix
 private getListContainer(listElements: HTMLElement[]): HTMLElement {
-  if (listElements.length === 0) return listElements[0];
+  if (listElements.length === 0) {
+    throw new Error('getListContainer called with empty listElements');
+  }
🤖 Prompt for AI Agents
In `@src/helpers/clientPaginationDetector.ts` around lines 116 - 118,
getListContainer currently returns listElements[0] even when listElements is
empty which yields undefined typed as HTMLElement; change getListContainer to
return HTMLElement | null and update its implementation to return null when
listElements.length === 0 (otherwise return listElements[0]), then update any
callers of getListContainer to handle a possible null return (e.g., guard before
dereferencing the result or bail out) so you no longer risk an NPE; reference:
getListContainer(listElements: HTMLElement[]), and ensure callers that
previously assumed an HTMLElement now check for null.

}
if (nextButton) {
const selector = this.generateSelectorsForElement(nextButton, doc, selectorGenerator);
const confidence = hasNumberedPages ? 'high' : 'high';
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Dead ternary — both branches return 'high'.

hasNumberedPages ? 'high' : 'high' always evaluates to 'high'. If the intent was to differentiate confidence based on numbered pages, one branch should be 'medium'.

🐛 Proposed fix
-    const confidence = hasNumberedPages ? 'high' : 'high';
+    const confidence = hasNumberedPages ? 'high' : 'medium';
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const confidence = hasNumberedPages ? 'high' : 'high';
const confidence = hasNumberedPages ? 'high' : 'medium';
🤖 Prompt for AI Agents
In `@src/helpers/clientPaginationDetector.ts` at line 278, The ternary assigning
confidence is dead because both branches return 'high'; update the expression in
clientPaginationDetector.ts where confidence is set (variable name: confidence,
condition: hasNumberedPages) so that the false branch returns 'medium' instead
of 'high' (i.e., make the ternary yield 'high' when hasNumberedPages is true and
'medium' when false) to reflect differing confidence levels.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@server/src/sdk/browserSide/pageAnalyzer.js`:
- Around line 2061-2100: detectInfiniteScrollScore currently double-counts the
same selector because sentinelPatterns and infiniteScrollLibraries both include
'[class*="infinite-scroll"]', inflating score; update detectInfiniteScrollScore
to deduplicate selectors (e.g., merge sentinelPatterns and
infiniteScrollLibraries into a single array or filter out duplicates before
querying) so each unique selector (from
sentinelPatterns/infiniteScrollLibraries) only adds to score once, keeping the
rest of the logic (score increments, scrollToTopPatterns checks,
initialHeight/viewportHeight checks) unchanged.
- Around line 1085-1096: In getClickableElements, avoid using
clickables.push.apply(clickables, Array.from(elements)) because spreading
thousands of nodes can throw a RangeError; instead concatenate or iterate to
append elements safely (e.g., use clickables =
clickables.concat(Array.from(elements)) or loop over elements and push
individually), keep the existing de-duplication via Array.from(new
Set(clickables)), and ensure the branch that adds root (when root !== document
and root.tagName/role indicates a button/link) still pushes the single root
element.
🧹 Nitpick comments (2)
server/src/sdk/browserSide/pageAnalyzer.js (2)

1953-1977: First-match-wins: scoring in detectFromPaginationWrapper doesn't differentiate between candidates.

Both nextScore and lmScore are unconditionally set to 20, and the guard if (20 > lmScore) / if (20 > nextScore) prevents any subsequent match from replacing the first one. Unlike the other phase functions (detectFromNearbyElements, detectFromFullDocument) where scores vary based on element properties (tag, class, ancestor), here every candidate gets an identical fixed score.

If the pagination wrapper contains multiple clickable elements matching nextButtonTextPatterns (e.g., a disabled-looking "Next" inside an inner div and the real "Next" button), the first one encountered in DOM order wins regardless of quality. Consider adding differentiating signals (e.g., tag type, proximity to active page) as done in the other phases, or at minimum use >= so later (potentially better-positioned) elements can replace earlier ones.


2000-2001: getClickableElements(document) is called twice — once per phase — re-querying the entire DOM.

Both detectFromNearbyElements (line 2001) and detectFromFullDocument (line 2103) independently call getClickableElements(document), which runs six querySelectorAll calls each time across the full document. On large pages this is wasteful. Consider computing the list once in the orchestration block and passing it as a parameter.

Also applies to: 2102-2103

Comment on lines +1085 to +1096
function getClickableElements(root) {
var clickables = [];
var selectors = ['button', 'a', '[role="button"]', '[onclick]', '.btn', '.button'];
for (var i = 0; i < selectors.length; i++) {
var elements = root.querySelectorAll(selectors[i]);
clickables.push.apply(clickables, Array.from(elements));
}
if (root !== document && (root.tagName === 'BUTTON' || root.tagName === 'A' || root.getAttribute('role') === 'button')) {
clickables.push(root);
}
return Array.from(new Set(clickables));
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

push.apply can throw a RangeError on pages with many clickable elements.

clickables.push.apply(clickables, Array.from(elements)) spreads all elements as individual arguments. If a single querySelectorAll returns thousands of nodes (common for a tags on link-heavy pages), this can exceed the engine's maximum call-stack size. Use concat or a loop instead.

♻️ Suggested fix
 function getClickableElements(root) {
   var clickables = [];
   var selectors = ['button', 'a', '[role="button"]', '[onclick]', '.btn', '.button'];
   for (var i = 0; i < selectors.length; i++) {
     var elements = root.querySelectorAll(selectors[i]);
-    clickables.push.apply(clickables, Array.from(elements));
+    for (var j = 0; j < elements.length; j++) {
+      clickables.push(elements[j]);
+    }
   }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
function getClickableElements(root) {
var clickables = [];
var selectors = ['button', 'a', '[role="button"]', '[onclick]', '.btn', '.button'];
for (var i = 0; i < selectors.length; i++) {
var elements = root.querySelectorAll(selectors[i]);
clickables.push.apply(clickables, Array.from(elements));
}
if (root !== document && (root.tagName === 'BUTTON' || root.tagName === 'A' || root.getAttribute('role') === 'button')) {
clickables.push(root);
}
return Array.from(new Set(clickables));
}
function getClickableElements(root) {
var clickables = [];
var selectors = ['button', 'a', '[role="button"]', '[onclick]', '.btn', '.button'];
for (var i = 0; i < selectors.length; i++) {
var elements = root.querySelectorAll(selectors[i]);
for (var j = 0; j < elements.length; j++) {
clickables.push(elements[j]);
}
}
if (root !== document && (root.tagName === 'BUTTON' || root.tagName === 'A' || root.getAttribute('role') === 'button')) {
clickables.push(root);
}
return Array.from(new Set(clickables));
}
🤖 Prompt for AI Agents
In `@server/src/sdk/browserSide/pageAnalyzer.js` around lines 1085 - 1096, In
getClickableElements, avoid using clickables.push.apply(clickables,
Array.from(elements)) because spreading thousands of nodes can throw a
RangeError; instead concatenate or iterate to append elements safely (e.g., use
clickables = clickables.concat(Array.from(elements)) or loop over elements and
push individually), keep the existing de-duplication via Array.from(new
Set(clickables)), and ensure the branch that adds root (when root !== document
and root.tagName/role indicates a button/link) still pushes the single root
element.

Comment on lines 2061 to +2100
function detectInfiniteScrollScore() {
try {
const debugInfo = {
indicators: [],
score: 0,
threshold: 5
};

const initialItemCount = listElements.length;
const initialHeight = document.documentElement.scrollHeight;
const viewportHeight = window.innerHeight;
const currentScrollY = window.scrollY;

if (initialHeight <= viewportHeight) {
return 0;
}

const loadingIndicators = [
'[class*="loading"]',
'[class*="spinner"]',
'[class*="skeleton"]',
'[aria-busy="true"]',
'[data-loading="true"]',
'.loader',
'.load-more-spinner',
'[class*="load"]',
'[id*="loading"]',
'[id*="spinner"]'
];
var score = 0;
var initialHeight = document.documentElement.scrollHeight;
var viewportHeight = window.innerHeight;

for (const selector of loadingIndicators) {
if (document.querySelector(selector)) {
debugInfo.score += 3;
debugInfo.indicators.push(`Loading indicator: ${selector} (+3)`);
break;
}
}
if (initialHeight <= viewportHeight) return 0;

const sentinelPatterns = [
'[class*="sentinel"]',
'[class*="trigger"]',
'[data-infinite]',
'[data-scroll-trigger]',
'#infinite-scroll-trigger',
'[class*="infinite"]',
'[id*="infinite"]'
var sentinelPatterns = [
'[data-infinite]', '[data-scroll-trigger]',
'#infinite-scroll-trigger', '[class*="infinite-scroll"]', '[id*="infinite-scroll"]',
];

for (const selector of sentinelPatterns) {
if (document.querySelector(selector)) {
debugInfo.score += 4;
debugInfo.indicators.push(`Sentinel element: ${selector} (+4)`);
break;
}
for (var i = 0; i < sentinelPatterns.length; i++) {
if (document.querySelector(sentinelPatterns[i])) { score += 6; break; }
}

const scrollToTopPatterns = [
'[class*="scroll"][class*="top"]',
'[aria-label*="scroll to top"]',
'[title*="back to top"]',
'.back-to-top',
'#back-to-top',
'[class*="scrolltop"]',
'[class*="backtotop"]',
'button[class*="top"]',
'a[href="#top"]',
'a[href="#"]'
var infiniteScrollLibraries = [
'.infinite-scroll', '[data-infinite-scroll]', '[class*="infinite-scroll"]',
];

for (const selector of scrollToTopPatterns) {
const element = document.querySelector(selector);
if (element && isVisible(element)) {
debugInfo.score += 2;
debugInfo.indicators.push('Scroll-to-top button (+2)');
break;
}
for (var j = 0; j < infiniteScrollLibraries.length; j++) {
if (document.querySelector(infiniteScrollLibraries[j])) { score += 6; break; }
}

if (initialHeight > viewportHeight * 3) {
debugInfo.score += 3;
debugInfo.indicators.push(`Very tall page (${(initialHeight / viewportHeight).toFixed(1)}x viewport) (+3)`);
} else if (initialHeight > viewportHeight * 2) {
debugInfo.score += 2;
debugInfo.indicators.push(`Tall page (${(initialHeight / viewportHeight).toFixed(1)}x viewport) (+2)`);
}

if (initialItemCount >= 20) {
debugInfo.score += 2;
debugInfo.indicators.push(`Many list items (${initialItemCount}) (+2)`);
} else if (initialItemCount >= 10) {
debugInfo.score += 1;
debugInfo.indicators.push(`Good number of list items (${initialItemCount}) (+1)`);
}

const infiniteScrollLibraries = [
'.infinite-scroll',
'[data-infinite-scroll]',
'[data-flickity]',
'[data-slick]',
'.masonry',
'[data-masonry]',
'[class*="infinite-scroll"]',
'[class*="lazy-load"]',
'[data-lazy]'
var scrollToTopPatterns = [
'[aria-label*="scroll to top" i]', '[title*="back to top" i]',
'.back-to-top', '#back-to-top', '[class*="scrolltop"]', '[class*="backtotop"]',
];

for (const selector of infiniteScrollLibraries) {
if (document.querySelector(selector)) {
debugInfo.score += 4;
debugInfo.indicators.push(`Infinite scroll library: ${selector} (+4)`);
break;
}
for (var k = 0; k < scrollToTopPatterns.length; k++) {
try {
var el = document.querySelector(scrollToTopPatterns[k]);
if (el && isVisible(el)) { score += 2; break; }
} catch (e) { continue; }
}

const lastListItem = listElements[listElements.length - 1];
if (lastListItem) {
const lastItemRect = lastListItem.getBoundingClientRect();
const lastItemY = lastItemRect.bottom + currentScrollY;
const viewportBottom = currentScrollY + viewportHeight;
if (initialHeight > viewportHeight * 5) score += 2;
return score;
} catch (e) {
return 0;
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

detectInfiniteScrollScore: sentinel and library patterns overlap, inflating score.

sentinelPatterns includes [class*="infinite-scroll"] and [id*="infinite-scroll"] (line 2071), while infiniteScrollLibraries also includes [class*="infinite-scroll"] (line 2078). A single element matching [class*="infinite-scroll"] would contribute +6 from each loop (total +12), which could push the score past the threshold of 8 on this overlap alone — potentially misidentifying pages that simply have an "infinite-scroll" class for styling purposes.

Consider deduplicating the selectors or merging the two loops into one so a single element doesn't double-count.

🤖 Prompt for AI Agents
In `@server/src/sdk/browserSide/pageAnalyzer.js` around lines 2061 - 2100,
detectInfiniteScrollScore currently double-counts the same selector because
sentinelPatterns and infiniteScrollLibraries both include
'[class*="infinite-scroll"]', inflating score; update detectInfiniteScrollScore
to deduplicate selectors (e.g., merge sentinelPatterns and
infiniteScrollLibraries into a single array or filter out duplicates before
querying) so each unique selector (from
sentinelPatterns/infiniteScrollLibraries) only adds to score once, keeping the
rest of the logic (score increments, scrollToTopPatterns checks,
initialHeight/viewportHeight checks) unchanged.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
src/components/recorder/RightSidePanel.tsx (2)

636-640: ⚠️ Potential issue | 🟡 Minor

Socket listener cleanup relies on a 5-second timeout — fragile on slow connections.

If paginationScrollTestResult arrives after 5 seconds, the handler has already been removed and the result is silently lost. The user sees no feedback and autoDetectedPagination stays null. Consider removing the arbitrary timeout and instead tying cleanup to component unmount or an explicit cancel action.


460-712: ⚠️ Potential issue | 🟠 Major

handleConfirmListCapture is very large and has a stale-closure risk with autoDetectedPagination.

The useCallback at Line 460 does not list autoDetectedPagination in its dependency array (Line 712). The check at Line 649 (autoDetectedPagination && ...) will therefore use a stale value. If auto-detection completes after the first render of this callback, the guard will incorrectly evaluate to false, and startPaginationMode() will be called even though auto-detection succeeded.

Add autoDetectedPagination to the dependency array:

- }, [captureStage, paginationType, limitType, customLimit, startPaginationMode, setShowPaginationOptions, setCaptureStage, notify, stopPaginationMode, startLimitMode, setShowLimitOptions, stopLimitMode, stopCaptureAndEmitGetListSettings, t, browserSteps, currentListActionId, updateListStepLimit]);
+ }, [captureStage, paginationType, limitType, customLimit, startPaginationMode, setShowPaginationOptions, setCaptureStage, notify, stopPaginationMode, startLimitMode, setShowLimitOptions, stopLimitMode, stopCaptureAndEmitGetListSettings, t, browserSteps, currentListActionId, updateListStepLimit, autoDetectedPagination]);

Also consider extracting the 'initial' case body into a separate function for readability.

🤖 Fix all issues with AI agents
In `@src/components/recorder/RightSidePanel.tsx`:
- Line 492: Several new notify(...) calls in RightSidePanel.tsx use hardcoded
English strings and bypass i18n; replace each raw string with the translated
string using the existing t() function (e.g., change notify("success", "Scroll
Down pagination has been auto-detected.") to notify("success", t("..."))).
Update every new notify invocation introduced in this PR (references: notify
calls around the identifiers in RightSidePanel.tsx where "Scroll Down pagination
has been auto-detected." and the other occurrences flagged) to wrap the message
in t(...) with an appropriate translation key or literal used elsewhere in this
file so the UI text is translatable. Ensure you import or reference the same t
function used in this component and keep the notify severity and signature
unchanged.
- Around line 522-527: The branch handling detectionResult.type currently
notifies for 'scrollDown'/'scrollUp' but ignores 'clickNext' and
'clickLoadMore', leaving users without feedback when selector is falsy; update
the branch that checks detectionResult.type to also call notify("success", ...)
for 'clickNext' and 'clickLoadMore' (using the same or an appropriate message)
when those types are detected, and then prevent duplicate notifications by
guarding the later notification that fires when detectionResult.selector is
present (e.g., remove or add a conditional so the later notify only runs when
the earlier one didn't already run or when selector exists). Ensure changes
reference detectionResult.type, detectionResult.selector and the notify(...)
call in the RightSidePanel component.
🧹 Nitpick comments (1)
src/components/recorder/RightSidePanel.tsx (1)

549-596: evaluateSelector is duplicated five times in this file.

The same XPath/CSS selector evaluation logic is copy-pasted at Lines 95, 389, 549, 767, and 972. Extract it into a shared utility (e.g., in a helper module alongside clientSelectorGenerator) and import it.

Sketch
// e.g., src/helpers/evaluateSelector.ts
export function evaluateSelector(selector: string, doc: Document): Element[] {
  if (selector.startsWith('//') || selector.startsWith('(//')) {
    try {
      const result = doc.evaluate(selector, doc, null, XPathResult.ORDERED_NODE_SNAPSHOT_TYPE, null);
      const elements: Element[] = [];
      for (let i = 0; i < result.snapshotLength; i++) {
        const node = result.snapshotItem(i);
        if (node && node.nodeType === Node.ELEMENT_NODE) {
          elements.push(node as Element);
        }
      }
      return elements;
    } catch { return []; }
  }
  try {
    return Array.from(doc.querySelectorAll(selector));
  } catch { return []; }
}

Also applies to: 767-789, 972-994, 389-411, 95-117


const handleScrollTestResult = (result: any) => {
if (result.success && result.contentLoaded) {
notify("success", "Scroll Down pagination has been auto-detected.");
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Hardcoded English strings bypass i18n.

All new notification messages use raw English strings instead of the t() translation function used elsewhere in this file. This will result in untranslated UI text for non-English users.

Example fix for one notification
-  notify("success", "Scroll Down pagination has been auto-detected.");
+  notify("success", t('right_panel.pagination.auto_detected_scroll_down'));

Apply the same pattern to every new notify(...) call introduced in this PR.

Also applies to: 524-524, 526-526, 615-615, 624-624

🤖 Prompt for AI Agents
In `@src/components/recorder/RightSidePanel.tsx` at line 492, Several new
notify(...) calls in RightSidePanel.tsx use hardcoded English strings and bypass
i18n; replace each raw string with the translated string using the existing t()
function (e.g., change notify("success", "Scroll Down pagination has been
auto-detected.") to notify("success", t("..."))). Update every new notify
invocation introduced in this PR (references: notify calls around the
identifiers in RightSidePanel.tsx where "Scroll Down pagination has been
auto-detected." and the other occurrences flagged) to wrap the message in t(...)
with an appropriate translation key or literal used elsewhere in this file so
the UI text is translatable. Ensure you import or reference the same t function
used in this component and keep the notify severity and signature unchanged.

Comment on lines 522 to +527
if (detectionResult.type) {
if (detectionResult.type === 'scrollDown') {
notify("success", "Scroll Down pagination has been auto-detected.");
} else if (detectionResult.type === 'scrollUp') {
notify("success", "Scroll Up pagination has been auto-detected.");
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Missing notification for clickNext/clickLoadMore detection types in this branch.

When detectionResult.type is clickNext or clickLoadMore, this block silently falls through. If the element also has a selector, notification eventually fires at Line 615—but if detectionResult.selector is falsy, the user receives no feedback at all about the auto-detection result.

Proposed fix
                     if (detectionResult.type) {
                       if (detectionResult.type === 'scrollDown') {
                         notify("success", "Scroll Down pagination has been auto-detected.");
                       } else if (detectionResult.type === 'scrollUp') {
                         notify("success", "Scroll Up pagination has been auto-detected.");
+                      } else if (detectionResult.type === 'clickNext') {
+                        notify("success", "Next Button pagination has been auto-detected.");
+                      } else if (detectionResult.type === 'clickLoadMore') {
+                        notify("success", "Load More pagination has been auto-detected.");
                       }

Note: with this change, clickNext/clickLoadMore with a selector would get notified twice (here and at Line 615). You may want to guard the Line 615 notification or remove it in favor of this single notification point.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if (detectionResult.type) {
if (detectionResult.type === 'scrollDown') {
notify("success", "Scroll Down pagination has been auto-detected.");
} else if (detectionResult.type === 'scrollUp') {
notify("success", "Scroll Up pagination has been auto-detected.");
}
if (detectionResult.type) {
if (detectionResult.type === 'scrollDown') {
notify("success", "Scroll Down pagination has been auto-detected.");
} else if (detectionResult.type === 'scrollUp') {
notify("success", "Scroll Up pagination has been auto-detected.");
} else if (detectionResult.type === 'clickNext') {
notify("success", "Next Button pagination has been auto-detected.");
} else if (detectionResult.type === 'clickLoadMore') {
notify("success", "Load More pagination has been auto-detected.");
}
🤖 Prompt for AI Agents
In `@src/components/recorder/RightSidePanel.tsx` around lines 522 - 527, The
branch handling detectionResult.type currently notifies for
'scrollDown'/'scrollUp' but ignores 'clickNext' and 'clickLoadMore', leaving
users without feedback when selector is falsy; update the branch that checks
detectionResult.type to also call notify("success", ...) for 'clickNext' and
'clickLoadMore' (using the same or an appropriate message) when those types are
detected, and then prevent duplicate notifications by guarding the later
notification that fires when detectionResult.selector is present (e.g., remove
or add a conditional so the later notify only runs when the earlier one didn't
already run or when selector exists). Ensure changes reference
detectionResult.type, detectionResult.selector and the notify(...) call in the
RightSidePanel component.

@amhsirak amhsirak merged commit cd715a8 into getmaxun:develop Feb 25, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants