Skip to content

Conversation

@RohitR311
Copy link
Collaborator

@RohitR311 RohitR311 commented Nov 18, 2025

What this PR does?

Assigns unique action names for capture list actions and prevents mixing of action data across different action types.

Summary by CodeRabbit

  • New Features
    • Per-action/per-step naming for scraping results so each step’s outputs are tracked by name; non-paginated lists auto-number when unnamed.
  • Bug Fixes
    • More consistent handling and serialization of list/schema results (success and error paths) and improved pagination to avoid premature writes.
  • Improvements
    • Emitted payloads now include type, name and normalized data; step naming simplified and action state preserved across serializations for more reliable workflows.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link

coderabbitai bot commented Nov 18, 2025

Walkthrough

Introduces per-action-name tracking for scrapeList and scrapeSchema, adds a private scrapeListCounter for auto-generated list names, threads step/action names into actions, initializes per-name entries in serializable structures, changes serializableCallback to emit per-action-name payloads, and preserves current action state across invocations.

Changes

Cohort / File(s) Summary
Core interpret updates
maxun-core/src/interpret.ts
Added private scrapeListCounter. scrapeSchema and scrapeList accept optional actionName, derive a per-action name, store results under namedResults and serializableDataByType keyed by that name, distinguish paginated vs non-paginated flows (paginationUsed), auto-increment list names for non-paginated runs, pass step/action names through, log completion, and ensure per-name initialization before serialization.
Server-side Interpreter adjustments
server/src/workflow-management/classes/Interpreter.ts
serializableCallback now resolves typeKey and actionName dynamically from incoming payloads (handles object-with-key or array payloads), normalizes/flattens data, writes to serializableDataByType[typeKey][actionName], persists { [actionName]: flattened }, emits { type, name, data }, and no longer resets currentActionType/currentActionName after handling.

Sequence Diagram(s)

sequenceDiagram
    participant Runner as carryOutSteps
    participant Core as maxun-core/interpret.ts
    participant ServerInterp as server/Interpreter
    participant Serializer as serializableCallback

    Runner->>Core: call scrapeList/scrapeSchema(actionName = step.name?)
    activate Core
    Core->>Core: derive name = actionName || auto "List N"
    Core->>Core: ensure serializableDataByType[type][name] exists
    alt Paginated
        Core->>Core: handlePagination -> append pages to serializableDataByType[type][name]
        Core->>Serializer: emit updates per page ({type,name,data})
    else Non-paginated
        Core->>Core: append results to serializableDataByType[type][name]
        Core->>Serializer: emit final ({type,name,data})
    end
    Core-->>Runner: log completion
    deactivate Core

    Serializer->>ServerInterp: persistDataToDatabase(type, { [name]: flattened })
    ServerInterp-->>Serializer: ack
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

  • Verify per-name initialization and consistent use of serializableDataByType[type][name] and namedResults.
  • Check pagination vs non-paginated flows for duplicate writes and correct lifecycle of paginationUsed.
  • Inspect serializableCallback normalization and persistence paths to ensure correct flattening and naming.
  • Confirm preserved currentActionType/currentActionName doesn't break callers expecting resets.

Possibly related PRs

Suggested labels

Type: Bug

Suggested reviewers

  • amhsirak

Poem

🐇 I named each list from one to N,

Maps filled neatly, page by page again.
Names flow through steps, no state lost in flight,
A rabbit cheers for orderly bytes tonight.

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main objective of the PR: fixing pagination data persistence for multiple actions by introducing per-action naming to prevent data mixing.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@RohitR311 RohitR311 changed the title fix: pagination data persistence for multiple actions fix(maxun-core): pagination data persistence for multiple actions Nov 18, 2025
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (5)
server/src/workflow-management/classes/Interpreter.ts (1)

570-612: scrapeList/scrapeSchema routing & naming look correct; consider a couple of safety tweaks

The new typeKey routing and scrapeList-specific actionName derivation from data.scrapeList/data.scrapeSchema are aligned with the new per-action maps exposed by maxun-core and should prevent cross‑type mixing. The flattened computation also correctly preserves arrays and handles object maps.

Two small robustness points to consider:

  • For scrapeList, deriving actionName from Object.keys(data) and picking the last key assumes the core interpreter keeps insertion order such that the most recently added list is the last key. That’s true today (each new scrapeList call inserts a new key and never mutates older keys), but if the core ever starts updating older keys, you could end up attributing results to the wrong action. A future‑proof variant would prefer this.currentActionName when keys.length > 1, and only fall back to the “last key” heuristic if currentActionName doesn’t exist in the map.

  • For non‑scrapeList types (e.g. scrapeSchema or custom types), when this.currentActionName is empty you’ll still create an entry with actionName === "". If you want to avoid anonymous buckets, you could also run such cases through getUniqueActionName (not only for scrapeList) so every persisted series has a readable key.

These are non‑blocking but would make the serialization more resilient to future changes in maxun-core and avoid odd anonymous keys.

maxun-core/src/interpret.ts (4)

85-85: scrapeListCounter appropriately centralizes unnamed list naming

Introducing private scrapeListCounter: number = 0; is a clean way to generate deterministic “List N” names for unnamed scrapeList actions across both non‑pagination and pagination flows. Given the interpreter instances are typically per‑run, this is safe; if you ever reuse an Interpreter instance across multiple workflows, you may want a small reset in run/teardown to avoid counters carrying over between runs, but that’s purely optional.


580-628: Non‑pagination scrapeList flow now correctly tracks paginationUsed and per‑action maps

The revised scrapeList implementation does a few important things right:

  • Distinguishes pagination vs non‑pagination via paginationUsed, so only one of handlePagination or the local serializableCallback path runs.
  • Names unnamed lists as List N using scrapeListCounter, ensuring consistent keys that match what the server expects when it derives names from the scrapeList map.
  • Initializes serializableDataByType[actionType][actionName] defensively and pushes scrapeResults into the correct per‑action bucket before emitting the full map to serializableCallback.

This matches the PR’s objective of unique, action‑scoped list outputs and avoids data mixing between different list actions. The only nit is the direct console.log for “ScrapeList completed…”; if you want fully consistent logging, you could route that through this.log instead, but it’s not functionally critical.


632-649: Error‑path actionName handling for scrapeList is consistent with success path

In the catch block of scrapeList, deriving actionName from config.__name and falling back to List N via scrapeListCounter ensures that even failed list actions get a stable entry in both namedResults and serializableDataByType. That keeps the output shape predictable for the consumer and prevents collisions between different list actions, which is exactly what you want for debugging and persistence.

If you want to harden the logging slightly, you could also guard error.message with a fallback (error instanceof Error ? error.message : String(error)), but that’s a minor robustness tweak.


839-852: Paginated scrapeList now initializes per‑action maps and updates them incrementally

The new initialization in handlePagination:

  • Computes actionName from config.__name or List N using scrapeListCounter.
  • Ensures this.serializableDataByType[actionType][actionName] exists as an array before scraping.

and the update in scrapeCurrentPage:

  • Replaces this.serializableDataByType[actionType][actionName] with allResults and then emits the full { scrapeList, scrapeSchema } maps on each page.

Together, this keeps paginated list results isolated per action, with stable names that align with the non‑pagination path and what the server expects. RETRY_DELAY being hoisted to a single constant also keeps the retry timing consistent across pagination helpers.

There’s some duplication between the actionType/actionName/map‑init logic here and in scrapeList/the error path; if this evolves further, consider extracting a small helper (e.g. getOrInitListBucket(config)) to centralize that behavior, but the current implementation is clear and correct.

Also applies to: 858-858, 892-897

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 334fbbc and 801ae5a.

📒 Files selected for processing (2)
  • maxun-core/src/interpret.ts (5 hunks)
  • server/src/workflow-management/classes/Interpreter.ts (1 hunks)

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
maxun-core/src/interpret.ts (1)

739-755: Remove unnecessary as any type casts—name is properly defined in the What type.

The What type already defines name?: string, so casting to as any is unnecessary and defeats TypeScript's type safety. Simplify:

  • Line 739: stepName = step.name || String(step.action);
  • Line 754: actionName = step.name || "";

The fallback value difference (action name vs. empty string) may be intentional, but consider adding a comment if so.

🧹 Nitpick comments (2)
maxun-core/src/interpret.ts (2)

563-651: Consider extracting name derivation logic to reduce duplication.

The name derivation logic (lines 610-615 and lines 633-638) is duplicated between the success and error paths. While the current implementation is correct, extracting this into a helper method would improve maintainability and reduce the risk of inconsistent behavior if the logic needs to change.

Example refactor:

private deriveActionName(providedName: string, actionType: 'scrapeList' | 'scrapeSchema'): string {
  let name = providedName || "";
  if (!name || name.trim() === "") {
    if (actionType === 'scrapeList') {
      this.scrapeListCounter++;
      name = `List ${this.scrapeListCounter}`;
    } else {
      name = "Texts";
    }
  }
  return name;
}

Then use: const name = this.deriveActionName(actionName, "scrapeList");


817-836: Consider extracting name derivation logic (same duplication as scrapeList).

The name derivation logic (lines 824-828) is duplicated from scrapeList. This is the third occurrence of the same pattern. Extracting this to a helper method (as suggested in the scrapeList comment) would eliminate all three duplications and ensure consistency.

The logic itself is correct: when scrapeList calls handlePagination with an empty actionName, handlePagination appropriately increments the counter and generates a unique name.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 801ae5a and 1b8e504.

📒 Files selected for processing (1)
  • maxun-core/src/interpret.ts (9 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
maxun-core/src/interpret.ts (1)
maxun-core/src/types/workflow.ts (1)
  • CustomFunctions (31-31)
🔇 Additional comments (2)
maxun-core/src/interpret.ts (2)

85-85: LGTM! Counter field properly initialized.

The scrapeListCounter field is correctly initialized and will be used to generate unique default names for scrapeList actions.


489-561: LGTM! Properly implements per-action naming for scrapeSchema.

The function correctly:

  • Accepts an optional actionName parameter with proper default fallback to "Texts"
  • Initializes nested data structures defensively before storing results
  • Maintains consistent data organization by action type and name
  • Spreads cumulativeResults to avoid unintended mutations

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
server/src/workflow-management/classes/Interpreter.ts (1)

551-588: Fix variable redeclaration and simplify the data extraction logic.

There are several issues with this code segment:

  1. Critical: Variable redeclaration - Line 571 redeclares actionName which was already declared at line 556. This is flagged by static analysis and causes confusion about which value is used later.

  2. Contradictory logic - Line 553 has an early return if currentActionName is missing, but then lines 572-588 attempt to derive actionName from the data structure. This suggests the early return check might be too strict or the logic is redundant.

  3. Complex control flow - The data extraction happens in two stages (lines 565-569, then 572-581), making it difficult to trace what data contains at any point.

Apply this diff to fix the redeclaration:

-          let actionName = "";
-          if (typeKey === "scrapeList" && data && typeof data === "object" && !Array.isArray(data)) {
+          // Resolve actionName from data structure for scrapeList
+          if (typeKey === "scrapeList" && data && typeof data === "object" && !Array.isArray(data)) {
             const keys = Object.keys(data);
             if (keys.length === 1) {
               actionName = keys[0];
               data = data[actionName];
             } else if (keys.length > 1) {
               actionName = keys[keys.length - 1];
               data = data[actionName];
             }
           }
+          // Similar logic for scrapeSchema if needed
+          else if (typeKey === "scrapeSchema" && data && typeof data === "object" && !Array.isArray(data)) {
+            const keys = Object.keys(data);
+            if (keys.length > 0) {
+              actionName = keys[keys.length - 1];
+              data = data[actionName];
+            }
+          }
 
           if (!actionName) {
             actionName = this.currentActionName || "";

Additionally, consider removing the early return check on line 553 for currentActionName since the code now derives it dynamically.

maxun-core/src/interpret.ts (1)

489-561: Critical data loss bug: Multiple scrapeSchema actions without names cause overwrites.

The implementation lacks collision protection when actionName is not provided. While scrapeList auto-generates names like "List 1", "List 2" (maxun-core/src/interpret.ts line 827), scrapeSchema defaults all unnamed actions to the single key "Texts" (line 545). This means consecutive scrapeSchema calls without assigned names will overwrite prior results instead of accumulating them.

Required fix: Add counter-based name generation for scrapeSchema matching the pattern used for scrapeList, or ensure all workflow steps assign names via the name field.

🧹 Nitpick comments (2)
maxun-core/src/interpret.ts (2)

739-755: Consider adding explicit type definition for the name property.

The code correctly derives and passes action names, but uses (step as any) to access the name property. This bypasses TypeScript's type checking and could mask issues.

Consider adding the name property to the step/action type definition:

// In types/workflow.ts or similar
interface What {
  action: string | CustomFunctions;
  args?: any;
  name?: string;  // Add this optional property
}

This would enable proper type checking and remove the need for type assertions while maintaining backward compatibility.


823-828: Consider extracting duplicated action naming logic into a helper method.

The auto-naming logic appears in multiple places:

  • Lines 612-615 (scrapeList non-paginated)
  • Lines 636-638 (scrapeList error path)
  • Lines 825-828 (handlePagination)

This duplication increases maintenance burden and risk of inconsistency.

Consider extracting this into a private helper method:

private getOrGenerateActionName(actionType: 'scrapeList' | 'scrapeSchema', providedName: string): string {
  let name = providedName || "";
  
  if (!name || name.trim() === "") {
    if (actionType === 'scrapeList') {
      this.scrapeListCounter++;
      name = `List ${this.scrapeListCounter}`;
    } else {
      name = "Texts"; // or similar default for scrapeSchema
    }
  }
  
  return name;
}

Then use it consistently across all locations:

const name = this.getOrGenerateActionName('scrapeList', actionName);
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1b8e504 and 8171e51.

📒 Files selected for processing (2)
  • maxun-core/src/interpret.ts (9 hunks)
  • server/src/workflow-management/classes/Interpreter.ts (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
maxun-core/src/interpret.ts (1)
maxun-core/src/types/workflow.ts (1)
  • CustomFunctions (31-31)
🪛 Biome (2.1.2)
server/src/workflow-management/classes/Interpreter.ts

[error] 571-571: Shouldn't redeclare 'actionName'. Consider to delete it or rename it.

'actionName' is defined here:

(lint/suspicious/noRedeclare)

🔇 Additional comments (3)
maxun-core/src/interpret.ts (2)

85-85: LGTM! Counter field for auto-generating list names.

The addition of scrapeListCounter enables automatic generation of unique names like "List 1", "List 2", etc., when explicit action names aren't provided.


563-651: LGTM! Per-action-name tracking correctly implemented with auto-naming.

The implementation properly:

  • Accepts actionName parameter for explicit naming
  • Auto-generates unique names ("List 1", "List 2", etc.) when not provided via scrapeListCounter
  • Differentiates paginated vs non-paginated flows with paginationUsed flag
  • Maintains per-name storage consistency in both success and error paths
  • Only emits serializable callbacks for non-paginated flows (pagination handles its own callbacks)
server/src/workflow-management/classes/Interpreter.ts (1)

590-613: No issues found—state management and normalization logic are correct.

Verification confirms the implementation is intentional and working correctly:

  1. .List property - This is a recognized, intentional data structure pattern used consistently in the codebase (appears at lines 593 and 781). The optional chaining safely handles cases where it doesn't exist.

  2. State preservation - The state variables currentActionType and currentActionName are intentionally preserved during a single workflow execution. They are set fresh before each action via setActionType and setActionName callbacks, and data is keyed by both typeKey and actionName to prevent misattribution. Importantly, the clearState() method properly resets all state (including these variables) when interpretation stops, ensuring clean separation between workflow invocations.

The code operates as designed with no data attribution risks.

@amhsirak amhsirak merged commit 1bb1cc8 into getmaxun:develop Nov 20, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants