-
Notifications
You must be signed in to change notification settings - Fork 1.1k
fix(maxun-core): pagination data persistence for multiple actions #890
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughIntroduces per-action-name tracking for scrapeList and scrapeSchema, adds a private scrapeListCounter for auto-generated list names, threads step/action names into actions, initializes per-name entries in serializable structures, changes serializableCallback to emit per-action-name payloads, and preserves current action state across invocations. Changes
Sequence Diagram(s)sequenceDiagram
participant Runner as carryOutSteps
participant Core as maxun-core/interpret.ts
participant ServerInterp as server/Interpreter
participant Serializer as serializableCallback
Runner->>Core: call scrapeList/scrapeSchema(actionName = step.name?)
activate Core
Core->>Core: derive name = actionName || auto "List N"
Core->>Core: ensure serializableDataByType[type][name] exists
alt Paginated
Core->>Core: handlePagination -> append pages to serializableDataByType[type][name]
Core->>Serializer: emit updates per page ({type,name,data})
else Non-paginated
Core->>Core: append results to serializableDataByType[type][name]
Core->>Serializer: emit final ({type,name,data})
end
Core-->>Runner: log completion
deactivate Core
Serializer->>ServerInterp: persistDataToDatabase(type, { [name]: flattened })
ServerInterp-->>Serializer: ack
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes
Possibly related PRs
Suggested labels
Suggested reviewers
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (5)
server/src/workflow-management/classes/Interpreter.ts (1)
570-612: scrapeList/scrapeSchema routing & naming look correct; consider a couple of safety tweaksThe new
typeKeyrouting andscrapeList-specific actionName derivation fromdata.scrapeList/data.scrapeSchemaare aligned with the new per-action maps exposed bymaxun-coreand should prevent cross‑type mixing. Theflattenedcomputation also correctly preserves arrays and handles object maps.Two small robustness points to consider:
For
scrapeList, derivingactionNamefromObject.keys(data)and picking the last key assumes the core interpreter keeps insertion order such that the most recently added list is the last key. That’s true today (each newscrapeListcall inserts a new key and never mutates older keys), but if the core ever starts updating older keys, you could end up attributing results to the wrong action. A future‑proof variant would preferthis.currentActionNamewhenkeys.length > 1, and only fall back to the “last key” heuristic ifcurrentActionNamedoesn’t exist in the map.For non‑
scrapeListtypes (e.g.scrapeSchemaor custom types), whenthis.currentActionNameis empty you’ll still create an entry withactionName === "". If you want to avoid anonymous buckets, you could also run such cases throughgetUniqueActionName(not only forscrapeList) so every persisted series has a readable key.These are non‑blocking but would make the serialization more resilient to future changes in
maxun-coreand avoid odd anonymous keys.maxun-core/src/interpret.ts (4)
85-85: scrapeListCounter appropriately centralizes unnamed list namingIntroducing
private scrapeListCounter: number = 0;is a clean way to generate deterministic “List N” names for unnamedscrapeListactions across both non‑pagination and pagination flows. Given the interpreter instances are typically per‑run, this is safe; if you ever reuse anInterpreterinstance across multiple workflows, you may want a small reset inrun/teardown to avoid counters carrying over between runs, but that’s purely optional.
580-628: Non‑pagination scrapeList flow now correctly tracks paginationUsed and per‑action mapsThe revised
scrapeListimplementation does a few important things right:
- Distinguishes pagination vs non‑pagination via
paginationUsed, so only one ofhandlePaginationor the localserializableCallbackpath runs.- Names unnamed lists as
List NusingscrapeListCounter, ensuring consistent keys that match what the server expects when it derives names from thescrapeListmap.- Initializes
serializableDataByType[actionType][actionName]defensively and pushesscrapeResultsinto the correct per‑action bucket before emitting the full map toserializableCallback.This matches the PR’s objective of unique, action‑scoped list outputs and avoids data mixing between different list actions. The only nit is the direct
console.logfor “ScrapeList completed…”; if you want fully consistent logging, you could route that throughthis.loginstead, but it’s not functionally critical.
632-649: Error‑path actionName handling for scrapeList is consistent with success pathIn the
catchblock ofscrapeList, derivingactionNamefromconfig.__nameand falling back toList NviascrapeListCounterensures that even failed list actions get a stable entry in bothnamedResultsandserializableDataByType. That keeps the output shape predictable for the consumer and prevents collisions between different list actions, which is exactly what you want for debugging and persistence.If you want to harden the logging slightly, you could also guard
error.messagewith a fallback (error instanceof Error ? error.message : String(error)), but that’s a minor robustness tweak.
839-852: Paginated scrapeList now initializes per‑action maps and updates them incrementallyThe new initialization in
handlePagination:
- Computes
actionNamefromconfig.__nameorList NusingscrapeListCounter.- Ensures
this.serializableDataByType[actionType][actionName]exists as an array before scraping.and the update in
scrapeCurrentPage:
- Replaces
this.serializableDataByType[actionType][actionName]withallResultsand then emits the full{ scrapeList, scrapeSchema }maps on each page.Together, this keeps paginated list results isolated per action, with stable names that align with the non‑pagination path and what the server expects.
RETRY_DELAYbeing hoisted to a single constant also keeps the retry timing consistent across pagination helpers.There’s some duplication between the
actionType/actionName/map‑init logic here and inscrapeList/the error path; if this evolves further, consider extracting a small helper (e.g.getOrInitListBucket(config)) to centralize that behavior, but the current implementation is clear and correct.Also applies to: 858-858, 892-897
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
maxun-core/src/interpret.ts(5 hunks)server/src/workflow-management/classes/Interpreter.ts(1 hunks)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
maxun-core/src/interpret.ts (1)
739-755: Remove unnecessaryas anytype casts—nameis properly defined in theWhattype.The
Whattype already definesname?: string, so casting toas anyis unnecessary and defeats TypeScript's type safety. Simplify:
- Line 739:
stepName = step.name || String(step.action);- Line 754:
actionName = step.name || "";The fallback value difference (action name vs. empty string) may be intentional, but consider adding a comment if so.
🧹 Nitpick comments (2)
maxun-core/src/interpret.ts (2)
563-651: Consider extracting name derivation logic to reduce duplication.The name derivation logic (lines 610-615 and lines 633-638) is duplicated between the success and error paths. While the current implementation is correct, extracting this into a helper method would improve maintainability and reduce the risk of inconsistent behavior if the logic needs to change.
Example refactor:
private deriveActionName(providedName: string, actionType: 'scrapeList' | 'scrapeSchema'): string { let name = providedName || ""; if (!name || name.trim() === "") { if (actionType === 'scrapeList') { this.scrapeListCounter++; name = `List ${this.scrapeListCounter}`; } else { name = "Texts"; } } return name; }Then use:
const name = this.deriveActionName(actionName, "scrapeList");
817-836: Consider extracting name derivation logic (same duplication as scrapeList).The name derivation logic (lines 824-828) is duplicated from
scrapeList. This is the third occurrence of the same pattern. Extracting this to a helper method (as suggested in thescrapeListcomment) would eliminate all three duplications and ensure consistency.The logic itself is correct: when
scrapeListcallshandlePaginationwith an emptyactionName,handlePaginationappropriately increments the counter and generates a unique name.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
maxun-core/src/interpret.ts(9 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
maxun-core/src/interpret.ts (1)
maxun-core/src/types/workflow.ts (1)
CustomFunctions(31-31)
🔇 Additional comments (2)
maxun-core/src/interpret.ts (2)
85-85: LGTM! Counter field properly initialized.The
scrapeListCounterfield is correctly initialized and will be used to generate unique default names for scrapeList actions.
489-561: LGTM! Properly implements per-action naming for scrapeSchema.The function correctly:
- Accepts an optional
actionNameparameter with proper default fallback to "Texts"- Initializes nested data structures defensively before storing results
- Maintains consistent data organization by action type and name
- Spreads
cumulativeResultsto avoid unintended mutations
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
server/src/workflow-management/classes/Interpreter.ts (1)
551-588: Fix variable redeclaration and simplify the data extraction logic.There are several issues with this code segment:
Critical: Variable redeclaration - Line 571 redeclares
actionNamewhich was already declared at line 556. This is flagged by static analysis and causes confusion about which value is used later.Contradictory logic - Line 553 has an early return if
currentActionNameis missing, but then lines 572-588 attempt to deriveactionNamefrom the data structure. This suggests the early return check might be too strict or the logic is redundant.Complex control flow - The data extraction happens in two stages (lines 565-569, then 572-581), making it difficult to trace what
datacontains at any point.Apply this diff to fix the redeclaration:
- let actionName = ""; - if (typeKey === "scrapeList" && data && typeof data === "object" && !Array.isArray(data)) { + // Resolve actionName from data structure for scrapeList + if (typeKey === "scrapeList" && data && typeof data === "object" && !Array.isArray(data)) { const keys = Object.keys(data); if (keys.length === 1) { actionName = keys[0]; data = data[actionName]; } else if (keys.length > 1) { actionName = keys[keys.length - 1]; data = data[actionName]; } } + // Similar logic for scrapeSchema if needed + else if (typeKey === "scrapeSchema" && data && typeof data === "object" && !Array.isArray(data)) { + const keys = Object.keys(data); + if (keys.length > 0) { + actionName = keys[keys.length - 1]; + data = data[actionName]; + } + } if (!actionName) { actionName = this.currentActionName || "";Additionally, consider removing the early return check on line 553 for
currentActionNamesince the code now derives it dynamically.maxun-core/src/interpret.ts (1)
489-561: Critical data loss bug: Multiple scrapeSchema actions without names cause overwrites.The implementation lacks collision protection when
actionNameis not provided. WhilescrapeListauto-generates names like"List 1","List 2"(maxun-core/src/interpret.ts line 827),scrapeSchemadefaults all unnamed actions to the single key"Texts"(line 545). This means consecutivescrapeSchemacalls without assigned names will overwrite prior results instead of accumulating them.Required fix: Add counter-based name generation for
scrapeSchemamatching the pattern used forscrapeList, or ensure all workflow steps assign names via thenamefield.
🧹 Nitpick comments (2)
maxun-core/src/interpret.ts (2)
739-755: Consider adding explicit type definition for thenameproperty.The code correctly derives and passes action names, but uses
(step as any)to access thenameproperty. This bypasses TypeScript's type checking and could mask issues.Consider adding the
nameproperty to the step/action type definition:// In types/workflow.ts or similar interface What { action: string | CustomFunctions; args?: any; name?: string; // Add this optional property }This would enable proper type checking and remove the need for type assertions while maintaining backward compatibility.
823-828: Consider extracting duplicated action naming logic into a helper method.The auto-naming logic appears in multiple places:
- Lines 612-615 (scrapeList non-paginated)
- Lines 636-638 (scrapeList error path)
- Lines 825-828 (handlePagination)
This duplication increases maintenance burden and risk of inconsistency.
Consider extracting this into a private helper method:
private getOrGenerateActionName(actionType: 'scrapeList' | 'scrapeSchema', providedName: string): string { let name = providedName || ""; if (!name || name.trim() === "") { if (actionType === 'scrapeList') { this.scrapeListCounter++; name = `List ${this.scrapeListCounter}`; } else { name = "Texts"; // or similar default for scrapeSchema } } return name; }Then use it consistently across all locations:
const name = this.getOrGenerateActionName('scrapeList', actionName);
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
maxun-core/src/interpret.ts(9 hunks)server/src/workflow-management/classes/Interpreter.ts(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
maxun-core/src/interpret.ts (1)
maxun-core/src/types/workflow.ts (1)
CustomFunctions(31-31)
🪛 Biome (2.1.2)
server/src/workflow-management/classes/Interpreter.ts
[error] 571-571: Shouldn't redeclare 'actionName'. Consider to delete it or rename it.
'actionName' is defined here:
(lint/suspicious/noRedeclare)
🔇 Additional comments (3)
maxun-core/src/interpret.ts (2)
85-85: LGTM! Counter field for auto-generating list names.The addition of
scrapeListCounterenables automatic generation of unique names like "List 1", "List 2", etc., when explicit action names aren't provided.
563-651: LGTM! Per-action-name tracking correctly implemented with auto-naming.The implementation properly:
- Accepts
actionNameparameter for explicit naming- Auto-generates unique names ("List 1", "List 2", etc.) when not provided via
scrapeListCounter- Differentiates paginated vs non-paginated flows with
paginationUsedflag- Maintains per-name storage consistency in both success and error paths
- Only emits serializable callbacks for non-paginated flows (pagination handles its own callbacks)
server/src/workflow-management/classes/Interpreter.ts (1)
590-613: No issues found—state management and normalization logic are correct.Verification confirms the implementation is intentional and working correctly:
.Listproperty - This is a recognized, intentional data structure pattern used consistently in the codebase (appears at lines 593 and 781). The optional chaining safely handles cases where it doesn't exist.State preservation - The state variables
currentActionTypeandcurrentActionNameare intentionally preserved during a single workflow execution. They are set fresh before each action viasetActionTypeandsetActionNamecallbacks, and data is keyed by bothtypeKeyandactionNameto prevent misattribution. Importantly, theclearState()method properly resets all state (including these variables) when interpretation stops, ensuring clean separation between workflow invocations.The code operates as designed with no data attribution risks.
What this PR does?
Assigns unique action names for capture list actions and prevents mixing of action data across different action types.
Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings.