-
Notifications
You must be signed in to change notification settings - Fork 1.1k
chore: release v0.0.4 #225
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (1)
docker-compose.yml (1)
46-46: Consider documenting release changesSince this is a release PR (v0.0.4), it would be helpful to document the changes or link to release notes.
Consider adding a comment above the image line documenting the key changes in this release or providing a link to the changelog/release notes.
+ # Release v0.0.4 - Add link to release notes or brief changelog here image: getmaxun/maxun-backend:v0.0.4
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (1)
docker-compose.yml(1 hunks)
🔇 Additional comments (1)
docker-compose.yml (1)
46-46: Verify the backend image availability
The version bump from v0.0.3 to v0.0.4 looks good. Let's verify the image exists in the registry.
✅ Verification successful
Image version v0.0.4 is available and actively used
The Docker image getmaxun/maxun-backend:v0.0.4 exists in Docker Hub and is in active state. It was pushed on December 4th, 2024, and was last pulled on December 5th, 2024, indicating it's being actively used.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Verify the existence of the new backend image version in the registry
# Note: Using curl to check Docker Hub API for the image tag
curl -s -f "https://hub.docker.com/v2/repositories/getmaxun/maxun-backend/tags/v0.0.4" || echo "Image not found"
Length of output: 1201
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Outside diff range and nitpick comments (5)
maxun-core/src/interpret.ts (3)
557-566: Performance concern in pagination handling.Using
JSON.stringifyfor uniqueness checking can be expensive for large objects. Consider implementing a more efficient unique key generation strategy.-const uniqueKey = JSON.stringify(item); +const uniqueKey = Object.entries(item) + .sort(([a], [b]) => a.localeCompare(b)) + .map(([k, v]) => `${k}:${v}`) + .join('|');🧰 Tools
🪛 Biome (1.9.4)
[error] 560-565: Other switch clauses can erroneously access this declaration.
Wrap the declaration in a block to restrict its access to the switch clause.The declaration is defined in this switch clause:
Unsafe fix: Wrap the declaration in a block.
(lint/correctness/noSwitchDeclarations)
630-644: Improve logging in getMatchingActionId.The debug logging is verbose and could impact performance. Consider adding a debug flag check.
private getMatchingActionId(workflow: Workflow, pageState: PageState, usedActions: string[]) { for (let actionId = workflow.length - 1; actionId >= 0; actionId--) { const step = workflow[actionId]; const isApplicable = this.applicable(step.where, pageState, usedActions); + if (this.options.debug) { console.log("-------------------------------------------------------------"); console.log(`Where:`, step.where); console.log(`Page state:`, pageState); console.log(`Match result: ${isApplicable}`); console.log("-------------------------------------------------------------"); + } if (isApplicable) { return actionId; } } }
Line range hint
647-741: Critical: Potential memory leak in workflow execution.The workflow copy and selectors arrays are continuously modified but never cleaned up, which could lead to memory issues during long-running workflows.
- Clear the arrays when the workflow completes
- Consider implementing a maximum workflow size check
private async runLoop(p: Page, workflow: Workflow) { const workflowCopy: Workflow = JSON.parse(JSON.stringify(workflow)); + const MAX_WORKFLOW_SIZE = 1000; // Adjust based on requirements + + if (workflowCopy.length > MAX_WORKFLOW_SIZE) { + throw new Error(`Workflow size exceeds maximum limit of ${MAX_WORKFLOW_SIZE}`); + } // ... existing code ... while (true) { // ... existing code ... if (!action) { + // Clean up + selectors = []; + workflowCopy.length = 0; return; } } }server/src/browser-management/classes/RemoteBrowser.ts (2)
93-107: Add unit tests fornormalizeUrlmethodThe
normalizeUrlmethod is crucial for preventing navigation loops and ensuring consistent URL formatting. Adding unit tests will help ensure it handles all edge cases correctly.
112-119: Add unit tests forshouldEmitUrlChangemethodThe
shouldEmitUrlChangemethod determines if a URL change is significant. Unit tests can validate its logic against various scenarios, improving reliability.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (2)
maxun-core/src/interpret.ts(12 hunks)server/src/browser-management/classes/RemoteBrowser.ts(6 hunks)
🧰 Additional context used
🪛 Biome (1.9.4)
maxun-core/src/interpret.ts
[error] 233-233: Avoid the use of spread (...) syntax on accumulators.
Spread syntax should be avoided on accumulators (like those in .reduce) because it causes a time complexity of O(n^2).
Consider methods such as .splice or .push instead.
(lint/performance/noAccumulatingSpread)
[error] 560-565: Other switch clauses can erroneously access this declaration.
Wrap the declaration in a block to restrict its access to the switch clause.
The declaration is defined in this switch clause:
Unsafe fix: Wrap the declaration in a block.
(lint/correctness/noSwitchDeclarations)
🔇 Additional comments (5)
maxun-core/src/interpret.ts (3)
151-168: Improved selector collection implementation.
The new implementation is more efficient as it:
- Uses Set for automatic deduplication
- Processes selectors in reverse order
- Returns early when selectors are found
560-565:
Fix variable scope in switch statement.
The static analysis tool correctly flags a potential issue with variable scope in the switch statement.
case 'clickNext':
+ {
const pageResults = await page.evaluate((cfg) => window.scrapeList(cfg), config);
// Filter out already scraped items
const newResults = pageResults.filter(item => {
const uniqueKey = JSON.stringify(item);
if (scrapedItems.has(uniqueKey)) return false;
scrapedItems.add(uniqueKey);
return true;
});
allResults = allResults.concat(newResults);
if (config.limit && allResults.length >= config.limit) {
return allResults.slice(0, config.limit);
}
+ }Likely invalid or redundant comment.
🧰 Tools
🪛 Biome (1.9.4)
[error] 560-565: Other switch clauses can erroneously access this declaration.
Wrap the declaration in a block to restrict its access to the switch clause.
The declaration is defined in this switch clause:
Unsafe fix: Wrap the declaration in a block.
(lint/correctness/noSwitchDeclarations)
180-236: Verify the impact of URL override logic.
The URL override logic at lines 224-225 might cause issues if action.where.url is unexpectedly undefined or if the URL comparison is case-sensitive.
✅ Verification successful
Based on the search results, I can now provide a final response:
URL comparison logic is properly implemented with necessary safeguards
The URL override logic is safe because:
- The code has a null check with
action && action.where.url - The URL comparison includes an explicit check for "about:blank"
- The
whereproperty and its URL field are properly typed and validated:- Joi schema validation enforces URL format:
url: [Joi.string().uri(), regex] - WhereWhatPair interface properly defines the URL structure
- Joi schema validation enforces URL format:
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Check for potential URL comparison edge cases
ast-grep --pattern 'if ($condition && $_ !== $_) {
$_ = $_;
}'
Length of output: 66
Script:
#!/bin/bash
# Search for URL comparison patterns in the codebase
rg "action.*where\.url.*!==" -A 2
# Check the type definition of action.where
ast-grep --pattern 'interface $_ {
where: {
$$$
}
}'
# Search for URL-related error handling
rg "action.*where\.url" -A 3 -B 1
Length of output: 641
Script:
#!/bin/bash
# Check the type definitions and interfaces related to the action object
ast-grep --pattern 'type $_ = {
where: $_;
}'
# Look for Action interface definition
ast-grep --pattern 'interface Action {
$$$
}'
# Search for URL-related type checks or validations
rg "where\?: \{" -A 3
Length of output: 123
Script:
#!/bin/bash
# Search for where property definitions with different pattern
rg "where:\s*{" -A 3
# Look for type imports related to Action
rg "import.*Action" -A 1
# Check for URL property definitions
rg "url:.*" -A 1 -B 2
Length of output: 11700
🧰 Tools
🪛 Biome (1.9.4)
[error] 233-233: Avoid the use of spread (...) syntax on accumulators.
Spread syntax should be avoided on accumulators (like those in .reduce) because it causes a time complexity of O(n^2).
Consider methods such as .splice or .push instead.
(lint/performance/noAccumulatingSpread)
server/src/browser-management/classes/RemoteBrowser.ts (2)
18-18: Dependency version locking for 'idcac-playwright'
As previously noted, ensure that the idcac-playwright package version is locked in package.json to maintain release stability.
Run the following script to check the dependency version:
#!/bin/bash
# Description: Check the 'idcac-playwright' dependency version in package.json
grep '"idcac-playwright":' package.json121-149: Refactored event listener setup improves code quality
Extracting event listener setup into setupPageEventListeners enhances code readability and adheres to the DRY principle by reducing duplication.
| const executeAction = async (invokee: any, methodName: string, args: any) => { | ||
| console.log("Executing action:", methodName, args); | ||
| if (!args || Array.isArray(args)) { | ||
| await (<any>invokee[methodName])(...(args ?? [])); | ||
| } else { | ||
| await (<any>invokee[methodName])(args); | ||
| } | ||
| }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Consider error handling in executeAction.
The executeAction function should include try-catch blocks to handle potential invocation errors gracefully.
const executeAction = async (invokee: any, methodName: string, args: any) => {
console.log("Executing action:", methodName, args);
+ try {
if (!args || Array.isArray(args)) {
await (<any>invokee[methodName])(...(args ?? []));
} else {
await (<any>invokee[methodName])(args);
}
+ } catch (error) {
+ console.error(`Failed to execute ${methodName}:`, error);
+ throw error;
+ }
};📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const executeAction = async (invokee: any, methodName: string, args: any) => { | |
| console.log("Executing action:", methodName, args); | |
| if (!args || Array.isArray(args)) { | |
| await (<any>invokee[methodName])(...(args ?? [])); | |
| } else { | |
| await (<any>invokee[methodName])(args); | |
| } | |
| }; | |
| const executeAction = async (invokee: any, methodName: string, args: any) => { | |
| console.log("Executing action:", methodName, args); | |
| try { | |
| if (!args || Array.isArray(args)) { | |
| await (<any>invokee[methodName])(...(args ?? [])); | |
| } else { | |
| await (<any>invokee[methodName])(args); | |
| } | |
| } catch (error) { | |
| console.error(`Failed to execute ${methodName}:`, error); | |
| throw error; | |
| } | |
| }; |
| // Handle page load events with retry mechanism | ||
| page.on('load', async () => { | ||
| const injectScript = async (): Promise<boolean> => { | ||
| try { | ||
| await page.waitForLoadState('networkidle', { timeout: 5000 }); | ||
|
|
||
| await page.evaluate(getInjectableScript()); | ||
| return true; | ||
| } catch (error: any) { | ||
| logger.log('warn', `Script injection attempt failed: ${error.message}`); | ||
| return false; | ||
| } | ||
| }; | ||
|
|
||
| const success = await injectScript(); | ||
| console.log("Script injection result:", success); | ||
| }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Implement a retry mechanism for script injection
The code mentions a retry mechanism, but currently, the script injection is attempted only once. Consider implementing retries to handle transient failures during script injection.
You could modify the injectScript function as follows:
const injectScript = async (): Promise<boolean> => {
+ const maxRetries = 3;
+ let attempts = 0;
+ while (attempts < maxRetries) {
+ try {
+ await page.waitForLoadState('networkidle', { timeout: 5000 });
+ await page.evaluate(getInjectableScript());
+ return true;
+ } catch (error: any) {
+ attempts++;
+ logger.log('warn', `Script injection attempt ${attempts} failed: ${error.message}`);
+ if (attempts >= maxRetries) {
+ return false;
+ }
+ }
+ }
+ return false;
};📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // Handle page load events with retry mechanism | |
| page.on('load', async () => { | |
| const injectScript = async (): Promise<boolean> => { | |
| try { | |
| await page.waitForLoadState('networkidle', { timeout: 5000 }); | |
| await page.evaluate(getInjectableScript()); | |
| return true; | |
| } catch (error: any) { | |
| logger.log('warn', `Script injection attempt failed: ${error.message}`); | |
| return false; | |
| } | |
| }; | |
| const success = await injectScript(); | |
| console.log("Script injection result:", success); | |
| }); | |
| // Handle page load events with retry mechanism | |
| page.on('load', async () => { | |
| const injectScript = async (): Promise<boolean> => { | |
| const maxRetries = 3; | |
| let attempts = 0; | |
| while (attempts < maxRetries) { | |
| try { | |
| await page.waitForLoadState('networkidle', { timeout: 5000 }); | |
| await page.evaluate(getInjectableScript()); | |
| return true; | |
| } catch (error: any) { | |
| attempts++; | |
| logger.log('warn', `Script injection attempt ${attempts} failed: ${error.message}`); | |
| if (attempts >= maxRetries) { | |
| return false; | |
| } | |
| } | |
| } | |
| return false; | |
| }; | |
| const success = await injectScript(); | |
| console.log("Script injection result:", success); | |
| }); |
feat: emit `setGetList` false socket event in pagination mode
feat: conditional selection
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Outside diff range and nitpick comments (2)
src/context/browserActions.tsx (2)
Line range hint
3-3: Consider removing the empty string from CaptureStage typeThe empty string in
CaptureStagetype could lead to ambiguous states since 'initial' already represents the starting state. Consider making it more strict by removing the empty string option.-export type CaptureStage = 'initial' | 'pagination' | 'limit' | 'complete' | ''; +export type CaptureStage = 'initial' | 'pagination' | 'limit' | 'complete';Also applies to: 8-8
55-57: Consider adding state transition validationThe current implementation allows any state transition without validation. This could lead to unexpected states if multiple actions are triggered simultaneously. Consider implementing a state machine to enforce valid transitions.
Example implementation:
const isValidTransition = (current: CaptureStage, next: CaptureStage): boolean => { const transitions: Record<CaptureStage, CaptureStage[]> = { 'initial': ['pagination', 'complete'], 'pagination': ['limit', 'complete'], 'limit': ['complete'], 'complete': ['initial'] }; return transitions[current]?.includes(next) ?? false; }; // Usage in setCaptureStage const setCaptureStage = (nextStage: CaptureStage) => { if (!isValidTransition(captureStage, nextStage)) { console.warn(`Invalid state transition from ${captureStage} to ${nextStage}`); return; } setStage(nextStage); };Also applies to: 63-66, 71-75, 77-83
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (3)
server/src/workflow-management/classes/Generator.ts(2 hunks)server/src/workflow-management/selector.ts(2 hunks)src/context/browserActions.tsx(1 hunks)
🔇 Additional comments (7)
src/context/browserActions.tsx (2)
Line range hint 39-39: LGTM: State management is well-structured
The state initialization and management for captureStage follows React best practices with proper typing.
Line range hint 1-140: Verify integration with workflow management system
The new capture stage management needs to be properly integrated with the workflow management system mentioned in the AI summary.
✅ Verification successful
Integration of capture stage management is properly implemented
The verification shows that the new capture stage management is well integrated:
ActionDescriptionBoxandRightSidePanelcomponents properly consume and handle the capture stages- The capture flow is correctly implemented with stage transitions (initial → pagination → limit → complete)
- UI elements are appropriately controlled based on the capture stage
- No direct coupling with the workflow system is needed as it operates at a higher abstraction level
The changes are self-contained within the browser actions context and its consuming components, requiring no additional integration work.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Check for components consuming the capture stage
rg -l "useActionContext.*captureStage"
# Check for any direct references to capture stages
rg -l "'initial'|'pagination'|'limit'|'complete'" --type tsx --type ts
# Look for potential workflow integration points
ast-grep --pattern 'workflow.$$$captureStage'
Length of output: 213
Script:
#!/bin/bash
# Check for components consuming the capture stage with proper file type
rg -l "useActionContext.*captureStage"
# Check for any direct references to capture stages
rg -l "'initial'|'pagination'|'limit'|'complete'" --type ts
# Look for workflow related files
fd -e ts -e tsx workflow
# Check for components using ActionContext
rg -l "useActionContext"
# Check for files importing browserActions context
rg -l "from ['\"]\./context/browserActions['\"]"
Length of output: 641
Script:
#!/bin/bash
# Check how captureStage is used in the components
rg "captureStage" src/components/molecules/ActionDescriptionBox.tsx -A 2 -B 2
rg "captureStage" src/components/organisms/RightSidePanel.tsx -A 2 -B 2
# Check workflow type definitions
cat maxun-core/src/types/workflow.ts
# Look for any state management in workflow files
rg "state" src/api/workflow.ts
Length of output: 4563
server/src/workflow-management/classes/Generator.ts (2)
544-544: LGTM: Proper integration of listSelector parameter
The addition of this.listSelector parameter to getElementInformation call is consistent with the updated function signature and improves selector generation accuracy.
572-574: LGTM: Consistent parameter passing
The changes maintain consistency by passing this.listSelector to both getRect and getElementInformation calls.
server/src/workflow-management/selector.ts (3)
99-101:
Fix potential division by zero in overlap calculation
The calculation of significantOverlap could result in a division by zero if the parent element has zero area.
Apply this fix:
- const significantOverlap =
- (childRect.width * childRect.height) /
- (parentRect.width * parentRect.height) > 0.5;
+ const parentArea = parentRect.width * parentRect.height;
+ const childArea = childRect.width * childRect.height;
+ let significantOverlap = false;
+ if (parentArea > 0) {
+ significantOverlap = (childArea / parentArea) > 0.5;
+ }77-84: 🛠️ Refactor suggestion
Extract duplicated container tags array
The container tags array is duplicated in three locations. This violates the DRY principle and makes maintenance more difficult.
Extract the array to a constant at the file level:
+ const CONTAINER_TAGS = ['DIV', 'SECTION', 'ARTICLE', 'MAIN', 'HEADER', 'FOOTER', 'NAV', 'ASIDE',
+ 'ADDRESS', 'BLOCKQUOTE', 'DETAILS', 'DIALOG', 'FIGURE', 'FIGCAPTION', 'MAIN', 'MARK', 'SUMMARY', 'TIME',
+ 'TABLE', 'THEAD', 'TBODY', 'TFOOT', 'TR', 'TH', 'TD', 'CAPTION', 'COLGROUP', 'COL', 'FORM', 'FIELDSET',
+ 'LEGEND', 'LABEL', 'INPUT', 'BUTTON', 'SELECT', 'DATALIST', 'OPTGROUP', 'OPTION', 'TEXTAREA', 'OUTPUT',
+ 'PROGRESS', 'METER', 'DETAILS', 'SUMMARY', 'MENU', 'MENUITEM', 'MENUITEM', 'APPLET', 'EMBED', 'OBJECT',
+ 'PARAM', 'VIDEO', 'AUDIO', 'SOURCE', 'TRACK', 'CANVAS', 'MAP', 'AREA', 'SVG', 'IFRAME', 'FRAME', 'FRAMESET',
+ 'LI', 'UL', 'OL', 'DL', 'DT', 'DD', 'HR', 'P', 'PRE', 'LISTING', 'PLAINTEXT', 'A'
+ ] as const;Then use CONTAINER_TAGS in all three locations instead of duplicating the array.
Also applies to: 205-212, 917-924
940-942:
Fix potential division by zero in getNonUniqueSelectors
Similar to the earlier issue, this calculation could result in a division by zero.
Apply the same fix here:
- const significantOverlap =
- (childRect.width * childRect.height) /
- (parentRect.width * parentRect.height) > 0.5;
+ const parentArea = parentRect.width * parentRect.height;
+ const childArea = childRect.width * childRect.height;
+ let significantOverlap = false;
+ if (parentArea > 0) {
+ significantOverlap = (childArea / parentArea) > 0.5;
+ }Likely invalid or redundant comment.
| const startPaginationMode = () => { | ||
| setPaginationMode(true); | ||
| setCaptureStage('pagination'); | ||
| socket?.emit('setGetList', { getList: false }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add error handling for socket communication
The socket emit lacks error handling and could silently fail. Consider adding error handling and user feedback.
-socket?.emit('setGetList', { getList: false });
+try {
+ if (!socket) {
+ throw new Error('Socket connection not available');
+ }
+ socket.emit('setGetList', { getList: false });
+} catch (error) {
+ console.error('Failed to communicate with server:', error);
+ // Consider adding user feedback mechanism
+}Committable suggestion skipped: line range outside the PR's diff.
| export const getElementInformation = async ( | ||
| page: Page, | ||
| coordinates: Coordinates, | ||
| listSelector: string, | ||
| ) => { | ||
| try { | ||
| const rect = await page.evaluate( | ||
| async ({ x, y }) => { | ||
| const el = document.elementFromPoint(x, y) as HTMLElement; | ||
| if (el) { | ||
| const { parentElement } = el; | ||
| // Match the logic in recorder.ts for link clicks | ||
| const element = parentElement?.tagName === 'A' ? parentElement : el; | ||
| const rectangle = element?.getBoundingClientRect(); | ||
| // @ts-ignore | ||
| if (rectangle) { | ||
| return { | ||
| x: rectangle.x, | ||
| y: rectangle.y, | ||
| width: rectangle.width, | ||
| height: rectangle.height, | ||
| top: rectangle.top, | ||
| right: rectangle.right, | ||
| bottom: rectangle.bottom, | ||
| left: rectangle.left, | ||
| if (listSelector !== '') { | ||
| const elementInfo = await page.evaluate( | ||
| async ({ x, y }) => { | ||
| const el = document.elementFromPoint(x, y) as HTMLElement; | ||
| if (el) { | ||
| const { parentElement } = el; | ||
| const element = parentElement?.tagName === 'A' ? parentElement : el; | ||
| let info: { | ||
| tagName: string; | ||
| hasOnlyText?: boolean; | ||
| innerText?: string; | ||
| url?: string; | ||
| imageUrl?: string; | ||
| attributes?: Record<string, string>; | ||
| innerHTML?: string; | ||
| outerHTML?: string; | ||
| } = { | ||
| tagName: element?.tagName ?? '', | ||
| }; | ||
| if (element) { | ||
| info.attributes = Array.from(element.attributes).reduce( | ||
| (acc, attr) => { | ||
| acc[attr.name] = attr.value; | ||
| return acc; | ||
| }, | ||
| {} as Record<string, string> | ||
| ); | ||
| } | ||
| // Gather specific information based on the tag | ||
| if (element?.tagName === 'A') { | ||
| info.url = (element as HTMLAnchorElement).href; | ||
| info.innerText = element.innerText ?? ''; | ||
| } else if (element?.tagName === 'IMG') { | ||
| info.imageUrl = (element as HTMLImageElement).src; | ||
| } else { | ||
| info.hasOnlyText = element?.children?.length === 0 && | ||
| element?.innerText?.length > 0; | ||
| info.innerText = element?.innerText ?? ''; | ||
| } | ||
| info.innerHTML = element.innerHTML; | ||
| info.outerHTML = element.outerHTML; | ||
| return info; | ||
| } | ||
| } | ||
| }, | ||
| { x: coordinates.x, y: coordinates.y }, | ||
| ); | ||
| return rect; | ||
| return null; | ||
| }, | ||
| { x: coordinates.x, y: coordinates.y }, | ||
| ); | ||
| return elementInfo; | ||
| } else { | ||
| const elementInfo = await page.evaluate( | ||
| async ({ x, y }) => { | ||
| const originalEl = document.elementFromPoint(x, y) as HTMLElement; | ||
| if (originalEl) { | ||
| let element = originalEl; | ||
|
|
||
| const containerTags = ['DIV', 'SECTION', 'ARTICLE', 'MAIN', 'HEADER', 'FOOTER', 'NAV', 'ASIDE', | ||
| 'ADDRESS', 'BLOCKQUOTE', 'DETAILS', 'DIALOG', 'FIGURE', 'FIGCAPTION', 'MAIN', 'MARK', 'SUMMARY', 'TIME', | ||
| 'TABLE', 'THEAD', 'TBODY', 'TFOOT', 'TR', 'TH', 'TD', 'CAPTION', 'COLGROUP', 'COL', 'FORM', 'FIELDSET', | ||
| 'LEGEND', 'LABEL', 'INPUT', 'BUTTON', 'SELECT', 'DATALIST', 'OPTGROUP', 'OPTION', 'TEXTAREA', 'OUTPUT', | ||
| 'PROGRESS', 'METER', 'DETAILS', 'SUMMARY', 'MENU', 'MENUITEM', 'MENUITEM', 'APPLET', 'EMBED', 'OBJECT', | ||
| 'PARAM', 'VIDEO', 'AUDIO', 'SOURCE', 'TRACK', 'CANVAS', 'MAP', 'AREA', 'SVG', 'IFRAME', 'FRAME', 'FRAMESET', | ||
| 'LI', 'UL', 'OL', 'DL', 'DT', 'DD', 'HR', 'P', 'PRE', 'LISTING', 'PLAINTEXT', 'A' | ||
| ]; | ||
| while (element.parentElement) { | ||
| const parentRect = element.parentElement.getBoundingClientRect(); | ||
| const childRect = element.getBoundingClientRect(); | ||
|
|
||
| if (!containerTags.includes(element.parentElement.tagName)) { | ||
| break; | ||
| } | ||
|
|
||
| const fullyContained = | ||
| parentRect.left <= childRect.left && | ||
| parentRect.right >= childRect.right && | ||
| parentRect.top <= childRect.top && | ||
| parentRect.bottom >= childRect.bottom; | ||
|
|
||
| const significantOverlap = | ||
| (childRect.width * childRect.height) / | ||
| (parentRect.width * parentRect.height) > 0.5; | ||
|
|
||
| if (fullyContained && significantOverlap) { | ||
| element = element.parentElement; | ||
| } else { | ||
| break; | ||
| } | ||
| } | ||
|
|
||
| let info: { | ||
| tagName: string; | ||
| hasOnlyText?: boolean; | ||
| innerText?: string; | ||
| url?: string; | ||
| imageUrl?: string; | ||
| attributes?: Record<string, string>; | ||
| innerHTML?: string; | ||
| outerHTML?: string; | ||
| } = { | ||
| tagName: element?.tagName ?? '', | ||
| }; | ||
|
|
||
| if (element) { | ||
| info.attributes = Array.from(element.attributes).reduce( | ||
| (acc, attr) => { | ||
| acc[attr.name] = attr.value; | ||
| return acc; | ||
| }, | ||
| {} as Record<string, string> | ||
| ); | ||
| } | ||
|
|
||
| if (element?.tagName === 'A') { | ||
| info.url = (element as HTMLAnchorElement).href; | ||
| info.innerText = element.innerText ?? ''; | ||
| } else if (element?.tagName === 'IMG') { | ||
| info.imageUrl = (element as HTMLImageElement).src; | ||
| } else { | ||
| info.hasOnlyText = element?.children?.length === 0 && | ||
| element?.innerText?.length > 0; | ||
| info.innerText = element?.innerText ?? ''; | ||
| } | ||
|
|
||
| info.innerHTML = element.innerHTML; | ||
| info.outerHTML = element.outerHTML; | ||
| return info; | ||
| } | ||
| return null; | ||
| }, | ||
| { x: coordinates.x, y: coordinates.y }, | ||
| ); | ||
| return elementInfo; | ||
| } | ||
| } catch (error) { | ||
| const { message, stack } = error as Error; | ||
| logger.log('error', `Error while retrieving selector: ${message}`); | ||
| logger.log('error', `Stack: ${stack}`); | ||
| console.error('Error while retrieving selector:', message); | ||
| console.error('Stack:', stack); | ||
| } | ||
| } | ||
| }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Enhance error handling in getElementInformation
The function has good logic for handling different element types, but the error handling could be improved:
- The error is logged but no error type is returned
- The function implicitly returns undefined on error
Consider applying this change:
} catch (error) {
const { message, stack } = error as Error;
console.error('Error while retrieving selector:', message);
console.error('Stack:', stack);
+ return {
+ error: message,
+ tagName: '',
+ };
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| export const getElementInformation = async ( | |
| page: Page, | |
| coordinates: Coordinates, | |
| listSelector: string, | |
| ) => { | |
| try { | |
| const rect = await page.evaluate( | |
| async ({ x, y }) => { | |
| const el = document.elementFromPoint(x, y) as HTMLElement; | |
| if (el) { | |
| const { parentElement } = el; | |
| // Match the logic in recorder.ts for link clicks | |
| const element = parentElement?.tagName === 'A' ? parentElement : el; | |
| const rectangle = element?.getBoundingClientRect(); | |
| // @ts-ignore | |
| if (rectangle) { | |
| return { | |
| x: rectangle.x, | |
| y: rectangle.y, | |
| width: rectangle.width, | |
| height: rectangle.height, | |
| top: rectangle.top, | |
| right: rectangle.right, | |
| bottom: rectangle.bottom, | |
| left: rectangle.left, | |
| if (listSelector !== '') { | |
| const elementInfo = await page.evaluate( | |
| async ({ x, y }) => { | |
| const el = document.elementFromPoint(x, y) as HTMLElement; | |
| if (el) { | |
| const { parentElement } = el; | |
| const element = parentElement?.tagName === 'A' ? parentElement : el; | |
| let info: { | |
| tagName: string; | |
| hasOnlyText?: boolean; | |
| innerText?: string; | |
| url?: string; | |
| imageUrl?: string; | |
| attributes?: Record<string, string>; | |
| innerHTML?: string; | |
| outerHTML?: string; | |
| } = { | |
| tagName: element?.tagName ?? '', | |
| }; | |
| if (element) { | |
| info.attributes = Array.from(element.attributes).reduce( | |
| (acc, attr) => { | |
| acc[attr.name] = attr.value; | |
| return acc; | |
| }, | |
| {} as Record<string, string> | |
| ); | |
| } | |
| // Gather specific information based on the tag | |
| if (element?.tagName === 'A') { | |
| info.url = (element as HTMLAnchorElement).href; | |
| info.innerText = element.innerText ?? ''; | |
| } else if (element?.tagName === 'IMG') { | |
| info.imageUrl = (element as HTMLImageElement).src; | |
| } else { | |
| info.hasOnlyText = element?.children?.length === 0 && | |
| element?.innerText?.length > 0; | |
| info.innerText = element?.innerText ?? ''; | |
| } | |
| info.innerHTML = element.innerHTML; | |
| info.outerHTML = element.outerHTML; | |
| return info; | |
| } | |
| } | |
| }, | |
| { x: coordinates.x, y: coordinates.y }, | |
| ); | |
| return rect; | |
| return null; | |
| }, | |
| { x: coordinates.x, y: coordinates.y }, | |
| ); | |
| return elementInfo; | |
| } else { | |
| const elementInfo = await page.evaluate( | |
| async ({ x, y }) => { | |
| const originalEl = document.elementFromPoint(x, y) as HTMLElement; | |
| if (originalEl) { | |
| let element = originalEl; | |
| const containerTags = ['DIV', 'SECTION', 'ARTICLE', 'MAIN', 'HEADER', 'FOOTER', 'NAV', 'ASIDE', | |
| 'ADDRESS', 'BLOCKQUOTE', 'DETAILS', 'DIALOG', 'FIGURE', 'FIGCAPTION', 'MAIN', 'MARK', 'SUMMARY', 'TIME', | |
| 'TABLE', 'THEAD', 'TBODY', 'TFOOT', 'TR', 'TH', 'TD', 'CAPTION', 'COLGROUP', 'COL', 'FORM', 'FIELDSET', | |
| 'LEGEND', 'LABEL', 'INPUT', 'BUTTON', 'SELECT', 'DATALIST', 'OPTGROUP', 'OPTION', 'TEXTAREA', 'OUTPUT', | |
| 'PROGRESS', 'METER', 'DETAILS', 'SUMMARY', 'MENU', 'MENUITEM', 'MENUITEM', 'APPLET', 'EMBED', 'OBJECT', | |
| 'PARAM', 'VIDEO', 'AUDIO', 'SOURCE', 'TRACK', 'CANVAS', 'MAP', 'AREA', 'SVG', 'IFRAME', 'FRAME', 'FRAMESET', | |
| 'LI', 'UL', 'OL', 'DL', 'DT', 'DD', 'HR', 'P', 'PRE', 'LISTING', 'PLAINTEXT', 'A' | |
| ]; | |
| while (element.parentElement) { | |
| const parentRect = element.parentElement.getBoundingClientRect(); | |
| const childRect = element.getBoundingClientRect(); | |
| if (!containerTags.includes(element.parentElement.tagName)) { | |
| break; | |
| } | |
| const fullyContained = | |
| parentRect.left <= childRect.left && | |
| parentRect.right >= childRect.right && | |
| parentRect.top <= childRect.top && | |
| parentRect.bottom >= childRect.bottom; | |
| const significantOverlap = | |
| (childRect.width * childRect.height) / | |
| (parentRect.width * parentRect.height) > 0.5; | |
| if (fullyContained && significantOverlap) { | |
| element = element.parentElement; | |
| } else { | |
| break; | |
| } | |
| } | |
| let info: { | |
| tagName: string; | |
| hasOnlyText?: boolean; | |
| innerText?: string; | |
| url?: string; | |
| imageUrl?: string; | |
| attributes?: Record<string, string>; | |
| innerHTML?: string; | |
| outerHTML?: string; | |
| } = { | |
| tagName: element?.tagName ?? '', | |
| }; | |
| if (element) { | |
| info.attributes = Array.from(element.attributes).reduce( | |
| (acc, attr) => { | |
| acc[attr.name] = attr.value; | |
| return acc; | |
| }, | |
| {} as Record<string, string> | |
| ); | |
| } | |
| if (element?.tagName === 'A') { | |
| info.url = (element as HTMLAnchorElement).href; | |
| info.innerText = element.innerText ?? ''; | |
| } else if (element?.tagName === 'IMG') { | |
| info.imageUrl = (element as HTMLImageElement).src; | |
| } else { | |
| info.hasOnlyText = element?.children?.length === 0 && | |
| element?.innerText?.length > 0; | |
| info.innerText = element?.innerText ?? ''; | |
| } | |
| info.innerHTML = element.innerHTML; | |
| info.outerHTML = element.outerHTML; | |
| return info; | |
| } | |
| return null; | |
| }, | |
| { x: coordinates.x, y: coordinates.y }, | |
| ); | |
| return elementInfo; | |
| } | |
| } catch (error) { | |
| const { message, stack } = error as Error; | |
| logger.log('error', `Error while retrieving selector: ${message}`); | |
| logger.log('error', `Stack: ${stack}`); | |
| console.error('Error while retrieving selector:', message); | |
| console.error('Stack:', stack); | |
| } | |
| } | |
| }; | |
| export const getElementInformation = async ( | |
| page: Page, | |
| coordinates: Coordinates, | |
| listSelector: string, | |
| ) => { | |
| try { | |
| if (listSelector !== '') { | |
| const elementInfo = await page.evaluate( | |
| async ({ x, y }) => { | |
| const el = document.elementFromPoint(x, y) as HTMLElement; | |
| if (el) { | |
| const { parentElement } = el; | |
| const element = parentElement?.tagName === 'A' ? parentElement : el; | |
| let info: { | |
| tagName: string; | |
| hasOnlyText?: boolean; | |
| innerText?: string; | |
| url?: string; | |
| imageUrl?: string; | |
| attributes?: Record<string, string>; | |
| innerHTML?: string; | |
| outerHTML?: string; | |
| } = { | |
| tagName: element?.tagName ?? '', | |
| }; | |
| if (element) { | |
| info.attributes = Array.from(element.attributes).reduce( | |
| (acc, attr) => { | |
| acc[attr.name] = attr.value; | |
| return acc; | |
| }, | |
| {} as Record<string, string> | |
| ); | |
| } | |
| // Gather specific information based on the tag | |
| if (element?.tagName === 'A') { | |
| info.url = (element as HTMLAnchorElement).href; | |
| info.innerText = element.innerText ?? ''; | |
| } else if (element?.tagName === 'IMG') { | |
| info.imageUrl = (element as HTMLImageElement).src; | |
| } else { | |
| info.hasOnlyText = element?.children?.length === 0 && | |
| element?.innerText?.length > 0; | |
| info.innerText = element?.innerText ?? ''; | |
| } | |
| info.innerHTML = element.innerHTML; | |
| info.outerHTML = element.outerHTML; | |
| return info; | |
| } | |
| return null; | |
| }, | |
| { x: coordinates.x, y: coordinates.y }, | |
| ); | |
| return elementInfo; | |
| } else { | |
| const elementInfo = await page.evaluate( | |
| async ({ x, y }) => { | |
| const originalEl = document.elementFromPoint(x, y) as HTMLElement; | |
| if (originalEl) { | |
| let element = originalEl; | |
| const containerTags = ['DIV', 'SECTION', 'ARTICLE', 'MAIN', 'HEADER', 'FOOTER', 'NAV', 'ASIDE', | |
| 'ADDRESS', 'BLOCKQUOTE', 'DETAILS', 'DIALOG', 'FIGURE', 'FIGCAPTION', 'MAIN', 'MARK', 'SUMMARY', 'TIME', | |
| 'TABLE', 'THEAD', 'TBODY', 'TFOOT', 'TR', 'TH', 'TD', 'CAPTION', 'COLGROUP', 'COL', 'FORM', 'FIELDSET', | |
| 'LEGEND', 'LABEL', 'INPUT', 'BUTTON', 'SELECT', 'DATALIST', 'OPTGROUP', 'OPTION', 'TEXTAREA', 'OUTPUT', | |
| 'PROGRESS', 'METER', 'DETAILS', 'SUMMARY', 'MENU', 'MENUITEM', 'MENUITEM', 'APPLET', 'EMBED', 'OBJECT', | |
| 'PARAM', 'VIDEO', 'AUDIO', 'SOURCE', 'TRACK', 'CANVAS', 'MAP', 'AREA', 'SVG', 'IFRAME', 'FRAME', 'FRAMESET', | |
| 'LI', 'UL', 'OL', 'DL', 'DT', 'DD', 'HR', 'P', 'PRE', 'LISTING', 'PLAINTEXT', 'A' | |
| ]; | |
| while (element.parentElement) { | |
| const parentRect = element.parentElement.getBoundingClientRect(); | |
| const childRect = element.getBoundingClientRect(); | |
| if (!containerTags.includes(element.parentElement.tagName)) { | |
| break; | |
| } | |
| const fullyContained = | |
| parentRect.left <= childRect.left && | |
| parentRect.right >= childRect.right && | |
| parentRect.top <= childRect.top && | |
| parentRect.bottom >= childRect.bottom; | |
| const significantOverlap = | |
| (childRect.width * childRect.height) / | |
| (parentRect.width * parentRect.height) > 0.5; | |
| if (fullyContained && significantOverlap) { | |
| element = element.parentElement; | |
| } else { | |
| break; | |
| } | |
| } | |
| let info: { | |
| tagName: string; | |
| hasOnlyText?: boolean; | |
| innerText?: string; | |
| url?: string; | |
| imageUrl?: string; | |
| attributes?: Record<string, string>; | |
| innerHTML?: string; | |
| outerHTML?: string; | |
| } = { | |
| tagName: element?.tagName ?? '', | |
| }; | |
| if (element) { | |
| info.attributes = Array.from(element.attributes).reduce( | |
| (acc, attr) => { | |
| acc[attr.name] = attr.value; | |
| return acc; | |
| }, | |
| {} as Record<string, string> | |
| ); | |
| } | |
| if (element?.tagName === 'A') { | |
| info.url = (element as HTMLAnchorElement).href; | |
| info.innerText = element.innerText ?? ''; | |
| } else if (element?.tagName === 'IMG') { | |
| info.imageUrl = (element as HTMLImageElement).src; | |
| } else { | |
| info.hasOnlyText = element?.children?.length === 0 && | |
| element?.innerText?.length > 0; | |
| info.innerText = element?.innerText ?? ''; | |
| } | |
| info.innerHTML = element.innerHTML; | |
| info.outerHTML = element.outerHTML; | |
| return info; | |
| } | |
| return null; | |
| }, | |
| { x: coordinates.x, y: coordinates.y }, | |
| ); | |
| return elementInfo; | |
| } | |
| } catch (error) { | |
| const { message, stack } = error as Error; | |
| console.error('Error while retrieving selector:', message); | |
| console.error('Stack:', stack); | |
| return { | |
| error: message, | |
| tagName: '', | |
| }; | |
| } | |
| }; |
feat: support arm64 for docker backend
fix: explicitly set easylist url
feat: website to api improvements (swagger)
@RohitR311 pls confirm maxun-core version before release.
Summary by CodeRabbit
New Features
Bug Fixes
Documentation
Chores