Skip to content

Hybrid : Page Completes Processing Without Waiting for All Requests to Finish #1218

@alban-stourbe-wmx

Description

@alban-stourbe-wmx

katana version:

Last version 1.1.2

Current Behavior:

Currently, the page processing flow relies on two steps: waitLoad to wait for the page to be fully loaded, followed by waitIdle, which depends on the window.requestIdleCallback call in the browser. While this method assumes the main thread is inactive (indicating no urgent tasks are running), it has a significant limitation:

Since waitIdle does not account for ongoing network requests, some XHR, fetch, or iframe requests may still be in progress when the callback is triggered. As a result, these pending requests may not complete before the page processing starts and the page is closed, leading to data loss or incomplete content retrieval.

Expected Behavior:

The WaitRequestIdle function offers a more robust solution by explicitly monitoring network activity. It ensures there has been a defined period of network inactivity before proceeding, reducing the risk of pending requests being overlooked. Additionally, it provides filtering options (includes, excludes, and excludeTypes) to precisely target the types of requests that should be tracked.

Recommendation: Adopting WaitStable for Comprehensive Stability Control

To achieve full coverage and ensure the page is stable before processing, the WaitStable function combines:

  • WaitLoad for initial page loading
  • WaitRequestIdle to confirm network inactivity
  • WaitDomStable to verify the DOM has remained stable for a specified period

By implementing `WaitStable, the risk of missing incomplete network requests is mitigated, ensuring all relevant data is captured before proceeding with page processing.

Steps To Reproduce:

You would just need to add an argument in etreer to set the time to wait until the page is stable before continuing processing.

Anything else:

I have already created an PR to fix this issue : #1217

Metadata

Metadata

Assignees

No one assigned

    Labels

    Status: CompletedNothing further to be done with this issue. Awaiting to be closed.Type: BugInconsistencies or issues which will cause an issue or problem for users or implementors.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions