-
Notifications
You must be signed in to change notification settings - Fork 1k
Closed
Labels
Description
...
def request_finished_handler(request):
response = request.response()
...
if request.redirect_to==None and request.resource_type in [ 'document','script' ]:
body = response.body() # error => Response body is unavailable for redirect responses
body = response.text() # error => Response body is unavailable for redirect responses
body = page.content() # error => Execution context was destroyed, most likely because of a navigation.
body = page.evaluate('document.body') # error => Execution context was destroyed, most likely because of a navigation.
...
page.on("requestfinished", request_finished_handler)
page.goto(url)
I also tried with:
def response_handler(response):
request = response.request
...
page.on("response", response_handler)
page.goto(url)
Problem is, I don't need the body of the final page loaded, but the full bodies of the documents and scripts from the starting url until the last link before the final url, to learn and later avoid or spoof fingerprinting.
I can - and i am using by now - requests.get() to get those bodies, but this have a major problem: being outside playwright, can be detected and denied as a scrapper (no session, no referrer, etc.), so i want to avoid this hack.
It is a bug or there is a way to do this that i don't know ?
Thanks !