Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Protocol error (Network.getResponseBody): No resource with given identifier found #89

Closed
simonw opened this issue Sep 13, 2022 · 7 comments
Labels
bug Something isn't working

Comments

@simonw
Copy link
Owner

simonw commented Sep 13, 2022

Got this error when running:

shot-scraper https://lite.datasette.io/ --wait-for 'document.querySelector("h2")' --log-requests - | tee /tmp/datasette-lite.txt
Exception in callback SyncBase._sync.<locals>.callback(<Task finishe...ifier found')>) at /Users/simon/.local/pipx/venvs/shot-scraper/lib/python3.10/site-packages/playwright/_impl/_sync_base.py:104
handle: <Handle SyncBase._sync.<locals>.callback(<Task finishe...ifier found')>) at /Users/simon/.local/pipx/venvs/shot-scraper/lib/python3.10/site-packages/playwright/_impl/_sync_base.py:104>
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/asyncio/events.py", line 80, in _run
    self._context.run(self._callback, *self._args)
  File "/Users/simon/.local/pipx/venvs/shot-scraper/lib/python3.10/site-packages/playwright/_impl/_sync_base.py", line 105, in callback
    g_self.switch()
  File "/Users/simon/.local/pipx/venvs/shot-scraper/lib/python3.10/site-packages/playwright/_impl/_browser_context.py", line 122, in <lambda>
    lambda params: self._on_response(
  File "/Users/simon/.local/pipx/venvs/shot-scraper/lib/python3.10/site-packages/playwright/_impl/_browser_context.py", line 397, in _on_response
    page.emit(Page.Events.Response, response)
  File "/Users/simon/.local/pipx/venvs/shot-scraper/lib/python3.10/site-packages/pyee/_base.py", line 113, in emit
    handled = self._call_handlers(event, args, kwargs)
  File "/Users/simon/.local/pipx/venvs/shot-scraper/lib/python3.10/site-packages/pyee/_base.py", line 96, in _call_handlers
    self._emit_run(f, args, kwargs)
  File "/Users/simon/.local/pipx/venvs/shot-scraper/lib/python3.10/site-packages/pyee/_asyncio.py", line 42, in _emit_run
    self.emit('error', exc)
  File "/Users/simon/.local/pipx/venvs/shot-scraper/lib/python3.10/site-packages/pyee/_base.py", line 116, in emit
    self._emit_handle_potential_error(event, args[0] if args else None)
  File "/Users/simon/.local/pipx/venvs/shot-scraper/lib/python3.10/site-packages/pyee/_base.py", line 86, in _emit_handle_potential_error
    raise error
  File "/Users/simon/.local/pipx/venvs/shot-scraper/lib/python3.10/site-packages/pyee/_asyncio.py", line 40, in _emit_run
    coro = f(*args, **kwargs)
  File "/Users/simon/.local/pipx/venvs/shot-scraper/lib/python3.10/site-packages/playwright/_impl/_impl_to_api_mapping.py", line 88, in wrapper_func
    return handler(
  File "/Users/simon/.local/pipx/venvs/shot-scraper/lib/python3.10/site-packages/shot_scraper/cli.py", line 734, in on_response
    "size": len(response.body()),
  File "/Users/simon/.local/pipx/venvs/shot-scraper/lib/python3.10/site-packages/playwright/sync_api/_generated.py", line 574, in body
    self._sync("response.body", self._impl_obj.body())
  File "/Users/simon/.local/pipx/venvs/shot-scraper/lib/python3.10/site-packages/playwright/_impl/_sync_base.py", line 111, in _sync
    return task.result()
  File "/Users/simon/.local/pipx/venvs/shot-scraper/lib/python3.10/site-packages/playwright/_impl/_network.py", line 375, in body
    binary = await self._channel.send("body")
  File "/Users/simon/.local/pipx/venvs/shot-scraper/lib/python3.10/site-packages/playwright/_impl/_connection.py", line 39, in send
    return await self.inner_send(method, params, False)
  File "/Users/simon/.local/pipx/venvs/shot-scraper/lib/python3.10/site-packages/playwright/_impl/_connection.py", line 63, in inner_send
    result = next(iter(done)).result()
playwright._impl._api_types.Error: Protocol error (Network.getResponseBody): No resource with given identifier found

This was logged out a bunch of times, even though the command itself ran to completion.

I think this is likely caused by the new log requests feature from:

@simonw simonw added the bug Something isn't working label Sep 13, 2022
@simonw
Copy link
Owner Author

simonw commented Sep 13, 2022

Probably this code - I think response.body() is breaking:

def on_response(response):
log_requests.write(
json.dumps(
{
"method": response.request.method,
"url": response.url,
"size": len(response.body()),

@simonw
Copy link
Owner Author

simonw commented Sep 13, 2022

Added this debugging code:

diff --git a/shot_scraper/cli.py b/shot_scraper/cli.py
index d104725..f5636a8 100644
--- a/shot_scraper/cli.py
+++ b/shot_scraper/cli.py
@@ -726,6 +726,12 @@ def take_shot(
         if log_requests:
 
             def on_response(response):
+                try:
+                    body = response.body()
+                except Exception as ex:
+                    print(ex)
+                    print(response.url)
+                    return
                 log_requests.write(
                     json.dumps(
                         {

And got this:

Protocol error (Network.getResponseBody): No resource with given identifier found
https://cdn.jsdelivr.net/pyodide/v0.20.0/full/pyodide.js
Protocol error (Network.getResponseBody): No resource with given identifier found
https://cdn.jsdelivr.net/pyodide/v0.20.0/full/pyodide.asm.js
...

@simonw
Copy link
Owner Author

simonw commented Sep 13, 2022

puppeteer/puppeteer#2258 (comment) says "resources get dumped after page commits navigation" - so presumably what's happening here is that a page navigation has occurred which clears those resources from memory before my Python code gets a chance to call .body() on them.

@simonw
Copy link
Owner Author

simonw commented Sep 13, 2022

My hunch is that it's a lot harder to reliably access the size of the resource than I had expected.

@simonw
Copy link
Owner Author

simonw commented Sep 13, 2022

I'm going to try my best, but return "size": null if the resource body size could not be calculated.

I'll mention this in the documentation.

@simonw
Copy link
Owner Author

simonw commented Sep 13, 2022

This seems to do the right thing:

diff --git a/shot_scraper/cli.py b/shot_scraper/cli.py
index d104725..a19e878 100644
--- a/shot_scraper/cli.py
+++ b/shot_scraper/cli.py
@@ -726,12 +726,20 @@ def take_shot(
         if log_requests:
 
             def on_response(response):
+                try:
+                    body = response.body()
+                    size = len(body)
+                except Error as ex:
+                    if "Network.getResponseBody" in ex.message:
+                        size = None
+                    else:
+                        raise
                 log_requests.write(
                     json.dumps(
                         {
                             "method": response.request.method,
                             "url": response.url,
-                            "size": len(response.body()),
+                            "size": size,
                             "timing": response.request.timing,
                         }
                     )

@simonw simonw closed this as completed in 31bc975 Sep 13, 2022
@simonw
Copy link
Owner Author

simonw commented Sep 13, 2022

simonw added a commit that referenced this issue Sep 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant