-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nodriver: CDP get_response_body command not working #1832
Comments
Hi there @jwwq Im also trying to solve this situation, I'm under the impression that calling import time
import nodriver as uc
from nodriver import cdp
xhr_requests = []
last_xhr_request = None
def listenXHR(page):
async def handler(evt):
# get ajax requests
if evt.type_ is cdp.network.ResourceType.XHR or evt.type_ is cdp.network.ResourceType.FETCH:
xhr_requests.append([evt.response.url, evt.request_id])
global last_xhr_request
last_xhr_request = time.time()
page.add_handler(cdp.network.ResponseReceived, handler)
async def receiveXHR(page, requests):
responses = []
retries = 0
max_retries = 5
# wait at least 2 second after the last xhr request to get some more
while True:
if last_xhr_request is None or retries > max_retries:
break
if time.time() - last_xhr_request <= 2:
retries = retries + 1
time.sleep(2)
continue
else:
break
await page # this is very important
# loop through gathered requests and get its response body
for request in requests:
try:
res = await page.send(cdp.network.get_response_body(request[1]))
if res is None:
continue
responses.append({
'url': request[0],
'body': res[0],
'is_base64': res[1]
})
except Exception as e:
print("error get body", e)
return responses
async def crawl():
browser = await uc.start(headless=False)
# use main tab
tab = browser.main_tab
listenXHR(tab)
# change url to something that makes ajax requests
tab = await browser.get("https://example.com")
time.sleep(2)
xhr_responses = await receiveXHR(tab, xhr_requests)
print(xhr_responses)
if __name__ == '__main__':
uc.loop().run_until_complete(crawl()) Excuse my python, i have been using the language for less than 10h lol
I hope this help somehow, and looking forward for a better solution or examples/explanation on how to actually do this |
Hi @falmar, your code helped me a lot for my use case. I have a few suggestions to make for the code you provided:-
Here's a slightly modified version of the same code you provided:-
Apologies if I have made any mistakes and for my English too. |
You're aware this code does not run as posted, correct? |
Apologies, I might have made mistakes while modifying it , could you tell me the issue that you're encountering? I ran it in my system, it was working fine for me. |
Good afternoon, thank you for your great work! Based on your "network_monitor.py" example, I try to retrieve the contents of the response. I am using the LoadingFinished handler to make sure that the file is retrieved completely. Unfortunately, the process hangs forever when I'm trying to send command to CDP (see full code below).
cdp_cmd = cdp.network.get_response_body(event.request_id)
res = await global_browser.main_tab.send(cdp_cmd)
Please help!
(other than that there is one more question: is there any way to get tab in handler without global variables, but it's a minor issue)
The text was updated successfully, but these errors were encountered: