You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Which package is this bug report for? If unsure which one to select, leave blank
@crawlee/cheerio (CheerioCrawler)
Issue description
I'm attempting to capture the headers that are being sent with my page requests inside of the requestHandler (I also tried inside of preNavigationHooks) and every time it just gives me an empty object. I'm not passing my own predefined request headers to Crawlee, but as I understand it the underlying http library that the CheerioCrawler uses (got-scraping) generates user agent headers by default. Everything I've read in the docs says I should be able to see those headers by inspecting request.headers in the requestHandler. Please advise.
Code sample
No response
Package version
3.3.2
Node.js version
18.16.1
Operating system
Linux
Apify platform
Tick me if you encountered this issue on the Apify platform
I have tested this on the next release
No response
Other context
No response
The text was updated successfully, but these errors were encountered:
For a quick solution right now you can access response.request.options.headers in the request handler. The request property on it will be what actually was sent via http module, hence headers will also be fully populated. You'll just have to ignore typescript error, since this property is for some reason not exposed via typings. Like this:
So, I've added typings for this, so now you should be able to access this without ts directive, also mentioned it in the docs. I decided not to add the headers to the request object, so that it's clear what was provided by the user, and what was autofilled.
Which package is this bug report for? If unsure which one to select, leave blank
@crawlee/cheerio (CheerioCrawler)
Issue description
I'm attempting to capture the headers that are being sent with my page requests inside of the requestHandler (I also tried inside of preNavigationHooks) and every time it just gives me an empty object. I'm not passing my own predefined request headers to Crawlee, but as I understand it the underlying http library that the CheerioCrawler uses (got-scraping) generates user agent headers by default. Everything I've read in the docs says I should be able to see those headers by inspecting request.headers in the requestHandler. Please advise.
Code sample
No response
Package version
3.3.2
Node.js version
18.16.1
Operating system
Linux
Apify platform
I have tested this on the
next
releaseNo response
Other context
No response
The text was updated successfully, but these errors were encountered: