-
Notifications
You must be signed in to change notification settings - Fork 110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support playwright_stealth #109
Conversation
Thank you very much for the contribution, but I don't want to include any third-party dependency unless it's really necessary. from scrapy import Spider, Request
from playwright.async_api import Page
async def new_page_handler(page: Page) -> None:
await page.add_init_script("/path/to/script")
# more stuff
class AwesomeSpider(Spider):
def start_requests(self):
yield Request(
url="https://httpbin.org/get",
meta={"playwright": True, "playwright_configure_page": new_page_handler},
) |
For the record, this should be possible after #128 |
It should be possible to include this with an optional pip dependency e.g. |
That's true, but it would still require changes to the main handler in order to support the integration - that's what I want to avoid. from playwright_stealth import stealth_async
async def init_page(page, request):
await stealth_async(page)
class StealthSpider(scrapy.Spider):
def start_requests(self):
yield scrapy.Request(
url="https://example.org",
meta={
"playwright": True,
"playwright_page_init_callback": init_page,
},
) |
@hqtang33 Were you able to find a solution? I tried to include your changes proposed here and also your fork of the stealth plugin but unfortunately, even the "simple" removal of "Headless" doesn't work in the user-agent. |
Integrated playwright_stealth, and PLAYWRIGHT_STEALTH_ENABLED as an optional config.
Attached bot test results.
PLAYWRIGHT_STEALTH_ENABLED = True
PLAYWRIGHT_STEALTH_ENABLED = False