Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

login with scrapy-playwright #42

Closed
Alfin72 opened this issue Dec 23, 2021 · 4 comments
Closed

login with scrapy-playwright #42

Alfin72 opened this issue Dec 23, 2021 · 4 comments
Labels
support Support questions

Comments

@Alfin72
Copy link

Alfin72 commented Dec 23, 2021

I am new to scrapy-playwright soo there is a chance I have missed the option but I did try my best to make page.fill option work using PageCoroutine function but I am not successful basically I want to pass "page.fill('#to_date', '2021-12-17')" to the browser before I could collect response if it is possible with an existing solution please help me with an example to login to quotes to scrap using page.fill to enter user name and password it will help other newbies too.

@elacuesta
Copy link
Member

Please, share what you have tried.

@elacuesta
Copy link
Member

Without more context, all I'll venture to say is that PageCoroutine("fill", "#to_date", "2021-12-17") should produce the expected results.

@elacuesta elacuesta added the support Support questions label Dec 24, 2021
@Alfin72
Copy link
Author

Alfin72 commented Dec 25, 2021

sorry for disturbing you during the holiday season.

I am basically trying to log in using playwright and then send the response to scrapy.

Here is how my spider looks.

`import scrapy
from scrapy_playwright.page import PageCoroutine

class ScrollSpider(scrapy.Spider):
name = "quotes"

def start_requests(self):
    yield scrapy.Request(
        url="https://quotes.toscrape.com/login",
        meta=dict(
            playwright=True,
            playwright_include_page=True,
            playwright_context="new",
            playwright_page_coroutines=[
                PageCoroutine("fill", "#username", "2021-12-17"),
                PageCoroutine("fill", "#password", "2021-12-18"),
                PageCoroutine("click", selector="[type='submit']"),
                PageCoroutine("wait_for_timeout", 5000),
            ],
        ),
    )

def parse(self, response):
    # 'response' contains the page as seen by the browser
    yield {"url": response.url}  

Instead of login and returning "https://quotes.toscrape.com/"
I am getting {'url': 'https://quotes.toscrape.com/login'}

Here is the playwright code which I am trying to replicate using scrapy_playwright.

`from playwright.sync_api import sync_playwright

with sync_playwright() as p:
for browser_type in [p.chromium]:
browser = browser_type.launch(headless=False)
page = browser.new_page()
page.goto("https://quotes.toscrape.com/login")
page.fill('#username', '2021-12-16')
page.fill('#password', '2021-12-17')
page.wait_for_timeout(5000)
page.query_selector('[type="submit"]').click()
page.wait_for_timeout(15000)
browser.close()`

@Alfin72 Alfin72 changed the title page.fill option not working. login with scrapy-playwright Dec 30, 2021
@Alfin72
Copy link
Author

Alfin72 commented Dec 30, 2021

I tried the same code on a different website and the code works, I believe the bug might be in (https://quotes.toscrape.com/login) hence I will close this issue.

@Alfin72 Alfin72 closed this as completed Dec 30, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
support Support questions
Projects
None yet
Development

No branches or pull requests

2 participants