Skip to content

How to Combine Playwright and HTTP Requests in Crawlee? #2891

Answered by janbuchar
kwdiwt asked this question in Q&A
Discussion options

You must be logged in to vote

Hello @kwdiwt and thank you for your interest in Crawlee! If you don't need to use a BrowserPool, I suggest that you just perform the login using playwright without any Crawlee wrappers, retrieve the cookies and use them to construct your HttpCrawler (or any of its subclasses - CheerioCrawler etc.):

crawler = HttpCrawler({
  // ...
  sessionPoolOptions: {
    sessionOptions: {
      cookieJar: {
        "yourCookie": "value"
      } // this can be a toughcookie.CookieJar instance as well
    }
  }
})

await crawler.run()

If you need to perform the login for each new session (perhaps to avoid getting blocked), you can use the createSessionFunction option (https://crawlee.dev/api/core/inter…

Replies: 2 comments 3 replies

Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
3 replies
@kwdiwt
Comment options

@janbuchar
Comment options

@kwdiwt
Comment options

Answer selected by kwdiwt
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants