-
Hi All, If I have written two crawling projects that are started simultaneously, do they share a browser pool, or does each crawling project initiate its own browser pool? |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 2 replies
-
Hello, every crawling project has its own browser pool, or more precisely, every |
Beta Was this translation helpful? Give feedback.
-
Thank you for your answer in advance. So if this happens, what is the point of a browser pool for each website as a project? I don’t pursue speed, but I will crawl many different websites and have different page structures. How to share the browser pool? Otherwise, each project will have to start and close the browser, worrying about occupying system resources |
Beta Was this translation helpful? Give feedback.
The current purpose of
BrowserPool
is to handle browser management during the crawler run and provide a unified interface for opening / closing pages in the managed browsers + handle fingerprint injection and proxy setup.There indeed might be a performance hit from not reusing the managed browsers across multiple concurrent crawls. Unfortunately, right now, there is no way of instantiating the
BrowserPool
separately and passing it to the crawler instance. While there might be actual technical reasons for this (e.g. the way that proxies currently bind to running browsers), this is IMO rather a design oversight.Currently, your best bets are: