-
Notifications
You must be signed in to change notification settings - Fork 299
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
about CONCURRENCY_PAGE #5
Comments
I guess you are calling cluster.queue with invalid URLs? If you want to use
or like this:
|
my code db_result = await fetch60();//fetch 60 urls from db |
Is the actual URL given to cluster.queue with protocol ( |
The current settings, the actual running situation is like this, while opening up 10 URLs,I want the result because these 10 URLs are under the same domain name, so each has to have a delay,Can be opened at the same time 10 tab, but the URL can not be entered at the same time, to increase the delay, otherwise the target site was judged as a robot,I try to add a delay code to the top of the task code, or I can open it at the same time. |
I understand your scenario and the library supports it. Please either answer my questions or provide your source code. |
const { Cluster } = require('puppeteer-cluster'); //规则存放的根目录 ///////////////////////引入数据库//////////////////////////////////// const mongoose = require('mongoose'); db.on('connected', function () { console.log('Mongodb 链接成功 ' + DB_URL); }); //-------------------函数定义--------------------- }; async function main() {
} async function getHtmlSorce({ page, data: url }) { |
@kanxue660 If you solved it, perhaps share the solution so others could learn? Haha |
@Rainbowhat The author has fixed this problem. |
There is such a scene,I have a number of URLs, want to open the bulk of parallel, but if the URL is under the same domain name, each page needs to delay a few seconds to open, to avoid being blocked by the target webmaster,How to do it, I follow the following settings do not seem to
concurrency: Cluster.CONCURRENCY_PAGE,
maxConcurrency: 10,
retryLimit: 5,//失败重试5次
retryDelay: 2000,//重试间隔2秒
sameDomainDelay:30*1000,//统一域名下,延时10秒打开,貌似没用
skipDuplicateUrls: true,//跳过重复url
workerCreationDelay: 500,//标签打开延时
The text was updated successfully, but these errors were encountered: