Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TypeError: Cannot read property 'priority' of undefined #127

Closed
yujiosaka opened this issue Feb 24, 2018 · 0 comments
Closed

TypeError: Cannot read property 'priority' of undefined #127

yujiosaka opened this issue Feb 24, 2018 · 0 comments
Labels

Comments

@yujiosaka
Copy link
Owner

What is the current behavior?

In rare possibility (in my case less than 3%), the crawler fails enqueueing next request and never stops.
The error goes like this:

TypeError: Cannot read property 'priority' of undefined
    at lowerBound (/Users/yujiosaka/work/headless-chrome-crawler/cache/session.js:71:63)
    at lowerBound (/Users/yujiosaka/work/headless-chrome-crawler/lib/helper.js:117:11)
    at SessionCache.enqueue (/Users/yujiosaka/work/headless-chrome-crawler/cache/session.js:71:15)
    at PriorityQueue.push (/Users/yujiosaka/work/headless-chrome-crawler/lib/priority-queue.js:34:17)
    at PriorityQueue.<anonymous> (/Users/yujiosaka/work/headless-chrome-crawler/lib/helper.js:178:23)
    at HCCrawler._push (/Users/yujiosaka/work/headless-chrome-crawler/lib/hccrawler.js:271:17)
    at _skipRequest.then.skip (/Users/yujiosaka/work/headless-chrome-crawler/lib/hccrawler.js:545:16)
    at process._tickCallback (internal/process/next_tick.js:103:7)

If the current behavior is a bug, please provide the steps to reproduce

The following scripts causes errors sometimes.

const HCCrawler = require('headless-chrome-crawler');

HCCrawler.launch({
  maxDepth: 3,
  maxConcurrency: 10,
  allowedDomains: ['www.emin.co.jp'],
  evaluatePage: (() => window.document.title),
  onSuccess: (result => { // 成功時に評価される関数
    console.log(`${result.options.url}\t${result.result}`);
  }),
})
  .then(crawler => {
    crawler.queue('https://www.emin.co.jp/');
    crawler.onIdle()
      .then(() => crawler.close());
  });

What is the expected behavior?

  • Do not cause an error
  • Execute all queued requests
  • Successfully close the crawler

What is the motivation / use case for changing the behavior?

Please tell us about your environment:

  • Version: 1.3.4
  • Platform / OS version: MacOS
  • Node.js version: v6.4.0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant