-
Notifications
You must be signed in to change notification settings - Fork 10.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
scrapy.pqueues.ScrapyPriorityQueue #6266
Comments
Do you have a disk queue corrupted by e.g. a prior spider crash? Otherwise you need to make a minimal reproducible example of this problem. |
No, as it should exist there because of how is the class code written. |
@wRAR I clear the request cache directory before starting new run. Starting a fresh crawl with a corrupt queue is not at all possible. Regarding the reproducible example, the does not occur every time and for every crawl. Hence reproducing it is very very difficult. |
|
@Gallaecio I clear both
The spiders runs fine but I get KeyError for some spiders. Also this error does not occur every run. |
@wRAR / @Gallaecio can you please guide how can I find the issue and fix this? |
No direct ideas but I would start with logging additions and removals for |
@wRAR @Gallaecio still looking for a solution and trying to debug on KeyError, however started seeing another issue.
And the spider never completes. I've to kill the process and rerun. |
Description
The ScrapyPriorityQueue throws a builtins.KeyError.
Steps to Reproduce
The error occurs randomly and not sure how to reproduce it.
Versions
2.6.2
Resolution:
Since python throws KeyError when the key does not exists in the queue(which is a dict here), should not we check if the key exists first?
if self.curprio is None can be replace with if self.curprio is None or self.curprio not in self.queues.
The text was updated successfully, but these errors were encountered: