New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
downloader-aware scheduler cleanups #2
downloader-aware scheduler cleanups #2
Conversation
Request.meta is always a dict
also, do some bike-shedding: _pathable -> _path_safe
* PriorityAsTupleQueue.is_empty does the same as len(self) == 0 * custom PriorityAsTupleQueue.close is not needed after a switch to namedtuples * is_new and is_empty return values are unused * "url" local variable is unused
* remove mutable default arguments * more verbose variable names
scrapy/pqueues.py
Outdated
slot = urlparse_cached(request).hostname or '' | ||
|
||
# FIXME: meta is not modified when request is a dict and no meta is stored? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To address this concern we should preserve
Line 40 in d8e2b25
if isinstance(request, dict): |
meta = request.get('meta', dict())
by meta = request.setdefault('meta', dict())
inside this code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I leave comments on small improvements possible here and there
* SlotPriorityQueues doesn't care about objects inside, it is now just a container for multiple priority queues * assorted variable renames * don't inherit DownloaderAwarePriorityQueue from SlotBasedPriorityQueue * apply @whalebot-helmsman's suggestions for __slots__ and meta issues
Thanks @kmike |
c9c349e
into
whalebot-helmsman:round-robin-scheduler-tested
Changes are rather mechanical so far, I haven't got to more complex parts yet.