-
Notifications
You must be signed in to change notification settings - Fork 10.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MRG+1] Per-key priorities for dict-like settings by promoting dicts to Settings instances #1149
[MRG+1] Per-key priorities for dict-like settings by promoting dicts to Settings instances #1149
Conversation
Implementing per-key priorities like this would render #1110 obsolete |
I've resolved issue 1 by providing a |
@@ -13,7 +13,7 @@ class DownloadHandlers(object): | |||
def __init__(self, crawler): | |||
self._handlers = {} | |||
self._notconfigured = {} | |||
handlers = crawler.settings.get('DOWNLOAD_HANDLERS_BASE') | |||
handlers = crawler.settings.get('DOWNLOAD_HANDLERS_BASE', {}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could use crawler.settings.getdict()
(which has {}
as default already) to get both 'DOWNLOAD_HANDLERS_BASE' and 'DOWNLOAD_HANDLERS'.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not just that it also allows you to pass DOWNLOAD_HANDLERS
from command line.
FEED_STORAGES and FEED_EXPORTERS are dictionaries with paths that don't use ordering too, and those should handle *BASE settings (DEFAULT_REQUEST_HEADERS is a dict as well, but it's used differently). I like the idea of outsourcing the common code from FEED* and DOWNLOAD_HANDLERS loading and
Don't worry, just rebase |
Thank you guys for the feedback! I now have:
To do:
|
Alright, this PR is now at a stage where I'm fairly happy with it and would remove the WIP tag as soon as I've written/updated the documentation and rebased. Still very open for feedback of course :) I've updated the first post as an overview if you haven't followed this PR. @curita I made some changes to the |
self.settings_module = settings_module | ||
Settings.__init__(self, **kw) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a reason for swapping these two lines?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, took me quite a while to figure out though ;D It became necessary after I moved the dict promotion into __init__()
. Settings.__init__()
uses __getitem__()
during the promotion of default dictionaries to BaseSettings
instances, which in turn accesses self.settings_module
, so it needs to be defined before calling __init__()
I love the unification of the code, really like the design decisions you took there. I pointed out a couple of remaining details but I think the overall functionality is well defined, you should be able to start with the documentation. |
2f288fa
to
4192d8b
Compare
Alright, I think this PR is ready for final review. I've incorporated your recent feedback (nitpicks, |
+1 to merge, It needs a note about backwards incompatibilities introduced by this PR and how to update users code if possible. |
I'm happy to add a note, would that go into |
Removing item pipelines list support is fine and it is not a backward The change that worries me a bit is the new behaviour for dictionary El ago. 20, 2015 7:05, "Jakob de Maeyer" notifications@github.com
|
compsett = BaseSettings(self[name + "_BASE"], priority='default') | ||
compsett.update(self[name]) | ||
return compsett | ||
else: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
else is not necessary, let's drop it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that I've seen it with a little more distance, I think this whole function doesn't really achieve what I had intended.
When users override XY_BASE
, they explicitly don't want any of Scrapy's defaults for that component setting. But line 206 pulls Scrapy's defaults (which now live in XY
) back in, and even worse overwrites the users XY_BASE
settings where they have the same keys (say when the user simply changed some orders).
I guess either line 206 could be changed so that only those keys from XY
that have a priority higher than default
are considered, or support for _BASE
settings could be dropped altogether with this PR. After all, they have always been marked as "never edit this", the behaviour of the dict-like settings changes with this PR anyways, and we could get rid of this non-public helper function.
Here's a proposed release note:
|
03349ff
to
d9577da
Compare
d9577da
to
03f1720
Compare
Rebased onto current master and updated the function that handles backwards-compatibility for users who explicitly set They will now find the expected behaviour that they get none of Scrapy's defaults for each setting where they manually have set a
will now result in no downloader middlewares being enabled (and a warning that Not sure about the codecov test failing. It says only 95 % of this diff are hit because there are a couple of places that weren't covered before but where I switched the syntax to use the new helpers, e.g. like this: - valid_output_formats = (
- list(self.settings.getdict('FEED_EXPORTERS').keys()) +
- list(self.settings.getdict('FEED_EXPORTERS_BASE').keys())
- )
+ feed_exporters = without_none_values(self.settings._getcomposite('FEED_EXPORTERS'))
+ valid_output_formats = feed_exporters.keys() These should definitely be covered but I don't think it belongs into this PR. |
…priorities [MRG+1] Per-key priorities for dict-like settings by promoting dicts to Settings instances
Removed "backward-incompatible" tag after #1586 merged |
Expand settings priorities by assigning per-key priorities for the dict-like settings (e.g.
DOWNLOADER_MIDDLEWARES
), instead of just a single priority for the whole dictionary. This allows updating these settings from multiple locations without having to care (too much) about order. It is a prerequisite for the add-on system (#1272, #591).There are two main updates:
BaseSettings
class (formerlySettings
). They behave just like dictionaries, but honour per-key priorities when being written to.X_BASE
settings are deprecated, with default entries now living in theX
setting.And several smaller updates:
Settings
is a subclass ofBaseSettings
. It loads the default settings and promotes dictionaries within them toBaseSettings
instances.BaseSettings
has a complete dictionary-like interface.None
. A new helper,scrapy.util.without_none_values()
was introduced for this. This was previously not supported byFEED_STORAGES
,FEED_EXPORTERS
, andDEFAULT_REQUEST_HEADERS
.scrapy.util.build_component_list()
helper has been updated according to the deprecation of_BASE
settings, as the(base, custom)
call signature does not make much sense anymore.It's still backwards-compatible.ITEM_PIPELINES
can no longer be provided as listComes with many new/updated tests and documentation.