-
Notifications
You must be signed in to change notification settings - Fork 10.8k
CaseInsensitiveDict (deprecate CaselessDict) #5146
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## master #5146 +/- ##
===========================================
+ Coverage 42.36% 88.88% +46.52%
===========================================
Files 162 162
Lines 11206 11243 +37
Branches 1825 1826 +1
===========================================
+ Hits 4747 9993 +5246
+ Misses 6103 965 -5138
+ Partials 356 285 -71
|
scrapy/http/headers.py
Outdated
return CaseInsensitiveDict( | ||
(to_unicode(key, encoding=self.encoding), to_unicode(b','.join(value), encoding=self.encoding)) | ||
for key, value in self.items() | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change should be safe I think, CaseInsensitiveDict
's interface is the same as CaselessDict
. The method is only used here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It’s still a potential issue for isinstance
usages. I’m not saying I’m against this change, but I am unsure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should address the isinstance
check:
diff --git scrapy/utils/datatypes.py scrapy/utils/datatypes.py
index 6eeabe1e..63c07b48 100644
--- scrapy/utils/datatypes.py
+++ scrapy/utils/datatypes.py
@@ -21,7 +21,7 @@ class CaselessDict(dict):
def __new__(cls, *args, **kwargs):
from scrapy.http.headers import Headers
- if issubclass(cls, CaselessDict) and not issubclass(cls, Headers):
+ if issubclass(cls, CaselessDict) and not issubclass(cls, (CaseInsensitiveDict, Headers)):
warnings.warn(
"scrapy.utils.datatypes.CaselessDict is deprecated,"
" please use scrapy.utils.datatypes.CaseInsensitiveDict instead",
@@ -79,7 +79,7 @@ class CaselessDict(dict):
return dict.pop(self, self.normkey(key), *args)
-class CaseInsensitiveDict(collections.UserDict):
+class CaseInsensitiveDict(collections.UserDict, CaselessDict):
"""A dict-like structure that accepts strings or bytes as keys and allows case-insensitive lookups.
It also allows overriding key and value normalization by defining custom `normkey` and `normvalue` methods.
Although as we discussed this might not be a problem, mostly because CaselessDict
is not documented.
def __delitem__(self, key: AnyStr) -> None: | ||
super().__delitem__(self.normkey(key)) | ||
|
||
def __contains__(self, key: AnyStr) -> bool: # type: ignore[override] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even if this goes against Liskov substitution principle, it makes sense because being case-insensitive only applies to str or bytes, not to all hashable types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if it's possible to inherit from some type-enabled dict
Thanks! |
I think the data model for
scrapy.http.headers.Headers
(andscrapy.utils.datatypes.CaselessDict
) is a little bit inconsistent. For instance:This has appeared before in #1459 (not reproducible anymore on py3, but the idea remains).
dict
is highly optimized at the C level and it seems like it doesn't always use overriden data access methods. For instance,MutableMapping
implementations (likeitemadapter.ItemAdapter
) use a combination ofkeys
(which I believe relies on__iter__
) and__getitem__
when wrapped withdict
, whiledict
subclasses seem to bypass this:This article shows a few reasons why inheriting from
dict
is not a good idea.I tried to use
CaseInsensitiveDict
as base class forHeaders
, but some tests fail (because they wrap the headers objects withdict
before checking the contents). Personally, I believe theHeaders
data model needs a little refactoring, I'd love to hear your thought about it (should I continue this PR to include changes to theHeaders
class, for instance?).