Skip to content

Commit

Permalink
Resurrected support for byte string keys that was removed in 1.1.0
Browse files Browse the repository at this point in the history
Details:

* Version 1.1.0 had introduced the use of str.casefold() for getting
  the case-insensitive key from the original key. That introduced
  an incompatibility for users that had previously used byte strings
  as keys. While this was documented as an incompatibility and
  the NocaseDict class provided a way to override the __casefold__()
  method for changing this behavior, it still caused issues with
  users.

  This change falls back to using lower() on the key if it does not
  have casefold().

* Added testcases for byte string keys.

Signed-off-by: Andreas Maier <andreas.r.maier@gmx.de>
  • Loading branch information
andy-maier committed Feb 25, 2023
1 parent f12630f commit ec87807
Show file tree
Hide file tree
Showing 5 changed files with 135 additions and 30 deletions.
6 changes: 3 additions & 3 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -54,12 +54,12 @@ the following exceptions (and the case-insensitivity of course):

The case-insensitivity is achieved by matching any key values as their
casefolded values. By default, the casefolding is performed with
`str.casefold()`_ on Python 3 and with `str.lower()`_ on Python 2.
`str.casefold()`_ for unicode string keys and with `bytes.lower()`_ for byte
string keys.
The default casefolding can be overridden with a user-defined casefold method.


.. _str.casefold(): https://docs.python.org/3/library/stdtypes.html#str.casefold
.. _str.lower(): https://docs.python.org/2/library/stdtypes.html#str.lower
.. _bytes.lower(): https://docs.python.org/3/library/stdtypes.html#bytes.lower

Functionality can be added using mixin classes:

Expand Down
3 changes: 3 additions & 0 deletions docs/changes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,9 @@ Released: not yet

* Added type hints and type checking with MyPy (issue #123).

* Resurrected support for byte string keys that was removed in version 1.1.0.
(issue #139)

**Cleanup:**

**Known issues:**
Expand Down
22 changes: 10 additions & 12 deletions docs/intro.rst
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,12 @@ the following exceptions (and the case-insensitivity of course):

The case-insensitivity is achieved by matching any key values as their
casefolded values. By default, the casefolding is performed with
:meth:`py:str.casefold` on Python 3 and with :meth:`py2:str.lower` on Python 2.
:meth:`py:str.casefold` for unicode string keys and with :meth:`py:bytes.lower`
for byte string keys.

The :meth:`py:str.casefold` method implements the casefolding
algorithm described in :term:`Default Case Folding in The Unicode Standard`.

The default casefolding can be overridden with a user-defined casefold method.

Functionality can be added using mixin classes:
Expand All @@ -64,21 +69,14 @@ not well maintained, or did not support the Python versions we needed.
Overriding the default casefold method
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

The case-insensitive behavior of the :class:`~nocasedict.NocaseDict` class
is implemented in its :meth:`~nocasedict.NocaseDict.__casefold__` method. That
method returns the casefolded key that is used for the case-insensitive lookup
of dictionary items.

The default implementation of the :meth:`~nocasedict.NocaseDict.__casefold__`
method calls :meth:`py:str.casefold` on Python 3 and :meth:`py2:str.lower` on
Python 2. The :meth:`py:str.casefold` method implements the casefolding
algorithm described in :term:`Default Case Folding in The Unicode Standard`.

If it is necessary to change the case-insensitive behavior of the
:class:`~nocasedict.NocaseDict` class, that can be done by overriding its
:meth:`~nocasedict.NocaseDict.__casefold__` method.

The following Python 3 example shows how your own casefold method would
That method returns the casefolded key that is used for the case-insensitive
lookup of dictionary items.

The following example shows how your own casefold method would
be used, that normalizes the key in addition to casefolding it:


Expand Down
20 changes: 13 additions & 7 deletions nocasedict/_nocasedict.py
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@
import warnings
from collections import OrderedDict
from collections.abc import MutableMapping, KeysView, ValuesView, ItemsView
from typing import Any, NoReturn, Optional, Iterator, Tuple, Dict
from typing import Any, AnyStr, NoReturn, Optional, Iterator, Tuple, Dict

from ._utils import _stacklevel_above_nocasedict

Expand All @@ -52,7 +52,7 @@
else:
_ODICT_TYPE = OrderedDict

Key = Optional[str] # May be None
Key = Optional[AnyStr]

# This env var is set when building the docs. It causes the methods
# that are supposed to exist only in a particular Python version, not to be
Expand Down Expand Up @@ -258,7 +258,6 @@ def __init__(self, *args, **kwargs) -> None:
dict6 = NocaseDict(dict1, BETA=3)
Raises:
AttributeError: The key does not have the casefold method.
TypeError: Expected at most 1 positional argument, got {n}.
ValueError: Cannot unpack positional argument item #{i}.
"""
Expand All @@ -279,29 +278,36 @@ def _casefolded_key(self, key: Key) -> Key:
return self.__casefold__(key)

@staticmethod
def __casefold__(key: str) -> str:
def __casefold__(key: AnyStr) -> AnyStr:
"""
This method implements the case-insensitive behavior of the class.
It returns a case-insensitive form of the input key by calling a
"casefold method" on the key. The input key will not be `None`.
The casefold method called by this method is :meth:`py:str.casefold`.
If that method does not exist on the key value (e.g. because it is a
byte string), :meth:`py:bytes.lower` is called, for compatibility with
earlier versions of the package.
This method can be overridden by users in order to change the
case-insensitive behavior of the class.
See :ref:`Overriding the default casefold method` for details.
Parameters:
key (str): Input key, as a unicode string. Will not be `None`.
key (AnyStr): Input key. Will not be `None`.
Returns:
str: Case-insensitive form of the input key, as a unicode string.
AnyStr: Case-insensitive form of the input key.
Raises:
AttributeError: The key does not have the casefold method.
"""
return key.casefold()
try:
return key.casefold() # type: ignore
except AttributeError:
# Probably a byte string, fall back to lower()
return key.lower()

# Basic accessor and setter methods

Expand Down
114 changes: 106 additions & 8 deletions tests/unittest/test_nocasedict.py
Original file line number Diff line number Diff line change
Expand Up @@ -365,7 +365,7 @@ def test_NocaseDict_init(testcase,
KeyError if TEST_AGAINST_DICT else AttributeError, None, True
),
(
"Empty dict, with empty string key (not found)",
"Empty dict, with empty unicode string key (not found)",
dict(
obj=NocaseDict(),
key='',
Expand All @@ -374,14 +374,32 @@ def test_NocaseDict_init(testcase,
KeyError, None, True
),
(
"Empty dict, with non-empty key (not found)",
"Empty dict, with empty byte string key (not found)",
dict(
obj=NocaseDict(),
key=b'',
exp_value=None,
),
KeyError, None, True
),
(
"Empty dict, with non-empty unicode string key (not found)",
dict(
obj=NocaseDict(),
key='Dog',
exp_value=None,
),
KeyError, None, True
),
(
"Empty dict, with non-empty byte string key (not found)",
dict(
obj=NocaseDict(),
key=b'Dog',
exp_value=None,
),
KeyError, None, True
),

# Non-empty NocaseDict
(
Expand All @@ -394,7 +412,7 @@ def test_NocaseDict_init(testcase,
KeyError, None, True
),
(
"Non-empty dict, with empty string key (not found)",
"Non-empty dict, with empty unicode string key (not found)",
dict(
obj=NocaseDict([('Dog', 'Cat'), ('Budgie', 'Fish')]),
key='',
Expand All @@ -403,7 +421,17 @@ def test_NocaseDict_init(testcase,
KeyError, None, True
),
(
"Non-empty dict, with non-empty non-existing key (not found)",
"Non-empty dict, with empty byte string key (not found)",
dict(
obj=NocaseDict([('Dog', 'Cat'), ('Budgie', 'Fish')]),
key=b'',
exp_value=None,
),
KeyError, None, True
),
(
"Non-empty dict, with non-empty non-existing unicode string key "
"(not found)",
dict(
obj=NocaseDict([('Dog', 'Cat'), ('Budgie', 'Fish')]),
key='invalid',
Expand All @@ -412,7 +440,17 @@ def test_NocaseDict_init(testcase,
KeyError, None, True
),
(
"Non-empty dict, with existing key in original case",
"Non-empty dict, with non-empty non-existing byte string key "
"(not found)",
dict(
obj=NocaseDict([('Dog', 'Cat'), ('Budgie', 'Fish')]),
key=b'invalid',
exp_value=None,
),
KeyError, None, True
),
(
"Non-empty dict, with existing unicode string key in original case",
dict(
obj=NocaseDict([('Dog', 'Cat'), ('Budgie', 'Fish')]),
key='Dog',
Expand All @@ -421,7 +459,17 @@ def test_NocaseDict_init(testcase,
None, None, True
),
(
"Non-empty dict, with existing key in non-original upper case",
"Non-empty dict, with existing byte string key in original case",
dict(
obj=NocaseDict([(b'Dog', 'Cat'), (b'Budgie', 'Fish')]),
key=b'Dog',
exp_value='Cat',
),
None, None, True
),
(
"Non-empty dict, with existing unicode string key in non-original "
"upper case",
dict(
obj=NocaseDict([('Dog', 'Cat'), ('Budgie', 'Fish')]),
key='DOG',
Expand All @@ -430,7 +478,18 @@ def test_NocaseDict_init(testcase,
KeyError if TEST_AGAINST_DICT else None, None, True
),
(
"Non-empty dict, with existing key in non-original lower case",
"Non-empty dict, with existing byte string key in non-original "
"upper case",
dict(
obj=NocaseDict([(b'Dog', 'Cat'), (b'Budgie', 'Fish')]),
key=b'DOG',
exp_value='Cat',
),
KeyError if TEST_AGAINST_DICT else None, None, True
),
(
"Non-empty dict, with existing unicode string key in non-original "
"lower case",
dict(
obj=NocaseDict([('Dog', 'Cat'), ('Budgie', 'Fish')]),
key='dog',
Expand All @@ -439,14 +498,53 @@ def test_NocaseDict_init(testcase,
KeyError if TEST_AGAINST_DICT else None, None, True
),
(
"Non-empty dict, with existing key in non-original mixed case",
"Non-empty dict, with existing byte string key in non-original "
"lower case",
dict(
obj=NocaseDict([(b'Dog', 'Cat'), (b'Budgie', 'Fish')]),
key=b'dog',
exp_value='Cat',
),
KeyError if TEST_AGAINST_DICT else None, None, True
),
(
"Non-empty dict, with existing unicode string key in non-original "
"mixed case",
dict(
obj=NocaseDict([('Dog', 'Cat'), ('Budgie', 'Fish')]),
key='doG',
exp_value='Cat',
),
KeyError if TEST_AGAINST_DICT else None, None, True
),
(
"Non-empty dict, with existing byte string key in non-original "
"mixed case",
dict(
obj=NocaseDict([(b'Dog', 'Cat'), (b'Budgie', 'Fish')]),
key=b'doG',
exp_value='Cat',
),
KeyError if TEST_AGAINST_DICT else None, None, True
),
(
"Non-empty unicode string dict, with same byte string key",
dict(
obj=NocaseDict([('Dog', 'Cat'), ('Budgie', 'Fish')]),
key=b'Dog',
exp_value='Cat',
),
KeyError, None, True
),
(
"Non-empty byte string dict, with same unicode string key",
dict(
obj=NocaseDict([(b'Dog', 'Cat'), (b'Budgie', 'Fish')]),
key='Dog',
exp_value='Cat',
),
KeyError, None, True
),
]


Expand Down

0 comments on commit ec87807

Please sign in to comment.