Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Set type check is too strict when creating Series from dict keys #36044

Closed
2 of 3 tasks
krassowski opened this issue Sep 1, 2020 · 2 comments · Fixed by #36054
Closed
2 of 3 tasks

BUG: Set type check is too strict when creating Series from dict keys #36044

krassowski opened this issue Sep 1, 2020 · 2 comments · Fixed by #36054
Labels
Bug Constructors Series/DataFrame/Index/pd.array Constructors
Milestone

Comments

@krassowski
Copy link

krassowski commented Sep 1, 2020

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • (optional) I have confirmed this bug exists on the master branch of pandas.


Code Sample, a copy-pastable example

from pandas import Series
Series({'a': 1, 'b': 2}.keys())

Problem description

Since Python 3.7 dictionaries are ordered, and therefore their keys are also ordered. In older pandas versions (prior to 1.1?) it was possible to create a Series from keys of a dictionary. A fix to #32582 intended to prevent an issue with sets seems to be breaking the creation of Series from keys of a dictionary.

Currently the following is raised:

/pandas/core/series.py in __init__(self, data, index, dtype, name, copy, fastpath)
    325                     data = data.copy()
    326             else:
--> 327                 data = sanitize_array(data, index, dtype, copy, raise_cast_failure=True)
    328 
    329                 data = SingleBlockManager.from_array(data, index)

/pandas/core/construction.py in sanitize_array(data, index, dtype, copy, raise_cast_failure)
    450         subarr = _try_cast(arr, dtype, copy, raise_cast_failure)
    451     elif isinstance(data, abc.Set):
--> 452         raise TypeError("Set type is unordered")
    453     elif lib.is_scalar(data) and index is not None and dtype is not None:
    454         data = maybe_cast_to_datetime(data, dtype)

TypeError: Set type is unordered

Expected Output

Same as of Series(list({'a': 1, 'b': 2}.keys())), i.e.:

0    a
1    b
dtype: object

Output of pd.show_versions()

INSTALLED VERSIONS

commit : f2ca0a2
python : 3.7.5.final.0
python-bits : 64
OS : Linux
OS-release : 5.4.0-42-generic
Version : #46-Ubuntu SMP Fri Jul 10 00:24:02 UTC 2020
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : en_GB.UTF-8
LOCALE : en_GB.UTF-8

pandas : 1.1.1
numpy : 1.18.1
pytz : 2019.3
dateutil : 2.8.1
pip : 19.2.3
setuptools : 41.2.0
Cython : None
pytest : 5.3.5
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 2.11.1
IPython : 7.12.0
pandas_datareader: None
bs4 : None
bottleneck : None
fsspec : None
fastparquet : None
gcsfs : None
matplotlib : None
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pytables : None
pyxlsb : None
s3fs : None
scipy : 1.4.1
sqlalchemy : None
tables : None
tabulate : None
xarray : None
xlrd : None
xlwt : None
numba : None

@krassowski krassowski added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Sep 1, 2020
@krassowski
Copy link
Author

krassowski commented Sep 1, 2020

The current check:

from collections import abc

isinstance(data, abc.Set):
    raise TypeError("Set type is unordered")

will also fall on OrderedSet which is a recipe linked from the collections.abc documentation (see "See also" notes at the very bottom of the page).

It will also fail on dict().items(); the failures on keys and items are because abc.KeysView inherits from abc.MappingView and abc.Set while abc.ItemsView inherits from abc.MappingView and abc.Set. The inheritance from abc.Set is a statement about uniqueness not about being unordered.

@simonjayhawkins
Copy link
Member

cc @dsaxton

@dsaxton dsaxton added Constructors Series/DataFrame/Index/pd.array Constructors and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Sep 1, 2020
@dsaxton dsaxton added this to the 1.1.2 milestone Sep 1, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Constructors Series/DataFrame/Index/pd.array Constructors
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants