New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: duplicate indexing with embedded non-orderables #17610

Closed
gloryfromca opened this Issue Sep 21, 2017 · 8 comments

Comments

Projects
None yet
4 participants
@gloryfromca
Contributor

gloryfromca commented Sep 21, 2017

code below is scene that issue happened.

 for user_id, row in pending_data_df.iterrows():

        df = DataFrame(
            columns=[
                Names.call.same_call_counts,
                Names.call.phonebook_detail
            ],
            index=[user_id]
        )

        phoneBookPath = row[Names.app.phonebookpath]
        call_number_list = row.get(Names.call.call_number_list)

Problem description

when I wanted to get a set from a Series , it happended.

it has worked well for a long time,but suddenly it broke out. raise Traceback like this:

Traceback (most recent call last):
File "/usr/local/lib/python3.5/threading.py", line 914, in _bootstrap_inner
self.run()
File "/home/datascience/zh/suijiesuihuan/datascience/my_threads.py", line 87, in run
ret_data, result = make_decision(int(self.layerId), df_input)
File "/home/datascience/zh/suijiesuihuan/datascience/layered_decision.py", line 52, in make_decision
data_df, result_df = load_features_group_loader(features_group_loader_id, mocking)(data_df)
File "/home/datascience/zh/suijiesuihuan/datascience/dynamic_content_loader.py", line 166, in inner
return load_and_make_decision(data_df, config_new)
File "/home/datascience/zh/suijiesuihuan/datascience/dynamic_content_loader.py", line 111, in load_and_make_decision
new_feature_group_df, error_df = _load_data(pending_df, features_loader_id, features_loader, mocking)
File "/home/datascience/zh/suijiesuihuan/datascience/dynamic_content_loader.py", line 76, in _load_data
data_df, result_df = features_loader(data_df)
File "/home/datascience/zh/suijiesuihuan/dynamic_contents/feature_group_loader/feature_group_loader_0003/feature_group_loader.py", line 55, in load
call_number_list = row.get(Names.call.call_number_list)
File "/home/datascience/zh/venv/lib/python3.5/site-packages/pandas/core/generic.py", line 1633, in get
return self[key]
File "/home/datascience/zh/venv/lib/python3.5/site-packages/pandas/core/series.py", line 611, in getitem
dtype=self.dtype).finalize(self)
File "/home/datascience/zh/venv/lib/python3.5/site-packages/pandas/core/series.py", line 227, in init
"".format(data.class.name))
TypeError: 'set' type is unordered

someone knows what happened?

@nmusolino

This comment has been minimized.

Contributor

nmusolino commented Sep 21, 2017

First, it is not clear at all that this is a bug. A bug report typically contains a short, runnable example of the problem, and a description of why the observed behavior is wrong.

I tried to reproduce this with pandas 0.19 but could not.

In [1]: import pandas

In [2]: s = pandas.Series({'a': 0, 'b': 1})

In [3]: s.get(['b', 'a'])
Out[3]:
b    1
a    0
dtype: int64

In [4]: s.get({'b', 'a'})  # note set
Out[4]:
a    0
b    1
dtype: int64

Could you try calling get with a list argument, as in this?

        call_number_list = row.get(list(Names.call.call_number_list))

@gfyoung gfyoung added the Can't Repro label Sep 21, 2017

@gfyoung

This comment has been minimized.

Member

gfyoung commented Sep 21, 2017

@gloryfromca : Also, could you provide version information (pandas.show_versions)?

@gloryfromca

This comment has been minimized.

Contributor

gloryfromca commented Oct 9, 2017

@gloryfromca

This comment has been minimized.

Contributor

gloryfromca commented Oct 9, 2017

@jreback

This comment has been minimized.

Contributor

jreback commented Oct 9, 2017

I guess this is a bug, though I am not sure why you would ever do this. embedding non-scalars (e.g. a set) is non-idiomatic. Using duplicate indices requires care as well.

I'll mark it, but it would require a community pull request to fix.

In [2]: s = Series({'1':333,'s':set([1,2,3])})

In [3]: s
Out[3]: 
1          333
s    {1, 2, 3}
dtype: object

In [13]: s2 = s.append(Series({'1':2}))

In [14]: s2
Out[14]: 
1          333
s    {1, 2, 3}
1            2
dtype: object

In [15]: s2[1]
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-15-c2e7db717c6c> in <module>()
----> 1 s2[1]

~/pandas/pandas/core/series.py in __getitem__(self, key)
    629                         result = self._constructor(
    630                             result, index=[key] * len(result),
--> 631                             dtype=self.dtype).__finalize__(self)
    632 
    633             return result

~/pandas/pandas/core/series.py in __init__(self, data, index, dtype, name, copy, fastpath)
    239             elif isinstance(data, (set, frozenset)):
    240                 raise TypeError("{0!r} type is unordered"
--> 241                                 "".format(data.__class__.__name__))
    242             else:
    243 

TypeError: 'set' type is unordered

@jreback jreback added this to the Next Major Release milestone Oct 9, 2017

@jreback jreback changed the title from raise TypeError: 'set' type is unordered when I try to get set from a series to BUG: duplicate indexing with embedded non-orderables Oct 9, 2017

@gloryfromca

This comment has been minimized.

Contributor

gloryfromca commented Oct 10, 2017

@jreback

This comment has been minimized.

Contributor

jreback commented Oct 10, 2017

Should I create a pull request for it ?

sure, docs are: http://pandas.pydata.org/pandas-docs/stable/contributing.html

@gloryfromca

This comment has been minimized.

Contributor

gloryfromca commented Oct 11, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment