Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Series.str.get_dummies fails if one of the categorical variables is called 'name' #12180

Closed
DGrady opened this issue Jan 29, 2016 · 3 comments
Closed
Labels
Bug Strings String extension data type and string data
Milestone

Comments

@DGrady
Copy link
Contributor

DGrady commented Jan 29, 2016

This works as expected:

>>> s = pd.Series(['Name', 'email|Name|address', 'address|email'])
>>> s.str.get_dummies(sep='|')
   Name  address  email
0     1        0      0
1     1        1      1
2     0        1      1

However, if any of the categorical variables is named exactly 'name', then there's a problem.

>>> s = pd.Series(['name', 'email|Name|address', 'address|email'])
>>> s.str.get_dummies(sep='|')
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-5-1a39a6dcd56b> in <module>()
----> 1 s.str.get_dummies(sep='|')

/Users/dgrady/anaconda/envs/python3/lib/python3.5/site-packages/pandas/core/strings.py in get_dummies(self, sep)
   1377         data = self._orig.astype(str) if self._is_categorical else self._data
   1378         result = str_get_dummies(data, sep)
-> 1379         return self._wrap_result(result, use_codes=(not self._is_categorical))
   1380 
   1381     @copy(str_translate)

/Users/dgrady/anaconda/envs/python3/lib/python3.5/site-packages/pandas/core/strings.py in _wrap_result(self, result, use_codes, name)
   1100         if not hasattr(result, 'ndim'):
   1101             return result
-> 1102         name = name or getattr(result, 'name', None) or self._orig.name
   1103 
   1104         if result.ndim == 1:

/Users/dgrady/anaconda/envs/python3/lib/python3.5/site-packages/pandas/core/generic.py in __nonzero__(self)
    729         raise ValueError("The truth value of a {0} is ambiguous. "
    730                          "Use a.empty, a.bool(), a.item(), a.any() or a.all()."
--> 731                          .format(self.__class__.__name__))
    732 
    733     __bool__ = __nonzero__

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

This is presumably related to the pandas/core/strings.py code in the stacktrace.

>>> pd.show_versions()

INSTALLED VERSIONS
------------------
commit: None
python: 3.5.1.final.0
python-bits: 64
OS: Darwin
OS-release: 14.5.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8

pandas: 0.17.1
nose: 1.3.7
pip: 7.1.2
setuptools: 18.5
Cython: 0.23.4
numpy: 1.10.1
scipy: 0.16.0
statsmodels: 0.6.1
IPython: 4.0.1
sphinx: 1.3.1
patsy: 0.4.0
dateutil: 2.4.2
pytz: 2015.7
blosc: None
bottleneck: 1.0.0
tables: 3.2.2
numexpr: 2.4.4
matplotlib: 1.5.0
openpyxl: 2.2.6
xlrd: 0.9.4
xlwt: 1.0.0
xlsxwriter: 0.7.7
lxml: 3.4.4
bs4: 4.4.1
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: 1.0.9
pymysql: None
psycopg2: None
Jinja2: 2.8
@jreback
Copy link
Contributor

jreback commented Jan 30, 2016

I think this is a bug in _wrap_result. The getattr(result,'Name',None) only should be checked if 'name' is in _metadata IOW, its a Series.

@jreback jreback added Bug Strings String extension data type and string data Difficulty Novice labels Jan 30, 2016
@jreback jreback added this to the Next Major Release milestone Jan 30, 2016
@jreback
Copy link
Contributor

jreback commented Jan 30, 2016

@DGrady want to take a crack at it?

@DGrady
Copy link
Contributor Author

DGrady commented Jan 31, 2016

Absolutely — I'll take a look at it later today.

DGrady added a commit to DGrady/pandas that referenced this issue Feb 1, 2016
@jreback jreback modified the milestones: 0.18.0, Next Major Release Feb 1, 2016
@jreback jreback closed this as completed in 34d9839 Feb 1, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Strings String extension data type and string data
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants