Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression in series.map? #8024

Closed
jankatins opened this issue Aug 14, 2014 · 8 comments · Fixed by #8026
Closed

Regression in series.map? #8024

jankatins opened this issue Aug 14, 2014 · 8 comments · Fixed by #8026
Labels
Bug Dtype Conversions Unexpected or buggy dtype conversions
Milestone

Comments

@jankatins
Copy link
Contributor

import pandas
from statsmodels import datasets


# load the data and clean it a bit
affairs = datasets.fair.load_pandas()
datas = affairs.exog
# any time greater than 0 is cheating
datas['cheated'] = affairs.endog > 0
# sort by the marriage quality and give meaningful name
# [rate_marriage, age, yrs_married, children,
# religious, educ, occupation, occupation_husb]
datas = datas.sort(['rate_marriage', 'religious'])
num_to_desc = {1: 'awful', 2: 'bad', 3: 'intermediate',
                  4: 'good', 5: 'wonderful'}
datas['rate_marriage'] = datas['rate_marriage'].map(num_to_desc)
num_to_faith = {1: 'non religious', 2: 'poorly religious', 3: 'religious',
                  4: 'very religious'}
datas['religious'] = datas['religious'].map(num_to_faith)
num_to_cheat = {False: 'faithful', True: 'cheated'}
datas['cheated'] = datas['cheated'].map(num_to_cheat)

part of the following test that fails on pythonxy Ubuntu testing

ERROR: statsmodels.graphics.tests.test_mosaicplot.test_mosaic

Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/nose/case.py", line 197, in runTest
self.test(_self.arg)
File "/usr/lib/python2.7/dist-packages/numpy/testing/decorators.py",
line 146, in skipper_func
return f(_args, **kwargs)
File "/build/buildd/statsmodels-0.6.0ppa18revno/debian/python-statsmodels/usr/lib/python2.7/dist-packages/statsmodels/graphics/tests/test_mosaicplot.py",
line 124, in test_mosaic
datas['cheated'] = datas['cheated'].map(num_to_cheat)
File "/usr/lib/pymodules/python2.7/pandas/core/series.py", line 1960, in map
indexer = arg.index.get_indexer(values)
File "/usr/lib/pymodules/python2.7/pandas/core/index.py", line 1460,
in get_indexer
if not self.is_unique:
File "properties.pyx", line 34, in pandas.lib.cache_readonly.get
(pandas/lib.c:38722)
File "/usr/lib/pymodules/python2.7/pandas/core/index.py", line 571,
in is_unique
return self._engine.is_unique
File "index.pyx", line 205, in
pandas.index.IndexEngine.is_unique.get (pandas/index.c:4338)
File "index.pyx", line 234, in
pandas.index.IndexEngine._do_unique_check (pandas/index.c:4790)
File "index.pyx", line 247, in
pandas.index.IndexEngine._ensure_mapping_populated
(pandas/index.c:4995)
File "index.pyx", line 253, in pandas.index.IndexEngine.initialize
(pandas/index.c:5092)
File "hashtable.pyx", line 731, in
pandas.hashtable.PyObjectHashTable.map_locations
(pandas/hashtable.c:12440)
ValueError: Does not understand character buffer dtype format string ('?')


This works on '0.13.1' but not on '0.14.1-202-g7d702e9'
@jreback
Copy link
Contributor

jreback commented Aug 14, 2014

can u post the series right before the map?

so can make a test from this

@jankatins
Copy link
Contributor Author

a = pd.Series([True,False,True,False], name="cheated")
conversion = {False: 'faithful', True: 'cheated'}
a.map(conversion)

@jankatins
Copy link
Contributor Author

Maybe it would be nice to run a CI where the major dependent libs (statsmodels,...?) would have their most recent stable release run?

@jreback
Copy link
Contributor

jreback commented Aug 14, 2014

we do this for numpy (and numpy does for pandas).

sure, though its a bit non-trivial (see the numpy_master tester) (and is not compat with conda).

@jreback
Copy link
Contributor

jreback commented Aug 14, 2014

merged

@cpcloud
Copy link
Member

cpcloud commented Aug 14, 2014

I'm going to start doing nightly linux64 builds of some dev packages. It's possible to test against those. It's not incompatible with conda you just have to use the GitHub tarball.

@cpcloud
Copy link
Member

cpcloud commented Aug 14, 2014

There's also a continuum dev channel that you can use tho I haven't looked to see what's there.

@josef-pkt
Copy link

FYI: I'm looking at the pythonxy nightly test results which use master for several packages, including pandas, patsy and statsmodels.
https://code.launchpad.net/~pythonxy
I never tried to see if there is an automatic way to get the test results out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Dtype Conversions Unexpected or buggy dtype conversions
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants