Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

regex option for DataFrame.filter raises error on numeric column names #10506

Closed
cyrusmaher opened this issue Jul 4, 2015 · 2 comments
Closed
Labels
API Design Indexing Related to indexing on series/frames, not to indexes themselves
Milestone

Comments

@cyrusmaher
Copy link
Contributor

See PR below. Includes tests and release doc edits.

#10384

I'm just having a little problem with merging a simple two line conflict. I'm in the process of learning, so be gentle =P

@jreback jreback added Indexing Related to indexing on series/frames, not to indexes themselves API Design labels Jul 5, 2015
@jreback jreback added this to the 0.17.0 milestone Jul 5, 2015
@jreback
Copy link
Contributor

jreback commented Jul 6, 2015

closed by #10384

@jreback jreback closed this as completed Jul 6, 2015
@griai
Copy link

griai commented May 6, 2016

This edit breaks if the DataFrame contains unicode column names with non-ASCII characters.

import pandas as pd
df = pd.DataFrame({u'a': [1, 2, 3], u'ä': [4, 5, 6]})
df.filter(regex=u'a')

throws me a

---------------------------------------------------------------------------
UnicodeEncodeError                        Traceback (most recent call last)
<ipython-input-10-9de5a19c260e> in <module>()
----> 1 df.filter(regex=u'a')

C:\Users\...\AppData\Local\Continuum\32bit\Anaconda\envs\test\lib\site-packages\pandas\core\generic.pyc in filter(self, items, like, regex, axis)
   2013             matcher = re.compile(regex)
   2014             return self.select(lambda x: matcher.search(str(x)) is not None,
-> 2015                                axis=axis_name)
   2016         else:
   2017             raise TypeError('Must pass either `items`, `like`, or `regex`')

C:\Users\...\AppData\Local\Continuum\32bit\Anaconda\envs\test\lib\site-packages\pandas\core\generic.pyc in select(self, crit, axis)
   1545         if len(axis_values) > 0:
   1546             new_axis = axis_values[
-> 1547                 np.asarray([bool(crit(label)) for label in axis_values])]
   1548         else:
   1549             new_axis = axis_values

C:\Users\...\AppData\Local\Continuum\32bit\Anaconda\envs\test\lib\site-packages\pandas\core\generic.pyc in <lambda>(x)
   2012         elif regex:
   2013             matcher = re.compile(regex)
-> 2014             return self.select(lambda x: matcher.search(str(x)) is not None,
   2015                                axis=axis_name)
   2016         else:

UnicodeEncodeError: 'ascii' codec can't encode character u'\xe4' in position 0: ordinal not in range(128)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Design Indexing Related to indexing on series/frames, not to indexes themselves
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants