New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API: DataFrame.select_dtypes should accept scalar #16855

Closed
chris-b1 opened this Issue Jul 7, 2017 · 3 comments

Comments

Projects
None yet
4 participants
@chris-b1
Contributor

chris-b1 commented Jul 7, 2017

In [164]: df = pd.DataFrame({'a': [1, 2, 3], 'b': ['a', 'b', 'c']})

In [165]: df.select_dtypes(include='object')
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-165-04044faa1a5a> in <module>()
----> 1 df.select_dtypes(include='object')

~\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\core\frame.py in select_dtypes(self, include, exclude)
   2355         include, exclude = include or (), exclude or ()
   2356         if not (is_list_like(include) and is_list_like(exclude)):
-> 2357             raise TypeError('include and exclude must both be non-string'
   2358                             ' sequences')
   2359         selection = tuple(map(frozenset, (include, exclude)))

TypeError: include and exclude must both be non-string sequences

In [166]: df.select_dtypes(include=['object'])
Out[166]: 
   b
0  a
1  b
2  c

Problem description

Only a convenience thing, but basically anywhere else we take list-likes, we accept a single string and I think should do the same here.

pandas 0.20.2

@chris-b1 chris-b1 added the API Design label Jul 7, 2017

@chris-b1 chris-b1 added this to the Next Major Release milestone Jul 7, 2017

@TomAugspurger

This comment has been minimized.

Show comment
Hide comment
@TomAugspurger

TomAugspurger Jul 7, 2017

Contributor

+100 :) We should do the same for exclude

Contributor

TomAugspurger commented Jul 7, 2017

+100 :) We should do the same for exclude

@Ffisegydd

This comment has been minimized.

Show comment
Hide comment
@Ffisegydd

Ffisegydd Jul 8, 2017

Contributor

I was looking at picking this up as one of my first contributions. A quick question for clarity though.

select_dtypes allows strings ('category', 'datetimetz', etc) but also allows numpy.number. For example:

df.select_dtypes(include=['category', numpy.number])

is valid.

My question is: what's the expected behaviour for df.select_dtypes(include=np.number)? The original issue only mentions allowing strings but it seems silly to exclude np.number, as such I'll continue by assuming that's the way to go but would be good to get some clarity.

Contributor

Ffisegydd commented Jul 8, 2017

I was looking at picking this up as one of my first contributions. A quick question for clarity though.

select_dtypes allows strings ('category', 'datetimetz', etc) but also allows numpy.number. For example:

df.select_dtypes(include=['category', numpy.number])

is valid.

My question is: what's the expected behaviour for df.select_dtypes(include=np.number)? The original issue only mentions allowing strings but it seems silly to exclude np.number, as such I'll continue by assuming that's the way to go but would be good to get some clarity.

@TomAugspurger

This comment has been minimized.

Show comment
Hide comment
@TomAugspurger

TomAugspurger Jul 8, 2017

Contributor
Contributor

TomAugspurger commented Jul 8, 2017

@chris-b1 chris-b1 changed the title from API: DataFrame.select_dtypes should accept single string to API: DataFrame.select_dtypes should accept scalar Jul 8, 2017

@jreback jreback modified the milestones: 0.21.0, Next Major Release Jul 10, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment