Skip to content

Adds string subclass check to dataframe.__getitem__#3461

Merged
mrocklin merged 3 commits intodask:masterfrom
jrbourbeau:dataframe_getitem_subclasses
May 2, 2018
Merged

Adds string subclass check to dataframe.__getitem__#3461
mrocklin merged 3 commits intodask:masterfrom
jrbourbeau:dataframe_getitem_subclasses

Conversation

@jrbourbeau
Copy link
Copy Markdown
Member

@jrbourbeau jrbourbeau commented May 2, 2018

This PR adds string subclass support to DataFrame.__getitem__

Fixes #3297

  • Tests added / passed
  • Passes flake8 dask
  • Fully documented, including docs/source/changelog.rst for all changes
    and one of the docs/source/*-api.rst files for new API

@mrocklin
Copy link
Copy Markdown
Member

mrocklin commented May 2, 2018

This seems fine to me.

It's odd seeing six in the codebase, which hasn't been used historically and isn't an explicit dependency. I notice though that it has been used once in dataframe/io/sql.py for the last year though. Thoughts on if we should add it as an explicit dependency of dask[dataframe] @jakirkham or work around with our own compatibility.py file?

@mrocklin
Copy link
Copy Markdown
Member

mrocklin commented May 2, 2018

Thanks for taking this on @jrbourbeau !

It's really nice to see people who solve other peoples' problems :)

@jrbourbeau
Copy link
Copy Markdown
Member Author

Thanks @mrocklin!

@jakirkham
Copy link
Copy Markdown
Member

Have no qualms with six other than it's style is a bit dated. Personally would encourage using a more modern Python 2/3 compat library like python-future or eight. Using our own compat module would be an option.

@jrbourbeau
Copy link
Copy Markdown
Member Author

Good to know about string_types in compatibility.py, I hadn't run across that before! I've got no preference re: six and can switch to using dask.compatibility.string_types here.

@mrocklin
Copy link
Copy Markdown
Member

mrocklin commented May 2, 2018

It looks like we have string_types already in compatibility.py.

@jrbourbeau I recommend using this so that we can just avoid the question altogether.

If you felt like fixing the use of six in dask/dataframe/io/sql.py as well that would be welcome :)

@mrocklin
Copy link
Copy Markdown
Member

mrocklin commented May 2, 2018

Ah, you beat me to the comment

@jrbourbeau jrbourbeau mentioned this pull request May 2, 2018
3 tasks
@jrbourbeau
Copy link
Copy Markdown
Member Author

Getting some OSError: [Errno 12] Cannot allocate memory errors for one of the builds on Travis. Should be unrelated to the changes made in this PR.

@mrocklin
Copy link
Copy Markdown
Member

mrocklin commented May 2, 2018 via email

@mrocklin
Copy link
Copy Markdown
Member

mrocklin commented May 2, 2018

This looks good to me. Thanks @jrbourbeau !

@jrbourbeau
Copy link
Copy Markdown
Member Author

I guess those were just spurious errors

@mrocklin mrocklin merged commit 1319fc9 into dask:master May 2, 2018
@mrocklin
Copy link
Copy Markdown
Member

mrocklin commented May 2, 2018

Merged! Thanks @jrbourbeau !

@jrbourbeau jrbourbeau deleted the dataframe_getitem_subclasses branch May 2, 2018 22:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants