Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added set_names to dask Index. #6344

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

Added set_names to dask Index. #6344

wants to merge 2 commits into from

Conversation

matheper
Copy link

@matheper matheper commented Jun 24, 2020

This PR is related to issue #6315.
It aims to add set_names method to Dask Index.

  • Tests added / passed
  • Passes black dask / flake8 dask

result = self.map_partitions(M.set_names, names, level, inplace, meta=meta)
return result

names = property(fset=set_names, fget=_get_names)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm I don't think this is what we want. IIUC, doing

df.index.names = ['a', 'b']

would operate on a new index object, rather than updating df.index. Is that correct?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, it is not covering the user case. I noticed the df.index is a property that always returns a new index, that is why the PR does not support any inplace operation.
I have a new commit that caches the df.index and returns it if it exists, otherwise, it does exactly the same as before.
This way it is possible to have inplace operations, but I am not sure if it will cover all the cases. I'll appreciate any feedback you have. Thanks!

dask/dataframe/tests/test_indexing.py Outdated Show resolved Hide resolved
@TomAugspurger
Copy link
Member

Hmm I didn't appreciate that you were working via a Series / DataFrame rather than index directly.

In that case, I'd recommend implementing / using DataFrame.rename_axis. Would that suffice for your use case?

@matheper
Copy link
Author

matheper commented Jul 3, 2020

I have no specific use case, just tried to replicate what was described in the issue.

From @cgi1 comment, I suppose DataFrame.rename_axis should be enough.

The reason for me, why I actually need to rename the Indexes is, that otherwise the reset_index() fails, because there are two Indexes with the same name (to be merged back from the Index site to the non-Index site - And there can be only one column on the non-Index side per name).

@TomAugspurger
Copy link
Member

Thanks. I think that rename_axis will fit better than adding a method to Index, but perhaps @cgi1 can chime in.

@TomAugspurger
Copy link
Member

@matheper are you interested in working on this still? I think that rename_axis is the better option.

Base automatically changed from master to main March 8, 2021 20:19
@jsignell
Copy link
Member

I think that we should explicitly not support inplace=True which would make this boil down to just a map_partitions I think. @matheper are you still interested in working on this?

@GPUtester
Copy link
Collaborator

Can one of the admins verify this patch?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants