Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API: Deep copy not copying index/columns #4202

Closed
hayd opened this issue Jul 11, 2013 · 13 comments · Fixed by #4039 or #4830
Closed

API: Deep copy not copying index/columns #4202

hayd opened this issue Jul 11, 2013 · 13 comments · Fixed by #4039 or #4830

Comments

@hayd
Copy link
Contributor

hayd commented Jul 11, 2013

http://stackoverflow.com/questions/17591104/in-pandas-can-i-deeply-copy-a-dataframe-including-its-index-and-column/17591423#17591423

This means if we change the index in place, e.g. with name:

df1 = df.copy()
df1.index.name = 'ffg'

This changes the name of the df.index.

@jreback
Copy link
Contributor

jreback commented Jul 11, 2013

releated #4039
cc @jtratner

this is the case where reference tracking to the indices would help (but too complicated)

so just force copy indices here

@hayd
Copy link
Contributor Author

hayd commented Jul 11, 2013

@jreback do you see the index.name thing? apparently not everyone does (my computer is sulking today, so not trusting it): http://stackoverflow.com/a/17591435/1240268

@hayd
Copy link
Contributor Author

hayd commented Jul 11, 2013

Ah I see, I think that is being done after (which creates a new object):

df2.index = df2.index[::-1]

@jtratner
Copy link
Contributor

Btw - this should be resolved by shallow copying indices, since they are supposed to be mostly immutable. That allows you to change metadata. After #4039, MultiIndex and Index will handle this transparently when copied with view() or anything that triggers __array_finalize__, __setstate__, etc.

@hayd
Copy link
Contributor Author

hayd commented Jul 31, 2013

@jtratner so this is fixed with #4039 ? (if not, will fix once that's merged)

@jtratner
Copy link
Contributor

Yes.

@jtratner
Copy link
Contributor

@hayd I believe this is all fixed and can be closed, yes?

@hayd
Copy link
Contributor Author

hayd commented Sep 12, 2013

@jtratner test is this:

In [11]: df = pd.DataFrame([1])

In [12]: df.index.name = 'foo'

In [13]: df1 = df.copy()

In [14]: df1.index.name = 'bar'

In [15]: df
Out[15]: 
     0
bar   
0    1

@jtratner
Copy link
Contributor

I will fix that.

@jtratner
Copy link
Contributor

@jreback Where is the best opportunity to actually copy an index (both for rows and columns) with the block manager? I think it just needs to be shallow or deep-copied each time and then passed as the new ref_item to the block copying.

@jreback
Copy link
Contributor

jreback commented Sep 12, 2013

right now there is NO deep copying of the index in BM. I think you just need to change BlockManager.copy to something like this:

   def copy(self, deep=True):
        """
        Make deep or shallow copy of BlockManager

        Parameters
        ----------
        deep : boolean, default True
            If False, return shallow copy (do not copy data)

        Returns
        -------
        copy : BlockManager
        """
        if deep:
             new_axes = [ ax.copy(deep=True) for ax in self.axes ]
        else:
             new_axes = list(self.axes)
        return self.apply('copy', axes=new_axes, deep=deep, do_integrity_check=False)

@jtratner
Copy link
Contributor

@jreback ah okay, there it is...now I remember we were hitting this issue before too...need to make sure the ref_items get passed to the copy constructor.

I think this copy needs a separate kwarg for whether index should be deep copied (because, generally, index always needs to be shallow copied).

@jreback
Copy link
Contributor

jreback commented Sep 12, 2013

@jtratner I don't know, I think easist just to copy everything on deep copy?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants