Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Int64Index and Float64Index (and... ?) do not propagate name of passed index #12309

Closed
toobaz opened this issue Feb 12, 2016 · 13 comments
Closed
Labels
Compat pandas objects compatability with Numpy or Python functions Indexing Related to indexing on series/frames, not to indexes themselves
Milestone

Comments

@toobaz
Copy link
Member

toobaz commented Feb 12, 2016

    In [8]: i = Index(range(3),name='foo')

    In [9]: Index(i)
    Out[9]: Int64Index([0, 1, 2], dtype='int64', name=u'foo')

    In [10]: pd.Int64Index(i)
    Out[10]: Int64Index([0, 1, 2], dtype='int64')

Float64Index as well
ideally move the tests to Base to test all index types.
See #12288.

@jreback jreback added Indexing Related to indexing on series/frames, not to indexes themselves Difficulty Novice Compat pandas objects compatability with Numpy or Python functions labels Feb 12, 2016
@jreback jreback added this to the Next Major Release milestone Feb 12, 2016
@toobaz
Copy link
Member Author

toobaz commented Feb 15, 2016

Related question: are the following behaviours, currently in master, desired?

  • explicitly given datatype is respected in Int64Index but not in Float64Index:

    In [3]: Int64Index(np.arange(3, dtype=np.int32), dtype=np.int32).dtype
    Out[3]: dtype('int32')
    
    In [4]: Float64Index(np.arange(3, dtype=np.float32), dtype=np.float32).dtype
    Out[4]: dtype('float64')
    
  • data dtype is not respected by default (even when it could be):

    In [5]: Int64Index(np.arange(3, dtype=np.int32)).dtype
    Out[5]: dtype('int64')
    

@jreback
Copy link
Contributor

jreback commented Feb 15, 2016

@toobaz no those are as expected, these by default will coerce to float64/int64 always (or raise an error if its impossible). we technically allow coercing to a smaller dtype size (e.g. 32bit).

I dont think its ever useful to have a Int64Index with a dtype of int32 (and break some things), IIRC I added it a while back for compat reasons. You can take this out and see what if anything breaks, and we can simply disallow (and cast as needed).

@toobaz
Copy link
Member Author

toobaz commented Feb 15, 2016

Not sure I understood correctly: OK for coercing to 64 bits both floats and ints by default. But what about a different, explicit, coercion (i.e. dtype=np.float32)? I understand you're saying that supporting this may make sense for floats but not for ints, right? (the current behaviour is the opposite)

@jreback
Copy link
Contributor

jreback commented Feb 15, 2016

I am saying it exists for both now, but doesn't make sense for either. IIRC I added this for ints, but don't really remember why it was needed. Ideally we should just drop this support as its uncessary and confusing, coercing things like np.float32 -> float64 and similarly for ints.

@toobaz
Copy link
Member Author

toobaz commented Feb 15, 2016

OK, so the dtype argument would become useless (at least for those two index types), right? Should it be deprecated? Or do we prefer to leave it for compatibility?

@jreback
Copy link
Contributor

jreback commented Feb 15, 2016

its for compatibility and assertions (IOW if you pass something and it cannot be coerced then you raise, this already happens now)

@toobaz
Copy link
Member Author

toobaz commented Feb 15, 2016

So for instance in Int64Index with dtype=np.int32, I should check I can cast data to np.int32, just for the sake of asserting... and then store as np.int64 anyway?! Seems a bit awkward.

@jreback
Copy link
Contributor

jreback commented Feb 15, 2016

no I would simply cast to int64. I suspect some things might break currently if you change this. IIRC might be a platform issue, I don't remember.

toobaz added a commit to toobaz/pandas that referenced this issue Feb 22, 2016
jreback added a commit to jreback/pandas that referenced this issue Apr 16, 2016
jreback added a commit that referenced this issue Apr 18, 2016
closes #12881
closes #12866
xref #12309

Author: Jeff Reback <jeff@reback.net>

Closes #12899 from jreback/astype and squashes the following commits:

3c1800b [Jeff Reback] BUG: .astype() of a Float64Index to a Int64Index
@toobaz toobaz mentioned this issue May 17, 2016
4 tasks
@jreback jreback modified the milestones: 0.18.2, Next Major Release Jun 1, 2016
@max-sixty
Copy link
Contributor

@toobaz Is this related, or a new issue?

In [8]: s2=pd.Series(2., index=pd.PeriodIndex(start='1995-01-02', end='2016-06-30', freq='B', name='date'))

In [10]: s1=pd.Series(3., index=pd.PeriodIndex(start='1995-01-02', end='2016-06-03', freq='B', name='date'))

In [12]: s3=s1*s2

In [13]: s3
Out[13]: 
1995-01-02    6.0
1995-01-03    6.0
1995-01-04    6.0
1995-01-05    6.0
...
2016-06-28    NaN
2016-06-29    NaN
2016-06-30    NaN
Freq: B, dtype: float64

In [14]: s3.index.name
#none

@toobaz
Copy link
Member Author

toobaz commented Jun 2, 2016

@MaximilianR same bug (not that I have perfectly clear the code paths, but it behaves fine in my branch)

@jreback
Copy link
Contributor

jreback commented Jun 2, 2016

@toobaz so just add this as a test in your PR

@toobaz
Copy link
Member Author

toobaz commented Jun 2, 2016

@jreback I thought you didn't want that PR to grow too much :-)

Seriously: I don't have clear the code path leading to PeriodIndex.__new__() , but then the difference is clearly inside it, so we wouldn't be testing anything "logically new". I can add the check for .name in a test on the multiplication of series though...

@jreback
Copy link
Contributor

jreback commented Jun 2, 2016

oh, right this is about Periods. If it works, just include the addtl test. Periods are a little bit in flux now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Compat pandas objects compatability with Numpy or Python functions Indexing Related to indexing on series/frames, not to indexes themselves
Projects
None yet
3 participants