Skip to content

Transpose modifies dtype of index, when a PeriodIndex #692

@max-sixty

Description

@max-sixty

This is very peculiar & specific, but also fairly impactful for us.

If you

  • Create a Dataset with a coord that is a PeriodIndex
  • Transpose that coord
  • Add a variable to the Dataset that needs to be reindexed

...then the type of the index changes from object to int64. This then causes other arrays added to that dataset to show up as NaNs throughout.

Here's an example. Note the dtype('O')) at the end of each output.

In [61]:
series = pd.Series(np.random.rand(10),index=pd.period_range(start='2000', periods=10,name='date'))
​
ds = xray.Dataset({'number 1':series})
ds['number 2'] = ds['number 1']
ds, ds.date.dtype
Out[61]:
(<xray.Dataset>
 Dimensions:   (date: 10)
 Coordinates:
   * date      (date) object 10957 10958 10959 10960 10961 10962 10963 10964 ...
 Data variables:
     number 1  (date) float64 0.1133 0.5952 0.5467 0.2035 0.2022 0.6723 ...
     number 2  (date) float64 0.1133 0.5952 0.5467 0.2035 0.2022 0.6723 ...,
 dtype('O'))
In [62]:

ds, ds.date.dtype
ds=ds.transpose('date')
ds, ds.date.dtype
Out[62]:
(<xray.Dataset>
 Dimensions:   (date: 10)
 Coordinates:
   * date      (date) object 10957 10958 10959 10960 10961 10962 10963 10964 ...
 Data variables:
     number 1  (date) float64 0.1133 0.5952 0.5467 0.2035 0.2022 0.6723 ...
     number 2  (date) float64 0.1133 0.5952 0.5467 0.2035 0.2022 0.6723 ...,
 dtype('O'))
In [63]:

ds
ds['number 3'] = ds['number 1']
ds, ds.date.dtype
Out[63]:
(<xray.Dataset>
 Dimensions:   (date: 10)
 Coordinates:
   * date      (date) object 10957 10958 10959 10960 10961 10962 10963 10964 ...
 Data variables:
     number 1  (date) float64 0.1133 0.5952 0.5467 0.2035 0.2022 0.6723 ...
     number 2  (date) float64 0.1133 0.5952 0.5467 0.2035 0.2022 0.6723 ...
     number 3  (date) float64 0.1133 0.5952 0.5467 0.2035 0.2022 0.6723 ...,
 dtype('O'))
In [64]:

ds
ds['number 4'] = ds['number 1'][:5]
ds, ds.date.dtype
Out[64]:
(<xray.Dataset>
 Dimensions:   (date: 10)
 Coordinates:
   * date      (date) int64 10957 10958 10959 10960 10961 10962 10963 10964 ...
 Data variables:
     number 1  (date) float64 0.1133 0.5952 0.5467 0.2035 0.2022 0.6723 ...
     number 2  (date) float64 0.1133 0.5952 0.5467 0.2035 0.2022 0.6723 ...
     number 3  (date) float64 0.1133 0.5952 0.5467 0.2035 0.2022 0.6723 ...
     number 4  (date) float64 0.1133 0.5952 0.5467 0.2035 0.2022 nan nan nan ...,
 dtype('int64'))

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions