API: enable set_levels/set_names/set_labels to accept a list and a level argument to change a single level/value #7792

Closed
Poquaruse opened this Issue Jul 18, 2014 · 3 comments

Comments

Projects
None yet
2 participants

e.g.

set_names(self, names, inplace=False, level=None)
set_levels(self, levels, level=None, copy=False, validate=True, verify_integrity=False

if level is not None then treat levels/names as a list (or list-of-lists if level is a list) and just set those levels.

e.g.

set_names('foo',level=1)
set_names(['foo','bar'],level=[1,2])
set_levels(['a','b','c'],level=1)
set_levels([['a','b','c'],[1,2,3]],level=[1,2])

Hi all,

first of all: I'm not sure whether this is a bug or if there is just no nice way to do this. I'd like to use tz_convert in a MultiIndex DataFrame in pandas 0.14.1.

dt_rng = pd.date_range(start='2014-01-01 00:00', periods = 1000, freq='1s', tz='Europe/Berlin')
df = pd.DataFrame({'a':np.random.randn(1000), 'b': np.random.randn(1000)},index = dt_rng)
df['b'] = df['b'].round()
df = df.groupby('b').resample('1h')
df.index.levels[1] = df.index.levels[1].tz_convert('UTC')

This does not work: 'FrozenList' does not support mutable operations.

What I currently do:

dt_rng = pd.date_range(start='2014-01-01 00:00', periods = 1000, freq='1s', tz='Europe/Berlin')
df = pd.DataFrame({'a':np.random.randn(1000), 'b': np.random.randn(1000)},index = dt_rng)
df['b'] = df['b'].round()
df = df.groupby('b').resample('1h')
df.index.set_levels([
    df.index.levels[0],
    df.index.levels[1].tz_convert('UTC')
],inplace=True)

This does work as expected, but is it really the way to do it?!

Thanks and best regards!

PS: INSTALLED VERSIONS

commit: None
python: 3.4.1.final.0
python-bits: 64
OS: Windows
OS-release: 8
machine: AMD64
processor: Intel64 Family 6 Model 58 Stepping 9, GenuineIntel
byteorder: little
LC_ALL: None
LANG: DE

pandas: 0.14.1
nose: 1.3.3
Cython: 0.20.1
numpy: 1.8.1
scipy: 0.14.0
statsmodels: None
IPython: 2.1.0
sphinx: 1.2.2
patsy: 0.2.1
scikits.timeseries: None
dateutil: 2.1
pytz: 2014.4
bottleneck: None
tables: 3.1.1
numexpr: 2.3.1
matplotlib: 1.3.1
openpyxl: 1.8.5
xlrd: 0.9.3
xlwt: None
xlsxwriter: 0.5.5
lxml: 3.3.5
bs4: 4.3.1
html5lib: None
httplib2: None
apiclient: None
rpy2: None
sqlalchemy: 0.9.4
pymysql: None
psycopg2: None

Contributor

jreback commented Jul 18, 2014

this is a bit buggy at the moment (because an issue with a multi-index reset when it has a tx). You could normally do something like (if it were not buggy)

df.reset_index('b').tz_localize('UTC').set_index(['b'],append=True)

which I think is reasonable.

maybe could have set_levels take a dict of level_name -> new_level or something

e.g.

index.set_levels({ 1 : new_levels })

jreback added this to the 0.15.0 milestone Jul 18, 2014

Thanks for letting me know. You're always insanely fast to answer. :-)

I don't really understand that piece of code you posted, but I'll look into it. So far, the above version works...

Contributor

jreback commented Jul 19, 2014

ok, after #7746 was merged, this works.

In [19]: df.reset_index('b').tz_convert('UTC').set_index(['b'],append=True)
Out[19]: 
                                     a
                          b           
2013-12-31 23:00:00+00:00 -4  0.604865
                          -3 -0.122167
                          -2  0.198013
                          -1 -0.063699
                           0 -0.032018
                           1 -0.076646
                           2  0.015499
                           3 -0.479551

In [20]: df.reset_index('b').tz_convert('UTC').set_index(['b'],append=True).index
Out[20]: 
MultiIndex(levels=[[2013-12-31 23:00:00+00:00], [-4.0, -3.0, -2.0, -1.0, 0.0, 1.0, 2.0, 3.0]],
           labels=[[0, 0, 0, 0, 0, 0, 0, 0], [0, 1, 2, 3, 4, 5, 6, 7]],
           names=[None, u'b'])

In [21]: df.reset_index('b').tz_convert('UTC').set_index(['b'],append=True).index.levels[0]
Out[21]: 
<class 'pandas.tseries.index.DatetimeIndex'>
[2013-12-31 23:00:00+00:00]
Length: 1, Freq: None, Timezone: UTC

but I think that set_levels/set_names could take a dict to ONLY change a single level might be nice.

jreback changed the title from tz_convert() does not work for MultiIndex DataFrames to API: enable set_levels/set_names to accept a dict for a single level change Jul 19, 2014

@jreback jreback modified the milestone: 0.15.1, 0.15.0 Jul 19, 2014

jreback changed the title from API: enable set_levels/set_names to accept a dict for a single level change to API: enable set_levels/set_names to a list and a level argument to change a single level/value Jul 28, 2014

jreback changed the title from API: enable set_levels/set_names to a list and a level argument to change a single level/value to API: enable set_levels/set_names to accept a list and a level argument to change a single level/value Jul 28, 2014

jreback changed the title from API: enable set_levels/set_names to accept a list and a level argument to change a single level/value to API: enable set_levels/set_names/set_labels to accept a list and a level argument to change a single level/value Jul 28, 2014

@jreback jreback modified the milestone: 0.15.1, 0.15.0 Jul 30, 2014

jreback closed this in #7874 Jul 30, 2014

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment