-
-
Notifications
You must be signed in to change notification settings - Fork 18.8k
Description
Code Sample, a copy-pastable example if possible
import numpy as np
import pandas as pd
np.random.seed(42)
nums = np.random.choice(range(0, 999), 1000)
cut = pd.cut(nums, 10)
discretization = cut.value_counts()
discretization.sort_index(inplace=True)
intervals = list(discretization.index)
mids = [i.mid for i in intervals]
print(discretization.reindex(index=mids))
Output:
(-0.998, 99.8] 100
(99.8, 199.6] 103
(199.6, 299.4] 97
(299.4, 399.2] 94
(399.2, 499.0] 97
(499.0, 598.8] 85
(598.8, 698.6] 118
(698.6, 798.4] 103
(798.4, 898.2] 108
(898.2, 998.0] 95
dtype: int64
Problem description
Series.reindex()
returns the original Series
even though the index is changed.
Expected Output
print(pd.Series(discretization.values, index=mids))
produces:
49.401 100
149.700 103
249.500 97
349.300 94
449.100 97
548.900 85
648.700 118
748.500 103
848.300 108
948.100 95
dtype: int64
Output of pd.show_versions()
INSTALLED VERSIONS
commit: None
python: 3.6.2.final.0
python-bits: 64
OS: Linux
OS-release: 4.9.39-1-MANJARO
machine: x86_64
processor:
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
pandas: 0.20.3
pytest: 3.1.3
pip: 9.0.1
setuptools: 36.2.2
Cython: None
numpy: 1.13.1
scipy: 0.19.1
xarray: None
IPython: 5.3.0
sphinx: None
patsy: None
dateutil: 2.6.1
pytz: 2017.2
blosc: None
bottleneck: 1.2.1
tables: None
numexpr: 2.6.2
feather: None
matplotlib: 2.0.2
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: 0.999999999
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.9.6
s3fs: None
pandas_gbq: None
pandas_datareader: 0.4.0