Join GitHub today
GitHub is home to over 20 million developers working together to host and review code, manage projects, and build software together.
Vectorised addition of MonthOffset(n=0) returns different values to item-by-item addition #11370
Comments
rekcahpassyla
changed the title from
Vectorised addition of `MonthOffset` returns different values to item-by-item addition to Vectorised addition of `MonthOffset(n=0)` returns different values to item-by-item addition
Oct 19, 2015
|
This is from #10744, I didn't have the n=0 semantics right (and apparently didn't test!). It'll be a couple days, but I'll submit a fix. |
jreback
added Bug Frequency
labels
Oct 19, 2015
jreback
added this to the
0.17.1
milestone
Oct 19, 2015
|
Many thanks for quick response! |
|
Test script import pandas as pd
from pandas.util.testing import assert_index_equal
pd.show_versions()
offsets = [
pd.offsets.MonthEnd,
pd.offsets.QuarterEnd, pd.offsets.YearEnd,
]
dates = pd.date_range('2011-01-01', '2011-01-05', freq='D')
for offset in offsets:
# adding each item individually or vectorised should give same answer
expected_vec = dates + offset(n=0)
expected = pd.DatetimeIndex([d + offset(n=0) for d in dates])
msg = "offset: {}, vectorised: {}, individual: {}".format(
offset, expected_vec, expected
)
try:
if pd.__version__ == '0.17.0':
assert_index_equal(expected_vec, expected, check_names=False)
else:
assert_index_equal(expected_vec, expected)
except AssertionError as er:
raise Exception(msg + str(er))0.17.0INSTALLED VERSIONS
------------------
commit: None
python: 2.7.10.final.0
python-bits: 64
OS: Windows
OS-release: 7
machine: AMD64
processor: Intel64 Family 6 Model 26 Stepping 5, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
pandas: 0.17.0
nose: 1.3.7
pip: 7.1.0
setuptools: 18.0.1
Cython: 0.22
numpy: 1.10.1
scipy: 0.16.0
statsmodels: 0.6.1
IPython: 3.2.1
sphinx: 1.3.1
patsy: 0.4.0
dateutil: 2.4.1
pytz: 2015.4
blosc: None
bottleneck: 1.0.0
tables: 3.2.2
numexpr: 2.4.4
matplotlib: 1.4.3
openpyxl: None
xlrd: 0.9.4
xlwt: None
xlsxwriter: 0.7.3
lxml: None
bs4: 4.3.2
html5lib: 0.999
httplib2: None
apiclient: None
sqlalchemy: 1.0.7
pymysql: None
psycopg2: None
Traceback (most recent call last):
File "c:\dev\code\sandbox\pandas_17_vs_15_dateoffsets.py", line 34, in <module>
raise Exception(msg + str(er))
Exception: offset: <class 'pandas.tseries.offsets.MonthEnd'>, vectorised: DatetimeIndex(['2010-12-31', '2010-12-31', '2010-12-31', '2010-12-31',
'2010-12-31'],
dtype='datetime64[ns]', freq=None), individual: DatetimeIndex(['2011-01-31', '2011-01-31', '2011-01-31', '2011-01-31',
'2011-01-31'],
dtype='datetime64[ns]', freq=None)Index are different
Index values are different (100.0 %)
[left]: DatetimeIndex(['2010-12-31', '2010-12-31', '2010-12-31', '2010-12-31',
'2010-12-31'],
dtype='datetime64[ns]', freq=None)
[right]: DatetimeIndex(['2011-01-31', '2011-01-31', '2011-01-31', '2011-01-31',
'2011-01-31'],
dtype='datetime64[ns]', freq=None)0.15.2INSTALLED VERSIONS
------------------
commit: None
python: 2.7.10.final.0
python-bits: 64
OS: Windows
OS-release: 7
machine: AMD64
processor: Intel64 Family 6 Model 26 Stepping 5, GenuineIntel
byteorder: little
LC_ALL: None
LANG: en_GB
pandas: 0.15.2
nose: 1.3.7
Cython: 0.22
numpy: 1.9.2
scipy: 0.15.1
statsmodels: None
IPython: 3.2.1
sphinx: 1.3.1
patsy: 0.3.0
dateutil: 2.4.1
pytz: 2015.4
bottleneck: 1.0.0
tables: 3.2.0
numexpr: 2.4.3
matplotlib: 1.4.3
openpyxl: 1.8.5
xlrd: 0.9.4
xlwt: 0.7.5
xlsxwriter: 0.7.3
lxml: 3.4.4
bs4: 4.3.2
html5lib: 0.999
httplib2: None
apiclient: None
rpy2: None
sqlalchemy: 1.0.7
pymysql: None
psycopg2: None |
|
Probably also wrong for On Mon, Oct 19, 2015 at 10:31 AM, Petra Chong notifications@github.com
|
rekcahpassyla commentedOct 19, 2015
This code returns different values in
0.17.0and0.15.20.17.0
0.15.2