DataFrame.ffill behaves different than DataFrame.interpolate(method='ffill') along axes #12918

EVaisman · 2016-04-18T03:07:51Z

It looks like df.ffill(axis=0) has the same behavior as test_df.interpolate(method='ffill', axis=1).

from pandas.util.testing import assert_frame_equal
import numpy as np
import pandas as pd

n = np.nan
test_df = pd.DataFrame([[0, 2, n, n],
                        [1, n, 4, 6],
                        [n, 3, 5, n]])

assert_frame_equal(
    test_df.interpolate(method='ffill', axis=1),
    test_df.ffill(axis=0),
)

Is this the desired behavior?

In [2]: pandas.show_versions()

INSTALLED VERSIONS
------------------
commit: None
python: 2.7.11.final.0
python-bits: 64
OS: Darwin
OS-release: 15.3.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8

pandas: 0.18.0
nose: 1.3.7
pip: 8.1.1
setuptools: 20.3.1
Cython: 0.23.4
numpy: 1.11.0
scipy: 0.17.0
statsmodels: 0.6.1
xarray: None
IPython: 4.1.2
sphinx: 1.3.5
patsy: 0.4.0
dateutil: 2.5.2
pytz: 2016.3
blosc: None
bottleneck: 1.0.0
tables: 3.2.2
numexpr: 2.5
matplotlib: 1.4.3
openpyxl: 2.3.2
xlrd: 0.9.4
xlwt: 1.0.0
xlsxwriter: 0.8.4
lxml: 3.6.0
bs4: 4.4.1
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: 1.0.12
pymysql: None
psycopg2: 2.6.1 (dt dec pq3 ext lo64)
jinja2: 2.7.3
boto: 2.39.0

The text was updated successfully, but these errors were encountered:

jorisvandenbossche · 2016-04-18T08:41:46Z

I suppose you mean the axis=1 vs axis=0 to obtain the same result? (or is there something else you didn't expect?).
At first sight, that seems like a bug to me.

For other methods (eg the default 'linear'), the direction is as expected (axis=0 -> filling per column), but for method='ffill' it is swapped.

In [77]: test_df
Out[77]:
     0    1    2    3
0  0.0  2.0  NaN  NaN
1  1.0  NaN  4.0  6.0
2  NaN  3.0  5.0  NaN

In [78]: test_df.interpolate()
Out[78]:
     0    1    2    3
0  0.0  2.0  NaN  NaN
1  1.0  2.5  4.0  6.0
2  1.0  3.0  5.0  6.0

In [81]: test_df.interpolate(axis=0, method='linear')
Out[81]:
     0    1    2    3
0  0.0  2.0  NaN  NaN
1  1.0  2.5  4.0  6.0
2  1.0  3.0  5.0  6.0

In [82]: test_df.interpolate(axis=0, method='ffill')
Out[82]:
     0    1    2    3
0  0.0  2.0  2.0  2.0
1  1.0  1.0  4.0  6.0
2  NaN  3.0  5.0  5.0

Also have to note that this behaviour is already there since the beginning (tested 0.13), but the ability to specify filling methods in interpolate is also not documented, so that is maybe the reason there weren't any bug reports.

@TomAugspurger Is there a reason for this behaviour, or is it just a bug?

EVaisman · 2016-04-18T11:31:45Z

You are correct Joris, that's what I was referring to.

[edit: remove email history]

TomAugspurger · 2016-04-18T12:08:01Z

interpolate shares a code path with fillna here, where we try one of the fillna methods, and if the method isn't valid e.g. linear we try an interpolate method.

Might need to flip the axis argument here.
@EVaisman interested in submitting a fix and some tests?

EVaisman · 2016-04-18T12:58:04Z

Yes! But will probably have to wait until this weekend.

jreback · 2016-04-18T15:13:29Z

hmm, we should not be accepting the fill methods directly in .interpolate I don't think. (or if we do, then they should be tested / listed). So let's make this an error.

JoshuaC3 · 2018-09-07T10:25:18Z

@jreback It would be nice to see ffill as a supported interpolation function. The ffill() method does not accept the limit_area kwarg with 'inside' and 'outside' options, whereas the interpolation does.

Alternatively, these kwarg could be added to .fillna, ffill and bfill methods to achieve the same results.

Which would be more desirable? Thanks.

jreback · 2018-09-07T12:35:31Z

i think ok to accept them here
ffill is a kind of interpolation

but leaving the existing api as it’s pretty functional as it is)

IgorFobia · 2019-07-23T14:34:54Z

I guess I am still having the same bug with pandas version 0.24.2

import numpy as np
import pandas as pd
tdf = pd.DataFrame({'a': [0, 1, np.nan], 'b': [5, np.nan, np.nan]})

Returns a dataframe tdf like this one:

a	b
0.0	5.0
1.0	NaN
NaN	NaN

fillna returns what we expect

tdf.fillna(method='pad')

a	b
0.0	5.0
1.0	5.0
1.0	5.0

While

tdf.interpolate(method='pad')

applies the padding per row:

a	b
0.0	5.0
1.0	1.0
NaN	NaN

jorisvandenbossche added the Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate label Apr 18, 2016

TomAugspurger added Difficulty Novice labels Apr 18, 2016

TomAugspurger added this to the 0.18.2 milestone Apr 18, 2016

jreback added the Error Reporting Incorrect or improved errors from pandas label Apr 18, 2016

jorisvandenbossche modified the milestones: 0.20.0, 0.19.0 Aug 29, 2016

jreback modified the milestones: 0.20.0, Next Major Release Mar 23, 2017

TomAugspurger added the good first issue label Oct 11, 2017

jreback removed the Difficulty Novice label Dec 15, 2017

jorisvandenbossche added Bug and removed Error Reporting Incorrect or improved errors from pandas labels Sep 7, 2018

cchwala mentioned this issue Sep 17, 2019

ENH: Added max_gap keyword for series.interpolate #25141

Closed

4 tasks

cchwala added a commit to cchwala/pandas that referenced this issue Sep 18, 2019

Added failing test for pandas-dev#12918

4d7b0f1

jbrockmendel removed the Effort Low label Oct 21, 2019

cchwala mentioned this issue Jan 15, 2020

df.interpolate(method='pad') axis is not consistent with df.fillna(method='pad') #29146

Closed

simonjayhawkins mentioned this issue May 12, 2020

fix bfill, ffill and pad when calling with df.interpolate with column… #33959

Merged

7 tasks

jreback modified the milestones: Contributions Welcome, 1.1 Jun 2, 2020

jreback closed this as completed in #33959 Jun 2, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DataFrame.ffill behaves different than DataFrame.interpolate(method='ffill') along axes #12918

DataFrame.ffill behaves different than DataFrame.interpolate(method='ffill') along axes #12918

EVaisman commented Apr 18, 2016 •

edited

Loading

jorisvandenbossche commented Apr 18, 2016

EVaisman commented Apr 18, 2016 •

edited

Loading

TomAugspurger commented Apr 18, 2016

EVaisman commented Apr 18, 2016 •

edited

Loading

jreback commented Apr 18, 2016

JoshuaC3 commented Sep 7, 2018

jreback commented Sep 7, 2018

IgorFobia commented Jul 23, 2019

DataFrame.ffill behaves different than DataFrame.interpolate(method='ffill') along axes #12918

DataFrame.ffill behaves different than DataFrame.interpolate(method='ffill') along axes #12918

Comments

EVaisman commented Apr 18, 2016 • edited Loading

jorisvandenbossche commented Apr 18, 2016

EVaisman commented Apr 18, 2016 • edited Loading

TomAugspurger commented Apr 18, 2016

EVaisman commented Apr 18, 2016 • edited Loading

jreback commented Apr 18, 2016

JoshuaC3 commented Sep 7, 2018

jreback commented Sep 7, 2018

IgorFobia commented Jul 23, 2019

EVaisman commented Apr 18, 2016 •

edited

Loading

EVaisman commented Apr 18, 2016 •

edited

Loading

EVaisman commented Apr 18, 2016 •

edited

Loading