Series (DataFrame column) inplace interpolate UnboundLocalError #6281

abrakababra · 2014-02-06T08:35:41Z

Hello, I recently stumbled across this:

df1 = pd.DataFrame({'a':[1.,2.,3.,4.,nan,6,7,8]}, dtype='<f4')
df1.a.interpolate(inplace=True, downcast=None)


---------------------------------------------------------------------------
UnboundLocalError                         Traceback (most recent call last)
<ipython-input-9-bf0994f72664> in <module>()
----> 1 df1.a.interpolate(inplace=True, downcast=None)

C:\Program Files\Python27\lib\site-packages\pandas\core\generic.pyc in interpolate(self, method, axis, limit, inplace, downcast, **kwargs)
   2304         if axis == 1:
   2305             res = res.T
-> 2306         return res
   2307 
   2308     #----------------------------------------------------------------------

UnboundLocalError: local variable 'res' referenced before assignment

The problem arises with setting inplace to True regardless of 'downcast'.

But speaking of which: running interpolate() without any options, I would excpect to keep data types by default! I often chunk through datasets and glue them together with to_hdf. It took me a while to figure out that a float column in one chunk had just zeros in it so interpolate downcasted it to int => to_hdf raised on appending.

Thx in advance

jreback · 2014-02-06T11:49:13Z

can you show: python ci/print_versions.py

jreback · 2014-02-06T11:51:12Z

interpolation by definition will convert in general to float64. (generally you also start with float64 because if you are trying to interpolate you need nan's to interpolate). dtype conversion is controlled by the downcast kw to avoid a performance penalty.

jreback · 2014-02-06T12:35:26Z

cc @TomAugspurger can you have a look at this.

TomAugspurger · 2014-02-06T13:26:11Z

I've got a PR for the Error coming, just sloppy coding on my part.

For the dtypes issue, downcast=None works correctly right? It's just that the default seems weird?

jreback · 2014-02-06T13:28:17Z

default is like this because you can have float looking turn back into floats, e.g. 1., 2., 3. would become int, which is odd as they start as float64. iow we don't upcast unless explicity done.

TomAugspurger · 2014-02-06T13:35:30Z

I think @abrakababra was surprised because it did covert from float to int. We have the default as downcast='infer'

In [11]: df = pd.DataFrame([1., 2., 3., 4.])

In [12]: df.dtypes
Out[12]: 
0    float64
dtype: object

In [13]: df.interpolate().dtypes
Out[13]: 
0    int64
dtype: object

jreback · 2014-02-06T13:41:03Z

@TomAugspurger oh..maybe because of back compat....

what happens if you turn it off?

TomAugspurger · 2014-02-06T13:59:45Z

Previously Series.interpolate just used numpy.interp which preserved the float dtype.

This is from 0.11:

In [18]: s
Out[18]: 
0     1
1     2
2   NaN
3     4
Name: a, dtype: float64

In [19]: s.interpolate()
Out[19]: 
0    1
1    2
2    3
3    4
Name: a, dtype: float64

So I don't think it was because of backwards compat. Turning it off (with downcast=None) preserves the float dtype. A bunch of tests fails, but just because I wrote them to expect the recast as int if possible.

abrakababra · 2014-02-06T14:55:52Z

Yes, just surprised about downcast='infer' as default. Glad about having this option for whenever it may become handy but in my case (appending to a existing column) it was not what I wanted/expected.
@TomAugspurger Thanks for fixing the 'inplace=True' issue :-)

jreback · 2014-02-06T14:58:40Z

@TomAugspurger I have no problem mayking downcast=None as default, just doc as an API change, and see if any problem if you just change the tests

TomAugspurger · 2014-02-06T15:04:53Z

@abrakababra thanks for the report.

@jreback agreed about the API change. I'll fix it this afternoon.

TomAugspurger · 2014-02-06T19:40:50Z

@jreback Had a problem come up with changing the downcast to None. If you start out with a DataFrame like

In [14]: df = DataFrame({'A': [1, 2, np.nan, 4, 5, np.nan, 7],
   ....:                 'C': [1, 2, 3, 5, 8, 13, 21]})

In [15]: df.dtypes
Out[15]: 
A    float64
C      int64
dtype: object

So one float, one int, with no NaNs in the int. We apply the interpolation method to each block, so it gets applied to the int block/column. The interpolation will generally return a float dtype.

In [18]: df.interpolate(method='cubic', downcast=None).dtypes
Out[18]: 
A    float64
C    float64
dtype: object

So the int column was changed to float. With downcast='infer' we'd get 2 int columns. Thoughts? I suppose I could only apply the interpolation method to columns with at least one null value.

jreback · 2014-02-06T19:43:25Z

@TomAugspurger

could 'skip' columns that don't need interpolation (e.g. don't have nans), sounds ok to me

jreback added the Bug label Feb 6, 2014

jreback added this to the 0.14.0 milestone Feb 6, 2014

TomAugspurger mentioned this issue Feb 6, 2014

BUG: Fix interpolate with inplace=True #6284

Merged

jreback added API Design labels Feb 6, 2014

TomAugspurger mentioned this issue Feb 6, 2014

BUG: Interpolate should be more careful with dtypes #6290

Closed

jreback closed this as completed in #6284 Feb 6, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Series (DataFrame column) inplace interpolate UnboundLocalError #6281

Series (DataFrame column) inplace interpolate UnboundLocalError #6281

abrakababra commented Feb 6, 2014

jreback commented Feb 6, 2014

jreback commented Feb 6, 2014

jreback commented Feb 6, 2014

TomAugspurger commented Feb 6, 2014

jreback commented Feb 6, 2014

TomAugspurger commented Feb 6, 2014

jreback commented Feb 6, 2014

TomAugspurger commented Feb 6, 2014

abrakababra commented Feb 6, 2014

jreback commented Feb 6, 2014

TomAugspurger commented Feb 6, 2014

TomAugspurger commented Feb 6, 2014

jreback commented Feb 6, 2014

Series (DataFrame column) inplace interpolate UnboundLocalError #6281

Series (DataFrame column) inplace interpolate UnboundLocalError #6281

Comments

abrakababra commented Feb 6, 2014

jreback commented Feb 6, 2014

jreback commented Feb 6, 2014

jreback commented Feb 6, 2014

TomAugspurger commented Feb 6, 2014

jreback commented Feb 6, 2014

TomAugspurger commented Feb 6, 2014

jreback commented Feb 6, 2014

TomAugspurger commented Feb 6, 2014

abrakababra commented Feb 6, 2014

jreback commented Feb 6, 2014

TomAugspurger commented Feb 6, 2014

TomAugspurger commented Feb 6, 2014

jreback commented Feb 6, 2014