Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: fillna with method segfaults on zero-length input (fixes #2775) #2778

Closed

Conversation

stephenwlin
Copy link
Contributor

fixes #2775

just added a check for zero-length data to the backfill and pad templates

not sure if I should add test coverage? the problem is that the tests will not fail without the fix, but rather segfault, so it might not be a good idea

@jreback
Copy link
Contributor

jreback commented Jan 30, 2013

fyi....this is going to conflict with dtypes branch....maybe do this afterwards? #2708

@stephenwlin
Copy link
Contributor Author

do you want to just fix it on your branch instead and we can close this PR? it's just some checks for N == 0

@jreback
Copy link
Contributor

jreback commented Jan 30, 2013

sure...just pad & backfill (all methods)....do we have a test to replicate?

@stephenwlin
Copy link
Contributor Author

just run fillna on any DataFrame or Series with zero rows, using a method rather than a fill value...
example from original issue was pandas.DataFrame(columns=["x"]).x.fillna(method="pad", inplace=1)

@ghost
Copy link

ghost commented Jan 30, 2013

I don't think inplace is needed to trigger, and I was unable to replicate with Series().fillna(method="pad", inplace=1)

Not sure why pandas.DataFrame(columns=["x"]).x and series() behave differently, there may be a deeper issue
at play.

@jreback
Copy link
Contributor

jreback commented Jan 30, 2013

it triggered for me with and w/o inplace....adding as a test

@jreback
Copy link
Contributor

jreback commented Jan 30, 2013

ok...fixed in #2708

@stephenwlin
Copy link
Contributor Author

fyi, apparently it only segfaults when dtype=='object', probably because that's the only case in which the value is dereferenced as a pointer. for non-object dtypes, the bug is still there, but it doesn't cause a segfault because the memory is just being read and interpreted as an integer/float

In [2]: p.Series().dtype
Out[2]: dtype('float64')

In [3]: p.Series().fillna(method='pad')
Out[3]: []

In [4]: p.DataFrame(columns=['x'])['x'].dtype
Out[4]: dtype('object')

In [5]: p.DataFrame(columns=['x'])['x'].fillna(method='pad')
Segmentation fault (core dumped)
In [2]: p.Series().astype('int64').fillna(method='pad')
Out[2]: []

In [3]: p.Series().astype('object').fillna(method='pad')
Segmentation fault (core dumped)

@wesm
Copy link
Member

wesm commented Jan 31, 2013

Should this be patched in a v0.10.2 release? I think that might happen before #2708 is merged into v0.11 dev branch

@stephenwlin stephenwlin restored the fillna-segfault-fix branch January 31, 2013 19:56
@jreback
Copy link
Contributor

jreback commented Jan 31, 2013

up 2 you....it actually hard to trigger this (though fix is pretty trivial too)...

@stephenwlin
Copy link
Contributor Author

@wesm, i restored the branch and commited the same test as in #2708, in case you want to merge this first. there will probably be conflicts later if you do, but they'll be easy to resolve

@stephenwlin stephenwlin reopened this Jan 31, 2013
jreback added a commit to jreback/pandas that referenced this pull request Feb 8, 2013
…ndas-dev#622)

     construction of multi numeric dtypes with other types in a dict
     validated get_numeric_data returns correct dtypes
     added blocks attribute (and as_blocks()) method that returns a dict of dtype -> homogeneous Frame to DataFrame
     added keyword 'raise_on_error' to astype, which can be set to false to exluded non-numeric columns
     fixed merging to correctly merge on multiple dtypes with blocks (e.g. float64 and float32 in other merger)
     changed implementation of get_dtype_counts() to use .blocks
     revised DataFrame.convert_objects to use blocks to be more efficient
     added Dtype printing to show on default with a Series
     added convert_dates='coerce' option to convert_objects, to force conversions to datetime64[ns]
     where can upcast integer to float as needed (on inplace ops pandas-dev#2793)
     added fully cythonized support for int8/int16
     no support for float16 (it can exist, but no cython methods for it)

TST: fixed test in test_from_records_sequencelike (dict orders can be different on different arch!)
       NOTE: using tuples will remove dtype info from the input stream (using a record array is ok though!)
     test updates for merging (multi-dtypes)
     added tests for replace (but skipped for now, algos not set for float32/16)
     tests for astype and convert in internals
     fixes for test_excel on 32-bit
     fixed test_resample_median_bug_1688 I belive
     separated out test_from_records_dictlike
     testing of panel constructors (GH pandas-dev#797)
     where ops now have a full test suite
     allow slightly less sensitive decimal tests for less precise dtypes

BUG: fixed GH pandas-dev#2778, fillna on empty frame causes seg fault
     fixed bug in groupby where types were not being casted to original dtype
     respect the dtype of non-natural numeric (Decimal)
     don't upcast ints/bools to floats (if you say were agging on len, you can get an int)
DOC: added astype conversion examples to whatsnew and docs (dsintro)
     updated RELEASE notes
     whatsnew for 0.10.2
     added upcasting gotchas docs

CLN: updated convert_objects to be more consistent across frame/series
     moved most groupby functions out of algos.pyx to generated.pyx
     fully support cython functions for pad/bfill/take/diff/groupby for float32
     moved more block-like conversion loops from frame.py to internals.py (created apply method)
       (e.g. diff,fillna,where,shift,replace,interpolate,combining), to top-level methods in BlockManager
@stephenwlin
Copy link
Contributor Author

closed because of merge of #2708

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

segmentation fault in fillna
3 participants