Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: hashtable memory error causes test_factorize_nan crash #7157

Merged
merged 1 commit into from
May 18, 2014
Merged

BUG: hashtable memory error causes test_factorize_nan crash #7157

merged 1 commit into from
May 18, 2014

Conversation

miketkelly
Copy link

ObjectVector class resizes its array without reseting its capacity count, so subsequent appends are invalid.

Mac OS 10.9, Python 2.7.6, numpy 1.9.0.dev-ee49411.

==57654== Invalid write of size 8
==57654==    at 0x137F856: __pyx_f_6pandas_9hashtable_12ObjectVector_append (in /Users/mtk/Projects/pandas/pandas/hashtable.so)
==57654==    by 0x139B16F: __pyx_pw_6pandas_9hashtable_17PyObjectHashTable_25get_labels (in /Users/mtk/Projects/pandas/pandas/hashtable.so)
==57654==    by 0x138CA9E: __pyx_pw_6pandas_9hashtable_10Factorizer_5factorize (in /Users/mtk/Projects/pandas/pandas/hashtable.so)
==57654==    by 0xD227E: PyEval_EvalFrameEx (in /usr/local/anaconda/lib/libpython2.7.dylib)

==57654==  Address 0x10095efd0 is 16 bytes inside a block of size 256 free'd
==57654==    at 0x7858: realloc (in /usr/local/Cellar/valgrind/3.9.0/lib/valgrind/vgpreload_memcheck-amd64-darwin.so)
==57654==    by 0x13C0F55: PyDataMem_RENEW (in /usr/local/anaconda/lib/python2.7/site-packages/numpy/core/multiarray.so)
==57654==    by 0x1488DE7: PyArray_Resize (in /usr/local/anaconda/lib/python2.7/site-packages/numpy/core/multiarray.so)
==57654==    by 0x14647A0: array_resize (in /usr/local/anaconda/lib/python2.7/site-packages/numpy/core/multiarray.so)
==57654==    by 0x1396A12: __pyx_pw_6pandas_9hashtable_12ObjectVector_5to_array (in /Users/mtk/Projects/pandas/pandas/hashtable.so)
==57654==    by 0x138CFEE: __pyx_pw_6pandas_9hashtable_10Factorizer_5factorize (in /Users/mtk/Projects/pandas/pandas/hashtable.so)
==57654==    by 0xD227E: PyEval_EvalFrameEx (in /usr/local/anaconda/lib/libpython2.7.dylib)

@jreback
Copy link
Contributor

jreback commented May 17, 2014

can u construct a test that fails w/o the fix?

this doesn't fail in numpy 1.9 afaict (in 64-bit linux)

as we test this in Travis

@miketkelly
Copy link
Author

Added a test case that crashed on linux also.

@jreback
Copy link
Contributor

jreback commented May 17, 2014

gr8

can u add more dtypes to your test

eg loop thru object, int64, float 64

and release note (0.14.0)

thanks

@jreback jreback added this to the 0.14.0 milestone May 17, 2014
@miketkelly
Copy link
Author

Done. Also needed to handle the special case where a vector is resized to 0.

@jreback
Copy link
Contributor

jreback commented May 18, 2014

looks good
ping when green

@jreback jreback merged commit eb97319 into pandas-dev:master May 18, 2014
@jreback
Copy link
Contributor

jreback commented May 18, 2014

thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Algos Non-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diff Bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants