Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Factorize fails with read-only array #12813

Closed
rabernat opened this issue Apr 6, 2016 · 2 comments

Comments

Projects
None yet
3 participants
@rabernat
Copy link

commented Apr 6, 2016

same as in #15286

Factorize raises a cython error when used with a read-only array. Seems related to #10043 and #10070. I discovered this via xarray via pydata/xarray#818.

Code Sample, a copy-pastable example if possible

a = np.arange(2)
a.flags.writeable = False
pd.factorize(a)

Raises the following

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-54-dddd925d767a> in <module>()
      1 a = np.arange(10)
      2 a.flags.writeable = False
----> 3 pd.factorize(a)

/Users/rpa/anaconda/lib/python2.7/site-packages/pandas/core/algorithms.pyc in factorize(values, sort, order, na_sentinel, size_hint)
    194     table = hash_klass(size_hint or len(vals))
    195     uniques = vec_klass()
--> 196     labels = table.get_labels(vals, uniques, 0, na_sentinel, True)
    197 
    198     labels = com._ensure_platform_int(labels)

pandas/hashtable.pyx in pandas.hashtable.Int64HashTable.get_labels (pandas/hashtable.c:7893)()

/Users/rpa/anaconda/lib/python2.7/site-packages/pandas/hashtable.so in View.MemoryView.memoryview_cwrapper (pandas/hashtable.c:29882)()

/Users/rpa/anaconda/lib/python2.7/site-packages/pandas/hashtable.so in View.MemoryView.memoryview.__cinit__ (pandas/hashtable.c:26251)()

ValueError: buffer source array is read-only

Expected Output

Should be the same as with a non-read-only array

>>> pd.factorize(np.arange(2))
(array([0, 1]), array([0, 1]))

output of pd.show_versions()

commit: None
python: 2.7.11.final.0
python-bits: 64
OS: Darwin
OS-release: 14.5.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8

pandas: 0.18.0
nose: 1.3.7
pip: 8.1.0
setuptools: 20.2.2
Cython: 0.22.1
numpy: 1.10.4
scipy: 0.16.0
statsmodels: 0.6.1
xarray: 0.7.2-4-g33efdcd
IPython: 4.0.0
sphinx: 1.2.3
patsy: 0.3.0
dateutil: 2.5.0
pytz: 2016.1
blosc: None
bottleneck: 1.0.0
tables: 3.2.1.1
numexpr: 2.5.1
matplotlib: 1.4.3
openpyxl: 1.8.5
xlrd: 0.9.3
xlwt: 1.0.0
xlsxwriter: 0.7.3
lxml: 3.4.4
bs4: 4.3.2
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: 1.0.5
pymysql: None
psycopg2: 2.6 (dt dec pq3 ext)
jinja2: 2.8
boto: 2.38.0

@jreback

This comment has been minimized.

Copy link
Contributor

commented Apr 6, 2016

yeah this is a cython bug, you can try same soln as in: #10070

@jreback jreback added this to the 0.18.1 milestone Apr 6, 2016

@jreback jreback modified the milestones: 0.18.1, 0.18.2 Apr 26, 2016

@jorisvandenbossche jorisvandenbossche modified the milestones: 0.20.0, 0.19.0 Aug 21, 2016

@jreback jreback modified the milestones: 0.20.0, Next Major Release Mar 23, 2017

@jreback jreback modified the milestones: Next Minor Release, Next Major Release Apr 1, 2017

@jreback jreback added the Prio-medium label Apr 1, 2017

@jreback jreback modified the milestones: Interesting Issues, Next Major Release Nov 26, 2017

@xhochy xhochy referenced this issue Jul 7, 2018

Merged

Accept constant memoryviews in HashTable.lookup #21688

3 of 3 tasks complete

@jreback jreback modified the milestones: Contributions Welcome, 0.24.0 Jul 7, 2018

@jreback

This comment has been minimized.

Copy link
Contributor

commented Jul 7, 2018

this can be easily fixed after #21688

xhochy added a commit to xhochy/pandas that referenced this issue Jul 8, 2018

@xhochy xhochy referenced this issue Jul 8, 2018

Merged

Add unit test for #12813 #21811

4 of 4 tasks complete

jreback added a commit that referenced this issue Jul 8, 2018

alimcmaster1 added a commit to alimcmaster1/pandas that referenced this issue Aug 12, 2018

Sup3rGeo added a commit to Sup3rGeo/pandas that referenced this issue Oct 1, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.