Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: 2D ndarray of dtype 'object' is always copied upon construction #39263

Closed
3 tasks done
irgolic opened this issue Jan 19, 2021 · 0 comments · Fixed by #39272
Closed
3 tasks done

BUG: 2D ndarray of dtype 'object' is always copied upon construction #39263

irgolic opened this issue Jan 19, 2021 · 0 comments · Fixed by #39272
Labels
Bug Internals Related to non-user accessible pandas implementation Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Milestone

Comments

@irgolic
Copy link
Contributor

irgolic commented Jan 19, 2021

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • (optional) I have confirmed this bug exists on the master branch of pandas.


Note: Please read this guide detailing how to provide the necessary information for us to reproduce your bug.

Code Sample, a copy-pastable example

import numpy as np                             
import pandas as pd    
                                                                                                                                                                                                                                                
a = np.array(['a', 'b'], dtype='object')                                                                                                                                                                                                                                                                                                                                                    
df = pd.DataFrame(a)                                                                                                                                                                                       

assert np.shares_memory(df.values, a)  # True         
                                                                                                                                                                      
b = np.array([['a', 'b'], 
              ['c', 'd']], dtype='object')                                                                                                                                                                                 
df2 = pd.DataFrame(b)                                                                                                                                                                                     

assert np.shares_memory(df2.values, b)  # False

Problem description

This issue addresses the TODO in pandas.core.internals.construction.py, line 250 on master at time of writing. This was introduced by #26825. It's quite annoying, I'm trying to use DataFrame as an interface to access my dtype object numpy array, but I can't without copying it in first.

Expected Output

Exit code 0.

Output of pd.show_versions()

INSTALLED VERSIONS

commit : b5958ee
python : 3.6.10.final.0
python-bits : 64
OS : Darwin
OS-release : 19.6.0
Version : Darwin Kernel Version 19.6.0: Mon Aug 31 22:12:52 PDT 2020; root:xnu-6153.141.2~1/RELEASE_X86_64
machine : x86_64
processor : i386
byteorder : little
LC_ALL : None
LANG : en_GB.UTF-8
LOCALE : en_GB.UTF-8

pandas : 1.1.5
numpy : 1.18.1
pytz : 2019.3
dateutil : 2.8.1
pip : 20.0.2
setuptools : 46.1.3
Cython : 0.29.14
pytest : 5.3.5
hypothesis : 5.4.1
sphinx : 2.3.1
blosc : None
feather : None
xlsxwriter : 1.2.8
lxml.etree : 4.5.0
html5lib : 1.0.1
pymysql : None
psycopg2 : None
jinja2 : 2.11.1
IPython : 7.12.0
pandas_datareader: None
bs4 : 4.8.2
bottleneck : 1.3.1
fsspec : 0.6.2
fastparquet : None
gcsfs : None
matplotlib : 3.2.1
numexpr : 2.7.1
odfpy : None
openpyxl : 3.0.3
pandas_gbq : None
pyarrow : None
pytables : None
pyxlsb : None
s3fs : None
scipy : 1.4.1
sqlalchemy : 1.3.13
tables : 3.6.1
tabulate : None
xarray : None
xlrd : 1.2.0
xlwt : 1.2.0
numba : 0.48.0

@irgolic irgolic added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Jan 19, 2021
irgolic added a commit to irgolic/pandas that referenced this issue Jan 22, 2021
irgolic added a commit to irgolic/pandas that referenced this issue Jan 22, 2021
@jreback jreback added this to the 1.3 milestone Jan 22, 2021
irgolic added a commit to irgolic/pandas that referenced this issue Jan 28, 2021
@lithomas1 lithomas1 added Internals Related to non-user accessible pandas implementation Reshaping Concat, Merge/Join, Stack/Unstack, Explode labels Feb 9, 2021
@lithomas1 lithomas1 removed the Needs Triage Issue that has not been reviewed by a pandas team member label Feb 9, 2021
@simonjayhawkins simonjayhawkins removed this from the 1.3 milestone Jun 8, 2021
@jreback jreback added this to the 1.4 milestone Jul 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Internals Related to non-user accessible pandas implementation Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants