Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: AttributeError: type object 'object' has no attribute 'dtype' with numpy 1.20.x and pandas versions 1.0.4 and earlier #39520

Closed
Lucareful opened this issue Feb 1, 2021 · 29 comments
Labels
Closing Candidate May be closeable, needs more eyeballs Compat pandas objects compatability with Numpy or Python functions Dependencies Required and optional dependencies

Comments

@Lucareful
Copy link

root@548977c7dc-62l72:/app# pip list | grep pandas
pandas 1.0.3

In ipython ,i try initializing df

`
In [1]: import pandas as pd

In [2]: pd.DataFrame([],columns=['a','b','c'])

AttributeErrorTraceback (most recent call last)
in
----> 1 pd.DataFrame([],columns=['clicks', 'uclicks', 'impressions'])

/usr/local/lib/python3.8/site-packages/pandas/core/frame.py in init(self, data, index, columns, dtype, copy)
488 mgr = init_ndarray(data, index, columns, dtype=dtype, copy=copy)
489 else:
--> 490 mgr = init_dict({}, index, columns, dtype=dtype)
491 else:
492 try:

/usr/local/lib/python3.8/site-packages/pandas/core/internals/construction.py in init_dict(data, index, columns, dtype)
237 else:
238 nan_dtype = dtype
--> 239 val = construct_1d_arraylike_from_scalar(np.nan, len(index), nan_dtype)
240 arrays.loc[missing] = [val] * missing.sum()
241

/usr/local/lib/python3.8/site-packages/pandas/core/dtypes/cast.py in construct_1d_arraylike_from_scalar(value, length, dtype)
1438 else:
1439 if not isinstance(dtype, (np.dtype, type(np.dtype))):
-> 1440 dtype = dtype.dtype
1441
1442 if length and is_integer_dtype(dtype) and isna(value):

AttributeError: type object 'object' has no attribute 'dtype'
`
image

@Lucareful Lucareful added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Feb 1, 2021
@Lucareful Lucareful changed the title BUG: pd.DataFrame([],columns=[]) get type object 'object' has no attribute 'dtype' BUG: python 3.8.7 pandas 1.0.3 pd.DataFrame([],columns=[]) get type object 'object' has no attribute 'dtype' Feb 1, 2021
@ofir-ov
Copy link

ofir-ov commented Feb 1, 2021

It started happening to me as well, using Python 3.8.7 and Pandas 1.0.2. It started happening in code that used to work:

# python
Python 3.8.7 (default, Jan 12 2021, 17:16:32)
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import pandas as pd
>>> pd.__version__
'1.0.2'
>>> my_df = pd.DataFrame(columns=['col1', 'col2', 'col3'])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.8/site-packages/pandas/core/frame.py", line 435, in __init__
    mgr = init_dict(data, index, columns, dtype=dtype)
  File "/usr/local/lib/python3.8/site-packages/pandas/core/internals/construction.py", line 239, in init_dict
    val = construct_1d_arraylike_from_scalar(np.nan, len(index), nan_dtype)
  File "/usr/local/lib/python3.8/site-packages/pandas/core/dtypes/cast.py", line 1440, in construct_1d_arraylike_from_scalar
    dtype = dtype.dtype
AttributeError: type object 'object' has no attribute 'dtype'
>>>

FWIW it seems like upgrading Pandas (in my case to 1.2.1) solves this.

@rmaguire31
Copy link

rmaguire31 commented Feb 1, 2021

Found the same issue this morning due to some CI tests failing. Python==3.8.2 and pandas==1.0.1.

CI tests ran again successfully after bumping to pandas==1.2.1.

@jreback
Copy link
Contributor

jreback commented Feb 1, 2021

pls show_versions as instructed, this is mostly likely because of the numpy 1.20 release

@jreback jreback added Dependencies Required and optional dependencies and removed Dependencies Required and optional dependencies Needs Triage Issue that has not been reviewed by a pandas team member Bug labels Feb 1, 2021
@okapies
Copy link

okapies commented Feb 1, 2021

I got same errors on my environment. As @jreback pointed out, the combination of older version of pandas (at least pandas>=0.25.2) and the numpy==1.2.0 produce the AttributeError. It seems the problem is fixed in pandas==1.0.5.

$ pip install -u pandas==1.0.1
$ pip install -u pandas==1.2.0
>>> import pandas as pd
>>> pd.DataFrame(columns=['a'])

@okapies
Copy link

okapies commented Feb 1, 2021

The following also blow up our codes:

>>> import pandas as pd
>>> f = lambda x: x
>>> pd.DataFrame.from_records(map(f, []), columns=['a'])
...
AttributeError: type object 'object' has no attribute 'dtype'
>>> pd.DataFrame.from_records(list(map(f, [])), columns=['a'])
Empty DataFrame
Columns: [a]
Index: []

@Lucareful
Copy link
Author

I got same errors on my environment. As @jreback pointed out, the combination of older version of pandas (at least pandas>=0.25.2) and the numpy==1.2.0 produce the AttributeError. It seems the problem is fixed in pandas==1.0.5.

$ pip install -u pandas==1.0.1
$ pip install -u pandas==1.2.0
>>> import pandas as pd
>>> pd.DataFrame(columns=['a'])

yes, I try to pip install -U pandas==1.0.5 (or update numpy==0.19.5 ), the problem is fixed. when pnadas==1.0.3 and numpy==1.20 the problem will reproduce

@jreback jreback reopened this Feb 2, 2021
@jreback
Copy link
Contributor

jreback commented Feb 2, 2021

we'd actually like to see if we can patch this

@jreback jreback added this to the 1.2.2 milestone Feb 2, 2021
@jreback jreback added the Compat pandas objects compatability with Numpy or Python functions label Feb 2, 2021
@okapies
Copy link

okapies commented Feb 2, 2021

What is the root cause of the problem? If numpy broke the backward compatibility unintentionally, it should be fixed in numpy==1.20.1.

@jreback
Copy link
Contributor

jreback commented Feb 2, 2021

the numpy break is intentional (though unsure why this didn't come up on our prior testing)

as dtypes definitions are being updated

@jreback
Copy link
Contributor

jreback commented Feb 2, 2021

cc @seberg @jbrockmendel

@seberg
Copy link
Contributor

seberg commented Feb 2, 2021

Hmm, I think I saw this recently on a NumPy issue. The unfortunate thing is that isinstance(dtype, (np.dtype, type(np.dtype))) doesn't make much sense, since type(np.dtype) was never really a clear construct (it was just type). But that was enough to make the old code work...

I admit the break is annoying, but trying to swing in a direction that doesn't use a metaclass for dtype now is not something I can do easily/lightly (it would probably be a huge change in direction). I guess we got into a position where pandas fixed this, before I actually did the change in NumPy.

@mrdavidlaing
Copy link

FWIW; if you are stuck on pandas==0.24.2 (don't ask); downgrading to numpy==1.19.5 works

@seberg
Copy link
Contributor

seberg commented Feb 2, 2021

I am not too happy that you have to pin NumPy, but I guess having an upstream package almost a year newer than the downstream package can be problematic more generally (if there had been a proper Deprecation you would see it kick in around the same time).
To some degree, there is always the question whether downstream packages should set an upper version limit. Ralf suggested doing that generally, although I am not sure pip would actually use it correctly.

If it helps, in this case, I suspect the increadibly minimal fix, will be to just replace type(np.dtype) with type. (The code might be slightly weird, but its unmodified behaviour that counts here.)

@fabiolafs
Copy link

FWIW; if you are stuck on pandas==0.24.2 (don't ask); downgrading to numpy==1.19.5 works
THANK YOU
also works with pandas==0.25.3

@simonjayhawkins simonjayhawkins changed the title BUG: python 3.8.7 pandas 1.0.3 pd.DataFrame([],columns=[]) get type object 'object' has no attribute 'dtype' BUG: AttributeError: type object 'object' has no attribute 'dtype' with numpy 1.20.x and pandas versions 1.0.4 and earlier Apr 8, 2021
@PaulBremner
Copy link

PaulBremner commented May 28, 2021

I got same errors on my environment. As @jreback pointed out, the combination of older version of pandas (at least pandas>=0.25.2) and the numpy==1.2.0 produce the AttributeError. It seems the problem is fixed in pandas==1.0.5.

$ pip install -u pandas==1.0.1
$ pip install -u pandas==1.2.0
>>> import pandas as pd
>>> pd.DataFrame(columns=['a'])

yes, I try to pip install -U pandas==1.0.5 (or update numpy==0.19.5 ), the problem is fixed. when pnadas==1.0.3 and numpy==1.20 the problem will reproduce

I have tried multiple different versions of pandas and numpy to no avail - I still get this bug. If I understood your post correctly Pandas 1.0.5 with numpy 1.19.5 (the earliest version of numpy I can see is 1 so assume there is a typo) should fix it and it does not. I tried 1.0.5 with the latest version of numpy and it still fails. I have tried several other combinations and still it persists. Please suggest how I can fix this bug.

For reference I am using:
mydict = {'dv':[floats], 'v1':[strings], 'v2':[strings], 'v3':[strings]}
df = pd.DataFrame.from_dict(mydict)
to create my dataframe.

@jbrockmendel
Copy link
Member

@PaulBremner can you post a copy/paste-able example? i cant reproduce the error using any of the examples in this thread

@PaulBremner
Copy link

@PaulBremner can you post a copy/paste-able example? i cant reproduce the error using any of the examples in this thread

A simplified example that generates the error. I am using Python 3.9, pandas 1.2.4, numpy 1.19.5 (though any combination of numpy and pandas versions I tried has the same result).

anovainput = {'object':[0.1,0.2,0.3,0.4]}
df = pd.DataFrame.from_dict(anovainput)
print(df.dtype)

@jbrockmendel
Copy link
Member

You want df.dtypes, not df.dtype. I think this is different than the issue in the OP.

@PaulBremner
Copy link

PaulBremner commented May 29, 2021

You want df.dtypes, not df.dtype. I think this is different than the issue in the OP.

No I definitely want dtype. When I try and run an ANOVA on my data it throws the error message mentioned by the OP (AttributeError: type object 'object' has no attribute 'dtype). My example code is just using the print statement to cause the error message to occur, I don't actually want to print anything.

@jbrockmendel
Copy link
Member

No I definitely want dtype.

Thanks for clarifying. When I copy/paste the snippet I get AttributeError: 'DataFrame' object has no attribute 'dtype' which is what we'd expect.

@PaulBremner
Copy link

PaulBremner commented Jun 2, 2021

No I definitely want dtype.

Thanks for clarifying. When I copy/paste the snippet I get AttributeError: 'DataFrame' object has no attribute 'dtype' which is what we'd expect.

Is there a work around for this bug? As far as I can see I have tried the solutions suggested above of rolling back numpy. I have tested:

pandas==0.24.2 (python 3.6)
pandas==0.25.3 (python 3.6)
pandas==1.0.5 (python 3.9)
pandas==1.2.4 (python 3.9)
with numpy==1.19.5 and 1.20.3 (and just as a stab in the dark test numpy==1.17.5)

Nothing seems to work, I still get the attribute error.

@jreback
Copy link
Contributor

jreback commented Jun 2, 2021

.dtype is NOT a property on a DataFrame but on a Series

@PaulBremner
Copy link

.dtype is NOT a property on a DataFrame but on a Series

I did not know that. Though I am still none the wiser about how to get my code to work.

@jbrockmendel
Copy link
Member

I did not know that. Though I am still none the wiser about how to get my code to work.

If you didn't know that, then we should revisit the idea that you should use df.dtypes

@nehaljwani
Copy link

Isn't it the case that the Pandas dev team simply needs to backport this commit to the 0.24.x (or any version older than 1)? I did it locally and it seems to work. I'm surprised that this important fix wasn't mentioned in https://pandas.pydata.org/docs/whatsnew/v1.0.5.html

@seberg
Copy link
Contributor

seberg commented Oct 13, 2021

Yes, probably that is all that is needed. Note that pandas fixed it before NumPy did the upstream change that caused the bug to be a serious issue rather than a small one. My guess is that 0.24.x is out-of-life before the change in NumPy even happened.

@nehaljwani
Copy link

Many end-users are stuck on pandas 0.24.x because of msgpack. It might be worthwhile to cut one patch release with this backport.

@jreback
Copy link
Contributor

jreback commented Oct 15, 2021

closing - we do not backport to older branches that are this old

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Closing Candidate May be closeable, needs more eyeballs Compat pandas objects compatability with Numpy or Python functions Dependencies Required and optional dependencies
Projects
None yet
Development

No branches or pull requests