Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TypeError: Cannot interpret '<attribute 'dtype' of 'numpy.generic' objects>' as a data type #18355

Closed
MahendraCherukupalli opened this issue Feb 7, 2021 · 29 comments

Comments

@MahendraCherukupalli
Copy link

Reproducing code example:

import numpy as np
<< your code here >>
import numpy as np
import pandas as pd
df=pd.read_csv('link')
df.info() and df.describe() gives error as "TypeError: Cannot interpret '<attribute 'dtype' of 'numpy.generic' objects>' as a data type" and also plotting(df.plot()) gives same error.

Error message:

NumPy/Python version information:

@seberg
Copy link
Member

seberg commented Feb 7, 2021

There is a unfortuate incompatibility with old pandas and 1.20. Updating pandas to a newer version should fix it, see also pandas-dev/pandas#39520 (comment)

Updating to pandas>=1.0.5 should solve it. Supposedly also pandas==0.25.3 works. If you are stuck with pandas 0.24.x you may not be able to usse numpy 1.20.x though, unfortunaly. In that case, use numpy<1.20,

@jessdwitch
Copy link

@seberg Any idea why someone might be suddenly seeing this error when calling df.to_csv with pandas==0.24.2 and numpy==1.16.5? Based on what you're saying above, it seems like those should be in the "OK" range.

@seberg
Copy link
Member

seberg commented Feb 8, 2021

@jessdwitch I am not sure if there are other (likely) ways to get this error, and I doubt it is possible with the above example (or a similar one, such as creating an empty dataframe IIRC).

Can you double check np.__version__ and pd.__version__ in your shell python console to make sure you are running the expected versions? Sometimes you might get multiple versions or so, which can quickly become very confusing.

@jessdwitch
Copy link

Thank you for the quick reply! I did check those already, since there are multiple versions installed (numpy==1.20.0 and pandas==0.25.3 when conda is deactivated, and the versions noted above when the conda environment the script is running in is activated). I double checked the logs, and while I don't have the specific versions printing to the log, I do see the env activating and confirmed 0.24.2 and 1.16.5 were the ones installed to that env.

@seberg
Copy link
Member

seberg commented Feb 8, 2021

Hmm, can you post an example and the full traceback? I am wondering if I mixed up the issues and this one was not related to 1.20.x at all.

To be honest, if you are not running 1.20.x, then searching/asking on pandas is more likely to be successfull.

@jessdwitch
Copy link

Gotcha. Tbh asking around pandas was my first thought, but this is literally the only google result with this error (well, this and an SO thread Mahendra created). Here's the traceback starting with the to_csv call. If that doesn't spark anything, I'll leave you be and pop in over there, but thank you regardless for taking a look.

  File "/root/miniconda/lib/python3.7/site-packages/annotation/summarize_annotation.py", line 159, in create_cluster_table
    pd.DataFrame(cluster_dict).to_csv(os.path.join(directory, 'report.csv'), index=False)
  File "/root/miniconda/lib/python3.7/site-packages/pandas/core/frame.py", line 411, in __init__
    mgr = init_dict(data, index, columns, dtype=dtype)
  File "/root/miniconda/lib/python3.7/site-packages/pandas/core/internals/construction.py", line 257, in init_dict
    return arrays_to_mgr(arrays, data_names, index, columns, dtype=dtype)
  File "/root/miniconda/lib/python3.7/site-packages/pandas/core/internals/construction.py", line 82, in arrays_to_mgr
    arrays = _homogenize(arrays, index, dtype)
  File "/root/miniconda/lib/python3.7/site-packages/pandas/core/internals/construction.py", line 323, in _homogenize
    val, index, dtype=dtype, copy=False, raise_cast_failure=False
  File "/root/miniconda/lib/python3.7/site-packages/pandas/core/internals/construction.py", line 712, in sanitize_array
    subarr = construct_1d_arraylike_from_scalar(value, len(index), dtype)
  File "/root/miniconda/lib/python3.7/site-packages/pandas/core/dtypes/cast.py", line 1233, in construct_1d_arraylike_from_scalar
    subarr = np.empty(length, dtype=dtype)
TypeError: Cannot interpret '<attribute 'dtype' of 'numpy.generic' objects>' as a data type

@seberg
Copy link
Member

seberg commented Feb 8, 2021

Sorry, do you have a minimal reproducer, preferably printing out the versions? It does look like this issue, but in that case the pandas code in contruct_1d_arraylike_from_scalar would include isinstance(dtype, (np.dtype, type(np.dtype))) or so (which is incorrect, because type(np.dtype) changed and was always weird).

Long story short, it looks a lot like this issue but I if you are really not on NumPy 1.20 type(np.dtype) is type and I am not aware of what might be wrong.

@jessdwitch
Copy link

Hey sorry for disappearing, but you were totally right. For some reason the Conda environment was using the pandas from within the env, but the numpy from outside, causing the conflict. We ended up just downgrading the numpy from outside the env to match the one within the env and everything is happy now. Thank you so much!

@mattip
Copy link
Member

mattip commented Feb 11, 2021

Closing. Please reopen if needed.

@aarthim123
Copy link

aarthim123 commented Feb 25, 2021

Hi I tried updating pandas to 1.0.5 and I still get the same error message. TypeError: Cannot interpret '<attribute 'dtype' of 'numpy.generic' objects>' as a data type . I have a numpy version of 1.20.1.
Update-I updated pandas using conda update pandas. And now the error I get is matplotlib is required for plotting when the default backend "matplotlib" is selected. Can someone guide me?

@mattip
Copy link
Member

mattip commented Feb 26, 2021

@aarthim123 this issue is specifically about the error in the title. Please open an error on an appropriate issue tracker for problems with installing pandas on conda please use a more appropriate forum. You might have better luck searching for similar error messages. If you do open an issue, you probably should report the output of conda list in order to untangle what you have installed and what you might need.

@MahendraCherukupalli
Copy link
Author

@aarthim123 use ..pip install numpy==1.16.5

@RichardScottOZ
Copy link

RichardScottOZ commented Apr 15, 2021

I just saw this:-

<class 'numpy.ndarray'> <class 'numpy.ndarray'>
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-15-14380fd16e5f> in <module>
     49         print(type(v_list[i]), type(face_list[i]) )
     50         #dataset = pd.DataFrame({'Group':'Neoprot-Ordovician','Surface': s, 'X': v_list[i][:, 0], 'Y': v_list[i][:, 1], 'Z': v_list[i][:, 2], 'SG': -1   })
---> 51         dataset = pd.DataFrame({'Group':'Neoprot-Ordovician','Surface': s, 'X': v_list[i][:, 0].astype(float)  })
     52         dataset["Name"] = metadata_list[i]["NAME"]
     53         dataset["CRS"] = str(metadata_list[i]["CRS"])

~\AppData\Local\Continuum\anaconda3\envs\gemgis\lib\site-packages\pandas\core\frame.py in __init__(self, data, index, columns, dtype, copy)
    433             )
    434         elif isinstance(data, dict):
--> 435             mgr = init_dict(data, index, columns, dtype=dtype)
    436         elif isinstance(data, ma.MaskedArray):
    437             import numpy.ma.mrecords as mrecords

~\AppData\Local\Continuum\anaconda3\envs\gemgis\lib\site-packages\pandas\core\internals\construction.py in init_dict(data, index, columns, dtype)
    252             arr if not is_datetime64tz_dtype(arr) else arr.copy() for arr in arrays
    253         ]
--> 254     return arrays_to_mgr(arrays, data_names, index, columns, dtype=dtype)
    255 
    256 

~\AppData\Local\Continuum\anaconda3\envs\gemgis\lib\site-packages\pandas\core\internals\construction.py in arrays_to_mgr(arrays, arr_names, index, columns, dtype)
     67 
     68     # don't force copy because getting jammed in an ndarray anyway
---> 69     arrays = _homogenize(arrays, index, dtype)
     70 
     71     # from BlockManager perspective

~\AppData\Local\Continuum\anaconda3\envs\gemgis\lib\site-packages\pandas\core\internals\construction.py in _homogenize(data, index, dtype)
    320                     val = dict(val)
    321                 val = lib.fast_multiget(val, oindex.values, default=np.nan)
--> 322             val = sanitize_array(
    323                 val, index, dtype=dtype, copy=False, raise_cast_failure=False
    324             )

~\AppData\Local\Continuum\anaconda3\envs\gemgis\lib\site-packages\pandas\core\construction.py in sanitize_array(data, index, dtype, copy, raise_cast_failure)
    463                 value = maybe_cast_to_datetime(value, dtype)
    464 
--> 465             subarr = construct_1d_arraylike_from_scalar(value, len(index), dtype)
    466 
    467         else:

~\AppData\Local\Continuum\anaconda3\envs\gemgis\lib\site-packages\pandas\core\dtypes\cast.py in construct_1d_arraylike_from_scalar(value, length, dtype)
   1459                 value = ensure_str(value)
   1460 
-> 1461         subarr = np.empty(length, dtype=dtype)
   1462         subarr.fill(value)
   1463 

TypeError: Cannot interpret '<attribute 'dtype' of 'numpy.generic' objects>' as a data type

@RichardScottOZ
Copy link

pandas 1.0.1 py38he350917_0 conda-forge
numpy 1.20.2 py38h09042cb_0 conda-forge

@seberg
Copy link
Member

seberg commented Apr 16, 2021

pandas 1.0.1 py38he350917_0 conda-forge
numpy 1.20.2 py38h09042cb_0 conda-forge

Please just update your pandas. If you want to stay on 1.0.x that is fine. 1.0.5 is just a bug-fix release that was reported to also include a fix for this. If 1.0.5 is not sufficient to fix it, let us know so others can find a solution easier.
The pandas version you list are clearly mentioned as not compatible in the pandas issue.

If you don't want to upgrade pandas for whatever reason, you may have to stick to NumPy 1.19.x as well.

@RichardScottOZ
Copy link

For reference if someone else comes across it - it happened due to installing something else downgrading the pandas version.

@bronxbear
Copy link

Reproducing code example:

import numpy as np
<< your code here >>
import numpy as np
import pandas as pd
df=pd.read_csv('link')
df.info() and df.describe() gives error as "TypeError: Cannot interpret '<attribute 'dtype' of 'numpy.generic' objects>' as a data type" and also plotting(df.plot()) gives same error.

Error message:

NumPy/Python version information:

pip install pandas --upgrade

@MigueL71994
Copy link

pip install --upgrade numpy
pip install --upgrade pandas

@dinhtrang24
Copy link

Hi, I've already upgrade both pandas (0.24.2) and numpy (1.21.5). When I tried data.info(), it still doesn't work. Any thoughts?

`---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
in
----> 1 data.info()

/opt/apps/apps/binapps/anaconda3/2019.07/lib/python3.7/site-packages/pandas/core/frame.py in info(self, verbose, buf, max_cols, memory_usage, null_counts)
2503 self.index._is_memory_usage_qualified()):
2504 size_qualifier = '+'
-> 2505 mem_usage = self.memory_usage(index=True, deep=deep).sum()
2506 lines.append("memory usage: {mem}\n".format(
2507 mem=_sizeof_fmt(mem_usage, size_qualifier)))

/opt/apps/apps/binapps/anaconda3/2019.07/lib/python3.7/site-packages/pandas/core/frame.py in memory_usage(self, index, deep)
2597 if index:
2598 result = Series(self.index.memory_usage(deep=deep),
-> 2599 index=['Index']).append(result)
2600 return result
2601

/opt/apps/apps/binapps/anaconda3/2019.07/lib/python3.7/site-packages/pandas/core/series.py in init(self, data, index, dtype, name, copy, fastpath)
260 else:
261 data = sanitize_array(data, index, dtype, copy,
--> 262 raise_cast_failure=True)
263
264 data = SingleBlockManager(data, index, fastpath=True)

/opt/apps/apps/binapps/anaconda3/2019.07/lib/python3.7/site-packages/pandas/core/internals/construction.py in sanitize_array(data, index, dtype, copy, raise_cast_failure)
640
641 subarr = construct_1d_arraylike_from_scalar(
--> 642 value, len(index), dtype)
643
644 else:

/opt/apps/apps/binapps/anaconda3/2019.07/lib/python3.7/site-packages/pandas/core/dtypes/cast.py in construct_1d_arraylike_from_scalar(value, length, dtype)
1185 value = to_str(value)
1186
-> 1187 subarr = np.empty(length, dtype=dtype)
1188 subarr.fill(value)
1189

TypeError: Cannot interpret '<attribute 'dtype' of 'numpy.generic' objects>' as a data type
`

@jtlz2
Copy link

jtlz2 commented Feb 15, 2022

Similarly - this is still an issue for me

pandas-1.0.1
numpy-1.21.5

The low pandas version is being enforced by the numpy version.

It does work if I enforce

pip install pandas>=1.3 though

@seberg
Copy link
Member

seberg commented Feb 15, 2022

The last time I saw someone who updated their pandas version and then still had the problem, had an environment issue that made them pick up the wrong pandas version after all. Please double check pandas.__version__ and numpy.__version__ in whatever you are running (e.g. the script) itself?

@levalencia
Copy link

levalencia commented Feb 16, 2022

I am having the same issue.
On a new compute instance in Azure ML, I am using Python 3.8 Kernel.

I checked the versions:
pandas 0.25.3
numpy 1.22.2

.describe gives the same error;
TypeError: Cannot interpret '<attribute 'dtype' of 'numpy.generic' objects>' as a data type
I also tried:
scaled_df.select_dtypes(include=[np.number])

same error:
TypeError: Cannot interpret '<attribute 'dtype' of 'numpy.generic' objects>' as a data type

What should I do?

Update:
I dont understand why when I run pip freeze I get different versions:
numpy==1.18.5
pandas==1.1.5

@dinhtrang24
Copy link

dinhtrang24 commented Mar 10, 2022

The last time I saw someone who updated their pandas version and then still had the problem, had an environment issue that made them pick up the wrong pandas version after all. Please double check pandas.__version__ and numpy.__version__ in whatever you are running (e.g. the script) itself?

Yes. I have checked both script and terminal again.

pandas == 0.24.2
numpy == 1.21.5

The error doesn't disappear

@seberg
Copy link
Member

seberg commented Mar 10, 2022

@dinhtrang24 that pandas version is known to be too old. I had asked Luis, because I though 0.24.3 may be new enough (I am not quite sure). You have to either upgrade pandas, since it is an old version, or if you are stuck with such an old pandas version downgrade NumPy.
(Or I suppose apply a patch to pandas, but unless you have very concrete reasons for using that pandas version, you should upgrade.)

@dinhtrang24
Copy link

@seberg Hi, thanks for the reply. I figured out the problem is that I have installed different versions of pandas and numpy using pip and pip3. After upgrading and matching their versions, the problem was solved.

marota added a commit to marota/ChroniX2Grid that referenced this issue Mar 14, 2022
Solving Cannot interpret '<attribute 'dtype' of 'numpy.generic' objects>' as a data type from
numpy/numpy#18355
marota added a commit to marota/ChroniX2Grid that referenced this issue Mar 14, 2022
Solving Cannot interpret '<attribute 'dtype' of 'numpy.generic' objects>' as a data type
from numpy/numpy#18355
@abhijit-z
Copy link

abhijit-z commented Apr 20, 2022

Hi
I facing the same error: TypeError: Cannot interpret '<attribute 'dtype' of 'numpy.generic' objects>' as a data type
when I use missingno library.
I checked the versions of my pandas and numpy respectively:
0.25.3
1.22.3

I am not sure what to do and how to get rid of this on Azure ML studio.
Can anyone please guide me here? Thanks

@levalencia
Copy link

Hi I facing the same error: TypeError: Cannot interpret '<attribute 'dtype' of 'numpy.generic' objects>' as a data type when I use missingno library. I checked the versions of my pandas and numpy respectively: 0.25.3 1.22.3

I am not sure what to do and get rid of this on Azure ML studio. Can anyone please guide me here? Thanks

the problem I have seen in Azure Notebooks is the way that we install dependencies due to not proper documentation.
I figured this out the hardway, through trial and error.

The best thing is:

  1. Create a conda environment
  2. Install the required packages
  3. Create a kernel

Then in notebooks select the kernel.

%pip install --> installs on current running kernel
!pip install --> installs on global environment

More info here. Do it from scratch and please reply if it helped

https://medium.com/@luisevalencia/how-to-properly-manage-azure-notebooks-kernels-and-conda-environments-b0862f3eca51

@abhijit-z
Copy link

Hi I facing the same error: TypeError: Cannot interpret '<attribute 'dtype' of 'numpy.generic' objects>' as a data type when I use missingno library. I checked the versions of my pandas and numpy respectively: 0.25.3 1.22.3
I am not sure what to do and get rid of this on Azure ML studio. Can anyone please guide me here? Thanks

the problem I have seen in Azure Notebooks is the way that we install dependencies due to not proper documentation. I figured this out the hardway, through trial and error.

The best thing is:

  1. Create a conda environment
  2. Install the required packages
  3. Create a kernel

Then in notebooks select the kernel.

%pip install --> installs on current running kernel !pip install --> installs on global environment

More info here. Do it from scratch and please reply if it helped

https://medium.com/@luisevalencia/how-to-properly-manage-azure-notebooks-kernels-and-conda-environments-b0862f3eca51

Thanks for prompt response @levalencia
I tried to run the following command on terminal in Azure
$ conda create — name GPSAnalysis
But it says:
WARNING: A conda environment already exists at '/anaconda/envs/azureml_py38'
Remove existing environment (y/[n])?
I should not delete azureml_py38 env, then what should I do?

@levalencia
Copy link

Check the official conda documentation, maybe a syntax error:
https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests