Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MCA : 'SparseDataFrame' object has no attribute 'to_numpy' #62

Closed
kaustubrao opened this issue Apr 23, 2019 · 10 comments
Closed

MCA : 'SparseDataFrame' object has no attribute 'to_numpy' #62

kaustubrao opened this issue Apr 23, 2019 · 10 comments

Comments

@kaustubrao
Copy link

I'm trying to execute the following code -

import prince
mca = prince.MCA(n_components=10,n_iter=5,copy=True,check_input=True,engine='auto',random_state=10)
mca_model=mca.fit(df_sample)

The problem is that OneHotEncoder has sparse set to True, so it returns a sparse dataframe. But the fit function has the following piece of code -

if isinstance(X, pd.DataFrame):
            X = X.to_numpy()

But to_numpy cannot be applied to sparse dataframes. As a result I'm getting the following error-

AttributeError: 'SparseDataFrame' object has no attribute 'to_numpy'

Not sure if this is a bug or if i'm doing something wrong.

@MaxHalford
Copy link
Owner

We could handle SparseDataFrames but as far as I know the pandas team is planning on deprecating them in the next version...

@kaustubrao
Copy link
Author

Just to clarify, my input dataframe df_sample is not sparse. The OneHotEncoder returns a sparse dataframe -

https://github.com/MaxHalford/prince/blob/master/prince/one_hot.py

@MaxHalford
Copy link
Owner

Yep, got it.

I'm on holiday at the moment, but I'll work on it when I get back in ~10 days.

@MaxHalford
Copy link
Owner

Can you please provide me with a fully working example?

@MaxHalford
Copy link
Owner

Ping @kaustubrao

@DerpMind
Copy link

DerpMind commented Feb 5, 2020

Hi Max,
I received the same error trying to fit an MCA on my data.
With pandas 1.0 the thrown error spells now: "TypeError: SparseDataFrame() takes no arguments"
Best regards
Lorenz

@DerpMind
Copy link

DerpMind commented Feb 5, 2020

I created a reproducable example, if you find time to check it out:

### Create some dataset with categorical variables
from sklearn.datasets import load_boston
data = load_boston()
df = pd.DataFrame(data["data"], columns=data["feature_names"])
df = df[["CHAS", "AGE"]]

df["AGE"] = df["AGE"].astype("float")
df.loc[df["AGE"]<25, "AGE"] = 1
df.loc[(df["AGE"]>=25) & (df["AGE"]<50), "AGE"] = 2
df.loc[df["AGE"]>=50, "AGE"] = 3

df["AGE"] = df["AGE"].astype("category")
df["CHAS"] = df["CHAS"].astype("category")


### Fit an MCA
mca = prince.MCA(
    n_components=2,
    n_iter=3,
    copy=True,
    check_input=True,
    engine='auto',
    random_state=42
)
mca = mca.fit(df)

TypeError                                 Traceback (most recent call last)
<ipython-input-63-8f63821dc142> in <module>
      7     random_state=42
      8 )
----> 9 mca = mca.fit(df)

~\AppData\Local\Continuum\anaconda3\envs\Basics\lib\site-packages\prince\mca.py in fit(self, X, y)
     26 
     27         # Apply CA to the indicator matrix
---> 28         super().fit(self.one_hot_.transform(X))
     29 
     30         # Compute the total inertia

~\AppData\Local\Continuum\anaconda3\envs\Basics\lib\site-packages\prince\one_hot.py in transform(self, X)
     33             columns=self.column_names_,
     34             index=X.index if isinstance(X, pd.DataFrame) else None,
---> 35             default_fill_value=0
     36         )

TypeError: SparseDataFrame() takes no arguments

@MaxHalford
Copy link
Owner

Hey @DerpMind, prince isn't intended to work with pandas 1.0 yet. It will in the next release.

@DerpMind
Copy link

DerpMind commented Feb 5, 2020

Hi Max, it threw the same error for pandas version 0.25.3
'SparseDataFrame' object has no attribute 'to_numpy'

@DerpMind
Copy link

DerpMind commented Feb 5, 2020

No, sorry. You are right - I got it working now with an older version of pandas.
Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants