Error while applying .transform() #117

nico695 · 2021-08-04T16:22:28Z

Same error that has been documented in here #56.

Tried downgrading the version to 0.7.0 through the repository that was linked in that thread. Still showing the same dimensionality error.

Here its the code:

import numpy as np
import pandas as pd
X_n = pd.DataFrame(data=np.random.rand(10000,2),columns=list('AB'))
X_c =pd.DataFrame(np.random.choice(list('abcde'),size=(10000,4),replace=True),columns =list('CDEF'))
X=pd.concat([X_n,X_c],axis=1)

from prince import FAMD

famd = FAMD(n_components = 6, n_iter = 100)
famd.fit(X)

famd.transform(X.iloc[1:10,:])

I got the same error in version 0.7.0 and 0.7.1

ValueError: shapes (9,20) and (22,6) not aligned: 20 (dim 1) != 22 (dim 0)

christophe-williams · 2021-09-16T01:09:57Z

I've run into this issue a few times and it looks like it's based on how dummies are generated in _build_X_global. When the dataset you are transforming does not have examples of all the categorical variables from the larger original dataset, the resulting dummified dataset has fewer columns (in this case, 20 rather than 22).

Suggested fix for this (and #56 and #116) is to store the dummified columns in the famd and mfa models. If a new dataset being transformed only has a subset of categorical values, then its dummified dataset should have the right number of columns and one or more columns will be all zeroes. If a new dataset being transformed has new categorical values, should probably throw an error.

sibmike · 2021-09-28T20:52:04Z

Had the same issue, so had to make sure my train, validation, and test have examples of all the categorical variables, before fitting MCA. And dump columns where they don't:

keep = []
for clmn in X_train_cat.columns:
    train_cats = set(X_val_train_cat[clmn].unique())
    val_cats = set(X_val_test_cat[clmn].unique())
    test_cats = set(X_test_cat[clmn].unique())
    keep.append(train_cats == val_cats == test_cats)

keep_columns = X_train_cat.columns[keep]

But that's obviously an awkward temp solution, just to make it work. The dummy matrix workaround @christophe-williams mentioned would be nice to have.

MaxHalford · 2023-02-27T11:45:15Z

Hello there 👋

I apologise for not answering earlier. I was not maintaining Prince anymore. However, I have just refactored the entire codebase. This refactoring should have fixed many bugs.

I don’t have time and energy to check if this fixes your issue, but there is a good chance it does. Feel free to reopen this issue if the problem persists after installing the new version — that is, version 0.8.0 and onwards.

MaxHalford closed this as completed Feb 27, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error while applying .transform() #117

Error while applying .transform() #117

nico695 commented Aug 4, 2021 •

edited

christophe-williams commented Sep 16, 2021

sibmike commented Sep 28, 2021 •

edited

MaxHalford commented Feb 27, 2023

Error while applying .transform() #117

Error while applying .transform() #117

Comments

nico695 commented Aug 4, 2021 • edited

christophe-williams commented Sep 16, 2021

sibmike commented Sep 28, 2021 • edited

MaxHalford commented Feb 27, 2023

nico695 commented Aug 4, 2021 •

edited

sibmike commented Sep 28, 2021 •

edited