Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error: Dimension of X_train and y_train is not the same ! #76

Open
ReemMOwn opened this issue Sep 11, 2023 · 2 comments
Open

Error: Dimension of X_train and y_train is not the same ! #76

ReemMOwn opened this issue Sep 11, 2023 · 2 comments

Comments

@ReemMOwn
Copy link

ReemMOwn commented Sep 11, 2023

I am getting this error when trying to use any sampler from smote_variants, my binary dataset has 30 input features and one output
X_train is ndarray with shape (227845, 30)
y_train is ndarray with shape (227845, 1)

/usr/local/lib/python3.10/dist-packages/smote_variants/oversampling/_mwmote.py in sampling_algorithm(self, X, y)
498 return self.return_copies(X, y, "Sampling is not needed")
499
--> 500 X_min = X[y == self.min_label]
501
502 nn_params= {**self.nn_params}

IndexError: boolean index did not match indexed array along dimension 1; dimension is 30 but corresponding boolean dimension is 1

Here's sample of my code:
X_train, X_test, y_train, y_test = split_data(df, 0.2)
import smote_variants as sv
sampler = sv.MWMOTE()
X_resampled, y_resampled = sampler.sample(X_train, y_train)

@gykovacs
Copy link
Member

Thank you for raising, I look into it.

@gykovacs
Copy link
Member

gykovacs commented Oct 2, 2023

I think the problem is that your y_train should be an array of shape (227845), that is, instead of a 2D array with the spatial extent of 1 in the second dimension, it should be a 1D array.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants