Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MiniBatchKMeans normalizes sample weights for each minibatch #16535

Closed
jeremiedbb opened this issue Feb 24, 2020 · 3 comments
Closed

MiniBatchKMeans normalizes sample weights for each minibatch #16535

jeremiedbb opened this issue Feb 24, 2020 · 3 comments
Assignees

Comments

@jeremiedbb
Copy link
Member

In this line of the minibatch step of MiniBatchKMeans,

nearest_center, inertia = _labels_inertia(X, sample_weight, x_squared_norms, centers)

X is a minibatch, and internally, _labels_inertia will call _check_normalize_sample_weight which will normalize the sample weights according the minibatch and not to the full X.

It should be normalized once according to the full X at the beginning and that's all.

I'm working on it.

@jeremiedbb
Copy link
Member Author

Not normalizing sample weights as discussed in #16594 will automatically fix that

@ogrisel
Copy link
Member

ogrisel commented Jul 22, 2020

@jeremiedbb it seems that we can close this now that #17848 has been merged. Do you agree?

@jeremiedbb
Copy link
Member Author

Right, thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants