Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
[MRG+1] Feature: Implement PowerTransformer #10210
What does this implement/fix? Explain your changes.
This PR implements
At the moment, only the Box-Cox transform is supported, which requires strictly positive data. The optimal parameters for stabilizing variance and minimizing skewness are determined using maximum likelihood, and the transformation is applied to the dataset feature-wise.
Any other comments?
We will consider implementing the Yeo-Johnson transform - a power transformation that can be applied to negative data - in a future PR.
Thanks to @maniteja123 for kicking it off!
There are a couple of other small things @glemaitre requested that are unaddressed as far as I can tell.…
On 5 December 2017 at 18:32, Eric Chang ***@***.***> wrote: Thanks for the doc fix @glemaitre <https://github.com/glemaitre>. Good suggestion on normalizing the distributions, Joel - I used minmax_scale(X, feature_range=(1e-10, 10)). — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#10210 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAEz67ZpQ-N0-cJkfbJX9_0gYHl7eo7Nks5s9PGVgaJpZM4QrWr2> .
Added the final tweaks.
@amueller, I think the 'skewness' vocabulary came from an earlier review. It's more of an empirical observation - the main purpose of Box-Cox is to make data normal and stabilize variance. Skewness does not necessarily imply higher variance, but it does imply nonnormality, so the description still makes sense, IMO.
edit: fixed flake8 error