-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
standardize data before pca #86
Comments
Usually, you would standardize data prior to fitting and do not mix two operations together. It's better not to mix two operations together. FYI, there is data transformation PR pending: JuliaStats/StatsBase.jl#85 |
What about the centering step, are you going to remove that? |
you should probably allow something like: x = rand(3, 10)
z = fit(ZScoreTransform, x)
p = fit(PCA, transform(z, x))
transform((z, p), xnew) or maybe even make the objects callable: xnew |> z |> p and for the reconstruction: y |> inv(p) |> inv(z) |
No, not really. Transformation and reconstructions are following: x = rand(3, 10)
# standardize data
T = fit(ZScoreTransform, x)
xstd = transform(Z, x)
# perform PCA
M = fit(PCA, xstd)
ysub = transform(M, xstd)
# reconstruct to original space
ystd = reconstruct(M, ysub)
# reconstruct to original scale
y = reconstruct(T, ystd)
x ≈ y Even thought pipeline |
This seems like an arbitrary choice. |
Currently there is no way to standardize data when doing a PCA, only to centralize.
The text was updated successfully, but these errors were encountered: