Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use predict instead of transform #88

Open
gdkrmr opened this issue Jan 22, 2019 · 8 comments
Open

use predict instead of transform #88

gdkrmr opened this issue Jan 22, 2019 · 8 comments

Comments

@gdkrmr
Copy link
Contributor

gdkrmr commented Jan 22, 2019

why is MultivariateStats using transform and not predict as the rest of the stats-ecosystem in Julia?

@wildart
Copy link
Collaborator

wildart commented Jan 23, 2019

I'm not sure, but @lindahua definitely knows why. After some digging, this can be relevant: JuliaStats/Roadmap.jl#4 (comment)

My understanding that transform is used in a dimensionality reduction context, predict in other cases.

@nalimilan
Copy link
Member

This package is very old, so it could make sense to rename some methods now that the ecosystem has evolved. If that's possible, using predict would be nice as it would avoid clashes with other packages (like data management packages).

@rofinn
Copy link
Member

rofinn commented Jan 24, 2019

FWIW, I think transform makes more sense for dimensionality reduction, but it probably shouldn't be exported. Maybe StatsBase should define it and MultivariateStats could extend it?

@nalimilan
Copy link
Member

I'd be fine with defining it but not exported. I guess it could be defined in StatsBase, but only if other stats packages use it.

It's the kind of situation where I wish there was a transform function in Base. Then packages could override it with their custom functions without any risk of ambiguity.

@gdkrmr
Copy link
Contributor Author

gdkrmr commented Jan 24, 2019

The new AbstractDataTransform in StatsBase.jl define transform but do not export it. JuliaStats/StatsBase.jl@641236d

@wildart
Copy link
Collaborator

wildart commented Jan 27, 2019

For some methods transform comes with an inverse operation, i.e. reconstruct. Not a perfect name but it gives a hit to an appropriate action (in scikit-learn, it's called inverse_transform). It would be hard to come up with a suitable inverse action name for a predict.

@gdkrmr
Copy link
Contributor Author

gdkrmr commented Jan 27, 2019

transform -- reconstruct
predict -- reconstruct
I don't see the issue :-)

@ablaom
Copy link

ablaom commented Feb 22, 2019

Names are important. I struggled for some hours to make sense of the MLR documentation because I could not undersand what the trafo method was supposed to be. Once I realised that trafo meant transform, everything became infinitely clearer.

Personally, I find it helpful if the name is suggestive of function. A PCA projection will never be a "prediction" in my mind. Define a transform method and overload predict if you want to.

I do like the reconstruct suggestion, especially because the inverse_transform is sometimes only approximate (eg, discretization) or is only a one-sided inverse. I think we will switch to that in MLJ. :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants