Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PR: inverse_transform is implemented for scikit-learn utility #12

Merged
merged 2 commits into from Aug 22, 2020

Conversation

sshojiro
Copy link
Contributor

@sshojiro sshojiro commented May 1, 2020

Hello,

I have found that there is no inverse_transform, which of PCA is found in scikit-learn.
A new method inverse_transform in this PR maps responsibilities onto the original data space.

The short sample code is here:

from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from ugtm.ugtm_sklearn import eGTM
X,y = load_iris(return_X_y=True)
Xtrain, Xtest = train_test_split(X, test_size=0.20,
                random_state=42)
model = eGTM(model='responsibilities')
_ = model.fit_transform( Xtrain )
matR = model.transform(Xtest)
Xhat = model.inverse_transform(matR)
print("original data", Xtest.shape)
print("projected data", Xhat.shape)
#original data (30, 4)
#projected data (30, 4)

This may help analyses with GTM.
Thank you in advance for your consideration.

feat: inverse_transform maps responsibilities onto the original data space

refactor: transform and fit_transform are available even in sklearn.pipeline.Pipeline, which ignores `model` input
….Pipeline`

1. transform(self, X, model) => transform(self, X),
   because transform params is not allowed in Pipeline.transform
2. eGTM.__init__(..., model="means"),
   because the transform parameter must be set when initialized
   or when set with `set_params`
3. fit(self, X) => fit(self, X, y=None),
   because fit method basically expects three inputs
   and the last one is None by default for unsupervised methods
@sshojiro
Copy link
Contributor Author

sshojiro commented May 2, 2020

The commit 3d1086e (formerly e8f6459 ) supports sklearn.pipeline.Pipeline integration.
Now the class eGTM switches the output format by eGTM.set_params and parameter eGTM.model.
eGTM.model takes either of 'means', 'modes', 'responsibilities' or 'complete'.

It is confirmed that the new feature works properly as follows:

from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from ugtm.ugtm_sklearn import eGTM
from sklearn.pipeline import make_pipeline
from sklearn.preprocessing import StandardScaler 
X,y = load_iris(return_X_y=True)
model = make_pipeline(StandardScaler(), eGTM())
Xtrain, Xtest = train_test_split(X, test_size=0.20,
                random_state=42)

model.set_params(**{"egtm__model": "responsibilities"})
model.fit_transform(Xtrain) # pass 
model.fit(Xtrain)           # pass 
for mtype in ['means', 'modes', 'responsibilities']:
    model.set_params(**{"egtm__model": mtype})
    Xttest = model.transform(Xtest)
    print(mtype, Xttest.shape)
# means (30, 2)
# modes (30, 2)
# responsibilities (30, 256)

@hagax8 hagax8 merged commit ff4d7d7 into hagax8:master Aug 22, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants