Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

_LSIWrapper: Access to ds_p outside of fit_transform #188

Open
robguinness opened this issue Oct 5, 2018 · 2 comments
Open

_LSIWrapper: Access to ds_p outside of fit_transform #188

robguinness opened this issue Oct 5, 2018 · 2 comments

Comments

@robguinness
Copy link

Hi,

Would you consider a pull request that allowed access to the ds_p field of fit_transform() as a member variable of _LSIWrapper? I have a use case where I want to do further processing of the transformed data. I realize I could access this using _load_features, but this means loading it again from disk, which seems like a waste.

This will, of course, increase the memory consumption of _LSIWrapper for other cases where the transformed data is not needed. Another option is to have fit_transform() return ds_p. This would fit more into the pattern of scikit-learn for fit_transform().

@rth
Copy link
Contributor

rth commented Oct 11, 2018

Thanks for opening this issue @robguinness

This will, of course, increase the memory consumption of _LSIWrapper for other cases where the transformed data is not needed. Another option is to have fit_transform() return ds_p. This would fit more into the pattern of scikit-learn for fit_transform().

Yes, because datasets can be quite large, including the LSI transformed data, I think it's better to avoid storing it as attribute unless we absolutely need to.

Returning ds_p in fit_transform() sounds like a good idea indeed, that way you could store it as needed for your applications. PR would be very welcome!

@robguinness
Copy link
Author

Thanks. I'll try to write a PR soon. I'm on another project at the moment, but I hope to get back to this soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants