-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LDA model persistence #12
Comments
You are correct the LDA transformer is not serialisable yet, unfortunately,
I just haven't gotten around to implementing it. If you fancy having a go
yourself, feel free to submit a pull request, or, in the meantime, you
could individually persist the component parts of the LDA model and then
recreate it from those parts at a later time.
…On Sat, 2 Jan 2021 at 22:54, onclue ***@***.***> wrote:
Thanks for this library, it seems really useful.
I have been playing around a bit with a feature extractor pipeline of
countvectoriser and tfidf transformer feeding into an LDA transformer, but
I can't seem to save the Fit'ed pipeline to disk and reload it later to
Transform new docs. Looking at the serialized pipeline in json, it seems
the vocabulary is there, as well as the tokenizer info and various LDA
params, but I don't see the induced topics (matrices). Maybe this is a
problem with the way I serialized it? If you can point to a working example
of how to properly serialize a trained LDA model and re-use it later, that
would be great.
Thanks again!
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#12>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACR7W335TDCUB4U3XLOTAS3SX6P3RANCNFSM4VRMOKJA>
.
|
Thanks, that makes sense. I will have a look. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Thanks for this library, it seems really useful.
I have been playing around a bit with a feature extractor pipeline of countvectoriser and tfidf transformer feeding into an LDA transformer, but I can't seem to save the Fit'ed pipeline to disk and reload it later to Transform new docs. Looking at the serialized pipeline in json, it seems the vocabulary is there, as well as the tokenizer info and various LDA params, but I don't see the induced topics (matrices). Maybe this is a problem with the way I serialized it? If you can point to a working example of how to properly serialize a trained LDA model and re-use it later, that would be great.
Thanks again!
The text was updated successfully, but these errors were encountered: