LDA model persistence #12

trpstra · 2021-01-02T22:54:36Z

Thanks for this library, it seems really useful.
I have been playing around a bit with a feature extractor pipeline of countvectoriser and tfidf transformer feeding into an LDA transformer, but I can't seem to save the Fit'ed pipeline to disk and reload it later to Transform new docs. Looking at the serialized pipeline in json, it seems the vocabulary is there, as well as the tokenizer info and various LDA params, but I don't see the induced topics (matrices). Maybe this is a problem with the way I serialized it? If you can point to a working example of how to properly serialize a trained LDA model and re-use it later, that would be great.
Thanks again!

james-bowman · 2021-01-03T11:42:31Z

You are correct the LDA transformer is not serialisable yet, unfortunately, I just haven't gotten around to implementing it. If you fancy having a go yourself, feel free to submit a pull request, or, in the meantime, you could individually persist the component parts of the LDA model and then recreate it from those parts at a later time.

…

On Sat, 2 Jan 2021 at 22:54, onclue ***@***.***> wrote: Thanks for this library, it seems really useful. I have been playing around a bit with a feature extractor pipeline of countvectoriser and tfidf transformer feeding into an LDA transformer, but I can't seem to save the Fit'ed pipeline to disk and reload it later to Transform new docs. Looking at the serialized pipeline in json, it seems the vocabulary is there, as well as the tokenizer info and various LDA params, but I don't see the induced topics (matrices). Maybe this is a problem with the way I serialized it? If you can point to a working example of how to properly serialize a trained LDA model and re-use it later, that would be great. Thanks again! — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#12>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ACR7W335TDCUB4U3XLOTAS3SX6P3RANCNFSM4VRMOKJA> .

trpstra · 2021-01-04T08:20:01Z

Thanks, that makes sense. I will have a look.

trpstra changed the title ~~LDA model persisting~~ LDA model persistence Jan 2, 2021

trpstra closed this as completed Jan 4, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LDA model persistence #12

LDA model persistence #12

trpstra commented Jan 2, 2021

james-bowman commented Jan 3, 2021 via email

trpstra commented Jan 4, 2021

LDA model persistence #12

LDA model persistence #12

Comments

trpstra commented Jan 2, 2021

james-bowman commented Jan 3, 2021 via email

trpstra commented Jan 4, 2021