New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Request - mode for infering single article topics #3
Comments
Maybe a separate node module that builds upon lda? |
Any updates on this? Is there a module I should use to conduct this? |
Not sure, do you have an example? This might be something that could be a separate library, built on top of lda. |
Hi Kory, thanks for the amazing module. Would it still be possible to add this functionality within the same module? The main reason is to convert text documents into feature vectors for a supervised learning problem. Something like this (modifed documentation example): const lda = require('lda');
// Example document.
const text = 'Cats are small. Dogs are big. Cats like to chase mice. Dogs like to eat bones.';
// Extract sentences.
const documents = text.match( /[^\.!\?]+[\.!\?]+/g );
// Run LDA to get terms for 2 topics (5 terms each).
const LDAModel = lda(documents, 2, 5);
/*
LDAModel.results = [ [ { term: 'dogs', probability: 0.2 },
{ term: 'cats', probability: 0.2 },
{ term: 'small', probability: 0.1 },
{ term: 'mice', probability: 0.1 },
{ term: 'chase', probability: 0.1 } ],
[ { term: 'dogs', probability: 0.2 },
{ term: 'cats', probability: 0.2 },
{ term: 'bones', probability: 0.11 },
{ term: 'eat', probability: 0.1 },
{ term: 'big', probability: 0.099 } ] ];
*/
//and hopefully a predict method on a trained model
LDAModel.predict('cats are big') // => [0.89,0.3] (calculate theta for new documents) Again thanks for the great module @primaryobjects! |
Hello again,
Let's say that we've gone through the corpus and created topic word distributions.
I want to use this output to tag single articles now.
I know that the process is similar, iterative, only that it needs not to affect the phi.
I think such a function would be useful for people like myself.
Thanks
Ilan
The text was updated successfully, but these errors were encountered: