Multi-label text classification with Probabilistic Topic Model ml-PLSI
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
README.md
mlPLSI.ipynb

README.md

mlPLSI

Karpovich S. N. Multi-label text classification with Probabilistic Topic Model ml-PLSI.

Keywords: Multi-label classification, supervised learning, topic model, natural language processing.

SUMMARY The paper proposes a method of multi-label classification for documents with topic model. A lot of researches of clustering and classification algorithms have one label for one document when one document can be relevant to several labels. The task is very actual. A comparative analysis of algorithms for multi-label classification is made. The article describes technology tools for the multi-label classification algorithm. A Topic Model is created by a supervised learning. We have estimated the classification quality and made a list of proposed categories for a word. The developed approach has shown its efficiency. Probabilistic estimations of the assignment of a document to a category allow to use it in the collective recognition and associative classification. Further we will research the opportunities of multi-label classification with probabilistic topic model.

License

BigARTM is released under New BSD License that allowes unlimited redistribution for any purpose (even for commercial use) as long as its copyright notices and the license’s disclaimers of warranty are maintained.

GitHub license