A Sentube corpus analyzer using NLTK
-
This folder contains the results of both stemmed and non-stemmed data
-
The file stemmedOut.json contains the results of the sentube sentiment analysis of stemmed tokens
-
The file polishedOut.json contains the same results, but with full words instead of stems.
-
The case of the dominance of the neutral words has been solved by removing all the stop words (words that do not add any meaning to the sentence (a, the, that))
-
Currently, I am working on POS tagging and whether it can improve the accuracy of the program (80% with full words, and 86% with stems.)