# Available sentiment analysis libraries

As sentiment analysis is a common issue, many people try to prepare ML-like architectures to address it. Most of these solutions are tried to be generic, so should be applicable in any domain, however that's not always the case.

The motivation behind this training is that Codete wanted to use sentiment analysis for one of the showcases we show at different conferences from time to time - it performs a realtime sentiment monitoring of tweets for given phrase and visualizing a global perception of that topic. Surpisingly, some of the best known libraries struggle with such short messages, which are typically incorrect in terms of grammar and linguistic correctness in general. Tweets have a tendency to contain a lot of phrases which don't come from the real language, like acronyms, hashtags, etc. There are also several other issues which are different the ones we have in the official English language. 

We believe, having a problem to be solved with Machine Learning, should never start with desiging our own model, but with a research if anyone else had a similar problem, and trying to apply their work, if possible. That was also a process we initially conducted, but existing ones didn't work well on our dataset. That's a good reason to design our own one.

Nevertheless, if you have to use sentiment analysis in your project, you should keep your eye on the following libraries:

## NLTK

Python NLTK is a common NLP library and it already has a built-in and trained model for the sentiment analysis. Let's look at the example, as it is quite simple to be used out of the box.

In [1]:
import nltk

from nltk.sentiment.vader import SentimentIntensityAnalyzer

# It is required to download the lexicon before starting
nltk.download("vader_lexicon")

# Create the analyzer and try it out
analyzer = SentimentIntensityAnalyzer()
analyzer.polarity_scores("The last game of polish national team was awful")



[nltk_data] Downloading package vader_lexicon to
[nltk_data]     /home/jovyan/nltk_data...
[nltk_data]   Package vader_lexicon is already up-to-date!


{'neg': 0.273, 'neu': 0.727, 'pos': 0.0, 'compound': -0.4588}

## TextBlob

This is another Python library which aims to address many different NLP problems. One of them is sentiment analysis. Under the hood TextBlob uses NLTK, but not directly the same algorithm, of course. Here is a short demo:

In [1]:
import nltk

from textblob import TextBlob

# Download the language resource used internally
nltk.download("punkt")

# Perform the analysis on given text
blob = TextBlob("The last game of polish national team was awful")
blob.sentiment

[nltk_data] Downloading package punkt to /home/jovyan/nltk_data...
[nltk_data]   Package punkt is already up-to-date!


Sentiment(polarity=-0.4666666666666666, subjectivity=0.48888888888888893)

## Stanford CoreNLP

Stanford Univeristy is one of the biggest research centers focused on NLP problems. CoreNLP is a toolkit of different methods used in this area in order to solve some common problems, also sentiment analysis. The library is written in Java and some papers claim it is state-of-the-art tool when it comes to that issue, so it's definitely worth giving a try.

Due to the fact we're using Jupyter Notebook, it's hard to show a working example here - it has a Java interface which is not as easy to be used within this tool as the previous libraries.