# A complete profile for sentiment analysis model

## Models Ideas:
1. Use of custom models for different categories (tech, food, books,...) to be automatically(by using label detection) or manually selected(by the client). *(different datasets included)*
3. Developing a layered classification **use *fast/slow* classification** (divide the dataset using confidence index to strong and weak groups; the weak group will be analysed further using ML model).
4. Aspect based analysis **(attach sentiment to specific aspects rather than sentence/opinion)** and word cloud **(for word frequencies)** to show insights of the reviews. 
5. Use of lemmatization, opinion unit extractor, subjectivity index and multiclass classification(love, sad, angry,...) for better performance and data enrichment.
6. Implementation of a sent-ngrams lexion sentiment analysis **(SO-CAL)**. (Implementation phase)
7. Use of client dataset to fine-tune the model. (Ideation phase)

## Datasets used:
1. Twitter airline 
2. IMDB 
3. Yelp (preprocessing phase)
4. Amazon 
5. 140sentiment twitter

## Implementation:

### Imports:

In [10]:
from textblob import TextBlob as tb
from textblob.sentiments import NaiveBayesAnalyzer
from textblob import Blobber
from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer
from flair.data import Sentence
from flair.models import TextClassifier
import spacy
from sklearn import metrics
import pandas as pd

### libraries implementation:

In [1]:
def metric(true,predict):
    analytics=[]
    #metrics.classification_report(true,predict)
    analytics.append(metrics.accuracy_score(true,predict))
    analytics.append(metrics.percision_score(true,predict))
    analytics.append(metrics.recall_score(true,predict))
    analytics.append(metrics.f1_score(true,predict))
    return analytics

In [2]:
def textblobPattern(df):
    sentiment=[]
    for sentence in df:
        sent=tb(sentence).polarity
        if sent>0:
            sentiment.append(1)
        elif sent<0:
            sentiment.append(-1)
        else:
            sentiment.append(0)
    return sentiment

In [None]:
def textblobNB(df):
    sentiment=[]
    tbnb = Blobber(analyzer=NaiveBayesAnalyzer())
    for sentence in df:
        ts=tbnb(sentence).sentiment
    return sentiment    

In [None]:
def vader(df):
    sentiment=[]
    analyzer = SentimentIntensityAnalyzer()
    for sentence in df:
        vs=analyzer.polarity_scores(sentence)['compound']
        if (vs > 0.5):
            sentiment.append(1)
        elif (vs < -0.5):
            sentiment.append(-1)
        else:
            sentiment.append(0)
    return sentiment

In [None]:
def flair(df):
    classifier = TextClassifier.load('en-sentiment')
    sentiment=[]
    for phrase in df:
        text = Sentence(phrase)
        classifier.predict(text)
        sentiment.append(1 if text.labels[0].value == 'POSITIVE' else -1)
    return sentiment

In [None]:
def spacy(df):
    pass

### Training model:

In [None]:
def textblobDTtrain(df):
    pass

In [None]:
def textblobNBtrain(df):
    pass

In [None]:
def flairtrain(df):
    pass

In [None]:
def RoBerta(df):
    pass

### Model testing: (for each dataset)

### Main:

## Results:

## Topics covered:
- Textblob - vader - flair - Spacy libraries
- Text operations: lemmatization - tokenization - vectorization - wordnet - tagging - n-gram 
- Machine learning concepts: vector space model, k-means clustering,[ Naive Bayes, k-NN, SVM] classifiers, decision tree - random forest - transformers (word2vec and wordtree of stanford).
- Technologies: Jupyter notebook - Google colab
- Dataset handeling: dataset preprocessing (and pandas library for csv) 
- Sentiment analysis approaches
- Handeling multiple machine learning libraries: roBERTa - BERT - [GloVe - Fasttext - torchtext]
- Training and fine tuning a custom model

## References:
- https://neptune.ai/blog/sentiment-analysis-python-textblob-vs-vader-vs-flair
- https://towardsdatascience.com/customer-churn-accuracy-a-4-6-increase-with-feature-engineering-29bcb1b1ee8f (REVIEW)
- https://www.analyticsvidhya.com/blog/2021/01/sentiment-analysis-vader-or-textblob/
- https://pythonprogramming.net/sentiment-analysis-python-textblob-vader/
- https://towardsdatascience.com/sentimental-analysis-using-vader-a3415fef7664
- https://medium.com/geekculture/what-nlp-library-you-should-use-for-your-sentimental-analysis-project-bef6b357a6db
- https://towardsdatascience.com/sentiment-analysis-comparing-3-common-approaches-naive-bayes-lstm-and-vader-ab561f834f89
****
* N-grams rule based model
- https://www.sciencedirect.com/science/article/pii/S095741741830143X
- https://github.com/sfu-discourse-lab/SO-CAL(to be reviewed)
- https://towardsdatascience.com/text-analysis-basics-in-python-443282942ec5
****
- https://towardsdatascience.com/text-classification-with-state-of-the-art-nlp-library-flair-b541d7add21f