Sentiment Analysis

NLP task : postive or negative Review Sentiment Analyzer using IMDB dataset in Python .

Overview

Dataset

The 100K reviews are divided into three datasets: 25K labeled training instances, 25K labeled test instances and 50K unlabeled training instances. Each review has one label representing the sentiment of it: Positive or Negative. These labels are balanced in both the training and the test set.
Text Pre-processing

Text preprocessing was done in order to remove unecessary words or turn verbs and adjectives into a unified word to make the model less confused . Text PreProcessing included
- spell checker
- stop words removal (is,are,the,this ....etc)
- Lemmatization
Vector Spaces and Data Matrix

This step aimed to convert the text of the review after pre-processing into a vector form for further steps.Different alternatives for text representation inclued:
- n-grams.
- Count, TF-IDF.
- DocToVec
Supervised Learning Step

We tried multiple of classification models. We tuned their parameters. We tuned these models parameters on 10% subset of the training set. Model choices are:
- Decision Trees
- Random Forests
- SVM Non-Linear
- Adaboost
- Logistic Regression
- Bagging
Results

With All mentioned Preprocessing:
- Logistic regression :87.5%,
- Adaboost 80%
- Decision trees was 52% before turning and reached 70% after
With preprocessing without lemmatization:
- Random forests was 54% before tuning and reached 85.5% after
- Adaboost 79.852%
Without preprocessing at all expept html removal:
- Naive Bayes 85.4%
- Logistic Regression 88.6% with c set to 1000
- Adaboost 80.2%
- Random Forests was 83.6% before tuning and 86.6% after
With stop words removal and html only:
- Logistic Regression 88.3%
Without any preprocessiong:
- Logistic Refression 88.5%

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
.ipynb_checkpoints		.ipynb_checkpoints
Assignment_2_SentimentAnalysis.ipynb		Assignment_2_SentimentAnalysis.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sentiment Analysis

Overview

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Sentiment Analysis

Overview

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages