Skip to content

IMDB movive reviews sentiment analysis based on Machine Learning , NLP and LSTM model [ open source ]

License

Notifications You must be signed in to change notification settings

bluetickconsultants/sentiment-analysis

Repository files navigation

IMDB Movie Review Sentiment Analysis



What is Sentiment Analysis?

The act of computationally recognising and categorising opinions contained in a piece of text, especially in order to discern whether the writer has a good, negative, or neutral attitude toward a given topic, product, etc.

Sentiment analysis is a technique for analysing a piece of text to determine the sentiment contained within it. It accomplishes this by combining machine learning and natural language processing (NLP).

This project is about movie reviews sentiment analysis based on Machine Learning, NLP, and LSTM models.


Using LSTM_RNN model

For developing sentiment Analysis model using LSTM Layers few techniques were applied for making the model to perform in better way

Techniques used

 1 . Collecting Data from various sources
 2 . Text cleaning
 3 . Balancing data
 4 . Regular Expression
 5 . pipeline of NLP
        a . lower text
        b . stemming 
        c . lemmatization
        d . stopwords
        e . spacy library 
        f . nltk(natural language toolkit)

Accuracy result using word Embedding and LSTM

 1 . Training_Accuracy = 52.8942115768463
 2 . Test_Accuracy = 0.4898785425101215


Developing the model using Machine Learning use cases

Using the same data we also developed ML models by cleaning to create a genralized model

Algorithms used

  1 . SVM (support vector machine)
  2 . Naive Bayes 
  3 . Decision Tree
  4 . Random Forest

Techniques used

  1 . Data cleaning
  2 . Using cross validation to train the data in better way 
  3 . HyperParameter Tunning 
          1 . GridSearch CV
          2 . Radomized search CV
   4 . AUC and ROC curve 
   5 . TPR (True positive Rate) 
   6 . FPR (False Positive Rate)
   7 . classification_report 

SVM

 1 . Training_Accuracy  = 0.9230769230769231
 2 . Test_Accuracy      = 0.7266666666666667

Naive Bayes

 1 . Training_Accuracy  = 0.9247491638795987
 2 . Test_Accuracy      = 0.7866666666666666

Decision Tree

 1 . Training_Accuracy  = 0.9966555183946488
 2 . Test_Accuracy      = 0.7333333333333333

Random Forest

 1 . Training_Accuracy  = 0.9966555183946488
 2 . Test_Accuracy      = 0.7333333333333333

AUC and ROC score

1 . AUC and ROC = 0.8234352773826458

Other Projects

To view all other open source projects visit

Author

Bluetick Consultants LLP

About

IMDB movive reviews sentiment analysis based on Machine Learning , NLP and LSTM model [ open source ]

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published