This repository contains code and resources for performing sentiment analysis on the IMDB dataset of 50K movie reviews. It explores different methodologies including Bag of Words (BoW), TF-IDF, and Neural Networks to classify reviews into positive or negative sentiments.
The IMDB dataset features 50,000 movie reviews, evenly split between 25,000 positive and 25,000 negative reviews. This balance makes it an excellent resource for binary sentiment classification, providing a standard benchmark for evaluating model performance in natural language processing tasks.
- Clone the Repository:
git clone https://github.com/AICrafter08/IMDB-Sentiment-Analysis.git
- Navigate to the Repository Folder:
cd IMDB-Sentiment-Analysis.git
src/
- Contains Python scripts:BOW_classification.py
: Contains code specific to Bag of Words classification.TFIDF_classification.py
: Dedicated to TF-IDF based classification.Neuralnet_classification.py
: Implements Neural Network for sentiment analysis.
IMDB_dataset_classifier.ipynb
: Jupyter notebook with a comprehensive guide covering all methods.
Ensure Python 3.x is installed. Install the required libraries using:
pip install -r requirements.txt
- BoW Classification:
python BOW_classification.py
- TF-IDF Classification:
python TFIDF_classification.py
- Neural Network Classification:
python Neuralnet_classification.py
- Comprehensive Analysis (includes all methods):
- Open and run
IMDB_dataset_classifier.ipynb
in a Jupyter notebook environment or Google Colab.
- Open and run