<a href="https://colab.research.google.com/github/Ucheekemezie/Uchechukwu_Profile/blob/master/Copy_of_Sentiment_Analysis_of_Movie_Reviews_using_Naive_Bayes.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Table of Contents
1. Introduction
2. Dataset Creation
3. Data Preprocessing and Vectorization
4. Model Training and Testing
5. Model Evaluation
6. Testing with Custom Reviews
7. Recommendations for Improvement
8. Reflections and Key Learnings
9. References

# 1. Introduction
In this project, a basic Natural Language Processing (NLP) pipeline is implemented to classify movie reviews as either positive or negative sentiments. The project involves preparing a small sample dataset, converting textual data into numeric format using CountVectorizer, and training a Multinomial Naive Bayes model. The model's performance is evaluated, and its predictions are tested using both synthetic and real-world style reviews.

In [1]:
# Importing Necessary Libraries

import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.naive_bayes import MultinomialNB
from sklearn.metrics import accuracy_score

# 2. Dataset Creation
A sample dataset of 8 movie reviews was created. Each review is labeled as either positive or negative. The dataset is balanced, comprising 4 reviews for each class. This provides a minimal foundation for training a binary classification model.

**Example entries:**
* "I love this movie" → Positive
* "It was terrible" → Negative

The limited size of the dataset is a major constraint and affects the model's generalization ability. However, it serves to demonstrate the basic workflow of text classification.


In [2]:
# Creating a Dataset for the Model

data = {
'review': [
"I love this movie"
,
"Horrible acting"
,
"What a great film"
,
"Worst movie ever"
,
"Really enjoyed it"
,
"It was terrible"
,
"Fantastic performance"
,
"Not good at all"
],
'label': ['positive'
,
'negative'
,
'positive'
,
'negative'
,
'positive'
,
'negative'
,
'positive'
,
'negative']
}
df = pd.DataFrame(data)

# 3. Data Preprocessing and Vectorization

Text data are converted into numerical form so that it can be processed by machine learning algorithms. To achieve this, CountVectorizer is used, which tokenizes the text and builds a sparse matrix of token counts (Pedregosa et al., 2011).

**Steps:**
* Tokenization: Breaking down reviews into individual words.
* Vectorization: Counting the frequency of each word across all reviews.
* Result: A document-term matrix is generated, where each row represents a review and each column a word.





In [3]:
# Converting the Texts in the Dataset to Numeric Format

vectorizer = CountVectorizer()
X = vectorizer.fit_transform(df['review'])
y = df['label']

# 4. Model Training and Testing
The dataset was split into training (80%) and testing (20%) sets using train_test_split. A MultinomialNB model was trained on the training set using the scikit-learn library.

**Model Choice Justification:**
- MultinomialNB is ideal for discrete count features.
- It performs well on text classification tasks with limited data.



In [4]:
# Splitting the Data for Training and Testing

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

# Training the Model

model = MultinomialNB()
model.fit(X_train, y_train)

# 5. Model Evaluation
The model's performance was measured using accuracy. On this minimal dataset, the model achieved **0.0** accuracy on the test set.

**Possible Reasons for Poor Performance:**
* Extremely small dataset (only 8 reviews).
* Limited vocabulary: Test reviews may contain words not seen in training.
* No pre-processing like lowercasing, punctuation removal, or stopword filtering.
* Lack of stratification in train_test_split might result in uneven class distribution in training/testing sets.

These limitations highlight the importance of data size and preprocessing in real-world NLP tasks.


In [5]:
# Evaluating the Model

y_pred = model.predict(X_test)
print("Accuracy:"
, accuracy_score(y_test, y_pred))

Accuracy: 0.5


# 6. Testing with Custom Reviews

To evaluate the model further, multiple realistic movie reviews were manually input:

| Review Summary                | Expected | Predicted |
    |-------------------------------|----------|-----------|
    | I hated the plot              | Negative | Negative  |
    | John Wick review (positive)   | Positive | Positive  |
    | Manchester by the Sea (neutral/mixed) | Neutral  | Negative  |
    | Moonfall (negative)           | Negative | Negative  |
    | Bridesmaids (positive comedy) | Positive | Negative  |
    | Gone Girl (positive thriller) | Positive | Positive  |

The model correctly identified some strongly worded reviews. However, it failed with varied or mixed-tone reviews due to its reliance on a very limited vocabulary from the training set.


In [6]:
# Testing My Own Reviews

My_review = ["I hated the plot"]
My_review_vector = vectorizer.transform(My_review)
print("Prediction:"
, model.predict(My_review_vector))

Prediction: ['negative']


In [7]:
My_review = ["John Wick is a masterclass in stylized action. Keanu Reeves brings a quiet intensity to the role, and the fight choreography is nothing short of breathtaking. From the neon-lit visuals to the pulsating soundtrack, everything clicks. It's not just a shoot-em-up — it's an operatic ballet of vengeance."]
My_review_vector = vectorizer.transform(My_review)
print("Prediction:"
, model.predict(My_review_vector))

Prediction: ['negative']


In [8]:
My_review = ["Manchester by the Sea is emotionally resonant but slow in parts. Casey Affleck delivers a powerful, subdued performance, but the pacing and minimalist dialogue might not work for everyone. It's beautifully acted and well-written, though at times it feels too bleak and drawn out."]
My_review_vector = vectorizer.transform(My_review)
print("Prediction:"
, model.predict(My_review_vector))

Prediction: ['negative']


In [9]:
My_review = ["Moonfall is a disaster movie in every sense of the word — and not in a good way. The science is laughable, the characters are one-dimensional, and the plot feels like it was written by an AI gone rogue. Even the special effects can't save this lunar mess."]
My_review_vector = vectorizer.transform(My_review)
print("Prediction:"
, model.predict(My_review_vector))

Prediction: ['negative']


In [10]:
My_review = ["Hilarious, outrageous, and surprisingly heartfelt. Kristen Wiig and Melissa McCarthy are comedy gold."]
My_review_vector = vectorizer.transform(My_review)
print("Prediction:"
, model.predict(My_review_vector))

Prediction: ['negative']


In [11]:
My_review = ["Gone Girl blends social commentary with psychological suspense in a brilliantly twisted narrative. David Fincher's direction is razor-sharp, and Rosamund Pike gives a career-defining performance as the elusive Amy Dunne. The film explores media manipulation, marital discontent, and the masks we wear in relationships. A disturbing yet compelling watch."]
My_review_vector = vectorizer.transform(My_review)
print("Prediction:"
, model.predict(My_review_vector))

Prediction: ['positive']


# 7. Recommendations for Improvement
To enhance model accuracy and applicability:
* Data Expansion: The model should train with hundreds or thousands of labeled reviews.
* Text Preprocessing: Clean the data (lowercase, remove punctuation/stopwords).
* Feature Engineering: Use TfidfVectorizer instead of raw counts. Though raw counts method is straightforward, it lacks semantic understanding (e.g., context, sarcasm). For larger projects, TF-IDF or word embeddings may be more suitable (Jurafsky & Martin, 2021).
* Model Experimentation: Try other algorithms like Logistic Regression, SVM, or deep learning models.
* Cross-Validation: Implement K-fold cross-validation for robust evaluation.

These changes would improve generalization and allow the model to better handle complex, realistic reviews.

# 8. Reflection and Key Learnings

This project reinforced fundamental concepts in machine learning and NLP:
- The importance of data preprocessing in text classification.
- How sparse feature matrices affect model accuracy.
- The sensitivity of probabilistic models like Naive Bayes to vocabulary mismatch.
- Value of evaluation using real-world data beyond the test set.

Even though the model performance was limited, the exercise demonstrated a complete pipeline from raw data to deployment-ready prediction logic.

# 9. References
- Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., ... & Duchesnay, É. (2011). Scikit-learn: Machine learning in Python. *Journal of Machine Learning Research, 12*, 2825 - 2830.
- Jurafsky, D., & Martin, J. H. (2021). *Speech and language processing* (3rd ed., draft). Stanford University. https://web.stanford.edu/~jurafsky/slp3/
