<h1>Review Classification using spacy</h1>

<h4>Import Dependencies</h4>

In [57]:
import pandas as pd
from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.feature_extraction.text import CountVectorizer, TfidfTransformer
from sklearn.pipeline import Pipeline
from sklearn.naive_bayes import MultinomialNB
from sklearn.metrics import accuracy_score, classification_report
import spacy

**Load the spaCy English model**

In [58]:
nlp = spacy.load("en_core_web_sm")

<h4>Load the dataset</h4>

*Load the dataset*

In [59]:
df = pd.read_csv("IMDB Dataset.csv")

<h3>Convert labels to binary: positive -> 1, negative -> 0</h3>

In [60]:
df['sentiment'] = df['sentiment'].map({'positive': 1, 'negative': 0})

<h3>Divide the dataset into training, validation, and test sets</h3>

In [61]:
train_df, test_df = train_test_split(df, test_size=0.2, random_state=42)
train_df, val_df = train_test_split(train_df, test_size=0.2, random_state=42)

**Create a text classification pipeline**

In [62]:
base_model = Pipeline([
    ('vectorizer', CountVectorizer()),
    ('tfidf', TfidfTransformer()),
    ('classifier', MultinomialNB())
])

In [63]:
# Train the base model with default hyperparameters
base_model.fit(train_df['review'], train_df['sentiment'])

<h3>Evaluate the base model on the test set</h3>

In [64]:
test_predictions = base_model.predict(test_df['review'])
test_accuracy = accuracy_score(test_df['sentiment'], test_predictions)
print("Test Accuracy (Base Model):", test_accuracy)
print(classification_report(test_df['sentiment'], test_predictions))

Test Accuracy (Tuned Model): 0.8608
              precision    recall  f1-score   support

           0       0.84      0.89      0.86      4961
           1       0.89      0.83      0.86      5039

    accuracy                           0.86     10000
   macro avg       0.86      0.86      0.86     10000
weighted avg       0.86      0.86      0.86     10000


<h3>Evaluate the base model on the validation set</h3>

In [65]:
val_base_predictions = base_model.predict(val_df['review'])
val_base_accuracy = accuracy_score(val_df['sentiment'], val_base_predictions)
print("Validation Accuracy (Base Model):", val_base_accuracy)
print(classification_report(val_df['sentiment'], val_base_predictions))

Validation Accuracy (Base Model): 0.863
              precision    recall  f1-score   support

           0       0.84      0.90      0.87      3959
           1       0.89      0.83      0.86      4041

    accuracy                           0.86      8000
   macro avg       0.86      0.86      0.86      8000
weighted avg       0.87      0.86      0.86      8000


<h3>Define hyperparameters</h3>

In [66]:
param_grid = {
    'vectorizer__ngram_range': [(1, 1), (1, 2)],  # Uni-gram and Bi-gram
    'tfidf__use_idf': [True, False],
    'classifier__alpha': [0.1, 0.01, 0.001, 0.0001]
}

<h3>Perform hyperparameter tuning</h3>

In [67]:
grid_search = GridSearchCV(base_model, param_grid, cv=3, scoring='accuracy', verbose=2)
grid_search.fit(train_df['review'], train_df['sentiment'])
print("Best Hyperparameters:", grid_search.best_params_)
best_model = grid_search.best_estimator_

Fitting 3 folds for each of 16 candidates, totalling 48 fits
[CV] END classifier__alpha=0.1, tfidf__use_idf=True, vectorizer__ngram_range=(1, 1); total time=   3.7s
[CV] END classifier__alpha=0.1, tfidf__use_idf=True, vectorizer__ngram_range=(1, 1); total time=   3.6s
[CV] END classifier__alpha=0.1, tfidf__use_idf=True, vectorizer__ngram_range=(1, 1); total time=   3.6s
[CV] END classifier__alpha=0.1, tfidf__use_idf=True, vectorizer__ngram_range=(1, 2); total time=  16.1s
[CV] END classifier__alpha=0.1, tfidf__use_idf=True, vectorizer__ngram_range=(1, 2); total time=  16.3s
[CV] END classifier__alpha=0.1, tfidf__use_idf=True, vectorizer__ngram_range=(1, 2); total time=  16.1s
[CV] END classifier__alpha=0.1, tfidf__use_idf=False, vectorizer__ngram_range=(1, 1); total time=   3.6s
[CV] END classifier__alpha=0.1, tfidf__use_idf=False, vectorizer__ngram_range=(1, 1); total time=   3.7s
[CV] END classifier__alpha=0.1, tfidf__use_idf=False, vectorizer__ngram_range=(1, 1); total time=   3.7s


<h3>Evaluate the tuned model on the test set</h3>

In [68]:
test_predictions = best_model.predict(test_df['review'])
test_accuracy = accuracy_score(test_df['sentiment'], test_predictions)
print("Test Accuracy (Tuned Model):", test_accuracy)
print(classification_report(test_df['sentiment'], test_predictions))

Test Accuracy (Tuned Model): 0.8933
              precision    recall  f1-score   support

           0       0.88      0.90      0.89      4961
           1       0.90      0.88      0.89      5039

    accuracy                           0.89     10000
   macro avg       0.89      0.89      0.89     10000
weighted avg       0.89      0.89      0.89     10000


<h3>Evaluate the tuned model on the validation set</h3>

In [69]:
val_predictions = best_model.predict(val_df['review'])
val_accuracy = accuracy_score(val_df['sentiment'], val_predictions)
print("Validation Accuracy (Tuned Model):", val_accuracy)
print(classification_report(val_df['sentiment'], val_predictions))

Validation Accuracy (Tuned Model): 0.89725
              precision    recall  f1-score   support

           0       0.88      0.91      0.90      3959
           1       0.91      0.88      0.90      4041

    accuracy                           0.90      8000
   macro avg       0.90      0.90      0.90      8000
weighted avg       0.90      0.90      0.90      8000


<h2>Model testing</h2>
<hr>

<h3>Predict sentiment using the trained model</h3>

In [71]:
# Input review
review = "This movie was great."

predicted_sentiment = best_model.predict([review])[0]
predicted_sentiment_label = 'positive' if predicted_sentiment == 1 else 'negative'
print("Predicted Sentiment:", predicted_sentiment_label)

Predicted Sentiment: positive
