Naieve Bayes Bag-of-Words Sentiment Classifier
Trains a naieve bayes classifier to predict sentiment of a movie review (positive or negative). The assignment code has been cleaned up and streamlined to facilitate reading and usage. This means the complete solution to the assignment is not here, just what I deemed the most relevant part for sharing.
Modifications to Instructor Implementations
feature_extractormember that defaults to
tokenize_and_update_model: Switched to use
feature_extractormember rather than
Implementations I provided
To train a Naive Bayes classifier on the
large_movie_review_dataset data using a feature extractor that stems, removes stopwords, and custom stopwords:
This command trains the model with every pseudocount from 1 to 25 (inclusive), creates a graph of pseudocount vs accuracy, returns the best pseudocount and the accuracy associated with that pseudocount.
from nb_sentiment_classify import NaiveBayes; # Initialize model with default feature extractor nb = NaiveBayes() # Train model on large_movie_review_dataset nb.train_model() # Evaluate accuracy given a pseudocount (1 used in this example) nb.evaluate_classifier_accuracy(1)