# Improving Sentiment Analysis Accuracy with Python and Machine Learning

summary of the sentiment analysis model:

1. Data Loading: The model begins by loading the movie_reviews dataset from NLTK, which consists of movie reviews labeled as either 'pos' (positive) or 'neg' (negative).

2. Data Preparation: The reviews are shuffled and then prepared for the model. Each review is transformed into a single string (as opposed to a list of words), and the labels are extracted.

3. Vectorization: The texts of the reviews are vectorized using the CountVectorizer from Scikit-learn. This transforms the text into a matrix of token counts, which can be used as input for the machine learning model.

4. Data Splitting: The dataset is split into a training set and a testing set. 80% of the data is used for training the model, and 20% is used for testing its performance.

5. Model Training: A Multinomial Naive Bayes classifier is trained on the training data. This is a suitable model for classification with discrete features (like word counts).

6. Prediction: The trained model is used to predict the labels (i.e., sentiments) for the test set.

7. Evaluation: Finally, the performance of the model is evaluated by comparing the predicted labels to the true labels. The classification report includes metrics like precision, recall, and F1-score for both the 'pos' and 'neg' classes, as well as the overall accuracy of the model.

In [1]:
import nltk
from nltk.corpus import movie_reviews
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.naive_bayes import MultinomialNB
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report
from nltk.corpus import movie_reviews

# Load the dataset
documents = [(list(movie_reviews.words(fileid)), category)
             for category in movie_reviews.categories()
             for fileid in movie_reviews.fileids(category)]

# Shuffle the documents
import random
random.shuffle(documents)

# Prepare the dataset
texts = [" ".join(doc) for doc, _ in documents]
labels = [label for _, label in documents]

# Vectorize the texts
vectorizer = CountVectorizer()
features = vectorizer.fit_transform(texts)

# Split the dataset into training and testing sets
features_train, features_test, labels_train, labels_test = train_test_split(features, labels, test_size=0.2, random_state=42)

# Train a Naive Bayes classifier
classifier = MultinomialNB()
classifier.fit(features_train, labels_train)

# Predict the labels for the test set
labels_pred = classifier.predict(features_test)

# Print the classification report
print(classification_report(labels_test, labels_pred))

              precision    recall  f1-score   support

         neg       0.80      0.82      0.81       195
         pos       0.82      0.81      0.82       205

    accuracy                           0.81       400
   macro avg       0.81      0.81      0.81       400
weighted avg       0.81      0.81      0.81       400



## The output you're seeing is a classification report, which is a summary of the performance of a classification model on the test data. Here's what each term means:

- Precision: This is the ratio of true positives (the number of items correctly labeled as belonging to the positive class) to the sum of true positives and false positives (the number of items incorrectly labeled as belonging to the positive class). In your case, the precision for both 'neg' and 'pos' classes is 0.80, which means that 80% of the reviews that the model labeled as positive or negative were actually positive or negative.

- Recall: This is the ratio of true positives to the sum of true positives and false negatives (the number of items incorrectly labeled as belonging to the negative class). In your case, the recall for the 'neg' class is 0.80 and for the 'pos' class is 0.81, which means that the model correctly identified 80% of the negative reviews and 81% of the positive reviews.

- F1-score: This is the harmonic mean of precision and recall. An F1 score reaches its best value at 1 (perfect precision and recall) and worst at 0. In your case, the F1 score for both classes is 0.80, which is quite good.

- Support: This is the number of occurrences of each class in the true data. In your case, there were 199 negative reviews and 201 positive reviews in the test data.

- Macro avg: This is the average of the metric for each class without considering the proportion of each class in the true data.

- Weighted avg: This is the average of the metric for each class considering the proportion of each class in the true data.

- Accuracy: This is the ratio of the total number of correct predictions to the total number of predictions. In your case, the accuracy of the model is 0.80, which means that the model correctly predicted the sentiment of 80% of the reviews.