# Simple Sentiment Analysis using NLTK

This notebook demonstrates how to perform basic sentiment analysis on text comments using NLTK (Natural Language Toolkit). We'll break down the process into several steps:

1. Installing and importing required libraries
2. Preparing text data and preprocessing
3. Creating a simple sentiment analyzer
4. Testing the analyzer with example comments

## Step 1: Install Required Libraries

First, we'll install NLTK, which provides tools for natural language processing.

In [1]:
import nltk
import re
from nltk.tokenize import word_tokenize
from nltk.corpus import stopwords
from nltk.sentiment.vader import SentimentIntensityAnalyzer

# Download required NLTK data
nltk.download('punkt')
nltk.download('stopwords')
nltk.download('vader_lexicon')

[nltk_data] Downloading package punkt to
[nltk_data]     C:\Users\KHILJI\AppData\Roaming\nltk_data...
[nltk_data]   Unzipping tokenizers\punkt.zip.
[nltk_data] Downloading package stopwords to
[nltk_data]     C:\Users\KHILJI\AppData\Roaming\nltk_data...
[nltk_data]   Unzipping corpora\stopwords.zip.
[nltk_data] Downloading package vader_lexicon to
[nltk_data]     C:\Users\KHILJI\AppData\Roaming\nltk_data...


True

## Step 2: Text Preprocessing

We'll create a function to clean and preprocess text data:
1. Convert to lowercase
2. Remove special characters and numbers
3. Tokenize the text
4. Remove stopwords

In [2]:
def preprocess_text(text):
    # Convert to lowercase
    text = text.lower()
    
    # Remove special characters and numbers
    text = re.sub(r'[^a-zA-Z\s]', '', text)
    
    # Tokenize the text
    tokens = word_tokenize(text)
    
    # Remove stopwords
    stop_words = set(stopwords.words('english'))
    tokens = [token for token in tokens if token not in stop_words]
    
    # Join tokens back into text
    return ' '.join(tokens)

# Example usage
sample_text = "This movie was really great! I enjoyed it a lot. 10/10 would recommend!"
preprocessed_text = preprocess_text(sample_text)
print("Original text:", sample_text)
print("Preprocessed text:", preprocessed_text)

Original text: This movie was really great! I enjoyed it a lot. 10/10 would recommend!
Preprocessed text: movie really great enjoyed lot would recommend


## Step 3: Sentiment Analysis

We'll use NLTK's VADER (Valence Aware Dictionary and sEntiment Reasoner) sentiment analyzer. VADER is specifically attuned to sentiments expressed in social media and works well with short texts.

In [3]:
def analyze_sentiment(text):
    # Initialize VADER sentiment analyzer
    sid = SentimentIntensityAnalyzer()
    
    # Get sentiment scores
    scores = sid.polarity_scores(text)
    
    # Determine sentiment based on compound score
    if scores['compound'] >= 0.05:
        sentiment = 'Positive'
    elif scores['compound'] <= -0.05:
        sentiment = 'Negative'
    else:
        sentiment = 'Neutral'
    
    return sentiment, scores

# Example comments
comments = [
    "This product is amazing! Best purchase ever!",
    "I really hate this, it's terrible and doesn't work at all.",
    "It's okay, nothing special but gets the job done."
]

# Analyze each comment
for comment in comments:
    sentiment, scores = analyze_sentiment(comment)
    print("\nComment:", comment)
    print("Sentiment:", sentiment)
    print("Scores:", scores)


Comment: This product is amazing! Best purchase ever!
Sentiment: Positive
Scores: {'neg': 0.0, 'neu': 0.368, 'pos': 0.632, 'compound': 0.8619}

Comment: I really hate this, it's terrible and doesn't work at all.
Sentiment: Negative
Scores: {'neg': 0.47, 'neu': 0.53, 'pos': 0.0, 'compound': -0.796}

Comment: It's okay, nothing special but gets the job done.
Sentiment: Neutral
Scores: {'neg': 0.162, 'neu': 0.695, 'pos': 0.144, 'compound': -0.0462}


## Step 4: Interactive Testing

Now you can test the sentiment analyzer with your own text! Use the cell below to analyze any text comment.

In [4]:
# Test comments with different sentiments
test_comments = [
    "The customer service was outstanding! They went above and beyond to help me. I'm extremely satisfied!",  # Positive
    "This is the worst experience ever. The product broke after one day and no one helped me. Complete waste of money.",  # Negative
    "The product works as expected. It does what it's supposed to do."  # Neutral
]

print("Analyzing different types of comments:\n")
for i, comment in enumerate(test_comments, 1):
    sentiment, scores = analyze_sentiment(comment)
    print(f"\n=== Comment {i} ===")
    print("Text:", comment)
    print("Sentiment:", sentiment)
    print("Detailed scores:", scores)

Analyzing different types of comments:


=== Comment 1 ===
Text: The customer service was outstanding! They went above and beyond to help me. I'm extremely satisfied!
Sentiment: Positive
Detailed scores: {'neg': 0.0, 'neu': 0.556, 'pos': 0.444, 'compound': 0.8854}

=== Comment 2 ===
Text: This is the worst experience ever. The product broke after one day and no one helped me. Complete waste of money.
Sentiment: Negative
Detailed scores: {'neg': 0.412, 'neu': 0.588, 'pos': 0.0, 'compound': -0.8979}

=== Comment 3 ===
Text: The product works as expected. It does what it's supposed to do.
Sentiment: Neutral
Detailed scores: {'neg': 0.0, 'neu': 1.0, 'pos': 0.0, 'compound': 0.0}
