In [None]:
"""Project Description: Customer Review Automation Using SentimentIntensityAnalyzer
Objective
The objective of this project is to automate the analysis of customer reviews to determine their sentiment. By using the SentimentIntensityAnalyzer from the VADER (Valence Aware Dictionary and sEntiment Reasoner) library, we aim to categorize reviews as positive, negative, or neutral, and generate a summary report.

Steps Involved:

1.Environment Setup
2.Data Collection
3.Data Preprocessing
4.Sentiment Analysis
5.Result Visualization and Reporting


Step-by-Step Guide

Step 1: Environment Setup
Ensure you have Python installed. Then, install the necessary libraries:


!pip install nltk pandas matplotlib
#Download the VADER lexicon:

"""
import nltk
nltk.download('vader_lexicon')
"""
Step 2: Data Collection
For this example, we'll use a CSV file containing customer reviews. The file reviews.csv should have a column named review containing the text of the reviews.

Step 3: Data Preprocessing
Load and preprocess the data:

"""
import pandas as pd

# Load the data
df = pd.read_csv('reviews.csv')

# Display the first few rows
print(df.head())
"""
Step 4: Sentiment Analysis
Perform sentiment analysis using the SentimentIntensityAnalyzer:

"""
from nltk.sentiment.vader import SentimentIntensityAnalyzer

# Initialize the SentimentIntensityAnalyzer
sia = SentimentIntensityAnalyzer()

# Define a function to categorize sentiment
def analyze_sentiment(review):
    score = sia.polarity_scores(review)
    if score['compound'] >= 0.05:
        return 'Positive'
    elif score['compound'] <= -0.05:
        return 'Negative'
    else:
        return 'Neutral'

# Apply the function to the dataframe
df['sentiment'] = df['review'].apply(analyze_sentiment)
"""
# Display the updated dataframe
print(df.head())
Step 5: Result Visualization and Reporting
Visualize the sentiment distribution and save the results:

"""
import matplotlib.pyplot as plt

# Plot the sentiment distribution
df['sentiment'].value_counts().plot(kind='bar', color=['green', 'red', 'blue'])
plt.title('Sentiment Distribution')
plt.xlabel('Sentiment')
plt.ylabel('Number of Reviews')
plt.show()

# Save the result to a new CSV file
df.to_csv('reviews_with_sentiment.csv', index=False)

In [None]:
#Full Python Script
#Here is the complete script for the project:
import pandas as pd
import matplotlib.pyplot as plt
from nltk.sentiment.vader import SentimentIntensityAnalyzer
import nltk

# Ensure VADER lexicon is downloaded
nltk.download('vader_lexicon')

# Load the data
df = pd.read_csv('reviews.csv')

# Initialize the SentimentIntensityAnalyzer
sia = SentimentIntensityAnalyzer()

# Define a function to categorize sentiment
def analyze_sentiment(review):
    score = sia.polarity_scores(review)
    if score['compound'] >= 0.05:
        return 'Positive'
    elif score['compound'] <= -0.05:
        return 'Negative'
    else:
        return 'Neutral'

# Apply the function to the dataframe
df['sentiment'] = df['review'].apply(analyze_sentiment)

# Display the updated dataframe
print(df.head())

# Plot the sentiment distribution
df['sentiment'].value_counts().plot(kind='bar', color=['green', 'red', 'blue'])
plt.title('Sentiment Distribution')
plt.xlabel('Sentiment')
plt.ylabel('Number of Reviews')
plt.show()

# Save the result to a new CSV file
df.to_csv('reviews_with_sentiment.csv', index=False)

"""
Conclusion
This project demonstrates how to automate customer review sentiment analysis using the SentimentIntensityAnalyzer from the VADER library. By following these steps, you can quickly categorize customer reviews and generate insightful visualizations and reports.

"""

In [None]:
"""
explain VADAR library in SentimentIntensityAnalyzer

The VADER (Valence Aware Dictionary and sEntiment Reasoner) library is a tool specifically designed for sentiment analysis in textual data. It is particularly adept at analyzing social media text, such as tweets and reviews, but is also effective on other types of text data. VADER is part of the Natural Language Toolkit (nltk) and is widely used due to its simplicity and effectiveness in detecting sentiment.

Key Features of VADER
Human-Curated Lexicon:

VADER includes a lexicon of words with associated sentiment intensity scores. These scores range from -4 (most negative) to +4 (most positive). This lexicon was created by human experts who rated the intensity of words based on their perceived sentiment.
Valence Shifting:

VADER takes into account the effect of punctuation, capitalization, and degree modifiers (such as "extremely", "very", "slightly") on the sentiment intensity. For example, "This is good!" will have a different score compared to "This is GOOD!!!".
Handling of Conjunctions:

The library considers the impact of conjunctions like "but" that can shift the sentiment of a sentence. For instance, in the sentence "The movie was good, but the ending was bad", the sentiment after "but" has more weight.
Emoji and Slang Recognition:

VADER can understand and interpret emojis and common internet slang, enhancing its ability to accurately gauge sentiment from social media and informal text.
SentimentIntensityAnalyzer
The SentimentIntensityAnalyzer is the primary tool in VADER used for performing sentiment analysis. It provides a straightforward way to analyze text and categorize it into positive, negative, neutral, or compound sentiment scores.

How SentimentIntensityAnalyzer Works
Initialization:

You first need to initialize the SentimentIntensityAnalyzer from the VADER library.
Polarity Scores:

The polarity_scores method of SentimentIntensityAnalyzer is used to get a dictionary of scores for a given text. The dictionary contains:
pos: Positive sentiment score.
neu: Neutral sentiment score.
neg: Negative sentiment score.
compound: A normalized score that sums up the overall sentiment of the text. The compound score ranges from -1 (most extreme negative) to +1 (most extreme positive).
Example of Using SentimentIntensityAnalyzer
Here's an example to illustrate how the SentimentIntensityAnalyzer works:
"""

from nltk.sentiment.vader import SentimentIntensityAnalyzer
import nltk

# Ensure VADER lexicon is downloaded
nltk.download('vader_lexicon')

# Initialize the SentimentIntensityAnalyzer
sia = SentimentIntensityAnalyzer()

# Example text
text = "The movie was great, but the ending was terrible!"

# Analyze the sentiment
scores = sia.polarity_scores(text)

print(scores)
"""Output:

{'neg': 0.297, 'neu': 0.466, 'pos': 0.237, 'compound': -0.296}

In this example:

The neg score (0.297) indicates the proportion of the text that is negative.
The neu score (0.466) indicates the proportion of the text that is neutral.
The pos score (0.237) indicates the proportion of the text that is positive.
The compound score (-0.296) is the overall sentiment score, suggesting that the sentiment is slightly negative.

Conclusion
VADER and its SentimentIntensityAnalyzer provide a robust, simple-to-use tool for sentiment analysis. 
It leverages a well-curated lexicon and sophisticated rules to accurately gauge the sentiment of text data, 
making it particularly useful for social media and review analysis.
"""