# Step 1: Load the Dataset

### Let's start by importing the necessary libraries and loading the dataset.

In [77]:
!pip install --upgrade nltk





In [78]:

import pandas as pd 

df = pd.read_csv("sentiment_dataset_large.csv")


In [79]:
df

Unnamed: 0,ID,Text,Sentiment
0,1,"A total waste of money, do not buy this.",Negative
1,2,"I absolutely love this product, it works great!",Positive
2,3,"This was a terrible experience, I regret it co...",Negative
3,4,"I absolutely love this product, it works great!",Positive
4,5,"The product is average, works as expected.",Neutral
...,...,...,...
995,996,"It's okay, nothing special but not bad either.",Neutral
996,997,"It's okay, nothing special but not bad either.",Neutral
997,998,"Poor quality and bad service, not worth the mo...",Negative
998,999,"A total waste of money, do not buy this.",Negative


In [80]:
df.head()

Unnamed: 0,ID,Text,Sentiment
0,1,"A total waste of money, do not buy this.",Negative
1,2,"I absolutely love this product, it works great!",Positive
2,3,"This was a terrible experience, I regret it co...",Negative
3,4,"I absolutely love this product, it works great!",Positive
4,5,"The product is average, works as expected.",Neutral


## Step 2: Text Preprocessing for NLP  
Before performing sentiment analysis, we clean the text using NLP techniques:

- Tokenization: Splitting text into individual words  
- Lemmatization: Converting words to their base form  
- Removing Stopwords: Eliminating common words that don’t contribute meaning  
- Vectorization: Transforming text into the numerical format

In [81]:
import nltk 
from nltk.tokenize import word_tokenize
from nltk.corpus import stopwords
from nltk.stem import WordNetLemmatizer
import string 




In [82]:
stop_words = set(stopwords.words("english"))

lemmatizer = WordNetLemmatizer()

lemmatizer


<WordNetLemmatizer>

In [83]:
def preprocess_text(text):
    tokens = word_tokenize(text.lower())

    tokens = [lemmatizer.lemmatize(word) for word in tokens if word not in stop_words and word not in string.punctuation]
    return " ".join(tokens)

In [84]:
df['Processed_Text'] = df['Text'].apply(preprocess_text)
df[['Text', 'Processed_Text']].head()



Unnamed: 0,Text,Processed_Text
0,"A total waste of money, do not buy this.",total waste money buy
1,"I absolutely love this product, it works great!",absolutely love product work great
2,"This was a terrible experience, I regret it co...",terrible experience regret completely
3,"I absolutely love this product, it works great!",absolutely love product work great
4,"The product is average, works as expected.",product average work expected


## Step 3: Perform Sentiment Analysis using NLP
### We will now analyze the sentiment using VADER (Valence Aware Dictionary and sEntiment Reasoner).

In [85]:
import nltk
from nltk.sentiment import SentimentIntensityAnalyzer
nltk.download('vader_lexicon')
sia = SentimentIntensityAnalyzer()


[nltk_data] Downloading package vader_lexicon to
[nltk_data]     C:\Users\naman\AppData\Roaming\nltk_data...
[nltk_data]   Package vader_lexicon is already up-to-date!


In [86]:
def get_sentiment(text):
    score = sia.polarity_scores(text)
    if score['compound'] >= 0.05:
        return "Positive"
    elif score['compound'] <= -0.05:
        return "Negative"
    else:
        return "Neutral"

# Apply sentiment analysis
df['Predicted_Sentiment'] = df['Processed_Text'].apply(get_sentiment)

# Compare actual vs predicted sentiment
df[['Text', 'Sentiment', 'Predicted_Sentiment']].head()


Unnamed: 0,Text,Sentiment,Predicted_Sentiment
0,"A total waste of money, do not buy this.",Negative,Negative
1,"I absolutely love this product, it works great!",Positive,Positive
2,"This was a terrible experience, I regret it co...",Negative,Negative
3,"I absolutely love this product, it works great!",Positive,Positive
4,"The product is average, works as expected.",Neutral,Neutral



## Step 4: Build an Interactive Dashboard with Dash
## Now, we will create an attractive dashboard using Dash to visualize the sentiment analysis results.

In [95]:
import dash
from dash import dcc, html, dash_table  # Import dash_table correctly
from dash.dependencies import Input, Output
import plotly.express as px
import pandas as pd


In [96]:
df = pd.DataFrame({
    'Text': ['Great product!', 'Terrible experience.', 'It’s okay.', 'Loved it!', 'Not good'],
    'Predicted_Sentiment': ['Positive', 'Negative', 'Neutral', 'Positive', 'Negative']
})

app = dash.Dash(__name__)

app.layout = html.Div([
    html.H1("Sentiment Analysis Dashboard", style={
        'textAlign': 'center',
        'color': '#1e3799',
        'fontSize': '32px',
        'textShadow': '2px 2px 4px #aaa',
        'marginBottom': '20px'
    }),

    dcc.Dropdown(
        id='sentiment-dropdown',
        options=[
            {'label': 'All', 'value': 'All'},
            {'label': 'Positive', 'value': 'Positive'},
            {'label': 'Negative', 'value': 'Negative'},
            {'label': 'Neutral', 'value': 'Neutral'}
        ],
        value='All',
        multi=False,
        style={'width': '50%', 'margin': '0 auto 20px'}
    ),

    dcc.Graph(id='sentiment-graph'),

    html.H3("Sample Sentiment Reviews", style={'color': '#1e3799', 'marginTop': '20px'}),
    
    dash_table.DataTable(
        id='review-table',
        columns=[
            {'name': 'Text', 'id': 'Text'},
            {'name': 'Predicted Sentiment', 'id': 'Predicted_Sentiment'}
        ],
        style_table={'width': '80%', 'margin': '0 auto'},
        style_header={'backgroundColor': '#1e3799', 'color': 'white', 'fontWeight': 'bold'},
        style_cell={'textAlign': 'center', 'padding': '10px', 'fontSize': '16px'}
    )
])

@app.callback(
    [Output('sentiment-graph', 'figure'), Output('review-table', 'data')],
    [Input('sentiment-dropdown', 'value')]
)
def update_dashboard(selected_sentiment):
    filtered_df = df if selected_sentiment == "All" else df[df["Predicted_Sentiment"] == selected_sentiment]

    # If the filtered DataFrame is empty, return an empty graph
    if filtered_df.empty:
        fig = px.histogram(title="No data available for the selected sentiment")
    else:
        fig = px.histogram(filtered_df, x='Predicted_Sentiment', title="Sentiment Distribution",
                           color='Predicted_Sentiment', category_orders={"Predicted_Sentiment": ["Positive", "Negative", "Neutral"]})
        fig.update_layout(template='plotly_white', title_x=0.5)

    # Convert filtered data to a format suitable for DataTable
    table_data = filtered_df.to_dict('records') if not filtered_df.empty else [{'Text': 'No data available', 'Predicted_Sentiment': ''}]

    return fig, table_data

if __name__ == '__main__':
    app.run(debug=True)

📊 Sentiment Analysis Dashboard - Conclusion
Overview
The Sentiment Analysis Dashboard provides a dynamic way to visualize and analyze textual sentiment data. It allows users to filter and explore sentiments (Positive, Negative, Neutral) through interactive graphs and tabular data.

Key Features & Insights
Sentiment Distribution Graph

Displays the frequency of each sentiment category (Positive, Negative, Neutral).
Helps in understanding the overall sentiment trend in the dataset.
Uses color coding for better visualization.
Sample Sentiment Reviews Table

Shows a tabular representation of reviews with their predicted sentiment.
Enables quick validation of sentiment classification accuracy.
Updates dynamically based on the selected sentiment filter.
Dropdown-Based Filtering

Users can select a specific sentiment category or view all data at once.
Provides flexibility in data exploration and trend identification.
Final Conclusion
The dashboard successfully categorizes and visualizes sentiment data, making it easier to interpret trends.
It helps in understanding how user reviews are distributed across different sentiment classes.
Can be expanded further with additional features like sentiment confidence scores, word clouds, and AI-driven insights.