# Title: Interactive Dashboard For Streaming Data

## Plots Contained:
- **Line Chart**
- **Pie Chart**
- **Word Cloud**
- **Sentiment Summary Table**

### Line Chart
- **Objective**: Display the trends of data count over time.
- **X-axis**: The time frame, either **year** or **month**, depending on the user's selection.
- **Y-axis**: Total data count for each sentiment group (Negative, Neutral, Positive) by year or month.
- **Purpose**: To visualize the distribution and trends of sentiment groups over time.

### Pie Chart
- **Objective**: Show the proportion of each sentiment group for the selected time period.
- **Selection**: The data can be filtered by either **year** or **month**.
- **Purpose**: To quickly compare the relative proportions of negative, neutral, and positive sentiments for a given time frame.

### Word Cloud
- **Objective**: Display the most frequently used words from the selected time period.
- **Selection**: The time frame (either **year** or **month**) can be selected, and the word cloud will reflect the most common terms for that time period.
- **Purpose**: To highlight the key themes or topics associated with the sentiment data.

### Sentiment Summary Table
- **Objective**: Show the total count of each sentiment group for the selected time period.
- **Selection**: The table displays sentiment counts for either the entire year or by specific month, depending on the user's choice.
- **Purpose**: To provide a concise summary of sentiment distribution in a tabular format.

## Filters:
- **Year/Month Selector**: A drop-down menu allows the user to select the desired time period.
  - **By Year**: Displays data aggregated by year.
  - **By Month**: Displays data aggrated by month. After selecting by month, another drop-down menu appears to choose the specific year.
- **Time Range Slider**: A slider is provided to select a range of years or months.
  - For **Year**: The user can select a range of years (e.g., 2015–2020). Default is 2010 to 2024.
  - For **Month**: The slider allows the user to select a specific time range within a chosen year. Default is 1 to 12.


In [3]:
from dash import Dash, dcc, html, Input, Output, dash_table
import dash_bootstrap_components as dbc
import plotly.express as px
import pandas as pd
import base64
from io import BytesIO
from wordcloud import WordCloud, STOPWORDS
import subprocess
from PIL import Image

# Initialize the app
app = Dash(__name__, external_stylesheets=[dbc.themes.BOOTSTRAP])

# Function to load and clean the latest data
def load_data():

    # Load the data
    df = pd.read_csv('climate_change_sentiment_analysis_streaming.csv')
    #df = pd.read_csv('streaming_sentiment_analysis.csv')
    
    # Ensure 'Date' column exists and is in datetime format
    if 'Date' in df.columns:
        df['Date'] = pd.to_datetime(df['Date'], errors='coerce')  # Coerce invalid formats to NaT
    else:
        raise KeyError("The 'Date' column is missing in the dataset.")
    
    # Drop rows with invalid or missing 'Date' values
    df = df.dropna(subset=['Date'])
    
    # Normalize 'Date' column (strip time information)
    df['Date'] = df['Date'].dt.date
    
    # Extract year and month from 'Date'
    df['Year'] = df['Date'].apply(lambda x: x.year)
    df['Month'] = df['Date'].apply(lambda x: x.month)
    df['sentiment_group'] = pd.cut(df['SelfTextSentimentScore'], bins=[-float('inf'), -0.1, 0.1, float('inf')], 
                                   labels=['negative', 'neutral', 'positive'])
    
    return df


# Load the data once
df = load_data()

# Function to generate word cloud image
def generate_wordcloud(text): 
    wordcloud = WordCloud(width=400, height=400, background_color='white', stopwords=STOPWORDS, max_words= 50).generate(text)
    img = BytesIO()
    wordcloud.to_image().save(img, format='PNG')
    img.seek(0)
    return f"data:image/png;base64,{base64.b64encode(img.getvalue()).decode()}"

# Layout for the app
app.layout = dbc.Container([
    dbc.Row([
        dbc.Col(html.H1("Climate Change Sentiment Analysis Dashboard"), className="mb-2")
    ]),

    # Dropdown for Year or Month View
    dbc.Row([
        dcc.Dropdown(
            id='view-selector',
            options=[
                {'label': 'By Year', 'value': 'year'},
                {'label': 'By Month', 'value': 'month'}
            ],
            value='year',
            clearable=False
        )
    ]),
    
    # Year Slider (Visible for "By Year")
    dbc.Row([
        html.Div(
            dcc.RangeSlider(
                id='year-slider',
                min=df['Year'].min(),
                max=df['Year'].max(),
                step=1,
                value=[df['Year'].min(), df['Year'].max()],
                marks={year: str(year) for year in range(df['Year'].min(), df['Year'].max() + 1)}
            ),
            id='year-slider-container',
            style={'display': 'block'}
        )
    ]),

    # Year Dropdown (Visible for "By Month")
    dbc.Row([
        html.Div(
            dcc.Dropdown(
                id='year-dropdown',
                options=[{'label': str(year), 'value': year} for year in sorted(df['Year'].unique(), reverse=True)],
                value=df['Year'].max(),
                clearable=False
            ),
            id='year-dropdown-container',
            style={'display': 'none'}
        )
    ]),
    
    # Range Slider for Month Selection
    dbc.Row([
        html.Div(
            dcc.RangeSlider(
                id='month-slider',
                min=1,
                max=12,
                step=1,
                value=[1, 12],
                marks={month: str(month) for month in range(1, 13)}
            ),
            id='month-slider-container',
            style={'display': 'none'}
        )
    ]),
    
    dbc.Row([
        dbc.Col(dcc.Graph(id='line-graph'), width=8),
        dbc.Col(dcc.Graph(id='pie-chart'), width=4),
    ]),

    
    dbc.Row([
        # Word Cloud
        dbc.Col([
            dbc.Card(
                dbc.CardBody([
                    html.Img(id='word-cloud', style={'width': '400px', 'height': '400px', 'display': 'block', 'margin': 'auto'})
                ]),
                style={"width": "450px", "margin": "auto", "border": "1px solid #ccc", "padding": "10px"}
            ),
        ], width=8),
        
        # Adding a DataTable for the sentiment counts
        dbc.Col([
            dbc.Card(
                dbc.CardBody([
                    html.H5("Sentiment Summary", className="card-title"),
                    dash_table.DataTable(
                        id='sentiment-table',
                        columns=[
                            {"name": "Sentiment", "id": "sentiment"},
                            {"name": "Count", "id": "count"}
                        ],
                        style_table={'overflowX': 'auto'},
                        style_cell={'textAlign': 'center'},
                        style_header={'fontWeight': 'bold'}
                    )
                ]),
                style={"width": "450px", "margin": "auto", "border": "1px solid #ccc", "padding": "10px"}
            ),
        ], width=4)
    ]),
    
    dbc.Row([
        dbc.Col(dcc.Interval(id='interval-component', interval=60000, n_intervals=0))
    ]) 
], fluid=True)

# Combined Callback for both Graphs, Word Cloud, Table and Styles
@app.callback(
    [
        Output('line-graph', 'figure'),
        Output('pie-chart', 'figure'),
        Output('year-slider-container', 'style'),
        Output('year-dropdown-container', 'style'),
        Output('month-slider-container', 'style'),
        Output('word-cloud', 'src'),
        Output('sentiment-table', 'data')
    ],
    [
        Input('view-selector', 'value'),
        Input('year-slider', 'value'),
        Input('year-dropdown', 'value'),
        Input('month-slider', 'value'),
        Input('interval-component', 'n_intervals')
    ]
)
def update_graphs_and_styles(view_type, year_range, selected_year, month_range, n_intervals):
    # Prepare styles
    year_slider_style = {'display': 'block'} if view_type == 'year' else {'display': 'none'}
    year_dropdown_style = {'display': 'block'} if view_type == 'month' else {'display': 'none'}
    month_slider_style = {'display': 'block'} if view_type == 'month' else {'display': 'none'}

    df = load_data()

    # Filter Data
    if view_type == 'year':
        filtered_df = df[(df['Year'] >= year_range[0]) & (df['Year'] <= year_range[1])]
        
        # Line Graph
        grouped = (filtered_df.groupby(['Year', 'sentiment_group'], observed=False)
                   .size()
                   .unstack(fill_value=0)
                   .reset_index())
        
        line_fig = px.line(grouped, x='Year', y=['negative', 'neutral', 'positive'],
                           labels={'value': 'Count of Sentiments', 'Year': 'Year'},
                           title="Sentiment Groups Over Years",
                           markers=True,
                           color_discrete_map={'negative': 'hotpink', 'neutral': 'steelblue', 'positive': 'mediumseagreen'})

        # Pie Chart
        sentiment_counts = filtered_df['sentiment_group'].value_counts().reset_index()
        sentiment_counts.columns = ['sentiment_group', 'count']
        
        pie_fig = px.pie(
            sentiment_counts, 
            names='sentiment_group', 
            values='count', 
            title="Sentiment Group Distribution Over Year",
            color='sentiment_group',  # Optional: Assign colors to categories
            color_discrete_map={'negative': 'hotpink', 'neutral': 'steelblue', 'positive': 'mediumseagreen'}
        )

        # Word Cloud
        text = ' '.join(filtered_df['SelfText'].dropna())
        wordcloud_img = generate_wordcloud(text)

        # Sentiment Counts Table
        sentiment_table_data = [
                    {"sentiment": "Negative", "count": sentiment_counts.loc[sentiment_counts['sentiment_group'] == 'negative', 'count'].values[0] 
                     if sentiment_counts.loc[sentiment_counts['sentiment_group'] == 'negative', 'count'].size > 0 else 0},
                    {"sentiment": "Neutral", "count": sentiment_counts.loc[sentiment_counts['sentiment_group'] == 'neutral', 'count'].values[0] 
                     if sentiment_counts.loc[sentiment_counts['sentiment_group'] == 'neutral', 'count'].size > 0 else 0},
                    {"sentiment": "Positive", "count": sentiment_counts.loc[sentiment_counts['sentiment_group'] == 'positive', 'count'].values[0] 
                     if sentiment_counts.loc[sentiment_counts['sentiment_group'] == 'positive', 'count'].size > 0 else 0},
                    {"sentiment": "Total", "count": sentiment_counts['count'].sum()},
                ]

        return line_fig, pie_fig, year_slider_style, year_dropdown_style, month_slider_style, wordcloud_img, sentiment_table_data
  
    elif view_type == 'month':
        filtered_df = df[(df['Year'] == selected_year) & 
                         (df['Month'] >= month_range[0]) & 
                         (df['Month'] <= month_range[1])]
        
        # Line Graph
        grouped = (filtered_df.groupby(['Month', 'sentiment_group'], observed=False)
                   .size()
                   .unstack(fill_value=0)
                   .reset_index())
        
        line_fig = px.line(grouped, x='Month', y=['negative', 'neutral', 'positive'],
                           labels={'value': 'Count of Sentiments', 'Month': 'Month'},
                           title=f"Sentiment Groups for Year {selected_year}",
                           markers=True,
                           color_discrete_map={'negative': 'hotpink', 'neutral': 'steelblue', 'positive': 'mediumseagreen'})

        # Pie Chart
        sentiment_counts = filtered_df['sentiment_group'].value_counts().reset_index()
        sentiment_counts.columns = ['sentiment_group', 'count']
        
        pie_fig = px.pie(
            sentiment_counts, 
            names='sentiment_group', 
            values='count', 
            title="Sentiment Group Distribution for Selected Month Range",
            color='sentiment_group',  # Optional: Assign colors to categories
            color_discrete_map={'negative': 'hotpink', 'neutral': 'steelblue', 'positive': 'mediumseagreen'}
        )

        # Word Cloud
        text = ' '.join(filtered_df['SelfText'].dropna())
        wordcloud_img = generate_wordcloud(text)

        # Sentiment Counts Table
        sentiment_table_data = [
                    {"sentiment": "Negative", "count": sentiment_counts.loc[sentiment_counts['sentiment_group'] == 'negative', 'count'].values[0] 
                     if sentiment_counts.loc[sentiment_counts['sentiment_group'] == 'negative', 'count'].size > 0 else 0},
                    {"sentiment": "Neutral", "count": sentiment_counts.loc[sentiment_counts['sentiment_group'] == 'neutral', 'count'].values[0] 
                     if sentiment_counts.loc[sentiment_counts['sentiment_group'] == 'neutral', 'count'].size > 0 else 0},
                    {"sentiment": "Positive", "count": sentiment_counts.loc[sentiment_counts['sentiment_group'] == 'positive', 'count'].values[0] 
                     if sentiment_counts.loc[sentiment_counts['sentiment_group'] == 'positive', 'count'].size > 0 else 0},
                    {"sentiment": "Total", "count": sentiment_counts['count'].sum()},
                ]

        return line_fig, pie_fig, year_slider_style, year_dropdown_style, month_slider_style, wordcloud_img, sentiment_table_data

if __name__ == '__main__':
    app.run_server(debug=True, port = 8000)
    subprocess.Popen(['/Applications/Google Chrome.app/Contents/MacOS/Google Chrome', 'http://127.0.0.1:8000/'])


[45617:259:1207/082918.044976:ERROR:chrome_browser_main.cc(1019)] The use of Rosetta to run the x64 version of Chromium on Arm is neither tested nor maintained, and unexpected behavior will likely result. Please check that all tools that spawn Chromium are Arm-native.


Opening in existing browser session.


objc[45635]: Class WebSwapCGLLayer is implemented in both /System/Library/Frameworks/WebKit.framework/Versions/A/Frameworks/WebCore.framework/Versions/A/Frameworks/libANGLE-shared.dylib (0x7ffb4a450158) and /Applications/Google Chrome.app/Contents/Frameworks/Google Chrome Framework.framework/Versions/112.0.5615.121/Libraries/libGLESv2.dylib (0x10f6f69c8). One of the two will be used. Which one is undefined.
