# Dash Dashboard - Global Emissions Analysis

This notebook contains a comprehensive Dash application for interactive emissions data analysis.

## 8 Core Visualizations

1. **Choropleth Map** - Global emissions by country (year-selectable)
2. **Top 10 Bar Chart** - Countries with highest emissions (year + N configurable)
3. **Heatmap** - Temporal evolution of top countries across years
4. **Single Country Time Series** - Historical emissions trend for selected country
5. **Multi-Country Comparison** - Overlay 2-5 countries for comparison
6. **ARIMA Forecast** - Historical + predicted global emissions with 95% CI (horizon variable)
7. **Linear Trend & Extrapolation** - Regression line + projection to 2050
8. **Sector Pie Chart** - Emissions breakdown by sector (year-selectable)

## How to use

1. **Execute cells sequentially** from top to bottom. Each cell loads dependencies, defines functions, or sets up the app.
2. **Start the server** by running the last cell:
   ```python
   app.run(debug=True, host='127.0.0.1', port=8050)
   ```
3. **Open browser** to `http://127.0.0.1:8050`

## Dashboard Controls (Sidebar)

- Year slider (1990-2021)
- Top N selector (5/10/15/20)
- Single country dropdown (detailed view)
- Multi-country multi-select (comparison)
- Forecast horizon slider (1-30 years)
- Extrapolation year slider (2021-2050)


## Cell 2: Import Required Libraries

This cell imports all necessary libraries for data processing, visualization, and the Dash web framework.

**Dependencies**: Dash, Plotly, Pandas, NumPy, Scikit-learn, Statsmodels


In [22]:
from dash import Dash, dcc, html, Input, Output
import dash_bootstrap_components as dbc
import pandas as pd
import plotly.express as px
import plotly.graph_objects as go
import numpy as np
from sklearn.linear_model import LinearRegression
from statsmodels.tsa.arima.model import ARIMA

print("‚úì Libraries imported successfully")

‚úì Libraries imported successfully


## Cell 3: Load and Prepare Data

Loads emissions data from CSV, coerces year columns to numeric format, and extracts metadata.

**Functions called**: None (utility setup)
**Output**: Global variables `df`, `year_cols`, `years`, `countries`


In [23]:
# Load data
DATA_PATH = 'historical_emissions.csv'
df = pd.read_csv(DATA_PATH)

# Coerce year columns to numeric
year_cols = [str(y) for y in range(1990, 2022) if str(y) in df.columns]
for col in year_cols:
    df[col] = pd.to_numeric(df[col], errors='coerce')

# Extract metadata
years = sorted([int(y) for y in year_cols])
countries = sorted(df['Country'].dropna().unique().tolist()) if 'Country' in df.columns else []

print(f"‚úì Data loaded: {len(df)} rows, years {min(years)}-{max(years)}, {len(countries)} countries")
print(f"  Columns: {', '.join(df.columns.tolist()[:5])}...")

‚úì Data loaded: 195 rows, years 1990-2021, 195 countries
  Columns: ISO, Country, Data source, Sector, Gas...


## Cell 4: Sector Pie Chart Function

**What it does**: Creates a pie chart showing emissions breakdown by sector for a selected year.

**What it displays**: Percentage distribution across all sectors (energy, industry, agriculture, etc.)

**Function**: `create_sector_pie(df, year)`


In [24]:
def create_sector_pie(df, year):
    """
    Generate pie chart showing emissions by sector for a given year.
    
    Args:
        df: DataFrame with 'Sector' column and year columns
        year: Year to analyze (int or str)
    
    Returns:
        Plotly figure (pie chart)
    """
    if 'Sector' not in df.columns:
        return go.Figure().add_annotation(text='Sector data not available')
    
    sector_data = df.groupby('Sector')[str(year)].sum().reset_index()
    sector_data.columns = ['Sector', 'Emissions']
    
    fig = px.pie(
        sector_data, 
        values='Emissions', 
        names='Sector',
        title=f'Emissions by Sector - {year}',
        hole=0  # Set to 0.4 for donut chart if preferred
    )
    fig.update_layout(height=400)
    return fig

print("‚úì Sector pie function defined")

‚úì Sector pie function defined


## Cell 5: Choropleth Map Function

**What it does**: Creates a world map visualizing global emissions by country.

**What it displays**: Color-coded countries based on total emissions for selected year (YlOrRd scale)

**Function**: `create_choropleth(df, year)`


In [25]:
def create_choropleth(df, year):
    """
    Generate choropleth map showing emissions by country.
    
    Args:
        df: DataFrame with 'ISO', 'Country' columns and year columns
        year: Year to analyze (int or str)
    
    Returns:
        Plotly figure (choropleth map)
    """
    # Aggregate emissions by country
    data = df.groupby(['ISO', 'Country'])[str(year)].sum().reset_index()
    
    fig = px.choropleth(
        data, 
        locations='ISO', 
        color=str(year), 
        hover_name='Country',
        color_continuous_scale='YlOrRd',
        projection='natural earth',
        title=f'Global Emissions by Country - {year}',
        locationmode='ISO-3'
    )
    fig.update_layout(height=500)
    return fig

print("‚úì Choropleth function defined")

‚úì Choropleth function defined


## Cell 6: Top N Countries Bar Chart Function

**What it does**: Creates a horizontal bar chart ranking countries by emission levels.

**What it displays**: Top N countries sorted by descending emissions for selected year

**Function**: `create_top10_bar(df, year, n)`


In [26]:
def create_top10_bar(df, year, n=10):
    """
    Generate horizontal bar chart of top N emitting countries.
    
    Args:
        df: DataFrame with 'Country' column and year columns
        year: Year to analyze (int or str)
        n: Number of top countries to show (default 10)
    
    Returns:
        Plotly figure (horizontal bar chart)
    """
    top_data = df.groupby('Country')[str(year)].sum().nlargest(n).reset_index()
    top_data.columns = ['Country', 'Emissions']
    
    fig = px.bar(
        top_data, 
        x='Emissions', 
        y='Country', 
        orientation='h',
        title=f'Top {n} Emitting Countries - {year}',
        color='Emissions',
        color_continuous_scale='Reds'
    )
    fig.update_layout(height=400, showlegend=False, yaxis={'categoryorder': 'total ascending'})
    return fig

print("‚úì Top-N bar chart function defined")

‚úì Top-N bar chart function defined


## Cell 7: Heatmap Function

**What it does**: Creates a heatmap showing temporal evolution of emissions for top countries.

**What it displays**: Matrix of top N countries (rows) √ó years (columns) with color intensity = emission level

**Function**: `create_heatmap(df, top_n)`


In [27]:
def create_heatmap(df, top_n=15):
    """
    Generate heatmap of top N countries' emissions across all years.
    
    Args:
        df: DataFrame with 'Country' column and year columns
        top_n: Number of top countries to include (default 15)
    
    Returns:
        Plotly figure (heatmap)
    """
    # Find top N countries by total emissions
    top_countries = df.groupby('Country')[year_cols].sum().sum(axis=1).nlargest(top_n).index
    
    # Create matrix
    heatmap_data = df[df['Country'].isin(top_countries)].groupby('Country')[year_cols].sum()
    heatmap_data.columns = [int(c) for c in heatmap_data.columns]
    
    fig = px.imshow(
        heatmap_data,
        labels=dict(x='Year', y='Country', color='Emissions'),
        color_continuous_scale='Blues',
        title=f'Emissions Evolution - Top {top_n} Countries',
        aspect='auto'
    )
    fig.update_layout(height=500)
    return fig

print("‚úì Heatmap function defined")

‚úì Heatmap function defined


## Cell 8: Single Country Time Series Function

**What it does**: Creates a line chart showing historical emissions trend for a single selected country.

**What it displays**: Year-by-year emissions from 1990-2021 with area fill below the line

**Function**: `create_country_timeseries(df, country)`


In [35]:
def create_country_timeseries(df, country):
    """
    Generate time series line chart for a single country.
    
    Args:
        df: DataFrame with 'Country' column and year columns
        country: Country name to analyze
    
    Returns:
        Plotly figure (line chart with fill)
    """
    country_data = df[df['Country'] == country][year_cols].sum()
    country_data.index = [int(y) for y in country_data.index]
    country_data = country_data.sort_index()
    
    fig = go.Figure()
    fig.add_trace(go.Scatter(
        x=country_data.index, 
        y=country_data.values,
        mode='lines+markers', 
        name=country, 
        fill='tozeroy',
        line=dict(color='steelblue'),
        marker=dict(size=4)
    ))
    fig.update_layout(
        title=f'Emissions Trend - {country}',
        xaxis_title='Year', 
        yaxis_title='Emissions',
        hovermode='x unified', 
        height=400
    )
    return fig

print("‚úì Single country time series function defined")

‚úì Single country time series function defined


## Cell 9: Multi-Country Comparison Function

**What it does**: Creates an overlay line chart comparing emissions trends across 2-5 selected countries.

**What it displays**: Multiple superimposed lines, one per country, for direct comparison

**Function**: `create_multi_country_comparison(df, countries_list)`


In [36]:
def create_multi_country_comparison(df, countries_list):
    """
    Generate line chart comparing multiple countries' emissions.
    
    Args:
        df: DataFrame with 'Country' column and year columns
        countries_list: List of country names to compare
    
    Returns:
        Plotly figure (multi-line chart)
    """
    if not countries_list:
        return go.Figure().add_annotation(text='Select at least one country')
    
    fig = go.Figure()
    for country in countries_list:
        country_data = df[df['Country'] == country][year_cols].sum()
        country_data.index = [int(y) for y in country_data.index]
        country_data = country_data.sort_index()
        
        fig.add_trace(go.Scatter(
            x=country_data.index, 
            y=country_data.values,
            mode='lines', 
            name=country,
            hovertemplate='%{x}<br>%{y:.0f}<extra></extra>'
        ))
    
    fig.update_layout(
        title='Country Comparison',
        xaxis_title='Year', 
        yaxis_title='Emissions',
        hovermode='x unified', 
        height=400,
        legend=dict(x=0.01, y=0.99)
    )
    return fig

print("‚úì Multi-country comparison function defined")

‚úì Multi-country comparison function defined


## Cell 10: ARIMA Forecast Function

**What it does**: Fits ARIMA model to global emissions data and generates predictions with 95% confidence interval.

**What it displays**: Historical data + dashed forecast line + shaded confidence band (95% CI)

**Function**: `create_arima_forecast(df, forecast_years)`


In [37]:
def create_arima_forecast(df, forecast_years=10):
    """
    Generate ARIMA forecast of global emissions.
    
    Args:
        df: DataFrame with year columns
        forecast_years: Number of years to forecast (default 10)
    
    Returns:
        Plotly figure (line + forecast + confidence interval)
    """
    # Aggregate global emissions
    global_ts = df[year_cols].sum(axis=0)
    global_ts.index = [int(y) for y in global_ts.index]
    global_ts = global_ts.sort_index()
    
    # Handle missing values
    global_ts = global_ts.fillna(method='ffill').fillna(method='bfill')
    
    fig = go.Figure()
    
    # Add historical data
    fig.add_trace(go.Scatter(
        x=global_ts.index, 
        y=global_ts.values,
        mode='lines', 
        name='Historical',
        line=dict(color='blue')
    ))
    
    # Fit ARIMA model
    try:
        model = ARIMA(global_ts, order=(1, 1, 1))
        results = model.fit()
        forecast = results.get_forecast(steps=forecast_years)
        forecast_mean = forecast.predicted_mean
        forecast_ci = forecast.conf_int()
        
        future_years = np.arange(global_ts.index.max() + 1, global_ts.index.max() + 1 + forecast_years)
        
        # Add forecast line
        fig.add_trace(go.Scatter(
            x=future_years, 
            y=forecast_mean.values,
            mode='lines', 
            name='ARIMA Forecast',
            line=dict(color='red', dash='dash')
        ))
        
        # Add confidence interval
        fig.add_trace(go.Scatter(
            x=list(future_years) + list(future_years[::-1]),
            y=list(forecast_ci.iloc[:, 1]) + list(forecast_ci.iloc[:, 0][::-1]),
            fill='toself',
            fillcolor='rgba(255,0,0,0.2)',
            line=dict(color='rgba(255,255,255,0)'),
            name='95% CI'
        ))
    except Exception as e:
        print(f"ARIMA failed ({e}), skipping forecast")
    
    fig.update_layout(
        title=f'Global Emissions - ARIMA Forecast ({forecast_years} years)',
        xaxis_title='Year', 
        yaxis_title='Emissions',
        hovermode='x unified', 
        height=400
    )
    return fig

print("‚úì ARIMA forecast function defined")

‚úì ARIMA forecast function defined


## Cell 11: Linear Trend & Extrapolation Function

**What it does**: Fits linear regression to historical emissions and extrapolates to a target year (default 2050).

**What it displays**: Scatter points (historical) + solid regression line + dashed extrapolation line

**Function**: `create_linear_trend(df, extrapolate_to)`


In [31]:
def create_linear_trend(df, extrapolate_to=2030):
    """
    Generate linear regression trend with extrapolation.
    
    Args:
        df: DataFrame with year columns
        extrapolate_to: Target year for extrapolation (default 2030)
    
    Returns:
        Plotly figure (scatter + trend line + extrapolation)
    """
    # Aggregate global emissions
    global_ts = df[year_cols].sum(axis=0)
    global_ts.index = [int(y) for y in global_ts.index]
    global_ts = global_ts.sort_index()
    
    # Fit linear regression
    X = global_ts.index.values.reshape(-1, 1)
    y = global_ts.values
    lr = LinearRegression()
    lr.fit(X, y)
    
    # Predict for full range
    years_pred = np.arange(1990, extrapolate_to + 1)
    y_pred = lr.predict(years_pred.reshape(-1, 1))
    
    fig = go.Figure()
    
    # Add historical scatter
    fig.add_trace(go.Scatter(
        x=global_ts.index, 
        y=global_ts.values,
        mode='markers', 
        name='Historical',
        marker=dict(size=6, color='blue')
    ))
    
    # Add regression line
    fig.add_trace(go.Scatter(
        x=years_pred, 
        y=y_pred,
        mode='lines',
        name='Linear Trend',
        line=dict(color='green')
    ))
    
    # Highlight extrapolation (future part)
    future_mask = years_pred > global_ts.index.max()
    fig.add_trace(go.Scatter(
        x=years_pred[future_mask], 
        y=y_pred[future_mask],
        mode='lines',
        name='Extrapolation',
        line=dict(color='green', dash='dash')
    ))
    
    fig.update_layout(
        title=f'Global Emissions - Linear Trend (extrapolated to {extrapolate_to})',
        xaxis_title='Year', 
        yaxis_title='Emissions',
        hovermode='x unified', 
        height=400
    )
    return fig

print("‚úì Linear trend function defined")

‚úì Linear trend function defined


## Cell 12: Dash App Setup

Initializes the Dash application with responsive Bootstrap layout. Defines:
- Sticky sidebar with 6 filter controls
- Main content area with 5 rows of visualizations (8 total)
- 7 callbacks linking controls to graphs

**Functions called**: All 8 visualization functions above


In [38]:
# Initialize Dash app
app = Dash(__name__, external_stylesheets=[dbc.themes.BOOTSTRAP])
server = app.server

# Define sidebar controls
controls = dbc.Card([
    html.H5('üìä Filters', className='mb-3 fw-bold text-primary'),
    html.Hr(),
    
    html.Label('üìÖ Year', className='fw-bold mt-3'),
    dcc.Slider(
        id='year-slider', 
        min=min(years), 
        max=max(years), 
        value=max(years),
        marks={str(y): str(y) if y % 5 == 0 else '' for y in years},
        step=1, 
        tooltip={'placement': 'bottom', 'always_visible': True}
    ),
    
    html.Label('üéØ Top N Countries', className='fw-bold mt-4'),
    dcc.RadioItems(
        id='top-n-radio', 
        options=[
            {'label': ' 5', 'value': 5}, 
            {'label': ' 10', 'value': 10},
            {'label': ' 15', 'value': 15}, 
            {'label': ' 20', 'value': 20}
        ], 
        value=10, 
        inline=True, 
        className='mt-2'
    ),
    
    html.Hr(),
    html.Label('üåç Single Country', className='fw-bold mt-3'),
    dcc.Dropdown(
        id='country-single', 
        options=[{'label': c, 'value': c} for c in countries],
        value=countries[0] if countries else 'China', 
        clearable=False
    ),
    
    html.Label('üåç Compare Countries (Multi-select)', className='fw-bold mt-4'),
    dcc.Dropdown(
        id='countries-multi', 
        options=[{'label': c, 'value': c} for c in countries],
        value=['China', 'United States', 'India'] if len(countries) >= 3 else countries[:3], 
        multi=True
    ),
    
    html.Hr(),
    html.Label('üìà ARIMA Forecast Horizon (years)', className='fw-bold mt-3'),
    dcc.Slider(
        id='forecast-slider', 
        min=1, 
        max=30, 
        value=10,
        marks={1:'1y', 10:'10y', 20:'20y', 30:'30y'}, 
        step=1
    ),
    
    html.Label('üîÆ Extrapolation Target Year', className='fw-bold mt-4'),
    dcc.Slider(
        id='extrapolate-slider', 
        min=2021, 
        max=2050, 
        value=2030,
        marks={2021:'2021', 2030:'2030', 2040:'2040', 2050:'2050'}, 
        step=1
    ),
], 
body=True, 
className='sticky-top bg-light',
style={'top': '10px', 'max-height': '90vh', 'overflow-y': 'auto', 'box-shadow': '0 2px 8px rgba(0,0,0,0.1)'}
)

# Define app layout
app.layout = dbc.Container([
    dbc.Row(dbc.Col(
        html.H1('üåç Global Emissions Analysis Dashboard', className='text-center mb-4 mt-3 text-primary'),
        width=12
    )),
    
    dbc.Row([
        # Sidebar (2 cols on lg, full width on smaller screens)
        dbc.Col(controls, width=12, lg=2, className='mb-3 order-last order-lg-first'),
        
        # Main content (10 cols on lg)
        dbc.Col([
            # Row 1: Choropleth Map + Top 10 Bar
            dbc.Row([
                dbc.Col(dcc.Graph(id='choropleth-map'), width=12, lg=6),
                dbc.Col(dcc.Graph(id='top10-bar'), width=12, lg=6)
            ], className='mb-4'),
            
            # Row 2: Heatmap (full width)
            dbc.Row(
                dbc.Col(dcc.Graph(id='heatmap'), width=12),
                className='mb-4'
            ),
            
            # Row 3: Single Country Time Series + Multi-Country Comparison
            dbc.Row([
                dbc.Col(dcc.Graph(id='timeseries-single'), width=12, lg=6),
                dbc.Col(dcc.Graph(id='timeseries-multi'), width=12, lg=6)
            ], className='mb-4'),
            
            # Row 4: ARIMA Forecast + Linear Trend
            dbc.Row([
                dbc.Col(dcc.Graph(id='arima-forecast'), width=12, lg=6),
                dbc.Col(dcc.Graph(id='linear-trend'), width=12, lg=6)
            ], className='mb-4'),
            
            # Row 5: Sector Pie (centered)
            dbc.Row(
                dbc.Col(dcc.Graph(id='sector-pie'), width=12, lg=6, className='offset-lg-3'),
                className='mb-4'
            )
        ], width=12, lg=10)
    ], className='g-3')
], fluid=True, className='bg-light', style={'padding': '20px'})

print("‚úì Dash app layout created")

‚úì Dash app layout created


## Cell 13: Define Callbacks

Connects interactive controls (sliders, dropdowns) to visualization functions. Each callback updates one or more graphs when user changes filter values.

**7 Callbacks**:
1. Year slider ‚Üí Choropleth, Top-10, Sector pie
2. Top-N radio ‚Üí Top-10, Heatmap
3. Single country dropdown ‚Üí Single timeseries
4. Multi-country multi-select ‚Üí Multi-country comparison
5. Forecast horizon slider ‚Üí ARIMA forecast
6. Extrapolation year slider ‚Üí Linear trend


In [39]:
# Callback 1: Update choropleth when year changes
@app.callback(
    Output('choropleth-map', 'figure'), 
    Input('year-slider', 'value')
)
def update_choropleth(year):
    return create_choropleth(df, year)

# Callback 2: Update top-10 bar when year or top-N changes
@app.callback(
    Output('top10-bar', 'figure'), 
    [Input('year-slider', 'value'), Input('top-n-radio', 'value')]
)
def update_top10(year, n):
    return create_top10_bar(df, year, n)

# Callback 3: Update heatmap when top-N changes
@app.callback(
    Output('heatmap', 'figure'), 
    Input('top-n-radio', 'value')
)
def update_heatmap(top_n):
    return create_heatmap(df, top_n)

# Callback 4: Update single country time series when country changes
@app.callback(
    Output('timeseries-single', 'figure'), 
    Input('country-single', 'value')
)
def update_ts_single(country):
    return create_country_timeseries(df, country)

# Callback 5: Update multi-country comparison when countries change
@app.callback(
    Output('timeseries-multi', 'figure'), 
    Input('countries-multi', 'value')
)
def update_ts_multi(countries_list):
    return create_multi_country_comparison(df, countries_list)

# Callback 6: Update ARIMA forecast when horizon changes
@app.callback(
    Output('arima-forecast', 'figure'), 
    Input('forecast-slider', 'value')
)
def update_arima(forecast_years):
    return create_arima_forecast(df, forecast_years)

# Callback 7: Update linear trend when extrapolation year changes
@app.callback(
    Output('linear-trend', 'figure'), 
    Input('extrapolate-slider', 'value')
)
def update_linear(extrapolate_to):
    return create_linear_trend(df, extrapolate_to)

# Callback 8: Update sector pie when year changes
@app.callback(
    Output('sector-pie', 'figure'), 
    Input('year-slider', 'value')
)
def update_sector(year):
    return create_sector_pie(df, year)

print("‚úì All 8 callbacks registered")

‚úì All 8 callbacks registered


## Cell 14: Start Dash Server

Launches the interactive dashboard on `http://127.0.0.1:8050`

**Note**: Uncomment `app.run()` to start the server. Press **Kernel ‚Üí Interrupt** to stop.


In [34]:
# === START DASH SERVER ===

# Uncomment the line below to launch the dashboard:
app.run(debug=True, host='127.0.0.1', port=8050)

print("‚úì Dashboard is ready!")
print("\nTo start the server, uncomment the line above and run this cell.")
print("Then open: http://127.0.0.1:8050")
print("\nTo stop the server: Kernel ‚Üí Interrupt")

‚úì Dashboard is ready!

To start the server, uncomment the line above and run this cell.
Then open: http://127.0.0.1:8050

To stop the server: Kernel ‚Üí Interrupt


‚úì Dashboard is ready!

To start the server, uncomment the line above and run this cell.
Then open: http://127.0.0.1:8050

To stop the server: Kernel ‚Üí Interrupt



Series.fillna with 'method' is deprecated and will raise in a future version. Use obj.ffill() or obj.bfill() instead.


An unsupported index was provided and will be ignored when e.g. forecasting.


An unsupported index was provided and will be ignored when e.g. forecasting.


An unsupported index was provided and will be ignored when e.g. forecasting.


Non-stationary starting autoregressive parameters found. Using zeros as starting parameters.


Non-invertible starting MA parameters found. Using zeros as starting parameters.


No supported index is available. Prediction results will be given with an integer index beginning at `start`.


No supported index is available. In the next version, calling this method in a model without a supported index will result in an exception.


No supported index is available. Prediction results will be given with an integer index beginning at `start`.


No supported index is available. In the next version, calling this method in a model without a sup