<a href="https://colab.research.google.com/github/ramahasiba/Zakey/blob/main/Rama_Hasiba_Capstone_Project_Interactive_Dashboard.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 🚀 Capstone Project - Interactive Dashboard

## Project Overview

Welcome to your first capstone project! The goal of this week is to integrate your Pandas data manipulation skills with your new visualization abilities to create a simple, interactive dashboard. We will use **Streamlit** to build the web app and **ngrok** to create a public URL for it, all from within this Google Colab notebook.

This project will challenge you to:
- Use **Markdown cells** for clear documentation.
- Load and prepare a dataset using **Pandas**.
- Build an interactive web application using **Streamlit**.
- Create dashboard components like dropdowns and sliders.
- Display data and plots dynamically based on user input.
- Deploy the app temporarily using **ngrok**.

---

## 🌍 Getting Started

1.  **Open a new Colab notebook.**
2.  **Install Libraries**: We need to install `streamlit` for building the app and `pyngrok` to expose our app to the web.
3.  **Load the Dataset**: We'll use the **Global Land Temperatures by Country** dataset for this project.[link text](https:// [link text](https://))


## 🔧 Project Workflow & Exercises

### Step 1: Install Libraries and Import Dependencies

First, let's get our environment ready by installing the necessary packages and importing our libraries.


In [None]:
# Install Streamlit and pyngrok
!pip install streamlit -q
!pip install pyngrok -q
!pip install python-dotenv -q

# Import libraries
import streamlit as st
import pandas as pd
import plotly.express as px
from pyngrok import ngrok
import os
import subprocess
import time
import os
from IPython.display import display, IFrame
from dotenv import load_dotenv


[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m10.1/10.1 MB[0m [31m45.5 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m6.9/6.9 MB[0m [31m109.7 MB/s[0m eta [36m0:00:00[0m
[?25h

### Read the NGROk API key

In [None]:

load_dotenv('.env')

authtoken = os.getenv("NGROK_API_KEY")

if authtoken:
  print("NGROK_API_KEY loaded successfully.")
else:
  print("NGROK_API_KEY not found in .env file.")

NGROK_API_KEY loaded successfully.


### Step 2: Load and Prepare the Data

Now, we'll load the dataset and perform some basic cleaning, similar to our previous exercises.




In [None]:
# Load the dataset
url = "https://raw.githubusercontent.com/Steven-Alvarado/Global-Temperature-Analysis/refs/heads/main/GlobalTemperatures.csv"
df = pd.read_csv(url)

# --- Data Cleaning ---
# Convert 'dt' to datetime and extract 'Year'
df['dt'] = pd.to_datetime(df['dt'])
df['Year'] = df['dt'].dt.year

# Handle missing temperatures
df.dropna(subset=['LandAverageTemperature'], inplace=True)
df

Unnamed: 0,dt,LandAverageTemperature,LandAverageTemperatureUncertainty,LandMaxTemperature,LandMaxTemperatureUncertainty,LandMinTemperature,LandMinTemperatureUncertainty,LandAndOceanAverageTemperature,LandAndOceanAverageTemperatureUncertainty,Year
0,1850-01-01,0.749,1.105,8.242,1.738,-3.206,2.822,12.833,0.367,1850
1,1850-02-01,3.071,1.275,9.970,3.007,-2.291,1.623,13.588,0.414,1850
2,1850-03-01,4.954,0.955,10.347,2.401,-1.905,1.410,14.043,0.341,1850
3,1850-04-01,7.217,0.665,12.934,1.004,1.018,1.329,14.667,0.267,1850
4,1850-05-01,10.004,0.617,15.655,2.406,3.811,1.347,15.507,0.249,1850
...,...,...,...,...,...,...,...,...,...,...
1987,2015-08-01,14.755,0.072,20.699,0.110,9.005,0.170,17.589,0.057,2015
1988,2015-09-01,12.999,0.079,18.845,0.088,7.199,0.229,17.049,0.058,2015
1989,2015-10-01,10.801,0.102,16.450,0.059,5.232,0.115,16.290,0.062,2015
1990,2015-11-01,7.433,0.119,12.892,0.093,2.157,0.106,15.252,0.063,2015


Explore maxn and min Year values to use them later

In [None]:
print(df['Year'].min())
print(df['Year'].max())

1850
2015


In [None]:
%%writefile app.py
import streamlit as st
import pandas as pd
import numpy as np
import plotly.express as px
import plotly.graph_objects as go

# Page Config
st.set_page_config(
    page_title="Global Temperatures — Original Dashboard",
    page_icon="🌡️",
    layout="wide"
)

# Data
@st.cache_data
def load_data():
    url = "https://raw.githubusercontent.com/Steven-Alvarado/Global-Temperature-Analysis/refs/heads/main/GlobalTemperatures.csv"
    df = pd.read_csv(url)
    df["dt"] = pd.to_datetime(df["dt"])
    df["Year"] = df["dt"].dt.year
    df["Month"] = df["dt"].dt.month
    df = df.dropna(subset=["LandAverageTemperature"])
    return df

df = load_data()

# Sidebar Controls
st.sidebar.header("Controls")
min_year, max_year = int(df["Year"].min()), int(df["Year"].max())
year_range = st.sidebar.slider("Year range", min_year, max_year, (1900, 2015), step=1)

smooth_k = st.sidebar.slider("Smoothing window (years)", 1, 15, 10, step=1)
remove_outliers = st.sidebar.checkbox("Remove monthly outliers (IQR method)", value=False)

st.sidebar.markdown("---")
st.sidebar.subheader("Anomaly baseline")
baseline = st.sidebar.selectbox(
    "Reference period",
    options=["1850–1900", "1901–1930", "1951–1980", "1981–2010"],
    index=2
)

baseline_map = {
    "1850–1900": (1850, 1900),
    "1901–1930": (1901, 1930),
    "1951–1980": (1951, 1980),
    "1981–2015": (1981, 2015),
}

df = df[(df["Year"] >= year_range[0]) & (df["Year"] <= year_range[1])].copy()

#  outlier removal option (per month)
if remove_outliers:
    def iqr_filter(g):
        q1, q3 = g.quantile(0.25), g.quantile(0.75)
        iqr = q3 - q1
        low, high = q1 - 1.5 * iqr, q3 + 1.5 * iqr
        return g.between(low, high)
    mask = df.groupby("Month")["LandAverageTemperature"].transform(iqr_filter)
    df = df[mask].copy()


# Aggregations
yearly = (
    df.groupby("Year", as_index=False)["LandAverageTemperature"]
      .mean()
      .rename(columns={"LandAverageTemperature": "AvgTemp"})
)

# Smoothing
if smooth_k > 1:
    yearly["Smoothed"] = yearly["AvgTemp"].rolling(window=smooth_k, center=True, min_periods=1).mean()
else:
    yearly["Smoothed"] = yearly["AvgTemp"]

# Baseline mean for anomaly
b_start, b_end = baseline_map[baseline]
baseline_df = df[(df["Year"] >= b_start) & (df["Year"] <= b_end)]
baseline_mean = baseline_df["LandAverageTemperature"].mean() if not baseline_df.empty else np.nan
yearly["Anomaly"] = yearly["AvgTemp"] - baseline_mean


latest_row = yearly.sort_values("Year").iloc[-1]
delta_vs_baseline = latest_row["Anomaly"]
kpi_cols = st.columns(3)
kpi_cols[0].metric("Selected period", f"{year_range[0]}–{year_range[1]}")
kpi_cols[1].metric("Baseline mean (°C)", f"{baseline_mean:.2f}" if not np.isnan(baseline_mean) else "N/A")
kpi_cols[2].metric("Latest anomaly (°C)", f"{delta_vs_baseline:.2f}" if not np.isnan(delta_vs_baseline) else "N/A")

st.title("Global Temperature Explorer")

# Tabs
tab1, tab3 = st.tabs(["Overview", "Data"])

with tab1:
    st.subheader("Long-run Temperature Trend")
    fig = go.Figure()
    fig.add_trace(go.Scatter(
        x=yearly["Year"], y=yearly["AvgTemp"],
        mode="lines+markers",
        name="Yearly average"
    ))
    fig.add_trace(go.Scatter(
        x=yearly["Year"], y=yearly["Smoothed"],
        mode="lines",
        name=f"{smooth_k}-year moving average"
    ))
    fig.update_layout(
        xaxis_title="Year",
        yaxis_title="Average Land Temperature (°C)",
        margin=dict(l=10, r=10, t=40, b=10),
        title=f"Global Average Land Temperature ({year_range[0]}–{year_range[1]})"
    )
    st.plotly_chart(fig, use_container_width=True)

with tab3:
    st.subheader("Data Preview")
    st.write("Filtered rows within your selected period.")
    st.dataframe(df.sort_values("dt").reset_index(drop=True), use_container_width=True)
    st.download_button(
        "Download filtered CSV",
        data=df.to_csv(index=False).encode("utf-8"),
        file_name=f"temperatures_{year_range[0]}_{year_range[1]}.csv",
        mime="text/csv"
    )


Overwriting app.py


In [None]:
authtoken = os.getenv("NGROK_API_KEY")

# Set up ngrok tunnel with the provided authtoken
if authtoken:
  try:
    # Kill any existing ngrok tunnels
    ngrok.kill()
    # Set auth token
    ngrok.set_auth_token(authtoken)
    # Run the streamlit app as a background process
    process = subprocess.Popen(["streamlit", "run", "app.py"])
    print("Starting Streamlit server in the background...")

    # Give the app a moment to start up
    time.sleep(5)

    # Open a tunnel to the Streamlit port
    public_url = ngrok.connect(addr='8501')
    print(f"Streamlit App URL: {public_url}")
    print("The dashboard is now running below. It may take a moment to load.")

    # Display the app in an IFrame
    display(IFrame(public_url, width='100%', height=800))

  except Exception as e:
    print(f"An error occurred: {e}")
    print("Please ensure your authtoken is correct and that Streamlit is installed.")

else:
  print("Please enter your ngrok authtoken in the field above to run the app.")

Starting Streamlit server in the background...
Streamlit App URL: NgrokTunnel: "https://lavina-hamulate-crankly.ngrok-free.dev" -> "http://localhost:8501"
The dashboard is now running below. It may take a moment to load.
