# 07 - Dashboard Temps R√©el (Live Monitoring)

Dashboard interactif de monitoring des flux Streaming Gold en temps r√©el.

**Concept :** Lecture automatique des tables Gold toutes les X secondes avec rafra√Æchissement graphique.

**Cas d'usage :** KPI du trafic a√©rien, d√©tection d'anomalies, suivi des charges.

## Configuration & Imports

In [1]:
import time
import pandas as pd
import plotly.graph_objects as go
from ipywidgets import VBox, HBox, HTML, Layout
from pyspark.sql.functions import col, desc, max as spark_max, sum as spark_sum
from config import get_s3_path, create_spark_session
import threading

# --- Configuration & Connexion Spark ---
GOLD_TRAFFIC_PATH = get_s3_path("gold", "traffic_by_country")
GOLD_METRICS_PATH = get_s3_path("gold", "metrics_by_category")

spark = create_spark_session("DashboardGrafana")

print(f"‚úÖ Source Trafic:  {GOLD_TRAFFIC_PATH}")
print(f"‚úÖ Source Metrics: {GOLD_METRICS_PATH}")

‚úÖ Configuration charg√©e depuis .env
:: loading settings :: url = jar:file:/opt/conda/lib/python3.12/site-packages/pyspark/jars/ivy-2.5.1.jar!/org/apache/ivy/core/settings/ivysettings.xml


Ivy Default Cache set to: /home/jovyan/.ivy2/cache
The jars for the packages stored in: /home/jovyan/.ivy2/jars
org.apache.hadoop#hadoop-aws added as a dependency
com.amazonaws#aws-java-sdk-bundle added as a dependency
org.apache.spark#spark-hadoop-cloud_2.12 added as a dependency
io.delta#delta-spark_2.12 added as a dependency
:: resolving dependencies :: org.apache.spark#spark-submit-parent-0a7a7a82-af5c-4982-af0b-2b2f868f4e11;1.0
	confs: [default]
	found org.apache.hadoop#hadoop-aws;3.3.4 in central
	found com.amazonaws#aws-java-sdk-bundle;1.12.262 in central
	found org.wildfly.openssl#wildfly-openssl;1.0.7.Final in central
	found org.apache.spark#spark-hadoop-cloud_2.12;3.5.3 in central
	found org.apache.hadoop#hadoop-client-runtime;3.3.4 in central
	found org.apache.hadoop#hadoop-client-api;3.3.4 in central
	found org.xerial.snappy#snappy-java;1.1.10.5 in central
	found org.slf4j#slf4j-api;2.0.7 in central
	found commons-logging#commons-logging;1.1.3 in central
	found com.google.c

‚úÖ Spark Session 'DashboardGrafana' configur√©e
‚úÖ Source Trafic:  s3a://datalake/gold/traffic_by_country
‚úÖ Source Metrics: s3a://datalake/gold/metrics_by_category


In [2]:
%%writefile app.py
import streamlit as st
import time
import pandas as pd
import sys
import os

# Ajouter le r√©pertoire courant au path pour importer config
sys.path.insert(0, '/home/jovyan/work')

import plotly.express as px
from pyspark.sql.functions import col

# Importer la configuration centralis√©e
from config import get_s3_path, create_spark_session

# 1. Configuration Page
st.set_page_config(page_title="OpenSky Live", page_icon="‚úàÔ∏è", layout="wide")

# CSS Dark Mode
st.markdown("""
<style>
    .stApp { background-color: #0e1117; }
    div[data-testid="stMetricValue"] { font-size: 2.5rem; color: #ff9900; }
</style>
""", unsafe_allow_html=True)

# 2. Session Spark (Utilise la m√™me config que les notebooks)
@st.cache_resource
def get_spark():
    try:
        return create_spark_session("StreamlitDashboard")
    except Exception as e:
        st.error(f"‚ùå Erreur Spark : {e}")
        st.stop()

spark = get_spark()

# Chemins S3 (Utilise la m√™me fonction que les notebooks)
GOLD_TRAFFIC = get_s3_path("gold", "traffic_by_country")
GOLD_METRICS = get_s3_path("gold", "metrics_by_category")

st.title("‚úàÔ∏è OpenSky Live Monitoring")
st.caption(f"üìä Trafic: {GOLD_TRAFFIC}")

placeholder = st.empty()

# 3. Boucle de rafra√Æchissement
iteration = 0
while True:
    iteration += 1
    with placeholder.container():
        try:
            # Lecture Delta (m√™me format que les autres notebooks)
            df_traf = spark.read.format("delta").load(GOLD_TRAFFIC)
            df_met = spark.read.format("delta").load(GOLD_METRICS)

            # Conversion Pandas (limit√© pour performance)
            pdf_traf = df_traf.orderBy(col("window").desc()).limit(100).toPandas()
            pdf_met = df_met.orderBy(col("window").desc()).limit(100).toPandas()

            if not pdf_traf.empty and not pdf_met.empty:
                # Filtrage derni√®re fen√™tre
                last_win_traf = pdf_traf['window'].max()
                cur_traf = pdf_traf[pdf_traf['window'] == last_win_traf].sort_values('flight_count', ascending=False)
                
                last_win_met = pdf_met['window'].max()
                cur_met = pdf_met[pdf_met['window'] == last_win_met]

                # KPI
                total = int(cur_traf['flight_count'].sum())
                top_c = cur_traf.iloc[0]['origin_country'] if len(cur_traf) > 0 else "-"
                avg_vel = int(cur_met['avg_velocity_kmh'].mean()) if 'avg_velocity_kmh' in cur_met.columns else 0
                avg_alt = int(cur_met['avg_altitude_m'].mean()) if 'avg_altitude_m' in cur_met.columns else 0

                # Affichage KPI
                k1, k2, k3, k4 = st.columns(4)
                k1.metric("‚è∞ Heure UTC", time.strftime('%H:%M:%S'))
                k2.metric("‚úàÔ∏è Avions", f"{total:,}")
                k3.metric("üåç Top Pays", top_c)
                k4.metric("üöÄ Vitesse Moy.", f"{avg_vel} km/h")

                st.divider()

                # Graphiques
                c1, c2 = st.columns(2)
                
                with c1:
                    st.subheader("üåç Top 10 Pays")
                    fig = px.bar(
                        cur_traf.head(10), 
                        x="flight_count", 
                        y="origin_country", 
                        orientation='h',
                        template="plotly_dark",
                        color="flight_count",
                        color_continuous_scale="Oranges"
                    )
                    fig.update_layout(showlegend=False, height=400)
                    st.plotly_chart(fig, use_container_width=True)
                
                with c2:
                    st.subheader("üöÄ Performance par Cat√©gorie")
                    if 'category' in cur_met.columns and not cur_met.empty:
                        fig2 = px.scatter(
                            cur_met, 
                            x="avg_velocity_kmh", 
                            y="avg_altitude_m",
                            size="aircraft_count",
                            color="category",
                            template="plotly_dark",
                            labels={"avg_velocity_kmh": "Vitesse (km/h)", "avg_altitude_m": "Altitude (m)"}
                        )
                        fig2.update_layout(height=400)
                        st.plotly_chart(fig2, use_container_width=True)
                    else:
                        st.info("Pas de donn√©es de cat√©gories")

                # Stats bas de page
                st.caption(f"üîÑ Refresh #{iteration} | Fen√™tre : {str(last_win_traf)[:19] if last_win_traf else 'N/A'}")

            else:
                st.warning("‚è≥ En attente de donn√©es dans le Data Lake...")
                st.info("üí° V√©rifiez que les notebooks 01, 02, 03, 03b sont actifs")

        except Exception as e:
            st.error(f"‚ùå Erreur : {str(e)[:200]}")
            st.info("Reconnexion dans 5 secondes...")
            time.sleep(5)
    
    time.sleep(2)

Overwriting app.py


In [None]:
# Lancement du Serveur Streamlit
import subprocess
import time

print("üöÄ Lancement du Dashboard Streamlit sur http://localhost:8501")
print("‚è≥ Merci de patienter quelques secondes...")

# Lancer le serveur en arri√®re-plan
process = subprocess.Popen([
    "streamlit", "run", "app.py", 
    "--server.port", "8501", 
    "--server.address", "0.0.0.0",
    "--logger.level", "info"
])

print("‚úÖ Serveur Streamlit lanc√© !")
print("üìä Acc√©dez au dashboard : http://localhost:8501")
print("üõë Pour arr√™ter, red√©marrez le kernel du notebook")

# Garder le processus actif
try:
    process.wait()
except KeyboardInterrupt:
    print("\n‚èπÔ∏è Arr√™t du serveur...")
    process.terminate()

üöÄ Lancement du Dashboard Streamlit sur http://localhost:8501
‚è≥ Merci de patienter quelques secondes...
‚úÖ Serveur Streamlit lanc√© !
üìä Acc√©dez au dashboard : http://localhost:8501
üõë Pour arr√™ter, red√©marrez le kernel du notebook

Collecting usage statistics. To deactivate, set browser.gatherUsageStats to false.


  You can now view your Streamlit app in your browser.

  URL: http://0.0.0.0:8501

:: loading settings :: url = jar:file:/opt/conda/lib/python3.12/site-packages/pyspark/jars/ivy-2.5.1.jar!/org/apache/ivy/core/settings/ivysettings.xml


Ivy Default Cache set to: /home/jovyan/.ivy2/cache
The jars for the packages stored in: /home/jovyan/.ivy2/jars
org.apache.hadoop#hadoop-aws added as a dependency
com.amazonaws#aws-java-sdk-bundle added as a dependency
org.apache.spark#spark-hadoop-cloud_2.12 added as a dependency
io.delta#delta-spark_2.12 added as a dependency
:: resolving dependencies :: org.apache.spark#spark-submit-parent-6481966f-9dfd-4f7f-a168-790b363f71ed;1.0
	confs: [default]
	found org.apache.hadoop#hadoop-aws;3.3.4 in central
	found com.amazonaws#aws-java-sdk-bundle;1.12.262 in central
	found org.wildfly.openssl#wildfly-openssl;1.0.7.Final in central
	found org.apache.spark#spark-hadoop-cloud_2.12;3.5.3 in central
	found org.apache.hadoop#hadoop-client-runtime;3.3.4 in central
	found org.apache.hadoop#hadoop-client-api;3.3.4 in central
	found org.xerial.snappy#snappy-java;1.1.10.5 in central
	found org.slf4j#slf4j-api;2.0.7 in central
	found commons-logging#commons-logging;1.1.3 in central
	found com.google.c