# ✨ Interactive Tool: Labour Market Tightness in France ✨

Welcome to this Jupyter Notebook showcasing how to combine labor market data with geospatial information to build a fully interactive dashboard. In this project, we will:

- **Load & Prepare** final labor market indicators.
- **Integrate Geospatial Data** using INSEE's *Zones d’Emploi* shapefile.
- **Build an Interactive Dashboard** with Dash & Plotly:
  - Dynamic **choropleth map** colored by labour tightness score.
  - Linked **line graph** and **data table** that respond to user interactions.
  - Custom download functionality.

---

## 🗺️ Project Overview

This notebook demonstrates a full geospatial analytics workflow:

1. **📂 Data Preparation & Merging**  
   - Load labor tightness indicators from a finalized CSV file.  
   - Load and simplify a shapefile of employment zones.  
   - Merge spatial and statistical data into a single GeoDataFrame.

2. **🧭 Geospatial Processing**  
   - Ensure all geometries are simplified and converted to WGS84.  
   - Optimize the GeoDataFrame for web-based map rendering.

3. **📊 Interactive Dashboard Development**  
   - Developed in Dash using modular layout and callbacks.  
   - Features:
     - A **map** showing the labour tightness score per employment zone.
     - Filters for granularity (FAP 22 / 87), region, and jobseeker type.
     - A **line chart** comparing a selected zone to national trends.
     - A **responsive data table** and **download button** for Excel export.

> **Note:**  
> Required Python packages include: `pandas`, `geopandas`, `dash`, `plotly`, and `base64`.  
> Ensure your directory structure and file paths match those referenced in the code.

---


In [1]:
# =====================================================
# REQUIRED LIBRARIES & PACKAGE IMPORTS
# =====================================================

import math
import base64
import webbrowser
import dash
from pathlib import Path
import pandas as pd
import geopandas as gpd
import plotly.express as px
import plotly.graph_objects as go
from dash import html, dcc, Input, Output, State, dash_table


In [2]:
# ---------------------------------------------------
# Define the project's root directory
# ---------------------------------------------------
# Get the current working directory and assume that the project root is one level above.
project_root = Path().resolve().parent

# ---------------------------------------------------
# Construct file paths for each dataset
# ---------------------------------------------------
final_data = project_root / "data" / "3- Final Data" / "ratio_supply_demand.csv"
shapefile = project_root / "data" / "shapefiles" / "Zones d'Emploi" / "ze2020_2022.shp"
img_src1 = f"data:image/png;base64,{base64.b64encode((project_root / 'app' / 'Pictures' / 'image1.png').read_bytes()).decode('ascii')}"
img_src2 = f"data:image/jpeg;base64,{base64.b64encode((project_root / 'app' / 'Pictures' / 'image2.jpg').read_bytes()).decode('ascii')}"

# ---------------------------------------------------
# Import data from files
# ---------------------------------------------------
df = pd.read_csv(final_data).round(3)
ze_shp = gpd.read_file(shapefile)

# ---------------------------------------------------
# Simplify the geometries from the shapefile for maximum speed.
# Adjust the tolerance value as needed.
# A higher tolerance yields simpler geometries (but may lose detail).
# ---------------------------------------------------
ze_shp["geometry"] = ze_shp["geometry"].simplify(tolerance=0.01, preserve_topology=True)


# ---------------------------------------------------
# Merge the simplified shapefile with your CSV data.
# Assume that 'zone_emploi' in df corresponds to 'libze2020' in ze_shp.
# ---------------------------------------------------
merged_df = df.merge(ze_shp[["geometry", "libze2020"]],
                     left_on="zone_emploi",
                     right_on="libze2020",
                     how="left").drop(columns="libze2020")

# ---------------------------------------------------
# Convert the merged DataFrame into a GeoDataFrame,
# specifying EPSG:4326 as the CRS (latitude/longitude).
# ---------------------------------------------------
gdf = gpd.GeoDataFrame(merged_df, geometry="geometry", crs="EPSG:4326")



In [43]:
# ======================================================
# 1) INITIALIZE THE DASH APP & LAYOUT CONFIGURATION
# ======================================================

# --- App Initialization ---
app = dash.Dash(__name__)

# --- App Layout ---
app.layout = html.Div([

    # ---------------------
    # 1.1) Header Section
    # ---------------------
    html.Div([
        html.Img(src=img_src1, style={"height": "80px"}),
        html.Img(src=img_src2, style={"height": "120px"})
    ], style={"display": "flex", "justifyContent": "space-between", 
              "alignItems": "center", "padding": "10px"}),

    html.Div([
        html.H1(
            "Visualising Labour Market Tightness In France",
            style={
                "textAlign": "center", "border": "1px solid #ccc",
                "padding": "12px", "backgroundColor": "#f4f3ee",
                "fontFamily": "'Roboto', sans-serif", "fontWeight": "400",
                "letterSpacing": "0.5px", "width": "100%"
            }
        )
    ], style={"display": "flex", "justifyContent": "center"}),

    # ---------------------
    # 1.2) Filters + Map
    # ---------------------
    html.Div([

        # Filter Panel (Left)
        html.Div([
            html.Div([
                html.Label("Choose Job Classification Detail:", style={
                    "fontWeight": "bold", "fontSize": "16px",
                    "fontFamily": "Roboto, sans-serif", "marginBottom": "6px"
                }),
                dcc.RadioItems(
                    id='fap-granularity',
                    options=[
                        {'label': 'FAP 22 – General job sectors', 'value': 'FAP22'},
                        {'label': 'FAP 87 – Specific job types', 'value': 'FAP87'}
                    ],
                    value='FAP22',
                    labelStyle={'display': 'inline-block', 'marginRight': '20px', 'fontSize': '14px'},
                    inputStyle={"marginRight": "6px"}
                )
            ], style={"marginBottom": "30px"}),

            html.Div([
                html.Label("Which Jobseekers to Include?", style={
                    "fontWeight": "bold", "fontSize": "16px",
                    "fontFamily": "Roboto, sans-serif", "marginBottom": "6px"
                }),
                dcc.RadioItems(
                    id='jobseekers-type',
                    options=[
                        {'label': 'Category A only – Fully unemployed', 'value': 'A'},
                        {'label': 'A, B & C – Includes part-time & training', 'value': 'ABC'}
                    ],
                    value='A',
                    labelStyle={'display': 'inline-block', 'marginRight': '20px', 'fontSize': '14px'},
                    inputStyle={"marginRight": "6px"}
                )
            ], style={"marginBottom": "30px"}),

            html.Div([
                html.Label("Select a Category of Jobs or Professions:", style={
                    "fontWeight": "bold", "fontSize": "16px",
                    "fontFamily": "Roboto, sans-serif", "marginBottom": "6px"
                }),
                dcc.Dropdown(
                    id='fap-family',
                    options=[{'label': fam, 'value': fam} for fam in gdf["famille_pro22"].unique()],
                    value=gdf["famille_pro22"].unique()[0],
                    clearable=False,
                    style={"fontSize": "14px"}
                )
            ], style={"marginBottom": "30px"}),

            html.Div([
                html.Label("Zoom on a Region:", style={
                    "fontWeight": "bold", "fontSize": "16px",
                    "fontFamily": "Roboto, sans-serif", "marginBottom": "6px"
                }),
                dcc.Dropdown(
                    id='region',
                    options=[{'label': reg, 'value': reg} for reg in gdf["region"].unique()],
                    placeholder="Select a region",
                    clearable=True,
                    style={"fontSize": "14px"}
                )
            ])
        ], style={
            "width": "25%", "padding": "20px", "minWidth": "280px",
            "fontFamily": "Roboto, sans-serif", "display": "flex",
            "flexDirection": "column", "justifyContent": "center"
        }),

        # Map Display (Right)
        html.Div([
            dcc.Graph(id='map-graph', style={"height": "600px", "width": "100%"})
        ], style={"width": "65%", "padding": "20px", "minWidth": "600px"})

    ], style={"display": "flex", "justifyContent": "center", "alignItems": "flex-start", "flexWrap": "wrap"}),

    # ---------------------
    # 1.3) Graph + Table
    # ---------------------
    html.Div([

        html.Div([
            dcc.Graph(id='line-graph', style={"height": "450px", "width": "100%"})
        ], style={"width": "60%", "padding": "20px", "minWidth": "500px"}),

        html.Div([
            dash_table.DataTable(
                id='data-table',
                columns=[{'name': col, 'id': col} for col in df.columns],
                data=df.to_dict('records'),
                filter_action="native",
                sort_action="native",
                page_size=10,

                style_table={
                    'overflowX': 'auto','border': '1px solid #ccc',
                    'borderRadius': '8px', 'boxShadow': '0 2px 4px rgba(0, 0, 0, 0.1)',
                },

                style_cell={
                    'textAlign': 'left','fontFamily': 'Roboto, sans-serif',
                    'fontSize': '13px', 'padding': '2px 4px'
                },

                style_data={
                    'height': '30px', 'lineHeight': '30px', 'padding': '2px 4px'
                },

                style_header={
                    'backgroundColor': '#f4f4f4', 'fontWeight': 'bold',
                    'borderBottom': '1px solid #aaa', 'height': '32px', 'lineHeight': '32px'
                },

                style_data_conditional=[
                    {'if': {'row_index': 'odd'}, 'backgroundColor': '#fafafa'},
                    {'if': {'state': 'selected'}, 'backgroundColor': '#D2F3FF'}
                ]
            ),
            html.Button("Download Excel", id="download-button", n_clicks=0),
            dcc.Download(id="download")
        ], style={"width": "30%", "padding": "20px", "minWidth": "300px"})

    ], style={"display": "flex", "justifyContent": "center", "alignItems": "flex-start", "flexWrap": "wrap"})
])


# ======================================================
# 2) CALLBACKS
# ======================================================

# ------------------------------------------
# 2.1) Update Professional Family Dropdown
# ------------------------------------------
@app.callback(
    [Output('fap-family', 'options'),
     Output('fap-family', 'value')],
    [Input('fap-granularity', 'value')]
)
def update_fap_family(granularity):
    if granularity == "FAP22":
        families = gdf["famille_pro22"].unique()
    else:
        families = gdf["famille_pro87"].unique()
    options = [{'label': fam, 'value': fam} for fam in families]
    default_value = families[0] if len(families) > 0 else None
    return options, default_value


# ------------------------------------------
# 2.2) Update Map Based on Filters
# ------------------------------------------
@app.callback(
    Output('map-graph', 'figure'),
    [Input('fap-granularity', 'value'),
     Input('jobseekers-type', 'value'),
     Input('fap-family', 'value'),
     Input('region', 'value')]
)
def update_map(granularity, jobseekers, fap_family, region):
    if granularity == "FAP87":
        score_col = 'lt_score_a_fap87' if jobseekers == 'A' else 'lt_score_abc_fap87'
        fam_col = "famille_pro87"
    else:
        score_col = 'lt_score_a_fap22' if jobseekers == 'A' else 'lt_score_abc_fap22'
        fam_col = "famille_pro22"

    filtered_gdf = gdf.copy()
    if fap_family:
        filtered_gdf = filtered_gdf[filtered_gdf[fam_col] == fap_family]
    if region:
        filtered_gdf = filtered_gdf[filtered_gdf["region"] == region]
    filtered_gdf.drop_duplicates(subset=["geometry"], inplace=True)

    if filtered_gdf.crs is not None and filtered_gdf.crs.to_string() != 'EPSG:4326':
        filtered_gdf = filtered_gdf.to_crs(epsg=4326)

    if filtered_gdf.empty or score_col not in filtered_gdf.columns:
        return px.scatter_mapbox(
            lat=[], lon=[], mapbox_style="carto-positron",
            title="No data available for the selected filters"
        )

    labels = filtered_gdf[score_col].astype(str)
    filtered_gdf["lt_score_cat"] = labels.where(labels.isin(["1", "2", "3", "4", "5"]), "No Data")
    geojson = filtered_gdf.__geo_interface__
    color_discrete_map = {
        "1": "#a6d96a", "2": "#d9ef8b", "3": "#ffffbf",
        "4": "#fdae61", "5": "#f46d43", "No Data": "#d9d9d9"
    }

    fig = px.choropleth_mapbox(
        filtered_gdf,
        geojson=geojson,
        locations="zone_emploi",
        featureidkey="properties.zone_emploi",
        color="lt_score_cat",
        color_discrete_map=color_discrete_map,
        mapbox_style="carto-positron",
        center={"lat": 46.2276, "lon": 2.2137},
        zoom=4.65,
        opacity=0.8,
        category_orders={"lt_score_cat": ["1", "2", "3", "4", "5", "No Data"]},
        hover_data={"lt_score_cat": False, "zone_emploi": False},
        custom_data=["lt_score_cat", "zone_emploi"]
    )

    fig.update_traces(
        hovertemplate="<b>Score</b>: %{customdata[0]}<br><b>Zone d'Emploi</b>: %{customdata[1]}<extra></extra>"
    )
    

    if region:
        minx, miny, maxx, maxy = filtered_gdf.total_bounds
        center = {"lat": (miny + maxy) / 2, "lon": (minx + maxx) / 2}
        width = (maxx - minx) * 1.1 if (maxx - minx) > 0 else 360
        zoom = min(math.log2(360 / width), 15)
        fig.update_layout(mapbox=dict(center=center, zoom=zoom))
    else:
        fig.update_layout(mapbox=dict(center={"lat": 46.2276, "lon": 2.2137}, zoom=4.65))

    fig.update_layout(margin={"r": 0, "t": 0, "l": 0, "b": 0},
                      legend_title_text="Labour Tightness Score")
    return fig


# ------------------------------------------
# 2.3) Update Line Chart Based on Map Click
# ------------------------------------------
@app.callback(
    Output('line-graph', 'figure'),
    [Input('map-graph', 'clickData'),
     Input('fap-granularity', 'value'),
     Input('jobseekers-type', 'value'),
     Input('fap-family', 'value')]
)
def update_line_chart(clickData, granularity, jobseekers, fap_family):
    if granularity == "FAP87":
        ratio_col = "ratio_zscore_a_fap87" if jobseekers == "A" else "ratio_zscore_abc_fap87"
        fam_col = "famille_pro87"
    else:
        ratio_col = "ratio_zscore_a_fap22" if jobseekers == "A" else "ratio_zscore_abc_fap22"
        fam_col = "famille_pro22"

    default_zone_line = "Paris"
    selected_zone = clickData["points"][0].get("location") if clickData and clickData["points"][0].get("location") else default_zone_line

    month_order = ["January", "February", "March", "April", "May", "June",
                   "July", "August", "September", "October", "November", "December"]

    zone_df = gdf[gdf["zone_emploi"] == selected_zone].copy()
    zone_df["month"] = pd.Categorical(zone_df["month"], categories=month_order, ordered=True)
    zone_series = zone_df.groupby("month", as_index=False)[ratio_col].mean()

    trace1 = go.Scatter(
        x=zone_series["month"], y=zone_series[ratio_col],
        mode="lines+markers", name=f"Selected FAP in {selected_zone}",
        line=dict(width=3), marker=dict(size=6)
    )

    fap_df = gdf[gdf[fam_col] == fap_family].copy()
    fap_df["month"] = pd.Categorical(fap_df["month"], categories=month_order, ordered=True)
    fap_series = fap_df.groupby("month", as_index=False)[ratio_col].mean()

    trace2 = go.Scatter(
        x=fap_series["month"], y=fap_series[ratio_col],
        mode="lines", name="Selected FAP across France",
        line=dict(width=2, dash="dot")
    )

    all_series_df = gdf.copy()
    all_series_df["month"] = pd.Categorical(all_series_df["month"], categories=month_order, ordered=True)
    all_series = all_series_df.groupby("month", as_index=False)[ratio_col].mean()

    trace3 = go.Scatter(
        x=all_series["month"], y=all_series[ratio_col],
        mode="lines", name="All FAPs across France",
        line=dict(width=2, dash="dash")
    )

    fig = go.Figure(data=[trace1, trace2, trace3])
    fig.update_layout(
        title={'text': f"Z-score Ratio of the '{fap_family}' sector in {selected_zone}", 'x': 0.5, 'xanchor': 'center', 'font': dict(size=20)},
        xaxis_title="Month", yaxis_title="Z-score Ratio",
        font=dict(family="Roboto, sans-serif", size=12),
        plot_bgcolor="#ffffff", paper_bgcolor="#f5f3f4",
        hovermode="x unified",
        xaxis=dict(showgrid=True, gridcolor="#eee"),
        yaxis=dict(showgrid=True, gridcolor="#eee"),
        margin={"r": 30, "t": 70, "l": 40, "b": 80},
        legend=dict(orientation="h", yanchor="bottom", y=-0.3, xanchor="center", x=0.5)
    )
    fig.update_traces(hoverinfo="all", line_shape="spline")
    return fig


# ------------------------------------------
# 2.4) Update Table Based on Map Click
# ------------------------------------------
@app.callback(
    Output('data-table', 'data'),
    [Input('map-graph', 'clickData'),
     Input('fap-granularity', 'value'),
     Input('fap-family', 'value')]
)
def update_table(clickData, granularity, fap_family):
    fam_col = "famille_pro87" if granularity == "FAP87" else "famille_pro22"
    selected_zone = clickData["points"][0].get("location") if clickData and clickData["points"][0].get("location") else "Paris"
    dff = df.copy()
    dff = dff[dff[fam_col] == fap_family]
    dff = dff[dff["zone_emploi"] == selected_zone]
    return dff.to_dict('records')


# ------------------------------------------
# 2.5) Download Table Data to Excel
# ------------------------------------------
@app.callback(
    Output("download", "data"),
    [Input("download-button", "n_clicks")],
    [State("data-table", "derived_virtual_data")]
)
def download_excel(n_clicks, table_data):
    if n_clicks is None or n_clicks == 0:
        return dash.no_update
    dff = df if table_data is None else pd.DataFrame(table_data)
    return dcc.send_data_frame(dff.to_excel, "filtered_data.xlsx", index=False)


# ======================================================
# 3) RUN THE APP
# ======================================================

if __name__ == '__main__':
    webbrowser.open("http://127.0.0.1:8050")
    app.run(debug=False)
