<h1 style="background-color:#4E9A06; color:#ffffff; padding:10px 15px; border-radius:5px; margin-top:1rem; margin-bottom:1rem; text-align:center;">
  02b – Interactive EDA for stakeholders
</h1>

<h2 style="color: #4E9A06; margin-top: 1rem; margin-bottom: 0.5rem;">
Objectives
</h2>
<ul>
  <li><strong>Provide non-technical stakeholders</strong> with intuitive, interactive visualizations of key findings from our climate-disease dataset.</li>
  <li><strong>Highlight</strong> spatial, temporal, and anomaly-driven disease risks in wild vs. agricultural systems.</li>
  <li><strong>Enable exploration</strong> of model insights (e.g. feature importances) without code.</li>
</ul>

<h2 style="color: #4E9A06; margin-top: 1rem; margin-bottom: 0.5rem;">
Inputs
</h2>
<ul>
  <li><strong>Cleaned dataset</strong>: <code>data/processed/merged_climate_disease_final.csv</code> (ETL-processed, anomalies computed, incidence zones assigned).</li>
  <li><strong>Model outputs</strong>: Pre-computed feature importances and OLS interaction summaries for Hypotheses 1–5.</li>
  <li><strong>Geographic metadata</strong>: Latitude/longitude, system type (Ag vs. Wild), pathogen group.</li>
</ul>

<h2 style="color: #4E9A06; margin-top: 1rem; margin-bottom: 0.5rem;">
Outputs
</h2>
<ul>
  <li><strong>Interactive world map</strong> showing survey locations colored by temperature and shaped by system type.</li>
  <li><strong>Anomaly vs. incidence plots</strong> (temperature &amp; precipitation) with LOWESS smoothing, colored by system, pathogen group, and transmission mode.</li>
  <li><strong>Stacked bar charts</strong> of pathogen distributions and host orders.</li>
  <li><strong>Feature importance dashboard</strong> (bar chart or table) to reveal top predictors from our ML models.</li>
</ul>

<h2 style="color: #4E9A06; margin-top: 1rem; margin-bottom: 0.5rem;">
Additional Comments
</h2>
<ul>
  <li>All visuals are implemented with Plotly/Streamlit for seamless interactivity.</li>
  <li>Designed for non-technical audiences: tooltips, legends, and clear titles.</li>
  <li>Future enhancements: geographic filtering sliders, time-series animations, exportable snapshots.</li>
</ul>
<hr>


In [None]:
import plotly.graph_objects as go
from plotly.subplots import make_subplots



# 1) Prepare counts
agent_counts = (
    df
    .groupby(['Antagonist_type_general','system_type'], observed=True)
    .size()
    .unstack(fill_value=0)
    .loc[['Virus','Eukaryotic parasite','Pest','Bacteria']]
)

host_counts = (
    df[df['Host_order'] != 'Unknown']
      .groupby(['Host_order','system_type'])
      .size()
      .unstack(fill_value=0)
)
# Sort host orders by total desc
host_counts['total'] = host_counts.sum(axis=1)
host_counts = host_counts.sort_values('total', ascending=False).drop(columns='total')

# 2) Create a 1×2 subplot
fig = make_subplots(
    rows=1, cols=2,
    subplot_titles=("a) Disease‐causing Agents by System",
                    "b) Host Plant Orders by System"),
    column_widths=[0.3, 0.7]
)

# --- Left panel: agents ---
for sys, color in zip(["Natural","Ag"], [WILD_COLOR, AG_COLOR]):
    fig.add_trace(
        go.Bar(
            x=agent_counts.index,
            y=agent_counts[sys],
            name=sys,
            marker_color=color
        ),
        row=1, col=1
    )

fig.update_xaxes(title_text="Disease‐causing Agent", row=1, col=1)
fig.update_yaxes(title_text="Observations", row=1, col=1)

# --- Right panel: host orders ---
for sys, color, showleg in zip(
    ["Natural","Ag"], [WILD_COLOR, AG_COLOR], [True, True]
):
    fig.add_trace(
        go.Bar(
            x=host_counts.index,
            y=host_counts[sys],
            name=sys,
            marker_color=color,
            showlegend=showleg  # legend only once
        ),
        row=1, col=2
    )

fig.update_xaxes(title_text="Host plant order", row=1, col=2, tickangle=45)
fig.update_yaxes(title_text="Observations", row=1, col=2)

# 3) Final layout tweaks
fig.update_layout(
    barmode='stack',
    template='simple_white',
    height=600, width=1000,
    legend=dict(x=0.75, y=0.95)
)

# 4) Render inline
import plotly.io as pio
pio.renderers.default = "notebook_connected"
fig.show()