# Interactive Dashboard for Meteorite Analysis

This Jupyter Notebook constructs an **interactive dashboard** using [Panel](https://panel.holoviz.org/) and various libraries to analyze the meteorite dataset contained in `data/Meteorite_Landings.csv`. The dashboard enables dynamic filtering and visualization of:
- Temporal trends,
- Mass distribution (with logarithmic scaling),
- Primary classifications,
- An interactive global map,
- Descriptive metrics and correlation analyses.

## Structure of the Solution

- **Section 1**: Import libraries and load the dataset using functions defined in `src/data_loader.py`.
- **Section 2**: Create Panel widgets (controls) for filtering by year, mass, classification, and type (Fell/Found).
- **Section 3**: Generate interactive visualizations using Panel integrated with Plotly, hvPlot, and Folium.
- **Section 4**: Compute and display metrics, including descriptive statistics, correlations, and spatial metrics.
- **Section 5**: Construct dashboard tabs and deploy the complete dashboard.

Ensure that all required dependencies (Panel, hvPlot, Holoviews, Folium, etc.) are installed so that each cell executes correctly.

> **Note**: The CSV file with meteorite data is expected to be located at `data/Meteorite_Landings.csv`, and the modules are located in `src/data_loader.py`, `src/metrics.py`, and `src/visualization.py` respectively.

## 1. Importing Libraries and Loading the Dataset

In this section, we import the essential Python libraries for data manipulation, visualization, and dashboard creation. Subsequently, we load the meteorite dataset using custom functions defined in `src/data_loader.py`, preparing the data for further analysis and interactive exploration.

In [None]:
import pandas as pd
import numpy as np
import panel as pn
import hvplot.pandas  # habilita .hvplot en DataFrames
import holoviews as hv
import colorcet as cc
import folium

import os, sys

# Ruta absoluta a la carpeta raíz del proyecto
BASE_DIR = os.path.abspath(os.path.join(os.path.dirname(__file__) if '__file__' in globals() else os.getcwd(), '..'))

# Agregar src/ al sys.path
SRC_PATH = os.path.join(BASE_DIR, 'src')
if SRC_PATH not in sys.path:
    sys.path.append(SRC_PATH)

# Ahora los imports funcionarán
from data_loader import load_meteorite_data, convert_to_geodataframe
from metrics import geographic_distribution_metrics



# Extensión de Panel para gráficos (Plotly, Holoviews/Bokeh) en el notebook
pn.extension('plotly', 'holoviews', sizing_mode="stretch_width")

# Cargar el dataset
meteorites = load_meteorite_data(file_path='/home/omega/final-project-topicos-1/data/Meteorite_Landings.csv')
print("Total de registros de meteoritos:", len(meteorites))
meteorites.head(5)

### Quick Overview of Numeric Columns

This cell displays summary statistics for the numeric columns in the dataset. It provides a brief insight into the distribution, central tendency, and dispersion of variables such as mass, year, latitude (reclat), and longitude (reclong).

In [None]:
meteorites[['mass', 'year', 'reclat', 'reclong']].describe()

## 2. Definition of Filters (Panel Widgets)

In this section, interactive widgets are created to allow users to dynamically filter the dataset. The filters include:
- **Year**: A range slider to select the desired time interval.
- **Mass (g)**: A range slider to filter data based on meteorite mass.
- **Classification**: A multi-select widget for choosing specific meteorite classes.
- **Type (Fell/Found)**: A checkbox group to filter by the meteorite fall type.

Additionally, a function `filter_data(...)` is defined, which applies the selected filters to the DataFrame and returns the filtered subset. This function ensures that all subsequent visualizations reflect the current filter settings.

In [None]:
import panel.widgets as pw

# Rango de años
year_min, year_max = int(meteorites['year'].min()), int(meteorites['year'].max())
year_slider = pw.IntRangeSlider(name='Año', start=year_min, end=year_max, value=(year_min, year_max), step=1)

# Rango de masas
mass_min, mass_max = meteorites['mass'].min(), meteorites['mass'].max()
mass_slider = pn.widgets.RangeSlider(
    name='Masa (g)',
    start=mass_min,
    end=mass_max,
    value=(mass_min, mass_max),
    step=round((mass_max - mass_min) / 100, 2)  # un paso razonable
)


# Clasificaciones (recclass)
classes = sorted(meteorites['recclass'].unique())
class_select = pw.MultiSelect(name='Clasificación', options=classes, value=classes[:], size=6)

# Tipo (Fell / Found)
types = meteorites['fall'].unique().tolist()  # en teoría, ["Fell", "Found"]
type_select = pw.CheckBoxGroup(name='Tipo', options=types, value=types[:])

# Función de filtrado
def filter_data(df, year_range, mass_range, classes_selected, types_selected):
    df_filtered = df[(df['year'] >= year_range[0]) & (df['year'] <= year_range[1])]
    df_filtered = df_filtered[(df_filtered['mass'] >= mass_range[0]) & (df_filtered['mass'] <= mass_range[1])]
    if classes_selected:
        df_filtered = df_filtered[df_filtered['recclass'].isin(classes_selected)]
    if types_selected:
        df_filtered = df_filtered[df_filtered['fall'].isin(types_selected)]
    return df_filtered

# Probar función de filtro con valores iniciales
meteorites_filtered = filter_data(
    meteorites, year_slider.value, mass_slider.value, class_select.value, type_select.value
)
print("Registros tras filtro inicial (debería ser igual al total):", len(meteorites_filtered))
meteorites_filtered.head(3)

## 3. Interactive Visualizations

This section is dedicated to generating various interactive visualizations that respond to the filter settings. Each visualization updates dynamically to reflect the subset of data currently selected by the user. The visualizations include a time series plot, a mass distribution histogram, a bar chart for the top classifications, and an interactive global map.

In [None]:
import plotly.graph_objs as go
from panel import bind

def make_timeseries_plot(df):
    by_year = df.groupby('year').agg(
        meteorite_count=('name', 'count'),
        total_mass=('mass', 'sum')
    ).reset_index()
    if by_year.empty:
        fig = go.Figure()
        fig.update_layout(title="Evolución temporal (sin datos)")
        return fig

    fig = go.Figure()

    # Cantidad de meteoritos (eje y principal)
    fig.add_trace(
        go.Scatter(x=by_year['year'], y=by_year['meteorite_count'], mode='lines+markers',
                   name='Cantidad', marker_color='blue')
    )

    # Masa total (eje y secundario)
    fig.add_trace(
        go.Scatter(x=by_year['year'], y=by_year['total_mass'], mode='lines+markers',
                   name='Masa total (g)', marker_color='red', yaxis='y2')
    )

    # Configurar layout con doble eje Y
    fig.update_layout(
        title="Meteoritos por año vs Masa total por año",
        xaxis_title="Año",
        yaxis=dict(title="Cantidad de meteoritos"),
        yaxis2=dict(title="Masa total (g)", overlaying='y', side='right'),
        legend=dict(x=0.01, y=0.95)
    )
    return fig

# Vincular
timeseries_plot = pn.bind(
    make_timeseries_plot,
    df=pn.bind(
        filter_data,
        meteorites,
        year_slider,
        mass_slider,
        class_select,
        type_select
    )
)

### 3.1 Temporal Evolution (Meteorite Count vs. Total Mass per Year)

In this cell, a dual-axis Plotly chart is generated. The primary y-axis shows the count of meteorites per year, while the secondary y-axis displays the total mass of meteorites for each year. This visualization facilitates the exploration of trends and relationships between meteorite frequency and cumulative mass over time.

In [None]:
def make_mass_histogram(df):
    if df.empty or df['mass'].dropna().le(0).all():
        return hv.Curve([]).opts(height=300, width=400, title="Histograma de masas (sin datos)")

    df_mass = df[df['mass'] > 0].copy()

    hist = df_mass.hvplot.hist(
        'mass',
        bins=50,
        title="Distribución de masas de meteoritos (log y)",
        height=300,
        width=400,
    ).opts(logy=True, ylim=(0.1, None))

    return hist


mass_hist_plot = pn.bind(
    make_mass_histogram,
    df=pn.bind(
        filter_data,
        meteorites,
        year_slider,
        mass_slider,
        class_select,
        type_select
    )
)

### 3.2 Histogram of Mass Distribution (Logarithmic Scale)

This cell utilizes hvPlot to generate a histogram of meteorite masses. The y-axis is set to a logarithmic scale to accommodate the wide range of mass values, allowing for a more insightful visualization of the mass distribution.

In [None]:
def make_classification_bar(df):
    if df.empty:
        return hv.Curve([]).opts(height=300, width=400, title="Clasificaciones (sin datos)")
    # Contar recclass
    top_classes = df['recclass'].value_counts().nlargest(10)
    class_df = top_classes.reset_index()
    class_df.columns = ['Clasificación', 'Cuenta']
    bars = class_df.hvplot.bar(
        x='Clasificación', y='Cuenta', rot=45,
        height=300, width=400,
        title="Top 10 clasificaciones de meteoritos"
    )
    return bars

class_bar_plot = pn.bind(
    make_classification_bar,
    df=pn.bind(
        filter_data,
        meteorites,
        year_slider,
        mass_slider,
        class_select,
        type_select
    )
)

### 3.3 Interactive Global Map (Folium)

This cell creates an interactive world map using Folium. Each meteorite location is marked with a `CircleMarker`, and nearby markers are grouped using `MarkerCluster` to enhance visualization clarity. The map updates dynamically based on the filtered data.

In [None]:
from folium.plugins import MarkerCluster

def make_map(df):
    m = folium.Map(location=[20, 0], zoom_start=2)  # Mapa centrado en [lat=20, lon=0]
    marker_cluster = MarkerCluster().add_to(m)
    if df.empty:
        folium.Marker(location=[0, 0], popup="No hay meteoritos en este rango").add_to(marker_cluster)
        return m

    for _, row in df.iterrows():
        lat, lon = row.get('reclat'), row.get('reclong')
        if pd.isna(lat) or pd.isna(lon):
            continue
        name = row.get('name', 'Unknown')
        mass = row.get('mass', 'N/A')
        year = row.get('year', 'N/A')
        mtype = row.get('fall', '')  # Fell or Found
        recclass = row.get('recclass', '')
        color = 'red' if mtype == 'Fell' else 'blue'
        popup_text = f"<b>{name}</b><br>Año: {year}<br>Masa: {mass} g<br>Clasificación: {recclass}"

        folium.CircleMarker(
            location=[lat, lon],
            radius=3,
            color=color,
            fill=True,
            fill_opacity=0.7,
            popup=popup_text,
            tooltip=str(recclass)
        ).add_to(marker_cluster)
    return m

meteorite_map = pn.bind(
    make_map,
    df=pn.bind(
        filter_data,
        meteorites,
        year_slider,
        mass_slider,
        class_select,
        type_select
    )
)

## 4. Metrics Panel

This section is devoted to calculating and displaying various analytical metrics. It includes:
- **Descriptive Statistics**: Calculation of the mean, median, and standard deviation of meteorite mass.
- **Correlation Analysis**: Computation of Pearson correlation coefficients between numeric variables, along with a visual heatmap representation.
- **Spatial Metrics**: Assessment of spatial distribution metrics using geospatial data transformations.

These metrics provide a quantitative overview of the dataset, offering insights into the underlying data distribution and inter-variable relationships.

In [None]:
# 4.1 Estadísticas descriptivas
def descriptive_stats(df):
    if df.empty:
        return pd.DataFrame(columns=["Grupo", "Media (g)", "Mediana (g)", "Desvío Std (g)"])
    
    # Estadísticas globales
    global_stats = df['mass'].agg(['mean', 'median', 'std'])

    # Por tipo Fell/Found
    type_stats = df.groupby('fall')['mass'].agg(['mean', 'median', 'std'])
    type_stats = type_stats.rename(index=str).reset_index()
    type_stats = type_stats.rename(columns={
        'fall': 'Grupo',
        'mean': 'Media (g)',
        'median': 'Mediana (g)',
        'std': 'Desvío Std (g)'
    })

    # Fila global como DataFrame y concatenar
    global_row = {
        'Grupo': 'Global',
        'Media (g)': global_stats['mean'],
        'Mediana (g)': global_stats['median'],
        'Desvío Std (g)': global_stats['std']
    }
    global_df = pd.DataFrame([global_row])

    type_stats = pd.concat([type_stats, global_df], ignore_index=True)

    # Redondear valores numéricos
    for col in ["Media (g)", "Mediana (g)", "Desvío Std (g)"]:
        type_stats[col] = type_stats[col].round(2)

    return type_stats

stats_df = pn.bind(
    descriptive_stats,
    df=pn.bind(
        filter_data,
        meteorites,
        year_slider,
        mass_slider,
        class_select,
        type_select
    )
)

In [None]:
# 4.2 Correlaciones numéricas + Heatmap
def correlation_matrix(df):
    numeric_cols = []
    for col in ['mass', 'year', 'reclat', 'reclong']:
        if col in df.columns:
            numeric_cols.append(col)
    if not numeric_cols:
        return pd.DataFrame()  # no numeric data
    corr_matrix = df[numeric_cols].corr().round(2)
    return corr_matrix

corr_df = pn.bind(
    correlation_matrix,
    df=pn.bind(
        filter_data,
        meteorites,
        year_slider,
        mass_slider,
        class_select,
        type_select
    )
)

def make_correlation_heatmap(df):
    corr = correlation_matrix(df)
    if corr.empty:
        return hv.Curve([]).opts(title="Heatmap de correlaciones (no disponible)", height=300, width=400)
    # Convertir a formato largo
    corr_values = corr.stack().reset_index()
    corr_values.columns = ['Var1', 'Var2', 'Correlacion']
    # Heatmap con hvplot
    cmap_use = cc.coolwarm if hasattr(cc, "coolwarm") else "RdBu"
    heatmap = corr_values.hvplot.heatmap(
        x='Var1', y='Var2', C='Correlacion', cmap=cmap_use,
        clim=(-1,1), colorbar=True, height=300, width=400,
        title="Matriz de correlación"
    )
    heatmap = heatmap.opts(xrotation=45)
    return heatmap

corr_heatmap = pn.bind(
    make_correlation_heatmap,
    df=pn.bind(
        filter_data,
        meteorites,
        year_slider,
        mass_slider,
        class_select,
        type_select
    )
)

In [None]:
import hvplot.pandas

def make_mass_heatmap(df):
    if df.empty or 'mass' not in df.columns or df['mass'].dropna().empty:
        return hv.Curve([]).opts(title="Mapa de calor de masa no disponible")

    # Asegurar que los valores tengan año
    df = df[df['year'].notna() & df['mass'].notna()]
    df = df[df['mass'] > 0]

    # Agrupar por año y tipo de caída
    heatmap_data = df.groupby(['year', 'fall'])['mass'].mean().reset_index()

    # Redondear masa para visualización
    heatmap_data['mass'] = heatmap_data['mass'].round(2)

    # Crear el mapa de calor con hvplot
    heatmap = heatmap_data.hvplot.heatmap(
        x='year',
        y='fall',
        C='mass',
        cmap='viridis',
        colorbar=True,
        height=400,
        width=700,
        title='Masa Promedio por Año y Tipo de Meteorito (Fell/Found)',
        xlabel='Año',
        ylabel='Tipo (Fall/Found)'
    )

    return heatmap


In [None]:
from data_loader import convert_to_geodataframe
from metrics import geographic_distribution_metrics

# 4.3 Métricas espaciales usando geographic_distribution_metrics
def spatial_metrics_info(df):
    if df.empty:
        return pd.DataFrame(columns=["Métrica", "Valor"])
    
    try:
        gdf = convert_to_geodataframe(df)
        metrics = geographic_distribution_metrics(gdf)
        
        # Convertir salida a DataFrame amigable
        if isinstance(metrics, dict):
            return pd.DataFrame(list(metrics.items()), columns=["Métrica", "Valor"])
        else:
            return pd.DataFrame({"Métrica": ["Error"], "Valor": ["Formato no compatible"]})
    except Exception as e:
        return pd.DataFrame({"Métrica": ["Error"], "Valor": [str(e)]})
    
spatial_df = pn.bind(
    spatial_metrics_info,
    df=pn.bind(
        filter_data,
        meteorites,
        year_slider,
        mass_slider,
        class_select,
        type_select
    )
)

## 5. Dashboard Construction (Panel Tabs)

In this final section, the complete dashboard is assembled using Panel Tabs. Two main tabs are created:
- **Visualizations**: Contains the time series plot, histogram, bar chart for top classifications, and the interactive global map.
- **Metrics**: Displays descriptive statistics, correlation matrices, heatmaps, and spatial metrics.

The filter widgets are positioned at the top of the layout so that any modifications to the filter parameters are simultaneously reflected in both tabs. This integrated approach ensures a cohesive user experience.

In [None]:
# Panel de visualizaciones
filtered_df = pn.bind(
    filter_data,
    meteorites,
    year_slider,
    mass_slider,
    class_select,
    type_select
)

mass_heatmap = pn.bind(make_mass_heatmap, df=filtered_df)


visualizations_panel = pn.Column(
    pn.pane.Markdown("## <span style='font-size:15px; font-weight:bold'>Visualizaciones interactivas</span>"),
    timeseries_plot,
    pn.Row(class_bar_plot),
    pn.pane.HTML("<hr>", height=10),
    pn.pane.Markdown("**Mapa mundial de meteoritos:**"),
    pn.pane.plot.Folium(meteorite_map, height=400),
    pn.pane.Markdown("**Mapa de Calor de Masa por Año y Tipo:**"),
    mass_heatmap
)



# Panel de métricas
metrics_panel = pn.Column(
    pn.pane.Markdown("### Métricas y Estadísticas"),
    pn.pane.Markdown("**Estadísticas descriptivas (masa en gramos):**"),
    pn.bind(lambda df: pn.pane.DataFrame(df, index=False, sizing_mode='stretch_width', height=120), stats_df),
    pn.pane.Markdown("**Correlación (matriz de Pearson):**"),
    pn.bind(lambda df: pn.pane.DataFrame(df, height=150), corr_df),
    pn.Row(corr_heatmap),
    pn.pane.Markdown("**Métricas espaciales:**"),
    pn.bind(lambda df: pn.pane.DataFrame(df, index=False, sizing_mode='stretch_width', height=150), spatial_df)
)

# Crear Tabs
tabs = pn.Tabs(
    ("📈 Visualizaciones", visualizations_panel),
    ("📊 Métricas", metrics_panel)
)

# Layout final
dashboard = pn.Column(
    pn.pane.Markdown("## Filtros"),
    pn.Row(year_slider, mass_slider, class_select, type_select),
    tabs
)

# Mostrar Dashboard
dashboard
