# Visualization Notebook: Publication‑Quality Figures

This notebook generates high‑quality visualizations for the multi‑lab reasoning study, including:
- Ridge plot (auto‑skips if zero variance)
- Violin plot (colored by metric)
- Correlation heatmap

All figures are designed to work even with small or zero‑variance datasets.

In [None]:
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt

pd.set_option('display.max_colwidth', None)

DATA_PATH = "../outputs/dataset/merged_dataset.csv"
df = pd.read_csv(DATA_PATH)

score_cols = [
    'correctness_score', 'completeness_score', 'relationship_detection_score',
    'relationship_accuracy_score', 'narrative_drift_score', 'certainty_score',
    'mechanistic_score', 'structure_score', 'total_score'
]

df.head()

## Ridge Plot (Auto‑Skip if Zero Variance)

JoyPy fails when all values in a column are identical. This dataset has zero variance across all scoring dimensions, so we detect that and skip the ridge plot.

A fallback bar chart is shown instead.

In [None]:
if df[score_cols].nunique().max() == 1:
    print("Ridge plot skipped: all score columns have zero variance.")
    plt.figure(figsize=(10,6))
    df[score_cols].iloc[0].plot(kind='bar', color='steelblue')
    plt.title("Score Profile (Identical Across Panels)")
    plt.ylabel("Score")
    plt.show()
else:
    import joypy
    plt.figure(figsize=(10, 8))
    joypy.joyplot(df[score_cols], colormap=plt.cm.viridis, figsize=(10, 8))
    plt.title('Ridge Plot of Scoring Dimensions', fontsize=16)
    plt.show()

## Violin Plot (Colored by Metric)

Uses `hue='metric'` to comply with Seaborn 0.13+ and avoid deprecation warnings.

In [None]:
df_long = df.melt(value_vars=score_cols, var_name='metric', value_name='value')

plt.figure(figsize=(14, 8))
sns.violinplot(
    data=df_long,
    x='metric',
    y='value',
    hue='metric',
    palette='Set3',
    legend=False
)
plt.xticks(rotation=45)
plt.title('Violin Plots of Scoring Dimensions')
plt.show()

## Correlation Heatmap

Even with zero variance, this produces a valid (though uniform) heatmap.

In [None]:
plt.figure(figsize=(10, 8))
sns.heatmap(df[score_cols].corr(), annot=True, cmap='coolwarm', vmin=-1, vmax=1)
plt.title('Correlation Heatmap of Scoring Dimensions')
plt.show()