# EDA-3: Interactive Visualizations for Bearing Health Analysis

**Professional Interactive Dashboard for XJTU-SY Bearing Dataset**

This notebook provides comprehensive interactive visualizations using Plotly for exploring:
- Bearing health progression across lifecycle
- Time-frequency spectrogram analysis
- Cross-bearing comparisons
- 3D waterfall PSD diagrams

## Key Features
1. **Health Dashboard**: Interactive overview with condition/bearing selectors
2. **Time-Frequency Spectrograms**: Dynamic heatmaps with lifecycle sliders
3. **Cross-Bearing Comparison**: Side-by-side analysis with dropdowns
4. **3D Waterfall Plots**: Professional PSD evolution visualization
5. **Feature Explorer**: Interactive feature distributions and correlations

## Dataset Parameters
- **Bearings**: 15 (5 per condition)
- **Conditions**: 35Hz/12kN, 37.5Hz/11kN, 40Hz/10kN
- **Sampling Rate**: 25.6 kHz
- **Total Files**: ~9,216

In [None]:
import sys
sys.path.insert(0, '..')

import numpy as np
import pandas as pd
from scipy import signal
from scipy.fft import rfft, rfftfreq

# Plotly for interactive visualizations
import plotly.express as px
import plotly.graph_objects as go
from plotly.subplots import make_subplots

# Matplotlib/Seaborn for static baselines
import matplotlib.pyplot as plt
import seaborn as sns

# Project modules
from src.data.loader import XJTUBearingLoader, SAMPLING_RATE, BEARINGS_PER_CONDITION
from src.features.frequency_domain import (
    get_characteristic_frequencies_for_condition,
    compute_fft,
    NYQUIST_FREQ
)

# Configure matplotlib style
plt.style.use('seaborn-v0_8-whitegrid')
plt.rcParams['figure.figsize'] = [14, 6]
plt.rcParams['figure.dpi'] = 100

# Plotly default template
import plotly.io as pio
pio.templates.default = 'plotly_white'

print('Libraries loaded successfully!')
print(f'Sampling Rate: {SAMPLING_RATE:,} Hz')
print(f'Nyquist Frequency: {NYQUIST_FREQ:,} Hz')

In [None]:
# Initialize data loader and load feature dataset
loader = XJTUBearingLoader(data_root='../assets/Data/XJTU-SY_Bearing_Datasets')
metadata = loader.get_metadata()

# Load pre-computed features
df = pd.read_csv('../outputs/features/features_v2.csv')

# Add lifecycle percentage
df['lifecycle_pct'] = (df['file_idx'] / df['total_files'] * 100).round(2)

print(f'Loaded {len(df):,} samples from {df["bearing_id"].nunique()} bearings')
print(f'Conditions: {df["condition"].unique().tolist()}')
print(f'Features: {len([c for c in df.columns if c.startswith(("h_", "v_"))])} (excluding metadata)')

---

## 1. Interactive Bearing Health Dashboard

Comprehensive overview of bearing health with interactive controls for condition and bearing selection.

In [None]:
# Color scheme for conditions
CONDITION_COLORS = {
    '35Hz12kN': '#3498db',     # Blue
    '37.5Hz11kN': '#2ecc71',   # Green  
    '40Hz10kN': '#e74c3c'      # Red
}

# Health status color scale
HEALTH_COLORSCALE = [
    [0.0, '#2ecc71'],   # Healthy - Green
    [0.5, '#f39c12'],   # Degrading - Orange
    [1.0, '#e74c3c']    # Failed - Red
]

In [None]:
# Dataset Overview - Interactive Sunburst Chart
# Shows hierarchical structure: Condition -> Bearing -> File count

sunburst_data = df.groupby(['condition', 'bearing_id']).agg({
    'filename': 'count',
    'h_rms': 'mean',
    'h_kurtosis': 'mean'
}).reset_index()
sunburst_data.columns = ['condition', 'bearing_id', 'file_count', 'mean_rms', 'mean_kurtosis']

fig = px.sunburst(
    sunburst_data,
    path=['condition', 'bearing_id'],
    values='file_count',
    color='mean_rms',
    color_continuous_scale='RdYlGn_r',
    title='<b>Dataset Overview</b><br><sup>Size represents file count, color represents mean RMS</sup>',
    hover_data={'mean_rms': ':.3f', 'mean_kurtosis': ':.3f'}
)

fig.update_layout(
    width=800,
    height=700,
    font=dict(size=12),
    coloraxis_colorbar=dict(title='Mean RMS')
)

fig.show()

In [None]:
# Interactive RUL Distribution by Condition and Bearing
fig = px.violin(
    df,
    x='bearing_id',
    y='rul',
    color='condition',
    color_discrete_map=CONDITION_COLORS,
    box=True,
    points='outliers',
    hover_data=['lifecycle_pct', 'h_rms', 'h_kurtosis'],
    title='<b>RUL Distribution by Bearing</b><br><sup>Violin plots with embedded box plots</sup>'
)

fig.update_layout(
    width=1200,
    height=500,
    xaxis_title='Bearing ID',
    yaxis_title='Remaining Useful Life (RUL)',
    legend_title='Condition',
    xaxis_tickangle=-45
)

fig.show()

In [None]:
# Health Indicator Progression - Interactive Line Chart with Range Slider
# Select a sample bearing for demonstration
sample_bearing = df[df['bearing_id'] == 'Bearing3_2'].sort_values('file_idx').copy()

fig = make_subplots(
    rows=2, cols=2,
    subplot_titles=(
        '<b>RMS Evolution</b>',
        '<b>Kurtosis Evolution</b>',
        '<b>Spectral Centroid</b>',
        '<b>Crest Factor</b>'
    ),
    vertical_spacing=0.12,
    horizontal_spacing=0.08
)

# RMS
fig.add_trace(
    go.Scatter(x=sample_bearing['lifecycle_pct'], y=sample_bearing['h_rms'],
               name='Horizontal', line=dict(color='#3498db'), legendgroup='h'),
    row=1, col=1
)
fig.add_trace(
    go.Scatter(x=sample_bearing['lifecycle_pct'], y=sample_bearing['v_rms'],
               name='Vertical', line=dict(color='#e74c3c'), legendgroup='v'),
    row=1, col=1
)

# Kurtosis
fig.add_trace(
    go.Scatter(x=sample_bearing['lifecycle_pct'], y=sample_bearing['h_kurtosis'],
               name='Horizontal', line=dict(color='#3498db'), legendgroup='h', showlegend=False),
    row=1, col=2
)
fig.add_trace(
    go.Scatter(x=sample_bearing['lifecycle_pct'], y=sample_bearing['v_kurtosis'],
               name='Vertical', line=dict(color='#e74c3c'), legendgroup='v', showlegend=False),
    row=1, col=2
)

# Spectral Centroid
fig.add_trace(
    go.Scatter(x=sample_bearing['lifecycle_pct'], y=sample_bearing['h_spectral_centroid']/1000,
               name='Horizontal', line=dict(color='#3498db'), legendgroup='h', showlegend=False),
    row=2, col=1
)
fig.add_trace(
    go.Scatter(x=sample_bearing['lifecycle_pct'], y=sample_bearing['v_spectral_centroid']/1000,
               name='Vertical', line=dict(color='#e74c3c'), legendgroup='v', showlegend=False),
    row=2, col=1
)

# Crest Factor
fig.add_trace(
    go.Scatter(x=sample_bearing['lifecycle_pct'], y=sample_bearing['h_crest_factor'],
               name='Horizontal', line=dict(color='#3498db'), legendgroup='h', showlegend=False),
    row=2, col=2
)
fig.add_trace(
    go.Scatter(x=sample_bearing['lifecycle_pct'], y=sample_bearing['v_crest_factor'],
               name='Vertical', line=dict(color='#e74c3c'), legendgroup='v', showlegend=False),
    row=2, col=2
)

# Add 80% lifecycle marker
for row in [1, 2]:
    for col in [1, 2]:
        fig.add_vline(x=80, line_dash='dash', line_color='gray', opacity=0.5, row=row, col=col)

fig.update_layout(
    height=700,
    width=1200,
    title='<b>Health Indicator Evolution: Bearing3_2 (40Hz10kN)</b><br><sup>Drag to zoom, double-click to reset</sup>',
    legend=dict(orientation='h', yanchor='bottom', y=1.02, xanchor='center', x=0.5),
    hovermode='x unified'
)

fig.update_xaxes(title_text='Lifecycle (%)', row=2, col=1)
fig.update_xaxes(title_text='Lifecycle (%)', row=2, col=2)
fig.update_yaxes(title_text='RMS', row=1, col=1)
fig.update_yaxes(title_text='Kurtosis', row=1, col=2)
fig.update_yaxes(title_text='Spectral Centroid (kHz)', row=2, col=1)
fig.update_yaxes(title_text='Crest Factor', row=2, col=2)

# Add range slider on bottom x-axes
fig.update_xaxes(rangeslider=dict(visible=True), row=2, col=1)

fig.show()

In [None]:
# Interactive Bearing Selector Dashboard with Dropdown
# Prepare data for all bearings
all_bearings = df['bearing_id'].unique().tolist()

# Create figure with initial bearing
initial_bearing = 'Bearing3_2'

fig = go.Figure()

# Add traces for each bearing (initially hidden except first)
for i, bearing in enumerate(all_bearings):
    bearing_data = df[df['bearing_id'] == bearing].sort_values('file_idx')
    visible = (bearing == initial_bearing)
    
    # RMS trace
    fig.add_trace(go.Scatter(
        x=bearing_data['lifecycle_pct'],
        y=bearing_data['h_rms'],
        name=f'{bearing} - H_RMS',
        mode='lines',
        line=dict(color='#3498db', width=2),
        visible=visible,
        hovertemplate='Lifecycle: %{x:.1f}%<br>H_RMS: %{y:.3f}<extra></extra>'
    ))
    
    fig.add_trace(go.Scatter(
        x=bearing_data['lifecycle_pct'],
        y=bearing_data['v_rms'],
        name=f'{bearing} - V_RMS',
        mode='lines',
        line=dict(color='#e74c3c', width=2),
        visible=visible,
        hovertemplate='Lifecycle: %{x:.1f}%<br>V_RMS: %{y:.3f}<extra></extra>'
    ))

# Create dropdown buttons
buttons = []
for i, bearing in enumerate(all_bearings):
    visibility = [False] * (len(all_bearings) * 2)
    visibility[i*2] = True
    visibility[i*2 + 1] = True
    
    condition = df[df['bearing_id'] == bearing]['condition'].iloc[0]
    
    buttons.append(dict(
        label=f'{bearing} ({condition})',
        method='update',
        args=[{'visible': visibility},
              {'title': f'<b>RMS Evolution: {bearing}</b><br><sup>Condition: {condition}</sup>'}]
    ))

fig.update_layout(
    updatemenus=[
        dict(
            active=all_bearings.index(initial_bearing),
            buttons=buttons,
            direction='down',
            showactive=True,
            x=0.15,
            xanchor='left',
            y=1.15,
            yanchor='top',
            bgcolor='white',
            bordercolor='#ccc'
        )
    ],
    annotations=[
        dict(
            text='<b>Select Bearing:</b>',
            x=0,
            xref='paper',
            y=1.12,
            yref='paper',
            align='left',
            showarrow=False
        )
    ],
    title=f'<b>RMS Evolution: {initial_bearing}</b><br><sup>Condition: {df[df["bearing_id"]==initial_bearing]["condition"].iloc[0]}</sup>',
    xaxis_title='Lifecycle (%)',
    yaxis_title='RMS Value',
    height=550,
    width=1100,
    hovermode='x unified',
    legend=dict(orientation='h', yanchor='bottom', y=-0.2)
)

fig.add_vline(x=80, line_dash='dash', line_color='gray', opacity=0.5,
              annotation_text='80% Lifecycle', annotation_position='top')

fig.show()

---

## 2. Interactive Time-Frequency Spectrograms

Dynamic heatmaps showing frequency content evolution with lifecycle slider control.

In [None]:
# Load sample signals for spectrogram visualization
sample_condition = '40Hz10kN'
sample_bearing_id = 'Bearing3_2'

# Load all signals for this bearing
signals_array, filenames = loader.load_bearing(sample_condition, sample_bearing_id)
num_files = len(filenames)

print(f'Loaded {num_files} files for {sample_bearing_id} ({sample_condition})')
print(f'Signal shape per file: {signals_array[0].shape}')

In [None]:
def compute_spectrogram(signal_data, nperseg=1024, noverlap=512):
    """Compute spectrogram using scipy.signal.spectrogram."""
    f, t, Sxx = signal.spectrogram(signal_data, fs=SAMPLING_RATE, 
                                    nperseg=nperseg, noverlap=noverlap)
    Sxx_db = 10 * np.log10(Sxx + 1e-10)
    return f, t, Sxx_db

# Create spectrograms at key lifecycle stages
stages = {
    'Healthy (5%)': int(num_files * 0.05),
    'Degrading (50%)': int(num_files * 0.50),
    'Failed (95%)': int(num_files * 0.95)
}

fig = make_subplots(
    rows=2, cols=3,
    subplot_titles=[f'<b>{stage}</b>' for stage in stages.keys()] * 2,
    row_titles=['<b>Horizontal</b>', '<b>Vertical</b>'],
    vertical_spacing=0.12,
    horizontal_spacing=0.05
)

for col_idx, (stage_name, file_idx) in enumerate(stages.items(), 1):
    sig = signals_array[file_idx]
    
    for row_idx, (channel, channel_name) in enumerate([(0, 'H'), (1, 'V')], 1):
        f, t, Sxx_db = compute_spectrogram(sig[:, channel])
        
        # Limit frequency range for visibility
        freq_mask = f <= 6000
        
        fig.add_trace(
            go.Heatmap(
                z=Sxx_db[freq_mask],
                x=t * 1000,  # Convert to ms
                y=f[freq_mask] / 1000,  # Convert to kHz
                colorscale='Viridis',
                showscale=(col_idx == 3 and row_idx == 1),
                colorbar=dict(title='dB', x=1.02) if (col_idx == 3 and row_idx == 1) else None,
                hovertemplate='Time: %{x:.1f}ms<br>Freq: %{y:.2f}kHz<br>Power: %{z:.1f}dB<extra></extra>'
            ),
            row=row_idx, col=col_idx
        )

fig.update_layout(
    height=700,
    width=1300,
    title=f'<b>Time-Frequency Spectrograms: {sample_bearing_id}</b><br><sup>Hover for detailed values</sup>'
)

fig.update_xaxes(title_text='Time (ms)', row=2)
fig.update_yaxes(title_text='Frequency (kHz)', col=1)

fig.show()

In [None]:
# Interactive Spectrogram with Animation Slider
# Sample 10 lifecycle points for smooth animation
n_frames = 15
frame_indices = np.linspace(0, num_files - 1, n_frames, dtype=int)

# Compute spectrograms for all frames
spectrograms = []
for idx in frame_indices:
    sig = signals_array[idx]
    f, t, Sxx_db = compute_spectrogram(sig[:, 0], nperseg=512, noverlap=384)
    freq_mask = f <= 6000
    spectrograms.append(Sxx_db[freq_mask])

# Create figure with slider
fig = go.Figure()

# Add initial frame
fig.add_trace(
    go.Heatmap(
        z=spectrograms[0],
        x=t * 1000,
        y=f[freq_mask] / 1000,
        colorscale='Inferno',
        colorbar=dict(title='Power (dB)'),
        hovertemplate='Time: %{x:.1f}ms<br>Freq: %{y:.2f}kHz<br>Power: %{z:.1f}dB<extra></extra>'
    )
)

# Create frames
frames = []
for i, idx in enumerate(frame_indices):
    lifecycle_pct = idx / num_files * 100
    frames.append(
        go.Frame(
            data=[go.Heatmap(z=spectrograms[i])],
            name=str(i),
            layout=dict(title=f'<b>Spectrogram at {lifecycle_pct:.0f}% Lifecycle</b><br><sup>File {idx+1}/{num_files}</sup>')
        )
    )

fig.frames = frames

# Create slider steps
steps = []
for i, idx in enumerate(frame_indices):
    lifecycle_pct = idx / num_files * 100
    steps.append(dict(
        args=[[str(i)], dict(frame=dict(duration=100, redraw=True), mode='immediate')],
        label=f'{lifecycle_pct:.0f}%',
        method='animate'
    ))

sliders = [dict(
    active=0,
    yanchor='top',
    xanchor='left',
    currentvalue=dict(
        font=dict(size=14),
        prefix='Lifecycle: ',
        visible=True,
        xanchor='center'
    ),
    transition=dict(duration=100),
    pad=dict(b=10, t=50),
    len=0.9,
    x=0.05,
    y=0,
    steps=steps
)]

# Add play/pause buttons
updatemenus = [dict(
    type='buttons',
    showactive=False,
    y=0,
    x=-0.05,
    xanchor='right',
    yanchor='top',
    pad=dict(t=50, r=10),
    buttons=[
        dict(
            label='Play',
            method='animate',
            args=[None, dict(frame=dict(duration=500, redraw=True), fromcurrent=True, mode='immediate')]
        ),
        dict(
            label='Pause',
            method='animate',
            args=[[None], dict(frame=dict(duration=0, redraw=False), mode='immediate')]
        )
    ]
)]

fig.update_layout(
    title=f'<b>Spectrogram at 0% Lifecycle</b><br><sup>File 1/{num_files}</sup>',
    xaxis_title='Time (ms)',
    yaxis_title='Frequency (kHz)',
    height=600,
    width=1000,
    sliders=sliders,
    updatemenus=updatemenus
)

fig.show()

---

## 3. Cross-Bearing Comparison Tools

Side-by-side comparison of any two bearings with synchronized views.

In [None]:
# Cross-Bearing Feature Comparison with Dual Dropdowns
# Create comparison figure

bearing_list = df['bearing_id'].unique().tolist()
feature_list = ['h_rms', 'v_rms', 'h_kurtosis', 'v_kurtosis', 'h_crest_factor', 
                'h_spectral_centroid', 'cross_correlation']

# Initial bearings
bearing1, bearing2 = 'Bearing1_1', 'Bearing3_2'
initial_feature = 'h_rms'

fig = make_subplots(
    rows=1, cols=2,
    subplot_titles=[f'<b>{bearing1}</b>', f'<b>{bearing2}</b>'],
    horizontal_spacing=0.08
)

# Add all bearing traces (hidden by default)
trace_count = 0
for bearing in bearing_list:
    for feature in feature_list:
        data = df[df['bearing_id'] == bearing].sort_values('file_idx')
        
        # Left panel
        visible1 = (bearing == bearing1 and feature == initial_feature)
        fig.add_trace(
            go.Scatter(
                x=data['lifecycle_pct'],
                y=data[feature],
                name=bearing,
                mode='lines',
                line=dict(color=CONDITION_COLORS.get(data['condition'].iloc[0], '#666')),
                visible=visible1,
                hovertemplate=f'{bearing}<br>{feature}: %{{y:.3f}}<br>Lifecycle: %{{x:.1f}}%<extra></extra>'
            ),
            row=1, col=1
        )
        trace_count += 1
        
        # Right panel
        visible2 = (bearing == bearing2 and feature == initial_feature)
        fig.add_trace(
            go.Scatter(
                x=data['lifecycle_pct'],
                y=data[feature],
                name=bearing,
                mode='lines',
                line=dict(color=CONDITION_COLORS.get(data['condition'].iloc[0], '#666')),
                visible=visible2,
                showlegend=False,
                hovertemplate=f'{bearing}<br>{feature}: %{{y:.3f}}<br>Lifecycle: %{{x:.1f}}%<extra></extra>'
            ),
            row=1, col=2
        )
        trace_count += 1

# Create dropdown for bearing 1 (left)
buttons_bearing1 = []
for b1 in bearing_list:
    cond1 = df[df['bearing_id'] == b1]['condition'].iloc[0]
    visibility = []
    for bearing in bearing_list:
        for feature in feature_list:
            visibility.append(bearing == b1 and feature == initial_feature)
            visibility.append(bearing == bearing2 and feature == initial_feature)
    buttons_bearing1.append(dict(
        label=f'{b1}',
        method='update',
        args=[{'visible': visibility}]
    ))

fig.update_layout(
    updatemenus=[
        dict(
            active=bearing_list.index(bearing1),
            buttons=buttons_bearing1[:8],  # First 8 bearings for left dropdown
            direction='down',
            showactive=True,
            x=0.1,
            xanchor='left',
            y=1.18,
            yanchor='top',
            bgcolor='white',
            bordercolor='#3498db'
        ),
        dict(
            active=bearing_list.index(bearing2),
            buttons=buttons_bearing1[8:],  # Last 7 bearings for right dropdown
            direction='down',
            showactive=True,
            x=0.6,
            xanchor='left',
            y=1.18,
            yanchor='top',
            bgcolor='white',
            bordercolor='#e74c3c'
        )
    ],
    annotations=[
        dict(text='<b>Left Bearing:</b>', x=0, y=1.15, xref='paper', yref='paper', showarrow=False),
        dict(text='<b>Right Bearing:</b>', x=0.5, y=1.15, xref='paper', yref='paper', showarrow=False)
    ],
    title='<b>Cross-Bearing Comparison: RMS Evolution</b>',
    height=500,
    width=1200,
    hovermode='x unified'
)

fig.update_xaxes(title_text='Lifecycle (%)')
fig.update_yaxes(title_text='RMS')

fig.show()

In [None]:
# All Bearings Overlay - Comparative Health Trajectories
fig = go.Figure()

# Add traces for all bearings, colored by condition
for bearing in df['bearing_id'].unique():
    bearing_data = df[df['bearing_id'] == bearing].sort_values('file_idx')
    condition = bearing_data['condition'].iloc[0]
    
    fig.add_trace(go.Scatter(
        x=bearing_data['lifecycle_pct'],
        y=bearing_data['h_rms'],
        name=bearing,
        mode='lines',
        line=dict(color=CONDITION_COLORS[condition], width=1.5),
        opacity=0.7,
        legendgroup=condition,
        hovertemplate=f'{bearing}<br>RMS: %{{y:.3f}}<br>Lifecycle: %{{x:.1f}}%<extra>{condition}</extra>'
    ))

fig.update_layout(
    title='<b>All Bearings: RMS Health Trajectories</b><br><sup>Colored by operating condition</sup>',
    xaxis_title='Lifecycle (%)',
    yaxis_title='Horizontal RMS',
    height=600,
    width=1100,
    legend=dict(
        title='Bearing',
        yanchor='top',
        y=0.99,
        xanchor='left',
        x=1.02
    ),
    hovermode='closest'
)

# Add vertical lines at key lifecycle stages
fig.add_vline(x=25, line_dash='dot', line_color='green', opacity=0.5, annotation_text='25%')
fig.add_vline(x=50, line_dash='dot', line_color='orange', opacity=0.5, annotation_text='50%')
fig.add_vline(x=75, line_dash='dot', line_color='red', opacity=0.5, annotation_text='75%')

fig.show()

---

## 4. 3D Waterfall PSD Diagram

Professional 3D visualization showing PSD evolution across the bearing lifecycle.

In [None]:
# 3D Waterfall PSD - Interactive Surface Plot
# Sample lifecycle points for waterfall
n_lifecycle_points = 30
sample_indices = np.linspace(0, num_files - 1, n_lifecycle_points, dtype=int)

# Compute PSD for each sample
psd_matrix = []
lifecycle_pcts = []

for idx in sample_indices:
    sig = signals_array[idx][:, 0]  # Horizontal channel
    freqs, psd = signal.welch(sig, fs=SAMPLING_RATE, nperseg=2048)
    psd_db = 10 * np.log10(psd + 1e-10)
    psd_matrix.append(psd_db)
    lifecycle_pcts.append(idx / num_files * 100)

psd_matrix = np.array(psd_matrix)

# Limit frequency range
freq_mask = freqs <= 5000
freqs_masked = freqs[freq_mask]
psd_matrix_masked = psd_matrix[:, freq_mask]

print(f'PSD matrix shape: {psd_matrix_masked.shape}')
print(f'Frequency range: 0 - {freqs_masked[-1]:.0f} Hz')
print(f'Lifecycle samples: {len(lifecycle_pcts)}')

In [None]:
# 3D Surface Waterfall Plot
fig = go.Figure(data=[go.Surface(
    z=psd_matrix_masked,
    x=freqs_masked / 1000,  # kHz
    y=lifecycle_pcts,
    colorscale='Viridis',
    colorbar=dict(title='PSD (dB)', x=1.05),
    hovertemplate='Freq: %{x:.2f} kHz<br>Lifecycle: %{y:.1f}%<br>PSD: %{z:.1f} dB<extra></extra>',
    lighting=dict(ambient=0.6, diffuse=0.8, specular=0.3, roughness=0.5),
    lightposition=dict(x=0, y=0, z=2000)
)])

fig.update_layout(
    title=f'<b>3D Waterfall PSD: {sample_bearing_id}</b><br><sup>Horizontal channel - Drag to rotate</sup>',
    scene=dict(
        xaxis_title='Frequency (kHz)',
        yaxis_title='Lifecycle (%)',
        zaxis_title='PSD (dB)',
        xaxis=dict(gridcolor='lightgray', backgroundcolor='white'),
        yaxis=dict(gridcolor='lightgray', backgroundcolor='white'),
        zaxis=dict(gridcolor='lightgray', backgroundcolor='white'),
        camera=dict(
            eye=dict(x=1.5, y=-1.8, z=0.8),
            center=dict(x=0, y=0, z=-0.2)
        ),
        aspectratio=dict(x=1.5, y=1.2, z=0.6)
    ),
    height=700,
    width=1000,
    margin=dict(l=50, r=50, t=80, b=50)
)

fig.show()

In [None]:
# Alternative: Wireframe/Line 3D Waterfall for Cleaner Look
fig = go.Figure()

# Add lines for each lifecycle point
n_lines = 20
line_indices = np.linspace(0, len(lifecycle_pcts)-1, n_lines, dtype=int)

for i in line_indices:
    lifecycle = lifecycle_pcts[i]
    color_val = lifecycle / 100
    
    # Color from green (healthy) to red (failed)
    r = int(255 * color_val)
    g = int(255 * (1 - color_val))
    color = f'rgb({r},{g},50)'
    
    fig.add_trace(go.Scatter3d(
        x=freqs_masked / 1000,
        y=[lifecycle] * len(freqs_masked),
        z=psd_matrix_masked[i],
        mode='lines',
        line=dict(color=color, width=3),
        name=f'{lifecycle:.0f}%',
        hovertemplate=f'Lifecycle: {lifecycle:.0f}%<br>Freq: %{{x:.2f}} kHz<br>PSD: %{{z:.1f}} dB<extra></extra>'
    ))

fig.update_layout(
    title=f'<b>3D Waterfall PSD Lines: {sample_bearing_id}</b><br><sup>Color: Green=Healthy, Red=Failed</sup>',
    scene=dict(
        xaxis_title='Frequency (kHz)',
        yaxis_title='Lifecycle (%)',
        zaxis_title='PSD (dB)',
        camera=dict(eye=dict(x=1.8, y=-1.5, z=0.9)),
        aspectratio=dict(x=1.5, y=1, z=0.6)
    ),
    height=700,
    width=1000,
    showlegend=False
)

fig.show()

In [None]:
# 2D Heatmap Alternative (Top-Down View of Waterfall)
fig = go.Figure(data=go.Heatmap(
    z=psd_matrix_masked,
    x=freqs_masked / 1000,
    y=lifecycle_pcts,
    colorscale='Viridis',
    colorbar=dict(title='PSD (dB)'),
    hovertemplate='Freq: %{x:.2f} kHz<br>Lifecycle: %{y:.1f}%<br>PSD: %{z:.1f} dB<extra></extra>'
))

# Add characteristic frequency annotations
char_freqs = get_characteristic_frequencies_for_condition(sample_condition)
freq_labels = {'bpfo': 'BPFO', 'bpfi': 'BPFI', 'bsf': 'BSF', 'ftf': 'FTF'}
freq_colors = {'bpfo': '#e74c3c', 'bpfi': '#2ecc71', 'bsf': '#9b59b6', 'ftf': '#f39c12'}

for name, freq in char_freqs.items():
    if freq / 1000 < freqs_masked[-1] / 1000:
        fig.add_vline(
            x=freq / 1000, 
            line_dash='dash', 
            line_color=freq_colors[name],
            annotation_text=freq_labels[name],
            annotation_position='top',
            opacity=0.7
        )

fig.update_layout(
    title=f'<b>PSD Evolution Heatmap: {sample_bearing_id}</b><br><sup>With characteristic frequency markers</sup>',
    xaxis_title='Frequency (kHz)',
    yaxis_title='Lifecycle (%)',
    height=600,
    width=1100
)

fig.show()

---

## 5. Interactive Feature Explorer

Multi-dimensional feature visualization with dynamic filtering.

In [None]:
# Interactive Scatter Matrix for Key Features
key_features = ['h_rms', 'h_kurtosis', 'h_crest_factor', 'h_spectral_centroid']

# Sample for performance
df_sample = df.sample(min(2000, len(df)), random_state=42)

fig = px.scatter_matrix(
    df_sample,
    dimensions=key_features,
    color='condition',
    color_discrete_map=CONDITION_COLORS,
    symbol='condition',
    title='<b>Feature Correlation Matrix</b><br><sup>Diagonal shows distribution, off-diagonal shows correlation</sup>',
    opacity=0.5,
    hover_data=['bearing_id', 'lifecycle_pct', 'rul']
)

fig.update_layout(
    height=800,
    width=1000,
    legend=dict(title='Condition')
)

fig.update_traces(diagonal_visible=False, showupperhalf=False)

fig.show()

In [None]:
# 3D Feature Space Visualization
fig = px.scatter_3d(
    df_sample,
    x='h_rms',
    y='h_kurtosis',
    z='h_spectral_centroid',
    color='rul',
    color_continuous_scale='RdYlGn',
    symbol='condition',
    size='h_crest_factor',
    size_max=15,
    opacity=0.7,
    hover_data=['bearing_id', 'lifecycle_pct'],
    title='<b>3D Feature Space</b><br><sup>Color=RUL, Size=Crest Factor</sup>'
)

fig.update_layout(
    height=700,
    width=1000,
    scene=dict(
        xaxis_title='RMS',
        yaxis_title='Kurtosis',
        zaxis_title='Spectral Centroid'
    )
)

fig.show()

In [None]:
# Animated Feature Evolution Over Lifecycle
# Aggregate by lifecycle bins
df['lifecycle_bin'] = pd.cut(df['lifecycle_pct'], bins=20, labels=False) * 5

df_agg = df.groupby(['condition', 'lifecycle_bin']).agg({
    'h_rms': 'mean',
    'h_kurtosis': 'mean',
    'h_crest_factor': 'mean',
    'bearing_id': 'count'
}).reset_index()
df_agg.columns = ['condition', 'lifecycle_bin', 'h_rms', 'h_kurtosis', 'h_crest_factor', 'count']

fig = px.scatter(
    df_agg,
    x='h_rms',
    y='h_kurtosis',
    color='condition',
    color_discrete_map=CONDITION_COLORS,
    size='count',
    size_max=40,
    animation_frame='lifecycle_bin',
    animation_group='condition',
    range_x=[0, df['h_rms'].quantile(0.99)],
    range_y=[df['h_kurtosis'].quantile(0.01), df['h_kurtosis'].quantile(0.99)],
    title='<b>Feature Evolution Animation</b><br><sup>Watch how features change across lifecycle</sup>'
)

fig.update_layout(
    height=600,
    width=900,
    xaxis_title='Mean RMS',
    yaxis_title='Mean Kurtosis'
)

fig.show()

In [None]:
# Interactive Parallel Coordinates Plot
features_for_parallel = ['h_rms', 'v_rms', 'h_kurtosis', 'v_kurtosis', 
                         'h_crest_factor', 'cross_correlation', 'rul']

# Normalize features for better visualization
df_norm = df_sample.copy()
for col in features_for_parallel[:-1]:  # Don't normalize RUL
    df_norm[col] = (df_norm[col] - df_norm[col].min()) / (df_norm[col].max() - df_norm[col].min())

fig = px.parallel_coordinates(
    df_norm,
    dimensions=features_for_parallel,
    color='rul',
    color_continuous_scale='RdYlGn',
    title='<b>Parallel Coordinates: Feature Relationships</b><br><sup>Drag axes to filter, color=RUL</sup>'
)

fig.update_layout(
    height=600,
    width=1200
)

fig.show()

---

## 6. Static Baseline Visualizations (Matplotlib/Seaborn)

High-quality static plots for publication and reports.

In [None]:
# Publication-Quality RMS Trends
fig, axes = plt.subplots(1, 3, figsize=(16, 5))

conditions = ['35Hz12kN', '37.5Hz11kN', '40Hz10kN']
colors_mpl = ['#3498db', '#2ecc71', '#e74c3c']

for ax, (cond, color) in zip(axes, zip(conditions, colors_mpl)):
    cond_bearings = df[df['condition'] == cond]['bearing_id'].unique()
    
    for bearing in cond_bearings:
        data = df[(df['condition'] == cond) & (df['bearing_id'] == bearing)].sort_values('file_idx')
        ax.plot(data['lifecycle_pct'], data['h_rms'], alpha=0.7, linewidth=1.2, label=bearing)
    
    ax.set_xlabel('Lifecycle (%)', fontsize=11)
    ax.set_ylabel('Horizontal RMS', fontsize=11)
    ax.set_title(f'{cond}', fontsize=12, fontweight='bold')
    ax.axvline(x=80, color='red', linestyle='--', alpha=0.5, linewidth=1)
    ax.legend(fontsize=8, loc='upper left')
    ax.grid(True, alpha=0.3)
    ax.set_xlim(0, 100)

plt.suptitle('RMS Degradation Trends by Condition', fontsize=14, fontweight='bold', y=1.02)
plt.tight_layout()
plt.show()

In [None]:
# Seaborn Heatmap: Feature Correlation with RUL
feature_cols = [c for c in df.columns if c.startswith(('h_', 'v_')) and 'band' not in c]
corr_with_rul = df[feature_cols + ['rul']].corr()['rul'].drop('rul').sort_values()

fig, ax = plt.subplots(figsize=(10, 12))

colors_diverging = ['#e74c3c' if c < 0 else '#2ecc71' for c in corr_with_rul]
bars = ax.barh(corr_with_rul.index, corr_with_rul.values, color=colors_diverging)

ax.axvline(x=0, color='black', linewidth=0.5)
ax.axvline(x=-0.5, color='gray', linestyle='--', alpha=0.5)
ax.axvline(x=0.5, color='gray', linestyle='--', alpha=0.5)

ax.set_xlabel('Correlation with RUL', fontsize=11)
ax.set_title('Feature-RUL Correlation\n(Negative = increases as RUL decreases)', fontsize=12, fontweight='bold')
ax.grid(True, alpha=0.3, axis='x')

plt.tight_layout()
plt.show()

In [None]:
# Static Waterfall Plot using Matplotlib 3D
from mpl_toolkits.mplot3d import Axes3D
from matplotlib.collections import PolyCollection

fig = plt.figure(figsize=(14, 10))
ax = fig.add_subplot(111, projection='3d')

# Prepare data for waterfall
verts = []
colors_waterfall = []

for i, (psd, lifecycle) in enumerate(zip(psd_matrix_masked[::2], lifecycle_pcts[::2])):
    xs = freqs_masked / 1000
    ys = psd
    verts.append(list(zip(xs, ys)))
    
    # Color based on lifecycle
    r = lifecycle / 100
    colors_waterfall.append((r, 1-r, 0.2, 0.7))

poly = PolyCollection(verts, facecolors=colors_waterfall, edgecolors='k', linewidths=0.5)
ax.add_collection3d(poly, zs=lifecycle_pcts[::2], zdir='y')

ax.set_xlabel('Frequency (kHz)')
ax.set_ylabel('Lifecycle (%)')
ax.set_zlabel('PSD (dB)')
ax.set_xlim(0, freqs_masked[-1]/1000)
ax.set_ylim(0, 100)
ax.set_zlim(psd_matrix_masked.min(), psd_matrix_masked.max())

ax.view_init(elev=25, azim=-60)
ax.set_title(f'3D Waterfall PSD: {sample_bearing_id} (Static)', fontsize=12, fontweight='bold')

plt.tight_layout()
plt.show()

In [None]:
# Multi-Bearing Comparison Grid
fig, axes = plt.subplots(3, 5, figsize=(18, 10))

for idx, bearing in enumerate(df['bearing_id'].unique()):
    row = idx // 5
    col = idx % 5
    ax = axes[row, col]
    
    data = df[df['bearing_id'] == bearing].sort_values('file_idx')
    condition = data['condition'].iloc[0]
    color = {'35Hz12kN': '#3498db', '37.5Hz11kN': '#2ecc71', '40Hz10kN': '#e74c3c'}[condition]
    
    ax.plot(data['lifecycle_pct'], data['h_rms'], color=color, linewidth=1)
    ax.fill_between(data['lifecycle_pct'], 0, data['h_rms'], alpha=0.3, color=color)
    
    ax.set_title(bearing, fontsize=9, fontweight='bold')
    ax.set_xlim(0, 100)
    ax.set_ylim(0, data['h_rms'].max() * 1.1)
    
    if col == 0:
        ax.set_ylabel('H_RMS', fontsize=8)
    if row == 2:
        ax.set_xlabel('Lifecycle (%)', fontsize=8)
    
    ax.tick_params(labelsize=7)

plt.suptitle('All Bearings: RMS Degradation Overview', fontsize=14, fontweight='bold', y=1.02)
plt.tight_layout()
plt.show()

---

## 7. Summary and Key Insights

### Interactive Visualization Capabilities

1. **Health Dashboard**: Sunburst chart showing dataset hierarchy, violin plots for RUL distribution
2. **Bearing Selector**: Dropdown-based bearing selection for RMS evolution analysis
3. **Time-Frequency Spectrograms**: Animated spectrogram with lifecycle slider
4. **Cross-Bearing Comparison**: Side-by-side comparison with dual dropdowns
5. **3D Waterfall PSD**: Surface and wireframe 3D plots showing frequency evolution
6. **Feature Explorer**: Scatter matrix, 3D feature space, parallel coordinates

### Static Baselines
- Publication-quality matplotlib plots
- Seaborn correlation heatmaps
- 3D waterfall using matplotlib
- Multi-bearing comparison grid

### Key Findings
- RMS shows consistent increase toward end-of-life across all bearings
- Kurtosis exhibits spikes indicating impact events
- Characteristic frequencies become more prominent near failure
- Different operating conditions show distinct degradation patterns

In [None]:
# Cleanup
if 'lifecycle_bin' in df.columns:
    df = df.drop(columns=['lifecycle_bin'])

print('EDA-3 Interactive Visualizations Complete!')
print('\nVisualization Summary:')
print('- 6 Interactive Plotly dashboards with dropdowns/sliders')
print('- 3D waterfall PSD (surface and wireframe)')
print('- Animated spectrogram with lifecycle slider')
print('- Cross-bearing comparison tools')
print('- Static matplotlib/seaborn baselines')