I'll provide a comprehensive explanation of Fourier analysis for categorical time series, covering the mathematical foundations, practical implementation, and interpretation strategies. Let me examine your current notebook to understand the context better.

[1 tool called]

Perfect! I can see the Fourier analysis method in your notebook. Let me provide a comprehensive explanation of how Fourier analysis works for categorical time series analysis.

[1 tool called]

## **Fourier Analysis for Categorical Time Series: Complete Mathematical Framework**

### **1. MATHEMATICAL FOUNDATIONS**

#### **A. Core Concept: From Categorical to Continuous**

Fourier analysis requires **continuous numerical data**, but categorical data is discrete. The key transformation:

**Categorical → Proportions → Time Series → Frequency Domain**

```python
# Step 1: Categorical data over time
# Gender: [M, F, M, M, F, F, M, F, ...]
# Time:   [1, 2, 3, 4, 5, 6, 7, 8, ...]

# Step 2: Convert to proportions per time period
# Time Period 1: P(Male)=0.6, P(Female)=0.4
# Time Period 2: P(Male)=0.3, P(Female)=0.7
# Time Period 3: P(Male)=0.8, P(Female)=0.2
```

#### **B. Discrete Fourier Transform (DFT) Mathematics**

For a time series of proportions **x[n]** with **N** time points:

**X[k] = Σ(n=0 to N-1) x[n] × e^(-i2πkn/N)**

Where:
- **X[k]** = Complex frequency component at frequency k
- **x[n]** = Proportion value at time n
- **k** = Frequency index (0 to N-1)
- **i** = Imaginary unit (√-1)

#### **C. Key Frequency Domain Properties**

1. **Magnitude**: **|X[k]| = √(Real[X[k]]² + Imag[X[k]]²)**
2. **Phase**: **φ[k] = arctan(Imag[X[k]] / Real[X[k]])**
3. **Power**: **P[k] = |X[k]|²**
4. **Frequency**: **f[k] = k / N** (normalized frequency)

[1 tool called]

### **2. ENHANCED IMPLEMENTATION WITH DETAILED ANALYSIS**

```python
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from scipy import signal
from scipy.fft import fft, fftfreq
import seaborn as sns

def comprehensive_fourier_categorical_analysis(df, cat_col, time_col, sampling_period='D'):
    """
    Complete Fourier analysis implementation for categorical time series
    """
    
    # STEP 1: Data Preparation and Validation
    print("=== STEP 1: DATA PREPARATION ===")
    
    # Ensure time column is datetime
    df[time_col] = pd.to_datetime(df[time_col])
    df = df.sort_values(time_col)
    
    # Create regular time grid (handle missing periods)
    full_time_range = pd.date_range(
        start=df[time_col].min(), 
        end=df[time_col].max(), 
        freq=sampling_period
    )
    
    print(f"Original data points: {len(df)}")
    print(f"Full time range: {len(full_time_range)} periods")
    print(f"Categories: {df[cat_col].unique()}")
    
    # STEP 2: Convert to Proportion Time Series
    print("\n=== STEP 2: PROPORTION TIME SERIES ===")
    
    # Group by time periods and calculate proportions
    time_grouped = df.groupby(df[time_col].dt.floor(sampling_period))
    proportion_series = time_grouped[cat_col].value_counts(normalize=True).unstack(fill_value=0)
    
    # Reindex to full time range (fill missing with interpolation)
    proportion_series = proportion_series.reindex(full_time_range).interpolate(method='linear').fillna(method='bfill').fillna(method='ffill')
    
    print(f"Proportion series shape: {proportion_series.shape}")
    print("Sample proportions:")
    print(proportion_series.head())
    
    # STEP 3: Fourier Transform for Each Category
    print("\n=== STEP 3: FOURIER TRANSFORM ===")
    
    results = {}
    N = len(proportion_series)
    
    # Calculate sampling frequency (for proper frequency scaling)
    if sampling_period == 'D':
        fs = 1.0  # 1 sample per day
        freq_unit = 'cycles/day'
    elif sampling_period == 'H':
        fs = 24.0  # 24 samples per day
        freq_unit = 'cycles/hour'
    elif sampling_period == 'M':
        fs = 1/30.0  # 1 sample per month (approx)
        freq_unit = 'cycles/month'
    else:
        fs = 1.0
        freq_unit = 'cycles/period'
    
    for category in proportion_series.columns:
        print(f"\nAnalyzing category: {category}")
        
        # Get time series for this category
        y = proportion_series[category].values
        
        # Remove DC component (mean) for better analysis
        y_detrended = y - np.mean(y)
        
        # Apply window function to reduce spectral leakage
        window = signal.windows.hann(N)
        y_windowed = y_detrended * window
        
        # Compute FFT
        Y = fft(y_windowed)
        frequencies = fftfreq(N, d=1/fs)
        
        # Only keep positive frequencies (due to symmetry)
        positive_freq_idx = frequencies > 0
        frequencies_pos = frequencies[positive_freq_idx]
        Y_pos = Y[positive_freq_idx]
        
        # Calculate magnitude, phase, and power
        magnitude = np.abs(Y_pos)
        phase = np.angle(Y_pos)
        power = magnitude ** 2
        
        # Normalize power spectrum
        power_normalized = power / np.sum(power)
        
        # Find dominant frequencies
        peak_indices = signal.find_peaks(magnitude, height=np.max(magnitude)*0.1)[0]
        dominant_frequencies = frequencies_pos[peak_indices]
        dominant_magnitudes = magnitude[peak_indices]
        
        # Convert frequencies to periods (more intuitive)
        dominant_periods = 1 / dominant_frequencies[dominant_frequencies > 0]
        
        results[category] = {
            'time_series': y,
            'detrended_series': y_detrended,
            'frequencies': frequencies_pos,
            'magnitude': magnitude,
            'phase': phase,
            'power': power,
            'power_normalized': power_normalized,
            'dominant_frequencies': dominant_frequencies,
            'dominant_periods': dominant_periods,
            'dominant_magnitudes': dominant_magnitudes,
            'peak_indices': peak_indices,
            'total_variance': np.var(y),
            'explained_variance_ratio': np.sum(power_normalized[peak_indices])
        }
        
        print(f"  Dominant periods: {dominant_periods}")
        print(f"  Explained variance: {results[category]['explained_variance_ratio']:.3f}")
    
    return results, proportion_series

def interpret_fourier_results(results, sampling_period='D'):
    """
    Comprehensive interpretation of Fourier analysis results
    """
    
    print("=== FOURIER ANALYSIS INTERPRETATION ===")
    
    interpretations = {}
    
    for category, data in results.items():
        print(f"\n--- CATEGORY: {category} ---")
        
        # Extract key metrics
        dominant_periods = data['dominant_periods']
        dominant_magnitudes = data['dominant_magnitudes']
        explained_variance = data['explained_variance_ratio']
        
        # Interpretation framework
        interpretation = {
            'seasonality_strength': 'weak',
            'primary_cycles': [],
            'seasonal_patterns': [],
            'trend_component': 'stable',
            'noise_level': 'low'
        }
        
        # 1. SEASONALITY STRENGTH ASSESSMENT
        if explained_variance > 0.7:
            interpretation['seasonality_strength'] = 'very_strong'
        elif explained_variance > 0.5:
            interpretation['seasonality_strength'] = 'strong'
        elif explained_variance > 0.3:
            interpretation['seasonality_strength'] = 'moderate'
        elif explained_variance > 0.1:
            interpretation['seasonality_strength'] = 'weak'
        else:
            interpretation['seasonality_strength'] = 'very_weak'
        
        print(f"Seasonality Strength: {interpretation['seasonality_strength']}")
        print(f"Explained Variance: {explained_variance:.3f}")
        
        # 2. CYCLE IDENTIFICATION
        if len(dominant_periods) > 0:
            # Sort by magnitude (strongest cycles first)
            sorted_indices = np.argsort(dominant_magnitudes)[::-1]
            
            for i, idx in enumerate(sorted_indices[:3]):  # Top 3 cycles
                period = dominant_periods[idx]
                magnitude = dominant_magnitudes[idx]
                
                # Interpret period length
                if sampling_period == 'D':
                    if 6 <= period <= 8:
                        cycle_type = 'weekly'
                    elif 28 <= period <= 32:
                        cycle_type = 'monthly'
                    elif 85 <= period <= 95:
                        cycle_type = 'quarterly'
                    elif 360 <= period <= 370:
                        cycle_type = 'annual'
                    elif period < 6:
                        cycle_type = 'short_term'
                    else:
                        cycle_type = f'{period:.1f}_day'
                elif sampling_period == 'H':
                    if 23 <= period <= 25:
                        cycle_type = 'daily'
                    elif 167 <= period <= 169:
                        cycle_type = 'weekly'
                    else:
                        cycle_type = f'{period:.1f}_hour'
                else:
                    cycle_type = f'{period:.1f}_period'
                
                interpretation['primary_cycles'].append({
                    'period': period,
                    'cycle_type': cycle_type,
                    'magnitude': magnitude,
                    'rank': i + 1
                })
                
                print(f"Cycle {i+1}: {cycle_type} ({period:.1f} periods, magnitude: {magnitude:.3f})")
        
        # 3. TREND ANALYSIS (from DC component and low frequencies)
        dc_component = np.mean(data['time_series'])
        low_freq_power = np.sum(data['power'][:5]) / np.sum(data['power'])
        
        if low_freq_power > 0.3:
            interpretation['trend_component'] = 'strong_trend'
        elif low_freq_power > 0.1:
            interpretation['trend_component'] = 'moderate_trend'
        else:
            interpretation['trend_component'] = 'stable'
        
        print(f"Trend Component: {interpretation['trend_component']}")
        print(f"Low Frequency Power: {low_freq_power:.3f}")
        
        # 4. NOISE ASSESSMENT
        high_freq_power = np.sum(data['power'][len(data['power'])//2:]) / np.sum(data['power'])
        
        if high_freq_power > 0.4:
            interpretation['noise_level'] = 'high'
        elif high_freq_power > 0.2:
            interpretation['noise_level'] = 'moderate'
        else:
            interpretation['noise_level'] = 'low'
        
        print(f"Noise Level: {interpretation['noise_level']}")
        print(f"High Frequency Power: {high_freq_power:.3f}")
        
        interpretations[category] = interpretation
    
    return interpretations
```

[1 tool called]

### **3. COMPREHENSIVE INTERPRETATION FRAMEWORK**

#### **A. Frequency Domain Insights**

[1 tool called]

#### **B. Advanced Interpretation Methods**

```python
def advanced_fourier_interpretation(results, time_index):
    """
    Advanced interpretation techniques for Fourier analysis
    """
    
    advanced_insights = {}
    
    for category, data in results.items():
        insights = {
            'spectral_characteristics': {},
            'seasonal_decomposition': {},
            'predictability_metrics': {},
            'comparative_analysis': {}
        }
        
        # 1. SPECTRAL CHARACTERISTICS
        frequencies = data['frequencies']
        magnitude = data['magnitude']
        power = data['power']
        
        # Spectral centroid (weighted average frequency)
        spectral_centroid = np.sum(frequencies * magnitude) / np.sum(magnitude)
        
        # Spectral spread (frequency distribution width)
        spectral_spread = np.sqrt(np.sum(((frequencies - spectral_centroid) ** 2) * magnitude) / np.sum(magnitude))
        
        # Spectral rolloff (frequency below which 85% of energy is contained)
        cumulative_power = np.cumsum(power)
        rolloff_threshold = 0.85 * cumulative_power[-1]
        rolloff_idx = np.where(cumulative_power >= rolloff_threshold)[0][0]
        spectral_rolloff = frequencies[rolloff_idx]
        
        insights['spectral_characteristics'] = {
            'centroid': spectral_centroid,
            'spread': spectral_spread,
            'rolloff': spectral_rolloff,
            'bandwidth': spectral_rolloff - frequencies[0]
        }
        
        # 2. SEASONAL DECOMPOSITION INSIGHTS
        # Reconstruct signal from dominant frequencies
        dominant_indices = data['peak_indices']
        reconstructed_signal = np.zeros(len(data['time_series']))
        
        for idx in dominant_indices:
            freq = frequencies[idx]
            amplitude = magnitude[idx] / len(data['time_series'])
            phase = data['phase'][idx]
            
            # Reconstruct sinusoidal component
            t = np.arange(len(data['time_series']))
            component = amplitude * np.cos(2 * np.pi * freq * t + phase)
            reconstructed_signal += component
        
        # Calculate reconstruction quality
        original_signal = data['detrended_series']
        reconstruction_error = np.mean((original_signal - reconstructed_signal) ** 2)
        reconstruction_r2 = 1 - reconstruction_error / np.var(original_signal)
        
        insights['seasonal_decomposition'] = {
            'reconstructed_signal': reconstructed_signal,
            'reconstruction_r2': reconstruction_r2,
            'residual_variance': reconstruction_error,
            'seasonal_component_strength': np.var(reconstructed_signal) / np.var(original_signal)
        }
        
        # 3. PREDICTABILITY METRICS
        # Entropy of power spectrum (measure of predictability)
        power_normalized = power / np.sum(power)
        spectral_entropy = -np.sum(power_normalized * np.log2(power_normalized + 1e-10))
        max_entropy = np.log2(len(power))
        normalized_entropy = spectral_entropy / max_entropy
        
        # Regularity index (concentration of power in dominant frequencies)
        dominant_power = np.sum(power[dominant_indices])
        total_power = np.sum(power)
        regularity_index = dominant_power / total_power
        
        insights['predictability_metrics'] = {
            'spectral_entropy': spectral_entropy,
            'normalized_entropy': normalized_entropy,
            'regularity_index': regularity_index,
            'predictability_score': 1 - normalized_entropy  # Higher = more predictable
        }
        
        advanced_insights[category] = insights
    
    return advanced_insights

def business_interpretation_framework(results, category_context=None):
    """
    Translate Fourier analysis results into business insights
    """
    
    business_insights = {}
    
    for category, data in results.items():
        insights = {
            'business_patterns': [],
            'operational_implications': [],
            'forecasting_potential': 'low',
            'anomaly_detection_capability': 'low'
        }
        
        # Analyze dominant cycles for business meaning
        if 'primary_cycles' in data:
            for cycle in data['primary_cycles']:
                period = cycle['period']
                cycle_type = cycle['cycle_type']
                magnitude = cycle['magnitude']
                
                # Business pattern interpretation
                if cycle_type == 'weekly':
                    insights['business_patterns'].append({
                        'pattern': 'Weekly Business Cycle',
                        'description': f'{category} shows {magnitude:.2f} strength weekly pattern',
                        'implication': 'Staff scheduling and inventory planning opportunities'
                    })
                elif cycle_type == 'monthly':
                    insights['business_patterns'].append({
                        'pattern': 'Monthly Consumer Cycle',
                        'description': f'{category} follows monthly spending/behavior patterns',
                        'implication': 'Align marketing campaigns with monthly cycles'
                    })
                elif cycle_type == 'quarterly':
                    insights['business_patterns'].append({
                        'pattern': 'Quarterly Business Seasonality',
                        'description': f'{category} shows quarterly seasonal variation',
                        'implication': 'Strategic planning and budget allocation considerations'
                    })
                elif cycle_type == 'annual':
                    insights['business_patterns'].append({
                        'pattern': 'Annual Seasonal Pattern',
                        'description': f'{category} has strong yearly seasonality',
                        'implication': 'Long-term capacity planning and seasonal hiring'
                    })
        
        # Forecasting potential assessment
        explained_variance = data.get('explained_variance_ratio', 0)
        if explained_variance > 0.6:
            insights['forecasting_potential'] = 'high'
            insights['operational_implications'].append('High forecasting accuracy possible')
        elif explained_variance > 0.3:
            insights['forecasting_potential'] = 'medium'
            insights['operational_implications'].append('Moderate forecasting accuracy expected')
        else:
            insights['forecasting_potential'] = 'low'
            insights['operational_implications'].append('Limited forecasting reliability')
        
        # Anomaly detection capability
        regularity = data.get('regularity_index', 0)
        if regularity > 0.5:
            insights['anomaly_detection_capability'] = 'high'
            insights['operational_implications'].append('Strong baseline for anomaly detection')
        elif regularity > 0.2:
            insights['anomaly_detection_capability'] = 'medium'
            insights['operational_implications'].append('Moderate anomaly detection possible')
        
        business_insights[category] = insights
    
    return business_insights
```

[1 tool called]

### **4. PRACTICAL EXAMPLES & CASE STUDIES**

#### **A. Example 1: Strong Weekly Pattern**

```python
def create_weekly_pattern_example():
    """
    Generate synthetic data with strong weekly seasonality
    """
    
    # Create 2 years of daily data
    dates = pd.date_range('2022-01-01', '2023-12-31', freq='D')
    n_days = len(dates)
    
    # Create strong weekly pattern: More males on weekdays, more females on weekends
    np.random.seed(42)
    
    data = []
    for i, date in enumerate(dates):
        day_of_week = date.dayofweek  # 0=Monday, 6=Sunday
        
        # Weekly pattern: Higher male proportion on weekdays
        if day_of_week < 5:  # Weekdays
            male_prob = 0.7 + 0.1 * np.sin(2 * np.pi * i / 7)  # Weekly sine wave
        else:  # Weekends
            male_prob = 0.3 + 0.1 * np.sin(2 * np.pi * i / 7)
        
        # Add some noise
        male_prob += np.random.normal(0, 0.05)
        male_prob = np.clip(male_prob, 0.1, 0.9)
        
        # Generate observations for this day
        n_obs = np.random.poisson(50)  # Average 50 observations per day
        genders = np.random.choice(['Male', 'Female'], size=n_obs, p=[male_prob, 1-male_prob])
        
        for gender in genders:
            data.append({'date': date, 'gender': gender})
    
    df = pd.DataFrame(data)
    
    # Apply Fourier analysis
    results, proportion_series = comprehensive_fourier_categorical_analysis(
        df, 'gender', 'date', sampling_period='D'
    )
    
    print("\n=== WEEKLY PATTERN EXAMPLE RESULTS ===")
    for category, result in results.items():
        print(f"\nCategory: {category}")
        print(f"Dominant periods: {result['dominant_periods']}")
        print(f"Expected: ~7 days (weekly pattern)")
        print(f"Explained variance: {result['explained_variance_ratio']:.3f}")
    
    return df, results, proportion_series

# Example 2: Monthly + Annual Pattern
def create_complex_seasonal_example():
    """
    Generate data with both monthly and annual seasonality
    """
    
    dates = pd.date_range('2020-01-01', '2023-12-31', freq='D')
    n_days = len(dates)
    
    data = []
    for i, date in enumerate(dates):
        # Annual pattern: More of category A in summer
        annual_component = 0.1 * np.sin(2 * np.pi * i / 365.25)
        
        # Monthly pattern: Category A peaks mid-month
        monthly_component = 0.05 * np.sin(2 * np.pi * i / 30.44)
        
        # Base probability + seasonal components
        prob_A = 0.5 + annual_component + monthly_component
        prob_A = np.clip(prob_A, 0.1, 0.9)
        
        # Generate observations
        n_obs = np.random.poisson(30)
        categories = np.random.choice(['A', 'B'], size=n_obs, p=[prob_A, 1-prob_A])
        
        for cat in categories:
            data.append({'date': date, 'category': cat})
    
    df = pd.DataFrame(data)
    
    results, proportion_series = comprehensive_fourier_categorical_analysis(
        df, 'category', 'date', sampling_period='D'
    )
    
    print("\n=== COMPLEX SEASONAL EXAMPLE RESULTS ===")
    for category, result in results.items():
        periods = result['dominant_periods']
        print(f"\nCategory: {category}")
        print(f"Dominant periods: {periods}")
        print("Expected: ~30 days (monthly) and ~365 days (annual)")
        print(f"Explained variance: {result['explained_variance_ratio']:.3f}")
    
    return df, results, proportion_series

# Example 3: Trend + Seasonality
def create_trend_seasonal_example():
    """
    Generate data with both trend and seasonal components
    """
    
    dates = pd.date_range('2020-01-01', '2023-12-31', freq='D')
    n_days = len(dates)
    
    data = []
    for i, date in enumerate(dates):
        # Linear trend: Category A proportion increases over time
        trend_component = 0.1 * (i / n_days)  # 0 to 0.1 over the period
        
        # Seasonal component: Annual cycle
        seasonal_component = 0.15 * np.sin(2 * np.pi * i / 365.25)
        
        # Combine components
        prob_A = 0.4 + trend_component + seasonal_component
        prob_A = np.clip(prob_A, 0.1, 0.9)
        
        # Generate observations
        n_obs = np.random.poisson(40)
        categories = np.random.choice(['A', 'B'], size=n_obs, p=[prob_A, 1-prob_A])
        
        for cat in categories:
            data.append({'date': date, 'category': cat})
    
    df = pd.DataFrame(data)
    
    results, proportion_series = comprehensive_fourier_categorical_analysis(
        df, 'category', 'date', sampling_period='D'
    )
    
    print("\n=== TREND + SEASONAL EXAMPLE RESULTS ===")
    for category, result in results.items():
        periods = result['dominant_periods']
        print(f"\nCategory: {category}")
        print(f"Dominant periods: {periods}")
        print("Expected: ~365 days (annual) + low-frequency trend")
        print(f"Explained variance: {result['explained_variance_ratio']:.3f}")
    
    return df, results, proportion_series
```

#### **B. Real-World Application Scenarios**

```python
def retail_customer_analysis_example():
    """
    Example: Retail customer gender patterns analysis
    """
    
    # Simulate retail customer data with realistic patterns
    dates = pd.date_range('2022-01-01', '2023-12-31', freq='H')  # Hourly data
    
    data = []
    for i, date in enumerate(dates):
        hour = date.hour
        day_of_week = date.dayofweek
        month = date.month
        
        # Daily pattern: More females during daytime shopping hours
        if 9 <= hour <= 17:
            female_prob = 0.65
        elif 18 <= hour <= 21:  # Evening shopping
            female_prob = 0.55
        else:  # Early morning/late night
            female_prob = 0.45
        
        # Weekly pattern: More families (balanced) on weekends
        if day_of_week >= 5:  # Weekend
            female_prob = 0.52
        
        # Seasonal pattern: Holiday shopping (November-December)
        if month in [11, 12]:
            female_prob += 0.08  # More female shoppers during holidays
        
        # Add noise
        female_prob += np.random.normal(0, 0.03)
        female_prob = np.clip(female_prob, 0.3, 0.8)
        
        # Generate customers for this hour
        n_customers = max(0, np.random.poisson(20) - 10)  # Variable traffic
        if n_customers > 0:
            genders = np.random.choice(['Male', 'Female'], 
                                     size=n_customers, 
                                     p=[1-female_prob, female_prob])
            
            for gender in genders:
                data.append({'datetime': date, 'gender': gender})
    
    df = pd.DataFrame(data)
    
    # Analyze at different time scales
    print("=== RETAIL CUSTOMER ANALYSIS ===")
    
    # Daily patterns (hourly sampling)
    results_hourly, _ = comprehensive_fourier_categorical_analysis(
        df, 'gender', 'datetime', sampling_period='H'
    )
    
    # Weekly patterns (daily sampling)  
    results_daily, _ = comprehensive_fourier_categorical_analysis(
        df, 'gender', 'datetime', sampling_period='D'
    )
    
    print("\nHourly Analysis (Daily Patterns):")
    for category, result in results_hourly.items():
        periods = result['dominant_periods']
        daily_periods = [p for p in periods if 20 <= p <= 28]  # ~24 hour periods
        if daily_periods:
            print(f"  {category}: Daily pattern detected at {daily_periods[0]:.1f} hours")
    
    print("\nDaily Analysis (Weekly/Seasonal Patterns):")
    for category, result in results_daily.items():
        periods = result['dominant_periods']
        weekly_periods = [p for p in periods if 6 <= p <= 8]  # ~7 day periods
        seasonal_periods = [p for p in periods if 360 <= p <= 370]  # ~365 day periods
        
        if weekly_periods:
            print(f"  {category}: Weekly pattern at {weekly_periods[0]:.1f} days")
        if seasonal_periods:
            print(f"  {category}: Annual pattern at {seasonal_periods[0]:.1f} days")
    
    return df, results_hourly, results_daily
```

[1 tool called]

### **5. COMPREHENSIVE VISUALIZATION FRAMEWORK**

```python
def create_fourier_visualization_suite(results, proportion_series, category_name):
    """
    Create comprehensive visualization suite for Fourier analysis results
    """
    
    fig = plt.figure(figsize=(20, 16))
    
    # Get data for the specified category
    data = results[category_name]
    
    # 1. TIME SERIES PLOT
    ax1 = plt.subplot(3, 3, 1)
    plt.plot(proportion_series.index, proportion_series[category_name], 'b-', alpha=0.7, linewidth=1)
    plt.title(f'Time Series: {category_name} Proportions', fontsize=12, fontweight='bold')
    plt.xlabel('Time')
    plt.ylabel('Proportion')
    plt.grid(True, alpha=0.3)
    
    # 2. DETRENDED TIME SERIES
    ax2 = plt.subplot(3, 3, 2)
    plt.plot(data['detrended_series'], 'g-', alpha=0.7, linewidth=1)
    plt.title(f'Detrended Series: {category_name}', fontsize=12, fontweight='bold')
    plt.xlabel('Time Index')
    plt.ylabel('Detrended Proportion')
    plt.grid(True, alpha=0.3)
    
    # 3. MAGNITUDE SPECTRUM
    ax3 = plt.subplot(3, 3, 3)
    plt.semilogy(data['frequencies'], data['magnitude'], 'r-', linewidth=2)
    
    # Highlight dominant frequencies
    for i, peak_idx in enumerate(data['peak_indices'][:5]):  # Top 5 peaks
        plt.semilogy(data['frequencies'][peak_idx], data['magnitude'][peak_idx], 
                    'ro', markersize=8, label=f'Peak {i+1}' if i < 3 else '')
    
    plt.title(f'Magnitude Spectrum: {category_name}', fontsize=12, fontweight='bold')
    plt.xlabel('Frequency (cycles/period)')
    plt.ylabel('Magnitude (log scale)')
    plt.legend()
    plt.grid(True, alpha=0.3)
    
    # 4. POWER SPECTRUM
    ax4 = plt.subplot(3, 3, 4)
    plt.loglog(data['frequencies'], data['power'], 'purple', linewidth=2)
    plt.title(f'Power Spectrum: {category_name}', fontsize=12, fontweight='bold')
    plt.xlabel('Frequency (cycles/period)')
    plt.ylabel('Power (log scale)')
    plt.grid(True, alpha=0.3)
    
    # 5. PERIOD SPECTRUM (more intuitive)
    ax5 = plt.subplot(3, 3, 5)
    periods = 1 / data['frequencies'][data['frequencies'] > 0]
    power_periods = data['power'][data['frequencies'] > 0]
    
    # Filter reasonable periods (avoid very short or very long)
    valid_mask = (periods >= 2) & (periods <= len(proportion_series) / 2)
    periods_filtered = periods[valid_mask]
    power_filtered = power_periods[valid_mask]
    
    plt.semilogx(periods_filtered, power_filtered, 'orange', linewidth=2)
    plt.title(f'Period Spectrum: {category_name}', fontsize=12, fontweight='bold')
    plt.xlabel('Period (time units)')
    plt.ylabel('Power')
    plt.grid(True, alpha=0.3)
    
    # Add period labels for common cycles
    common_periods = [7, 30, 91, 365]  # Weekly, monthly, quarterly, annual
    for period in common_periods:
        if period <= max(periods_filtered):
            plt.axvline(x=period, color='red', linestyle='--', alpha=0.5)
            plt.text(period, max(power_filtered)*0.8, f'{period}d', 
                    rotation=90, verticalalignment='bottom')
    
    # 6. PHASE SPECTRUM
    ax6 = plt.subplot(3, 3, 6)
    plt.plot(data['frequencies'], np.degrees(data['phase']), 'cyan', linewidth=1)
    plt.title(f'Phase Spectrum: {category_name}', fontsize=12, fontweight='bold')
    plt.xlabel('Frequency (cycles/period)')
    plt.ylabel('Phase (degrees)')
    plt.grid(True, alpha=0.3)
    
    # 7. RECONSTRUCTED SIGNAL COMPARISON
    ax7 = plt.subplot(3, 3, 7)
    
    # Reconstruct signal from dominant frequencies
    reconstructed = np.zeros(len(data['detrended_series']))
    t = np.arange(len(data['detrended_series']))
    
    for peak_idx in data['peak_indices'][:3]:  # Top 3 components
        freq = data['frequencies'][peak_idx]
        amplitude = data['magnitude'][peak_idx] / len(data['detrended_series'])
        phase = data['phase'][peak_idx]
        component = amplitude * np.cos(2 * np.pi * freq * t + phase)
        reconstructed += component
    
    plt.plot(data['detrended_series'], 'b-', alpha=0.7, label='Original', linewidth=1)
    plt.plot(reconstructed, 'r--', alpha=0.8, label='Reconstructed (Top 3)', linewidth=2)
    plt.title(f'Signal Reconstruction: {category_name}', fontsize=12, fontweight='bold')
    plt.xlabel('Time Index')
    plt.ylabel('Amplitude')
    plt.legend()
    plt.grid(True, alpha=0.3)
    
    # 8. DOMINANT CYCLES BAR CHART
    ax8 = plt.subplot(3, 3, 8)
    
    if len(data['dominant_periods']) > 0:
        # Sort by magnitude
        sorted_indices = np.argsort(data['dominant_magnitudes'])[::-1]
        top_periods = data['dominant_periods'][sorted_indices][:5]
        top_magnitudes = data['dominant_magnitudes'][sorted_indices][:5]
        
        bars = plt.bar(range(len(top_periods)), top_magnitudes, 
                      color=['red', 'orange', 'yellow', 'green', 'blue'][:len(top_periods)])
        
        # Add period labels
        labels = [f'{period:.1f}' for period in top_periods]
        plt.xticks(range(len(top_periods)), labels)
        
        plt.title(f'Top Dominant Cycles: {category_name}', fontsize=12, fontweight='bold')
        plt.xlabel('Period (time units)')
        plt.ylabel('Magnitude')
        plt.grid(True, alpha=0.3, axis='y')
        
        # Add value labels on bars
        for i, (bar, mag) in enumerate(zip(bars, top_magnitudes)):
            plt.text(bar.get_x() + bar.get_width()/2, bar.get_height() + max(top_magnitudes)*0.01,
                    f'{mag:.3f}', ha='center', va='bottom', fontsize=10)
    
    # 9. SUMMARY STATISTICS
    ax9 = plt.subplot(3, 3, 9)
    ax9.axis('off')
    
    # Create summary text
    summary_text = f"""
    FOURIER ANALYSIS SUMMARY
    Category: {category_name}
    
    Key Metrics:
    • Explained Variance: {data['explained_variance_ratio']:.3f}
    • Total Variance: {data['total_variance']:.6f}
    • Dominant Cycles: {len(data['dominant_periods'])}
    
    Top 3 Periods:
    """
    
    if len(data['dominant_periods']) > 0:
        sorted_indices = np.argsort(data['dominant_magnitudes'])[::-1]
        for i in range(min(3, len(sorted_indices))):
            idx = sorted_indices[i]
            period = data['dominant_periods'][idx]
            magnitude = data['dominant_magnitudes'][idx]
            summary_text += f"    {i+1}. {period:.1f} units (mag: {magnitude:.3f})\n"
    
    # Interpretation
    explained_var = data['explained_variance_ratio']
    if explained_var > 0.7:
        interpretation = "Very Strong Seasonality"
    elif explained_var > 0.5:
        interpretation = "Strong Seasonality"
    elif explained_var > 0.3:
        interpretation = "Moderate Seasonality"
    elif explained_var > 0.1:
        interpretation = "Weak Seasonality"
    else:
        interpretation = "Very Weak/No Seasonality"
    
    summary_text += f"\n    Interpretation: {interpretation}"
    
    plt.text(0.05, 0.95, summary_text, transform=ax9.transAxes, fontsize=11,
             verticalalignment='top', fontfamily='monospace',
             bbox=dict(boxstyle='round', facecolor='lightgray', alpha=0.8))
    
    plt.tight_layout()
    plt.suptitle(f'Comprehensive Fourier Analysis: {category_name}', 
                 fontsize=16, fontweight='bold', y=0.98)
    
    return fig

def create_comparative_analysis_plot(results):
    """
    Create comparative analysis across all categories
    """
    
    categories = list(results.keys())
    n_categories = len(categories)
    
    fig, axes = plt.subplots(2, 2, figsize=(16, 12))
    
    # 1. Explained Variance Comparison
    ax1 = axes[0, 0]
    explained_vars = [results[cat]['explained_variance_ratio'] for cat in categories]
    bars1 = ax1.bar(categories, explained_vars, color=['skyblue', 'lightcoral', 'lightgreen', 'gold'][:n_categories])
    ax1.set_title('Seasonality Strength Comparison', fontsize=14, fontweight='bold')
    ax1.set_ylabel('Explained Variance Ratio')
    ax1.grid(True, alpha=0.3, axis='y')
    
    # Add value labels
    for bar, val in zip(bars1, explained_vars):
        ax1.text(bar.get_x() + bar.get_width()/2, bar.get_height() + 0.01,
                f'{val:.3f}', ha='center', va='bottom', fontweight='bold')
    
    # 2. Dominant Period Comparison
    ax2 = axes[0, 1]
    
    # Get primary period for each category
    primary_periods = []
    for cat in categories:
        if len(results[cat]['dominant_periods']) > 0:
            # Get period with highest magnitude
            max_idx = np.argmax(results[cat]['dominant_magnitudes'])
            primary_periods.append(results[cat]['dominant_periods'][max_idx])
        else:
            primary_periods.append(0)
    
    bars2 = ax2.bar(categories, primary_periods, color=['skyblue', 'lightcoral', 'lightgreen', 'gold'][:n_categories])
    ax2.set_title('Primary Cycle Period Comparison', fontsize=14, fontweight='bold')
    ax2.set_ylabel('Period (time units)')
    ax2.grid(True, alpha=0.3, axis='y')
    
    # Add value labels
    for bar, val in zip(bars2, primary_periods):
        if val > 0:
            ax2.text(bar.get_x() + bar.get_width()/2, bar.get_height() + max(primary_periods)*0.01,
                    f'{val:.1f}', ha='center', va='bottom', fontweight='bold')
    
    # 3. Power Spectrum Overlay
    ax3 = axes[1, 0]
    colors = ['blue', 'red', 'green', 'orange']
    
    for i, cat in enumerate(categories):
        data = results[cat]
        # Normalize power for comparison
        normalized_power = data['power'] / np.max(data['power'])
        ax3.loglog(data['frequencies'], normalized_power, 
                  color=colors[i % len(colors)], label=cat, linewidth=2, alpha=0.7)
    
    ax3.set_title('Power Spectrum Comparison', fontsize=14, fontweight='bold')
    ax3.set_xlabel('Frequency (cycles/period)')
    ax3.set_ylabel('Normalized Power')
    ax3.legend()
    ax3.grid(True, alpha=0.3)
    
    # 4. Cycle Distribution Heatmap
    ax4 = axes[1, 1]
    
    # Create matrix of cycle strengths
    common_periods = [2, 3, 7, 14, 30, 91, 182, 365]  # Common business cycles
    cycle_matrix = np.zeros((len(categories), len(common_periods)))
    
    for i, cat in enumerate(categories):
        data = results[cat]
        for j, target_period in enumerate(common_periods):
            # Find closest detected period
            if len(data['dominant_periods']) > 0:
                period_diffs = np.abs(data['dominant_periods'] - target_period)
                closest_idx = np.argmin(period_diffs)
                
                # If close enough (within 20%), record the magnitude
                if period_diffs[closest_idx] / target_period < 0.2:
                    cycle_matrix[i, j] = data['dominant_magnitudes'][closest_idx]
    
    im = ax4.imshow(cycle_matrix, cmap='YlOrRd', aspect='auto')
    ax4.set_title('Cycle Strength Heatmap', fontsize=14, fontweight='bold')
    ax4.set_xlabel('Period Type')
    ax4.set_ylabel('Category')
    ax4.set_xticks(range(len(common_periods)))
    ax4.set_xticklabels([f'{p}d' for p in common_periods], rotation=45)
    ax4.set_yticks(range(len(categories)))
    ax4.set_yticklabels(categories)
    
    # Add colorbar
    plt.colorbar(im, ax=ax4, label='Magnitude')
    
    # Add text annotations
    for i in range(len(categories)):
        for j in range(len(common_periods)):
            if cycle_matrix[i, j] > 0:
                ax4.text(j, i, f'{cycle_matrix[i, j]:.2f}', 
                        ha='center', va='center', fontweight='bold', 
                        color='white' if cycle_matrix[i, j] > np.max(cycle_matrix)*0.5 else 'black')
    
    plt.tight_layout()
    plt.suptitle('Comparative Fourier Analysis Results', fontsize=16, fontweight='bold', y=0.98)
    
    return fig
```

[1 tool called]

## **6. KEY INSIGHTS FOURIER ANALYSIS PROVIDES**

### **A. Seasonality Detection & Quantification**

1. **Cycle Identification**: Automatically detects dominant cycles (weekly, monthly, annual)
2. **Strength Measurement**: Quantifies how strong each seasonal pattern is
3. **Multiple Seasonality**: Can detect overlapping seasonal patterns simultaneously
4. **Phase Information**: Reveals when in the cycle peaks/troughs occur

### **B. Trend Analysis Capabilities**

1. **Low-Frequency Components**: DC and near-DC components reveal long-term trends
2. **Trend vs Seasonality**: Separates trend from cyclical components
3. **Trend Strength**: Quantifies how much variation is due to trend vs seasonality

### **C. Predictability Assessment**

1. **Signal-to-Noise Ratio**: Distinguishes predictable patterns from random noise
2. **Forecasting Potential**: High explained variance indicates good forecasting potential
3. **Regularity Metrics**: Measures how consistent the patterns are over time

### **D. Business Intelligence**

1. **Operational Planning**: Identifies cycles relevant for staffing, inventory, marketing
2. **Anomaly Detection**: Establishes baseline patterns for detecting unusual behavior
3. **Resource Allocation**: Informs when to scale resources up/down based on cycles

## **7. MATHEMATICAL ADVANTAGES OVER CHI-SQUARED**

| **Aspect** | **Chi-Squared Test** | **Fourier Analysis** |
|------------|---------------------|---------------------|
| **Pattern Detection** | Tests for any dependence | Identifies specific cycles |
| **Multiple Cycles** | Cannot separate | Detects all simultaneously |
| **Strength Quantification** | Binary (significant/not) | Continuous strength measures |
| **Phase Information** | None | Reveals timing of cycles |
| **Trend Detection** | Limited | Excellent via low frequencies |
| **Noise Handling** | Sensitive to noise | Separates signal from noise |
| **Forecasting** | Not applicable | Enables prediction |

## **SUMMARY: Complete Fourier Framework**

**✅ Mathematical Foundation**: DFT transforms categorical proportions to frequency domain

**✅ Implementation**: Comprehensive analysis with proper preprocessing and windowing  

**✅ Interpretation**: Multi-level framework from spectral metrics to business insights

**✅ Examples**: Practical cases showing different seasonal patterns

**✅ Visualization**: Complete suite for presenting results effectively

**Key Advantage**: Fourier analysis doesn't just detect seasonality—it **quantifies, characterizes, and enables prediction** of complex temporal patterns in categorical data, making it far more powerful than simple statistical tests for understanding cyclical behavior.

I'll provide a deep mathematical explanation of why Fourier analysis is so powerful at detecting these properties. Let me break down each mathematical property and show exactly how the formulae enable these capabilities.

[1 tool called]

## **Mathematical Properties of Fourier Analysis: Deep Dive into Why It Works**

### **1. PATTERN DETECTION: The Correlation Principle**

#### **A. Mathematical Foundation**

The DFT formula: **$X[k] = Σ(n=0 to N-1) x[n] × e^(-i2πkn/N) $**

**Why this detects patterns:**

The exponential term **e^(-i2πkn/N)** is a **complex sinusoid** at frequency k. The DFT essentially computes the **correlation** between your data and sinusoids of all possible frequencies.

```python
def explain_pattern_detection_mathematics():
    """
    Mathematical explanation of why DFT detects patterns
    """
    
    # The DFT is mathematically equivalent to:
    # X[k] = Σ x[n] × [cos(2πkn/N) - i×sin(2πkn/N)]
    
    # This can be rewritten as two separate correlations:
    # Real part: Σ x[n] × cos(2πkn/N)  [correlation with cosine]
    # Imag part: Σ x[n] × sin(2πkn/N)   [correlation with sine]
    
    N = 100
    n = np.arange(N)
    
    # Example: Data with hidden 10-period cycle
    true_signal = np.cos(2 * np.pi * n / 10)  # 10-period cycle
    noise = np.random.normal(0, 0.5, N)
    data = true_signal + noise
    
    # Manual correlation calculation for frequency k=10 (10-period cycle)
    k = 10
    test_cosine = np.cos(2 * np.pi * k * n / N)
    test_sine = np.sin(2 * np.pi * k * n / N)
    
    # Correlation coefficients
    real_correlation = np.sum(data * test_cosine)
    imag_correlation = -np.sum(data * test_sine)  # Note the minus sign in DFT
    
    # Magnitude (strength of correlation)
    magnitude = np.sqrt(real_correlation**2 + imag_correlation**2)
    
    print("=== PATTERN DETECTION MATHEMATICS ===")
    print(f"Data contains 10-period cycle")
    print(f"Correlation with 10-period cosine: {real_correlation:.3f}")
    print(f"Correlation with 10-period sine: {imag_correlation:.3f}")
    print(f"Combined magnitude: {magnitude:.3f}")
    
    # Compare with other frequencies
    print("\nCorrelation with other frequencies:")
    for test_k in [5, 15, 20]:
        test_cos = np.cos(2 * np.pi * test_k * n / N)
        correlation = np.sum(data * test_cos)
        print(f"  {test_k}-period: {correlation:.3f}")
    
    return {
        'data': data,
        'true_frequency': 10,
        'correlations': {
            'real': real_correlation,
            'imag': imag_correlation,
            'magnitude': magnitude
        }
    }

# Mathematical insight: High correlation = Strong pattern match
result = explain_pattern_detection_mathematics()
```

**Key Mathematical Insight:**
- **High magnitude |X[k]|** means the data **strongly correlates** with a sinusoid of frequency k
- **Low magnitude** means **weak or no correlation** with that frequency
- This is why peaks in the magnitude spectrum indicate **dominant patterns**

[1 tool called]

### **2. MULTIPLE CYCLES: Orthogonality and Linear Superposition**

#### **A. Mathematical Foundation: Orthogonality**

**Why DFT can separate multiple cycles simultaneously:**

The complex exponentials **e^(-i2πkn/N)** for different values of k are **orthogonal**:

**Σ(n=0 to N-1) e^(-i2πk₁n/N) × e^(i2πk₂n/N) = N if k₁=k₂, 0 if k₁≠k₂**

```python
def demonstrate_orthogonality_principle():
    """
    Mathematical demonstration of why orthogonality enables multiple cycle detection
    """
    
    N = 64
    n = np.arange(N)
    
    # Create data with multiple cycles
    cycle1 = 2 * np.cos(2 * np.pi * n / 8)    # 8-period cycle, amplitude 2
    cycle2 = 1.5 * np.cos(2 * np.pi * n / 16)  # 16-period cycle, amplitude 1.5
    cycle3 = 1 * np.cos(2 * np.pi * n / 32)    # 32-period cycle, amplitude 1
    
    # Combined signal
    combined_signal = cycle1 + cycle2 + cycle3
    
    print("=== ORTHOGONALITY AND MULTIPLE CYCLES ===")
    print("Signal contains:")
    print("  - 8-period cycle (amplitude 2)")
    print("  - 16-period cycle (amplitude 1.5)")  
    print("  - 32-period cycle (amplitude 1)")
    
    # Manual DFT calculation to show orthogonality
    def manual_dft_component(data, k, N):
        """Calculate single DFT component manually"""
        n = np.arange(N)
        complex_exponential = np.exp(-1j * 2 * np.pi * k * n / N)
        return np.sum(data * complex_exponential)
    
    # Test frequencies corresponding to our cycles
    test_frequencies = [8, 16, 32]  # Periods we expect
    test_k_values = [N // period for period in test_frequencies]  # Convert to k indices
    
    print(f"\nDFT Analysis (N={N}):")
    for period, k in zip(test_frequencies, test_k_values):
        dft_value = manual_dft_component(combined_signal, k, N)
        magnitude = abs(dft_value)
        expected_magnitude = N/2  # For cosine: |X[k]| = N*A/2 where A is amplitude
        
        print(f"  {period}-period cycle:")
        print(f"    DFT[{k}] = {dft_value:.3f}")
        print(f"    Magnitude = {magnitude:.3f}")
        print(f"    Expected ≈ {expected_magnitude:.1f}")
    
    # Show orthogonality in action
    print(f"\nOrthogonality Verification:")
    
    # Create basis functions
    basis_8 = np.exp(-1j * 2 * np.pi * (N//8) * n / N)
    basis_16 = np.exp(-1j * 2 * np.pi * (N//16) * n / N)
    basis_32 = np.exp(-1j * 2 * np.pi * (N//32) * n / N)
    
    # Cross-correlations (should be ~0 due to orthogonality)
    cross_8_16 = abs(np.sum(basis_8 * np.conj(basis_16)))
    cross_8_32 = abs(np.sum(basis_8 * np.conj(basis_32)))
    cross_16_32 = abs(np.sum(basis_16 * np.conj(basis_32)))
    
    print(f"  |⟨basis_8, basis_16⟩| = {cross_8_16:.6f} (should be ~0)")
    print(f"  |⟨basis_8, basis_32⟩| = {cross_8_32:.6f} (should be ~0)")
    print(f"  |⟨basis_16, basis_32⟩| = {cross_16_32:.6f} (should be ~0)")
    
    # Auto-correlations (should be N)
    auto_8 = abs(np.sum(basis_8 * np.conj(basis_8)))
    print(f"  |⟨basis_8, basis_8⟩| = {auto_8:.1f} (should be {N})")
    
    return {
        'combined_signal': combined_signal,
        'individual_cycles': [cycle1, cycle2, cycle3],
        'orthogonality_verified': max(cross_8_16, cross_8_32, cross_16_32) < 1e-10
    }

result = demonstrate_orthogonality_principle()
```

#### **B. Linear Superposition Property**

**Mathematical Property:** If **x[n] = s₁[n] + s₂[n] + s₃[n]**, then:
**DFT{x[n]} = DFT{s₁[n]} + DFT{s₂[n]} + DFT{s₃[n]}**

This **linearity** combined with **orthogonality** means:
- Each cycle contributes **independently** to its corresponding frequency bin
- No **interference** between different cycles
- **Perfect separation** of overlapping patterns

[1 tool called]

### **3. STRENGTH QUANTIFICATION: Magnitude and Parseval's Theorem**

#### **A. Mathematical Foundation: Magnitude as Correlation Strength**

**Magnitude Formula:** **|X[k]| = √(Real[X[k]]² + Imag[X[k]]²)**

**Why this quantifies strength:**

```python
def explain_strength_quantification():
    """
    Mathematical explanation of how magnitude quantifies pattern strength
    """
    
    N = 100
    n = np.arange(N)
    
    # Create signals with different amplitudes (strengths)
    weak_signal = 0.5 * np.cos(2 * np.pi * n / 10)      # Amplitude 0.5
    medium_signal = 1.0 * np.cos(2 * np.pi * n / 10)    # Amplitude 1.0  
    strong_signal = 2.0 * np.cos(2 * np.pi * n / 10)    # Amplitude 2.0
    
    signals = {
        'weak': weak_signal,
        'medium': medium_signal, 
        'strong': strong_signal
    }
    
    print("=== STRENGTH QUANTIFICATION MATHEMATICS ===")
    print("Signals with same frequency (10-period) but different amplitudes:")
    
    for name, signal in signals.items():
        # Calculate DFT manually for the dominant frequency
        k = N // 10  # Frequency bin for 10-period cycle
        
        # DFT calculation
        dft_value = np.sum(signal * np.exp(-1j * 2 * np.pi * k * n / N))
        magnitude = abs(dft_value)
        
        # Theoretical relationship: For cosine with amplitude A
        # |X[k]| = N * A / 2
        theoretical_magnitude = N * np.max(signal) / 2
        
        print(f"\n{name.upper()} signal (amplitude {np.max(signal)}):")
        print(f"  DFT magnitude: {magnitude:.3f}")
        print(f"  Theoretical: {theoretical_magnitude:.3f}")
        print(f"  Ratio: {magnitude/theoretical_magnitude:.6f}")
    
    return signals

# Mathematical relationship: |X[k]| ∝ Amplitude of cycle at frequency k
explain_strength_quantification()
```

#### **B. Parseval's Theorem: Energy Conservation**

**Parseval's Theorem:** **Σ|x[n]|² = (1/N) × Σ|X[k]|²**

**Why this enables strength comparison:**

```python
def demonstrate_parsevals_theorem():
    """
    Show how Parseval's theorem enables relative strength measurement
    """
    
    N = 64
    n = np.arange(N)
    
    # Create signal with multiple components
    component1 = 3 * np.cos(2 * np.pi * n / 8)   # Strong 8-period
    component2 = 1 * np.cos(2 * np.pi * n / 16)  # Weak 16-period
    component3 = 2 * np.cos(2 * np.pi * n / 32)  # Medium 32-period
    
    signal = component1 + component2 + component3
    
    # Calculate DFT
    X = np.fft.fft(signal)
    frequencies = np.fft.fftfreq(N)
    
    # Time domain energy
    time_energy = np.sum(np.abs(signal)**2)
    
    # Frequency domain energy
    freq_energy = np.sum(np.abs(X)**2) / N
    
    print("=== PARSEVAL'S THEOREM: ENERGY CONSERVATION ===")
    print(f"Time domain energy: {time_energy:.3f}")
    print(f"Frequency domain energy: {freq_energy:.3f}")
    print(f"Ratio (should be 1.0): {freq_energy/time_energy:.6f}")
    
    # Energy distribution analysis
    power_spectrum = np.abs(X)**2
    total_power = np.sum(power_spectrum)
    
    # Find peaks corresponding to our components
    peak_indices = [N//8, N//16, N//32]  # Expected frequency bins
    
    print(f"\nEnergy Distribution:")
    for i, (period, k) in enumerate(zip([8, 16, 32], peak_indices)):
        component_power = power_spectrum[k]
        relative_power = component_power / total_power
        
        print(f"  {period}-period cycle:")
        print(f"    Power: {component_power:.1f}")
        print(f"    Relative: {relative_power:.3f} ({relative_power*100:.1f}%)")
    
    # This shows how DFT magnitude directly relates to signal energy/strength
    return {
        'signal': signal,
        'power_spectrum': power_spectrum,
        'energy_conservation_verified': abs(freq_energy/time_energy - 1.0) < 1e-10
    }

demonstrate_parsevals_theorem()
```

**Mathematical Insight:**
- **|X[k]|²** represents the **energy** at frequency k
- **Relative magnitudes** show **relative importance** of different cycles
- **Total energy is conserved** between time and frequency domains

[1 tool called]

### **4. PHASE ANALYSIS: Timing and Synchronization**

#### **A. Mathematical Foundation: Complex Representation**

**Phase Formula:** **φ[k] = arctan(Imag[X[k]] / Real[X[k]])**

**Why phase reveals timing:**

The DFT represents each frequency component as: **A[k] × e^(iφ[k])** where:
- **A[k] = |X[k]|** is the amplitude (strength)
- **φ[k]** is the phase (timing offset)

```python
def explain_phase_mathematics():
    """
    Mathematical explanation of how phase reveals timing information
    """
    
    N = 100
    n = np.arange(N)
    
    # Create identical cycles with different phase shifts
    reference_cycle = np.cos(2 * np.pi * n / 20)                    # Phase = 0
    shifted_90 = np.cos(2 * np.pi * n / 20 + np.pi/2)             # Phase = π/2 (90°)
    shifted_180 = np.cos(2 * np.pi * n / 20 + np.pi)              # Phase = π (180°)
    shifted_270 = np.cos(2 * np.pi * n / 20 + 3*np.pi/2)          # Phase = 3π/2 (270°)
    
    cycles = {
        'reference (0°)': (reference_cycle, 0),
        'shifted 90°': (shifted_90, np.pi/2),
        'shifted 180°': (shifted_180, np.pi),
        'shifted 270°': (shifted_270, 3*np.pi/2)
    }
    
    print("=== PHASE MATHEMATICS: TIMING DETECTION ===")
    print("Identical 20-period cycles with different phase shifts:")
    
    for name, (signal, expected_phase) in cycles.items():
        # Calculate DFT for the dominant frequency
        k = N // 20  # Frequency bin for 20-period cycle
        
        # Manual DFT calculation
        dft_value = np.sum(signal * np.exp(-1j * 2 * np.pi * k * n / N))
        
        magnitude = abs(dft_value)
        phase = np.angle(dft_value)
        
        # Convert to degrees for readability
        phase_degrees = np.degrees(phase)
        expected_degrees = np.degrees(expected_phase)
        
        print(f"\n{name}:")
        print(f"  DFT value: {dft_value:.3f}")
        print(f"  Magnitude: {magnitude:.3f}")
        print(f"  Phase: {phase:.3f} rad ({phase_degrees:.1f}°)")
        print(f"  Expected: {expected_phase:.3f} rad ({expected_degrees:.1f}°)")
        print(f"  Error: {abs(phase - expected_phase):.6f} rad")
    
    return cycles

explain_phase_mathematics()
```

#### **B. Phase Relationships and Timing**

**Mathematical Relationship:**
For a sinusoid **A×cos(2πft + φ)**, the phase **φ** determines:
- **When the peak occurs** in the cycle
- **Synchronization** between different components
- **Lead/lag relationships** between signals

```python
def demonstrate_phase_timing_relationship():
    """
    Show how phase translates to actual timing in the signal
    """
    
    N = 120  # 120 time points
    n = np.arange(N)
    period = 24  # 24-period cycle
    
    # Create cycles with different phases
    phases = [0, np.pi/4, np.pi/2, np.pi, 3*np.pi/2]
    phase_names = ['0°', '45°', '90°', '180°', '270°']
    
    print("=== PHASE TO TIMING TRANSLATION ===")
    print(f"24-period cycle, N={N} points")
    
    for phase, name in zip(phases, phase_names):
        # Create signal
        signal = np.cos(2 * np.pi * n / period + phase)
        
        # Find peak location
        peak_index = np.argmax(signal)
        peak_time = peak_index
        
        # Calculate expected peak time from phase
        # Phase shift φ corresponds to time shift of φ/(2π) * period
        expected_peak_time = (-phase / (2 * np.pi)) * period
        if expected_peak_time < 0:
            expected_peak_time += period  # Wrap around
        
        print(f"\nPhase {name} ({phase:.3f} rad):")
        print(f"  Peak occurs at time: {peak_time}")
        print(f"  Expected from phase: {expected_peak_time:.1f}")
        print(f"  Time shift: {expected_peak_time:.1f} time units")
        print(f"  Fraction of cycle: {expected_peak_time/period:.3f}")
    
    # Mathematical insight: Phase directly encodes timing offset
    print(f"\nMathematical relationship:")
    print(f"  Time shift = -φ/(2π) × Period")
    print(f"  Phase φ = -2π × (time_shift/Period)")

demonstrate_phase_timing_relationship()
```

**Key Mathematical Insight:**
- **Phase encodes timing offset** within the cycle
- **Phase differences** reveal synchronization between patterns
- **Negative phase** = signal **leads** the reference
- **Positive phase** = signal **lags** the reference

[1 tool called]

### **5. TREND DETECTION & NOISE SEPARATION: Frequency Localization**

#### **A. Mathematical Foundation: DC Component and Low Frequencies**

**DC Component:** **X[0] = Σ(n=0 to N-1) x[n]** (the mean)

**Low Frequencies:** **X[1], X[2], ...** represent **slow variations** (trends)

**High Frequencies:** **X[N/2-1], X[N/2-2], ...** represent **rapid variations** (noise)

```python
def explain_trend_noise_separation():
    """
    Mathematical explanation of trend and noise separation
    """
    
    N = 200
    n = np.arange(N)
    
    # Create signal components
    trend = 0.02 * n  # Linear trend
    seasonal = np.cos(2 * np.pi * n / 50)  # 50-period seasonal
    noise = 0.3 * np.random.normal(0, 1, N)  # High-frequency noise
    
    # Combined signal
    signal = trend + seasonal + noise
    
    # Calculate DFT
    X = np.fft.fft(signal)
    frequencies = np.fft.fftfreq(N)
    magnitude = np.abs(X)
    
    print("=== TREND AND NOISE SEPARATION MATHEMATICS ===")
    
    # 1. DC COMPONENT (Trend baseline)
    dc_component = X[0] / N  # Average value
    print(f"DC component (mean): {dc_component:.6f}")
    print(f"Actual mean: {np.mean(signal):.6f}")
    
    # 2. LOW FREQUENCY ANALYSIS (Trend)
    low_freq_bins = 5  # First 5 frequency bins (excluding DC)
    low_freq_power = np.sum(magnitude[1:low_freq_bins+1]**2)
    total_power = np.sum(magnitude**2)
    trend_power_ratio = low_freq_power / total_power
    
    print(f"\nLow frequency analysis (trend detection):")
    print(f"  Low frequency power: {low_freq_power:.1f}")
    print(f"  Total power: {total_power:.1f}")
    print(f"  Trend power ratio: {trend_power_ratio:.3f}")
    
    # 3. HIGH FREQUENCY ANALYSIS (Noise)
    high_freq_threshold = N // 4  # Upper quarter of frequencies
    high_freq_power = np.sum(magnitude[high_freq_threshold:]**2)
    noise_power_ratio = high_freq_power / total_power
    
    print(f"\nHigh frequency analysis (noise detection):")
    print(f"  High frequency power: {high_freq_power:.1f}")
    print(f"  Noise power ratio: {noise_power_ratio:.3f}")
    
    # 4. SIGNAL RECONSTRUCTION (Filtering)
    # Reconstruct trend (DC + low frequencies)
    X_trend = np.zeros_like(X)
    X_trend[0] = X[0]  # DC
    X_trend[1:low_freq_bins+1] = X[1:low_freq_bins+1]  # Low frequencies
    X_trend[-low_freq_bins:] = X[-low_freq_bins:]  # Negative frequencies (symmetry)
    
    trend_reconstructed = np.real(np.fft.ifft(X_trend))
    
    # Reconstruct seasonal (mid frequencies)
    seasonal_bin = N // 50  # 50-period cycle
    X_seasonal = np.zeros_like(X)
    X_seasonal[seasonal_bin] = X[seasonal_bin]
    X_seasonal[-seasonal_bin] = X[-seasonal_bin]  # Negative frequency
    
    seasonal_reconstructed = np.real(np.fft.ifft(X_seasonal))
    
    print(f"\nSignal reconstruction quality:")
    print(f"  Original trend correlation: {np.corrcoef(trend, trend_reconstructed)[0,1]:.3f}")
    print(f"  Original seasonal correlation: {np.corrcoef(seasonal, seasonal_reconstructed)[0,1]:.3f}")
    
    return {
        'original_signal': signal,
        'components': {'trend': trend, 'seasonal': seasonal, 'noise': noise},
        'reconstructed': {'trend': trend_reconstructed, 'seasonal': seasonal_reconstructed},
        'frequency_analysis': {
            'dc_component': dc_component,
            'trend_power_ratio': trend_power_ratio,
            'noise_power_ratio': noise_power_ratio
        }
    }

result = explain_trend_noise_separation()
```

#### **B. Mathematical Properties Enabling Separation**

**1. Frequency Localization:**
- **Trends** → **Low frequencies** (slow changes)
- **Seasonality** → **Specific frequencies** (regular cycles)  
- **Noise** → **High frequencies** (rapid, random changes)

**2. Filtering Mathematics:**
```python
def mathematical_filtering_explanation():
    """
    Show the mathematical basis for frequency-domain filtering
    """
    
    print("=== MATHEMATICAL BASIS FOR FILTERING ===")
    
    # Mathematical relationships:
    print("1. FREQUENCY-TIME RELATIONSHIP:")
    print("   Low frequency ↔ Slow time variation (trends)")
    print("   High frequency ↔ Fast time variation (noise)")
    print("   f = k/N, where k is frequency bin, N is signal length")
    
    print("\n2. FILTERING OPERATIONS:")
    print("   Low-pass filter: Keep X[k] for k ≤ k_cutoff")
    print("   High-pass filter: Keep X[k] for k ≥ k_cutoff") 
    print("   Band-pass filter: Keep X[k] for k₁ ≤ k ≤ k₂")
    
    print("\n3. RECONSTRUCTION FORMULA:")
    print("   x_filtered[n] = IFFT{H[k] × X[k]}")
    print("   where H[k] is the filter function")
    
    # Example filter functions
    N = 100
    k = np.arange(N)
    
    # Low-pass filter (keeps trends)
    cutoff_low = 5
    H_lowpass = np.zeros(N)
    H_lowpass[:cutoff_low] = 1
    H_lowpass[-cutoff_low:] = 1  # Symmetric for real signals
    
    # High-pass filter (keeps noise/rapid changes)
    H_highpass = 1 - H_lowpass
    
    # Band-pass filter (keeps specific frequency range)
    H_bandpass = np.zeros(N)
    H_bandpass[10:15] = 1  # Keep frequencies 10-14
    H_bandpass[-15:-10] = 1  # Symmetric
    
    print(f"\n4. EXAMPLE FILTERS (N={N}):")
    print(f"   Low-pass: Keeps frequencies 0-{cutoff_low} (trends)")
    print(f"   High-pass: Keeps frequencies {cutoff_low}-{N//2} (noise)")
    print(f"   Band-pass: Keeps frequencies 10-14 (specific cycles)")
    
    return {
        'filter_functions': {
            'lowpass': H_lowpass,
            'highpass': H_highpass, 
            'bandpass': H_bandpass
        }
    }

mathematical_filtering_explanation()
```

**Key Mathematical Insight:**
- **Frequency domain separation** enables **perfect filtering**
- **Different signal components** occupy **different frequency ranges**
- **Linear filtering** preserves signal structure while removing unwanted components

[1 tool called]

### **6. FORECASTING: Reconstruction and Extrapolation**

#### **A. Mathematical Foundation: Inverse Transform and Extrapolation**

**Reconstruction Formula:** **x[n] = (1/N) × Σ(k=0 to N-1) X[k] × e^(i2πkn/N)**

**Forecasting Principle:** If we know **X[k]**, **A[k]**, and **φ[k]** for dominant frequencies, we can **extrapolate** beyond the observed data.

```python
def explain_forecasting_mathematics():
    """
    Mathematical explanation of how Fourier analysis enables forecasting
    """
    
    N = 100
    n = np.arange(N)
    
    # Create signal with known components
    true_component1 = 2 * np.cos(2 * np.pi * n / 20)  # 20-period cycle
    true_component2 = 1 * np.cos(2 * np.pi * n / 10 + np.pi/4)  # 10-period cycle with phase
    noise = 0.2 * np.random.normal(0, 1, N)
    
    observed_signal = true_component1 + true_component2 + noise
    
    print("=== FORECASTING MATHEMATICS ===")
    
    # 1. ANALYZE OBSERVED DATA
    X = np.fft.fft(observed_signal)
    frequencies = np.fft.fftfreq(N)
    magnitude = np.abs(X)
    phase = np.angle(X)
    
    # Find dominant frequencies
    # Exclude DC and very high frequencies
    search_range = slice(1, N//4)
    dominant_indices = np.argsort(magnitude[search_range])[-3:] + 1  # Top 3 peaks
    
    print("Dominant frequency components detected:")
    for i, k in enumerate(dominant_indices):
        period = N / k if k > 0 else np.inf
        freq = frequencies[k]
        mag = magnitude[k]
        ph = phase[k]
        
        print(f"  Component {i+1}:")
        print(f"    Frequency bin k={k}, Period≈{period:.1f}")
        print(f"    Magnitude: {mag:.3f}, Phase: {ph:.3f} rad")
    
    # 2. MATHEMATICAL FORECASTING FORMULA
    print(f"\n=== FORECASTING FORMULA ===")
    print("For each dominant component k:")
    print("  x_forecast[n] = (2/N) × |X[k]| × cos(2πkn/N + φ[k])")
    print("Combined forecast:")
    print("  x_total[n] = Σ x_forecast_k[n] + DC_component")
    
    # 3. GENERATE FORECAST
    forecast_length = 50  # Forecast 50 points into the future
    n_extended = np.arange(N, N + forecast_length)
    
    # Reconstruct each dominant component
    forecast = np.zeros(forecast_length)
    
    # Add DC component
    dc_component = np.real(X[0]) / N
    forecast += dc_component
    
    print(f"\nForecasting {forecast_length} points beyond observed data:")
    
    for k in dominant_indices:
        if k < N//2:  # Only positive frequencies
            # Extract component parameters
            amplitude = 2 * magnitude[k] / N  # Factor of 2 for real signals
            frequency = k / N  # Normalized frequency
            phase_shift = phase[k]
            
            # Generate forecast for this component
            component_forecast = amplitude * np.cos(2 * np.pi * frequency * n_extended + phase_shift)
            forecast += component_forecast
            
            print(f"  Component k={k}: amplitude={amplitude:.3f}, freq={frequency:.4f}, phase={phase_shift:.3f}")
    
    # 4. VALIDATION: Compare with true continuation
    true_continuation1 = 2 * np.cos(2 * np.pi * n_extended / 20)
    true_continuation2 = 1 * np.cos(2 * np.pi * n_extended / 10 + np.pi/4)
    true_forecast = true_continuation1 + true_continuation2 + dc_component
    
    # Calculate forecast accuracy
    forecast_error = np.mean((forecast - true_forecast)**2)
    forecast_correlation = np.corrcoef(forecast, true_forecast)[0, 1]
    
    print(f"\nForecast validation:")
    print(f"  MSE: {forecast_error:.6f}")
    print(f"  Correlation with true signal: {forecast_correlation:.3f}")
    
    return {
        'observed_signal': observed_signal,
        'forecast': forecast,
        'true_forecast': true_forecast,
        'dominant_components': dominant_indices,
        'forecast_accuracy': {
            'mse': forecast_error,
            'correlation': forecast_correlation
        }
    }

forecast_result = explain_forecasting_mathematics()
```

#### **B. Mathematical Properties Enabling Forecasting**

**1. Periodicity Assumption:**
If **X[k]** is significant, the corresponding sinusoid **continues indefinitely**:
**x_future[n] = (2/N) × |X[k]| × cos(2πk(n+N)/N + φ[k])**

**2. Stationarity Assumption:**
The **frequency content remains constant** over time.

**3. Superposition Principle:**
**Total forecast = Σ(individual component forecasts)**

```python
def demonstrate_forecasting_limitations():
    """
    Show mathematical limitations and assumptions of Fourier forecasting
    """
    
    print("=== MATHEMATICAL ASSUMPTIONS FOR FORECASTING ===")
    
    # 1. STATIONARITY REQUIREMENT
    print("1. STATIONARITY ASSUMPTION:")
    print("   Mathematical requirement: E[x(t)] = constant, Var[x(t)] = constant")
    print("   Implication: Frequency content doesn't change over time")
    print("   Violation example: Trends, changing seasonality")
    
    # 2. PERIODICITY ASSUMPTION  
    print("\n2. PERIODICITY ASSUMPTION:")
    print("   Mathematical basis: sin(ωt) and cos(ωt) are periodic")
    print("   Implication: Detected cycles continue indefinitely")
    print("   Violation example: Damped oscillations, one-time events")
    
    # 3. LINEARITY ASSUMPTION
    print("\n3. LINEARITY ASSUMPTION:")
    print("   Mathematical basis: DFT is linear transformation")
    print("   Implication: Components combine additively")
    print("   Violation example: Nonlinear interactions, threshold effects")
    
    # Example: Non-stationary signal (changing frequency)
    N = 200
    n = np.arange(N)
    
    # Frequency changes over time (chirp signal)
    instantaneous_freq = 0.05 + 0.02 * n / N  # Frequency increases
    phase = np.cumsum(2 * np.pi * instantaneous_freq)
    chirp_signal = np.cos(phase)
    
    # Try to forecast this with standard Fourier method
    X = np.fft.fft(chirp_signal)
    magnitude = np.abs(X)
    
    # This will fail because the signal is non-stationary
    dominant_k = np.argmax(magnitude[1:N//2]) + 1
    
    print(f"\n=== NON-STATIONARY EXAMPLE ===")
    print(f"Chirp signal with changing frequency")
    print(f"Dominant frequency bin detected: {dominant_k}")
    print(f"But actual frequency changes from {0.05:.3f} to {0.05+0.02:.3f}")
    print(f"Standard Fourier forecast will be inaccurate!")
    
    # 4. MATHEMATICAL SOLUTIONS FOR LIMITATIONS
    print(f"\n=== MATHEMATICAL EXTENSIONS ===")
    print("1. Short-Time Fourier Transform (STFT):")
    print("   X(m,k) = Σ x[n]w[n-m]e^(-i2πkn/N)")
    print("   Captures time-varying frequency content")
    
    print("\n2. Wavelet Transform:")
    print("   W(a,b) = ∫ x(t)ψ*((t-b)/a)dt")
    print("   Better time-frequency localization")
    
    print("\n3. Adaptive Filtering:")
    print("   Update frequency estimates as new data arrives")
    print("   Handle non-stationary behavior")
    
    return {
        'chirp_signal': chirp_signal,
        'limitation_demonstrated': True
    }

demonstrate_forecasting_limitations()
```

[1 tool called]

## **MATHEMATICAL SUMMARY: Why Fourier Analysis Works**

```mermaid
graph TD
    A["DFT Formula<br/>X[k] = Σ x[n]e^(-i2πkn/N)"] --> B["Complex Exponentials<br/>e^(-i2πkn/N)"]
    
    B --> C["Orthogonality Property<br/>⟨e^(ik₁), e^(ik₂)⟩ = 0 if k₁≠k₂"]
    B --> D["Correlation Computation<br/>Inner product with basis functions"]
    
    C --> E["Multiple Cycle Separation<br/>Independent frequency bins"]
    D --> F["Pattern Detection<br/>High correlation = Strong pattern"]
    
    A --> G["Magnitude |X[k]|<br/>= √(Real² + Imag²)"]
    A --> H["Phase φ[k]<br/>= arctan(Imag/Real)"]
    
    G --> I["Strength Quantification<br/>Energy at each frequency"]
    H --> J["Timing Information<br/>Phase offset within cycle"]
    
    A --> K["Frequency Localization<br/>Low k = Trends, High k = Noise"]
    K --> L["Filtering Capability<br/>Separate signal components"]
    
    A --> M["Inverse Transform<br/>x[n] = (1/N)Σ X[k]e^(i2πkn/N)"]
    M --> N["Reconstruction & Forecasting<br/>Extrapolate periodic components"]
    
    style A fill:#e1f5fe,color:#000
    style E fill:#c8e6c9,color:#000
    style F fill:#c8e6c9,color:#000
    style I fill:#fff3e0,color:#000
    style J fill:#fff3e0,color:#000
    style L fill:#f3e5f5,color:#000
    style N fill:#ffebee,color:#000
```

[1 tool called]

### **CORE MATHEMATICAL PRINCIPLES**

| **Property** | **Mathematical Foundation** | **Why It Works** |
|--------------|----------------------------|------------------|
| **Pattern Detection** | **Correlation**: X[k] = ⟨x, e^(ikω)⟩ | High correlation with sinusoid = Strong pattern |
| **Multiple Cycles** | **Orthogonality**: ⟨e^(ik₁), e^(ik₂)⟩ = δ_{k₁,k₂} | Different frequencies don't interfere |
| **Strength Quantification** | **Parseval's Theorem**: Σ\|x[n]\|² = (1/N)Σ\|X[k]\|² | Energy conservation enables comparison |
| **Phase/Timing** | **Complex Representation**: X[k] = \|X[k]\|e^(iφ[k]) | Phase encodes timing offset |
| **Trend Detection** | **Frequency Localization**: Low k ↔ Slow variation | Trends = Low frequency components |
| **Noise Separation** | **Frequency Localization**: High k ↔ Fast variation | Noise = High frequency components |
| **Forecasting** | **Periodicity**: sin(ωt) repeats indefinitely | Sinusoidal components extrapolate naturally |

### **KEY MATHEMATICAL INSIGHTS**

1. **Orthogonal Basis Decomposition**: DFT decomposes signals into orthogonal sinusoidal components
2. **Energy Conservation**: Total signal energy is preserved in frequency domain
3. **Frequency-Time Duality**: Slow time variations ↔ Low frequencies, Fast variations ↔ High frequencies  
4. **Complex Representation**: Magnitude + Phase captures both strength and timing
5. **Linear Superposition**: Multiple patterns combine additively
6. **Inverse Transform**: Perfect reconstruction enables forecasting

**The mathematical elegance of Fourier analysis lies in its ability to transform complex temporal patterns into simple, interpretable frequency components, each with clear physical meaning and predictive power.**