# Unemployment Rate Analysis

This notebook provides a comprehensive analysis of unemployment rate data, focusing on:
- Data cleaning and exploration
- Trend analysis
- COVID-19 impact assessment
- Seasonal pattern identification
- Policy insights and recommendations

## Table of Contents
1. [Data Loading and Preprocessing](#data-loading)
2. [Exploratory Data Analysis](#eda)
3. [Trend Analysis](#trends)
4. [COVID-19 Impact Analysis](#covid)
5. [Seasonal Analysis](#seasonal)
6. [Policy Insights](#policy)
7. [Conclusions](#conclusions)

In [None]:
# Import required libraries
import sys
import os
sys.path.append('../src')

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.graph_objects as go
import plotly.express as px
from plotly.subplots import make_subplots

# Import custom modules
from data_loader import UnemploymentDataLoader
from analysis import UnemploymentAnalyzer
from visualizations import UnemploymentVisualizer

# Configure display options
pd.set_option('display.max_columns', None)
pd.set_option('display.width', None)
plt.style.use('seaborn-v0_8')
sns.set_palette("husl")

# Suppress warnings
import warnings
warnings.filterwarnings('ignore')

print("Libraries imported successfully!")

## 1. Data Loading and Preprocessing {#data-loading}

Let's start by loading and preprocessing our unemployment data.

In [None]:
# Initialize data loader
loader = UnemploymentDataLoader('../data/unemployment_data.csv')

# Load and preprocess data
data = loader.preprocess_data()

# Display basic information
print("Dataset Shape:", data.shape)
print("\nColumn Names:")
print(data.columns.tolist())

# Display first few rows
print("\nFirst 10 rows:")
data.head(10)

In [None]:
# Get data summary
summary = loader.get_data_summary()

print("=== DATA SUMMARY ===")
print(f"Total Records: {summary['total_records']}")
print(f"Date Range: {summary['date_range']['start']} to {summary['date_range']['end']}")
print(f"\nUnemployment Statistics:")
for key, value in summary['unemployment_stats'].items():
    print(f"  {key.replace('_', ' ').title()}: {value:.2f}%")

print(f"\nCOVID-19 Impact:")
for key, value in summary['covid_impact'].items():
    print(f"  {key.replace('_', ' ').title()}: {value:.2f}%")

## 2. Exploratory Data Analysis {#eda}

Let's explore the basic characteristics of our unemployment data.

In [None]:
# Basic statistics
print("Unemployment Rate Statistics:")
print(data['Unemployment_Rate'].describe())

# Check for missing values
print("\nMissing Values:")
print(data.isnull().sum())

# Data types
print("\nData Types:")
print(data.dtypes)

In [None]:
# Create basic visualization
fig, axes = plt.subplots(2, 2, figsize=(15, 10))

# Time series plot
axes[0, 0].plot(data['Date'], data['Unemployment_Rate'], linewidth=2)
axes[0, 0].set_title('Unemployment Rate Over Time')
axes[0, 0].set_ylabel('Unemployment Rate (%)')
axes[0, 0].tick_params(axis='x', rotation=45)
axes[0, 0].grid(True, alpha=0.3)

# Distribution
axes[0, 1].hist(data['Unemployment_Rate'], bins=20, alpha=0.7, edgecolor='black')
axes[0, 1].set_title('Distribution of Unemployment Rates')
axes[0, 1].set_xlabel('Unemployment Rate (%)')
axes[0, 1].set_ylabel('Frequency')

# Box plot by year
yearly_data = [data[data['Year'] == year]['Unemployment_Rate'].values for year in data['Year'].unique()]
axes[1, 0].boxplot(yearly_data, labels=data['Year'].unique())
axes[1, 0].set_title('Unemployment Rate Distribution by Year')
axes[1, 0].set_ylabel('Unemployment Rate (%)')
axes[1, 0].tick_params(axis='x', rotation=45)

# Rate changes
axes[1, 1].bar(data['Date'], data['Rate_Change'], alpha=0.7)
axes[1, 1].set_title('Month-over-Month Rate Changes')
axes[1, 1].set_ylabel('Rate Change (pp)')
axes[1, 1].axhline(y=0, color='red', linestyle='--', alpha=0.7)
axes[1, 1].tick_params(axis='x', rotation=45)

plt.tight_layout()
plt.show()

## 3. Trend Analysis {#trends}

Now let's analyze the trends in unemployment data across different periods.

In [None]:
# Initialize analyzer
analyzer = UnemploymentAnalyzer(data)

# Perform trend analysis
trend_results = analyzer.trend_analysis()

print("=== TREND ANALYSIS RESULTS ===")
print(f"Overall Trend Direction: {trend_results['overall_trend']['direction']}")
print(f"Linear Slope: {trend_results['overall_trend']['linear_slope']:.6f} per month")
print(f"Linear R²: {trend_results['overall_trend']['linear_r2']:.4f}")
print(f"Polynomial R²: {trend_results['overall_trend']['polynomial_r2']:.4f}")

print("\n=== PERIOD-SPECIFIC TRENDS ===")
for period, stats in trend_results['period_trends'].items():
    period_name = period.replace('_', ' ').title()
    print(f"\n{period_name}:")
    print(f"  Slope: {stats['slope']:.4f} per month")
    print(f"  R²: {stats['r_squared']:.4f}")
    print(f"  Total Change: {stats['change']:.2f} percentage points")
    print(f"  Start Rate: {stats['start_rate']:.2f}%")
    print(f"  End Rate: {stats['end_rate']:.2f}%")

In [None]:
# Visualize trends
viz = UnemploymentVisualizer(data)
viz.plot_unemployment_trend()

## 4. COVID-19 Impact Analysis {#covid}

Let's analyze the specific impact of COVID-19 on unemployment rates.

In [None]:
# Perform COVID impact analysis
covid_results = analyzer.covid_impact_analysis()

print("=== COVID-19 IMPACT ANALYSIS ===")

print("\nPeriod Statistics:")
for period, stats in covid_results['period_statistics'].items():
    period_name = period.replace('_', ' ').title()
    print(f"\n{period_name}:")
    print(f"  Mean: {stats['mean']:.2f}%")
    print(f"  Std Dev: {stats['std']:.2f}%")
    print(f"  Min: {stats['min']:.2f}%")
    print(f"  Max: {stats['max']:.2f}%")
    if 'peak_date' in stats:
        print(f"  Peak Date: {stats['peak_date'].strftime('%Y-%m')}")

print("\nImpact Metrics:")
impact = covid_results['impact_metrics']
print(f"Peak Impact (Absolute): +{impact['peak_impact_absolute']:.2f} percentage points")
print(f"Peak Impact (Percentage): +{impact['peak_impact_percentage']:.1f}%")
print(f"Recovery Period: {impact['recovery_months']} months")
print(f"Full Recovery Achieved: {'Yes' if impact['full_recovery'] else 'No'}")

print("\nStatistical Tests:")
ttest = covid_results['statistical_tests']['pre_vs_post_ttest']
print(f"Pre-COVID vs Post-COVID T-test:")
print(f"  T-statistic: {ttest['statistic']:.4f}")
print(f"  P-value: {ttest['p_value']:.6f}")
print(f"  Significant Difference: {'Yes' if ttest['significant_difference'] else 'No'}")

In [None]:
# Visualize COVID impact
viz.plot_covid_impact()

## 5. Seasonal Analysis {#seasonal}

Let's identify seasonal patterns in unemployment data.

In [None]:
# Perform seasonal analysis
seasonal_results = analyzer.seasonal_analysis()

print("=== SEASONAL ANALYSIS RESULTS ===")

print("\nSeasonal Summary:")
summary = seasonal_results['seasonal_summary']
print(f"Seasonal Range: {summary['seasonal_range']:.2f} percentage points")
print(f"Peak Month: {summary['peak_month']['name']} (Month {summary['peak_month']['number']})")
print(f"Low Month: {summary['low_month']['name']} (Month {summary['low_month']['number']})")

seasonality_test = summary['seasonality_test']
print(f"\nSeasonality Test (Kruskal-Wallis):")
print(f"  Statistic: {seasonality_test['kruskal_statistic']:.4f}")
print(f"  P-value: {seasonality_test['p_value']:.6f}")
print(f"  Significant Seasonality: {'Yes' if seasonality_test['is_seasonal'] else 'No'}")

print("\nMonthly Statistics (excluding COVID period):")
monthly_stats = seasonal_results['monthly_statistics']
for month, stats in monthly_stats.items():
    month_name = {1: 'January', 2: 'February', 3: 'March', 4: 'April',
                  5: 'May', 6: 'June', 7: 'July', 8: 'August',
                  9: 'September', 10: 'October', 11: 'November', 12: 'December'}[month]
    print(f"  {month_name}: {stats['mean']:.2f}% (±{stats['std']:.2f})")

In [None]:
# Visualize seasonal patterns
viz.plot_seasonal_analysis()

## 6. Policy Insights {#policy}

Based on our analysis, let's generate policy insights and recommendations.

In [None]:
# Generate policy insights
policy_results = analyzer.policy_insights()

print("=== POLICY INSIGHTS AND RECOMMENDATIONS ===")

print("\nSummary Metrics:")
metrics = policy_results['summary_metrics']
print(f"Current Rate: {metrics['current_rate']:.1f}%")
print(f"Historical Average: {metrics['historical_average']:.1f}%")
print(f"COVID Peak: {metrics['covid_peak']:.1f}%")
print(f"Recovery Status: {metrics['recovery_status']}")

print("\nKey Insights:")
for i, insight in enumerate(policy_results['insights'], 1):
    print(f"\n{i}. {insight['category']} (Priority: {insight['priority']})")
    print(f"   Finding: {insight['finding']}")
    print(f"   Recommendation: {insight['recommendation']}")

In [None]:
# Perform volatility analysis
volatility_results = analyzer.volatility_analysis()

print("=== VOLATILITY ANALYSIS ===")

print("\nPeriod Volatility:")
for period, stats in volatility_results['period_volatility'].items():
    period_name = period.replace('_', ' ').title()
    print(f"\n{period_name}:")
    print(f"  Standard Deviation: {stats['standard_deviation']:.2f}%")
    print(f"  Coefficient of Variation: {stats['coefficient_of_variation']:.2f}")
    print(f"  Range: {stats['range']:.2f} percentage points")
    print(f"  Interquartile Range: {stats['interquartile_range']:.2f} percentage points")

print("\nOverall Volatility:")
overall = volatility_results['overall_volatility']
print(f"Total Range: {overall['total_range']:.2f} percentage points")
print(f"Overall Standard Deviation: {overall['overall_std']:.2f}%")
print(f"Coefficient of Variation: {overall['coefficient_of_variation']:.2f}")

## 7. Comprehensive Dashboard

Let's create a comprehensive dashboard showing all key findings.

In [None]:
# Create comprehensive dashboard
viz.create_dashboard()

## 8. Conclusions {#conclusions}

Based on our comprehensive analysis of unemployment data from 2018-2024, we can draw several important conclusions:

### Key Findings:

1. **COVID-19 Impact**: The pandemic caused an unprecedented spike in unemployment from 3.5% to 14.8% in just two months (February to April 2020), representing a 320% increase.

2. **Rapid Recovery**: Despite the severe initial impact, recovery was faster than historical recessions, with unemployment falling from 14.8% to 6.3% within 8 months.

3. **Return to Baseline**: By late 2021, unemployment had returned to pre-pandemic levels, demonstrating the effectiveness of policy interventions.

4. **Seasonal Patterns**: Normal periods show typical seasonal patterns with slightly higher unemployment in winter months.

5. **Current Status**: Post-recovery unemployment has stabilized around 3.7-4.0%, similar to pre-pandemic levels.

### Policy Implications:

1. **Emergency Preparedness**: The need for flexible unemployment insurance systems and rapid-response economic frameworks.

2. **Intervention Effectiveness**: The combination of fiscal stimulus and monetary policy proved highly effective in accelerating recovery.

3. **Long-term Resilience**: Investment in labor market adaptability and social safety nets is crucial for future crisis preparedness.

4. **Current Focus**: With unemployment at historically low levels, policy focus can shift to inflation control and sustainable growth.

### Recommendations:

- Maintain flexible policy frameworks for future crisis response
- Continue monitoring leading indicators for early intervention
- Invest in job retraining and reskilling programs
- Strengthen social safety nets based on lessons learned from COVID-19
- Focus on sustainable employment growth and economic stability