# ‚ö° UK Historic Electricity Demand Analysis - Data Exploration

**Team Member(s)**: [Add your name here]  
**Objective**: Load and explore the UK Historic Electricity Demand dataset (2001-2025) to understand energy consumption patterns, seasonal trends, and grid dynamics.

## üìã Overview
This notebook performs comprehensive analysis of the UK Historic Electricity Demand dataset containing 25 years of half-hourly electricity demand records. We'll examine demand patterns, renewable generation, interconnector flows, and seasonal variations.

---

## üîç **Step 1: Data Exploration and Quality Assessment**

### 1.1 üìÇ Initial Data Loading and Structure Analysis

**Purpose**: Load a sample from the electricity demand dataset to understand its structure, temporal resolution, and data quality.

**Key Questions**:
- What variables are available in the dataset?
- What is the temporal resolution and coverage?
- What does a typical electricity demand record look like?

In [None]:
# ======================================
# Step 1 ‚Äî Load and Inspect UK Electricity Demand Data
# ======================================
# Objective:
# - Understand the multi-file structure (2001-2025)
# - Examine column structure and data types
# - Check temporal resolution and completeness
# ======================================

from pathlib import Path
import pandas as pd
import numpy as np
import glob
from datetime import datetime

# Path to electricity demand data directory
data_dir = Path("Dataset_2_UK_Historic_Electricity_Demand_Data/")

# ------------------------------------------------------------
# Discover all CSV files in the directory
# ------------------------------------------------------------
csv_files = list(data_dir.glob("demanddata_*.csv"))
csv_files.sort()

print(f"‚úÖ Found {len(csv_files)} data files")
print("Years available:")
for file in csv_files:
    year = file.stem.split('_')[-1]
    print(f"  üìÖ {year}: {file.name}")

# ------------------------------------------------------------
# Load a sample file to understand structure
# ------------------------------------------------------------
sample_file = csv_files[10]  # Load 2011 as a representative sample
print(f"\nüìÇ Loading sample file: {sample_file.name}")

df_sample = pd.read_csv(sample_file)
print("‚úÖ Sample loaded successfully.")
print("Shape:", df_sample.shape)  # (rows, columns)

print("\nüîç Column names:")
for i, col in enumerate(df_sample.columns):
    print(f"  {i+1:2d}. {col}")

# Preview the first few rows
print(f"\nüìã First 5 rows:")
display(df_sample.head())

# ------------------------------------------------------------
# Inspect data types and basic statistics
# ------------------------------------------------------------
print("\nüìä DataFrame Info:")
df_sample.info()

print("\nüí° Basic Statistics:")
display(df_sample.describe())

# ------------------------------------------------------------
# Check for missing values
# ------------------------------------------------------------
print("\n‚ùì Missing Value Analysis:")
missing_pct = df_sample.isnull().sum() / len(df_sample) * 100
missing_analysis = pd.DataFrame({
    'Missing_Count': df_sample.isnull().sum(),
    'Missing_Percentage': missing_pct.round(2)
}).sort_values('Missing_Percentage', ascending=False)

print(missing_analysis[missing_analysis['Missing_Count'] > 0])

### 1.2 üìä Temporal Analysis and Data Coverage

**Purpose**: Understand the temporal structure of the electricity demand data, including settlement periods, date formats, and data completeness across years.

**Key Focus**:
- Settlement period structure (half-hourly data)
- Date parsing and temporal coverage
- Identification of any data gaps or anomalies

In [None]:
# ======================================
# Temporal Analysis of Electricity Data
# ======================================

# Parse settlement date and analyze temporal structure
df_sample['SETTLEMENT_DATE'] = pd.to_datetime(df_sample['SETTLEMENT_DATE'], format='%d-%b-%Y')

print("üìÖ TEMPORAL ANALYSIS:")
print("=" * 50)
print(f"Date range: {df_sample['SETTLEMENT_DATE'].min()} to {df_sample['SETTLEMENT_DATE'].max()}")
print(f"Total days: {df_sample['SETTLEMENT_DATE'].nunique()}")
print(f"Total records: {len(df_sample):,}")

# Analyze settlement periods (should be 1-48 for half-hourly data)
print(f"\n‚è∞ SETTLEMENT PERIOD ANALYSIS:")
print(f"Period range: {df_sample['SETTLEMENT_PERIOD'].min()} to {df_sample['SETTLEMENT_PERIOD'].max()}")
print(f"Unique periods: {df_sample['SETTLEMENT_PERIOD'].nunique()}")
print(f"Expected periods per day: 48 (half-hourly)")

# Check for missing settlement periods
expected_periods = set(range(1, 49))
actual_periods = set(df_sample['SETTLEMENT_PERIOD'].unique())
missing_periods = expected_periods - actual_periods

if missing_periods:
    print(f"‚ö†Ô∏è Missing settlement periods: {sorted(missing_periods)}")
else:
    print("‚úÖ All settlement periods (1-48) present")

# Create datetime column for easier analysis
df_sample['DATETIME'] = df_sample['SETTLEMENT_DATE'] + pd.Timedelta(minutes=30) * (df_sample['SETTLEMENT_PERIOD'] - 1)

print(f"\nüïê DATETIME ANALYSIS:")
print(f"First timestamp: {df_sample['DATETIME'].min()}")
print(f"Last timestamp: {df_sample['DATETIME'].max()}")

# Check for daylight saving transitions (days with != 48 periods)
daily_counts = df_sample.groupby('SETTLEMENT_DATE')['SETTLEMENT_PERIOD'].count()
non_standard_days = daily_counts[daily_counts != 48]

if not non_standard_days.empty:
    print(f"\n‚ö†Ô∏è Days with non-standard period counts:")
    for date, count in non_standard_days.items():
        print(f"  {date.strftime('%Y-%m-%d')}: {count} periods")
else:
    print("\n‚úÖ All days have standard 48 periods")

# Sample of data structure
print(f"\nüìã SAMPLE RECORDS:")
sample_records = df_sample[['SETTLEMENT_DATE', 'SETTLEMENT_PERIOD', 'ND', 'ENGLAND_WALES_DEMAND', 'DATETIME']].head(10)
display(sample_records)

### 1.3 ‚ö° Electricity Demand Variables Analysis

**Purpose**: Examine the key electricity variables including demand metrics, renewable generation, and interconnector flows.

**Key Variables**:
- ND: National Demand
- TSD: Transmission System Demand  
- England & Wales Demand
- Embedded Wind/Solar Generation
- Interconnector Flows

In [None]:
# ======================================
# Electricity Variables Analysis
# ======================================

# Define key variable categories
demand_vars = ['ND', 'TSD', 'ENGLAND_WALES_DEMAND']
renewable_vars = ['EMBEDDED_WIND_GENERATION', 'EMBEDDED_SOLAR_GENERATION', 
                  'EMBEDDED_WIND_CAPACITY', 'EMBEDDED_SOLAR_CAPACITY']
interconnector_vars = [col for col in df_sample.columns if '_FLOW' in col]
storage_vars = ['NON_BM_STOR', 'PUMP_STORAGE_PUMPING']

print("‚ö° ELECTRICITY DEMAND VARIABLES ANALYSIS:")
print("=" * 60)

# Analyze demand variables
print("\nüí° DEMAND VARIABLES:")
for var in demand_vars:
    if var in df_sample.columns:
        data = df_sample[var]
        print(f"  {var}:")
        print(f"    Range: {data.min():,.0f} - {data.max():,.0f} MW")
        print(f"    Mean: {data.mean():,.0f} MW")
        print(f"    Std: {data.std():,.0f} MW")

# Analyze renewable generation
print("\nüå± RENEWABLE GENERATION VARIABLES:")
for var in renewable_vars:
    if var in df_sample.columns:
        data = df_sample[var]
        if 'CAPACITY' in var:
            print(f"  {var}: {data.mean():,.0f} MW (capacity)")
        else:
            print(f"  {var}: {data.min():,.0f} - {data.max():,.0f} MW")

# Analyze interconnector flows
print("\nüîå INTERCONNECTOR FLOWS:")
for var in interconnector_vars:
    if var in df_sample.columns:
        data = df_sample[var]
        print(f"  {var}: {data.min():,.0f} to {data.max():,.0f} MW")

# Analyze storage variables
print("\nüîã STORAGE VARIABLES:")
for var in storage_vars:
    if var in df_sample.columns:
        data = df_sample[var]
        print(f"  {var}: {data.min():,.0f} to {data.max():,.0f} MW")

# Create summary statistics table
print("\nüìä SUMMARY STATISTICS (MW):")
key_variables = ['ND', 'TSD', 'ENGLAND_WALES_DEMAND', 'EMBEDDED_WIND_GENERATION', 'EMBEDDED_SOLAR_GENERATION']
summary_stats = df_sample[key_variables].describe().round(0)
display(summary_stats)

# Correlation analysis between demand variables
print("\nüîó CORRELATION BETWEEN DEMAND VARIABLES:")
demand_corr = df_sample[demand_vars].corr().round(3)
display(demand_corr)

---

## üßπ **Step 2: Data Integration and Processing**

### 2.1 üîÑ Multi-Year Data Loading and Consolidation

**Purpose**: Load and combine data from all available years (2001-2025) into a unified dataset for comprehensive analysis.

**Approach**: 
- Process each year's data with consistent schema
- Handle any structural changes over time
- Create a master dataset with temporal continuity

In [None]:
# ======================================
# Multi-Year Data Loading and Integration
# ======================================

def load_and_clean_electricity_data(data_dir, years_subset=None):
    """
    Load and clean electricity demand data from multiple CSV files
    
    Parameters:
    - data_dir: Path to directory containing CSV files
    - years_subset: List of years to load (None for all years)
    
    Returns:
    - Combined DataFrame with all years
    """
    
    # Find all CSV files
    csv_files = list(data_dir.glob("demanddata_*.csv"))
    csv_files.sort()
    
    # Filter by years if specified
    if years_subset:
        filtered_files = []
        for file in csv_files:
            year = int(file.stem.split('_')[-1])
            if year in years_subset:
                filtered_files.append(file)
        csv_files = filtered_files
    
    print(f"üìÇ Loading data from {len(csv_files)} files...")
    
    all_dataframes = []
    
    for i, file in enumerate(csv_files):
        try:
            year = file.stem.split('_')[-1]
            print(f"  Loading {year}... ", end="")
            
            # Load data
            df_year = pd.read_csv(file)
            
            # Add year column
            df_year['YEAR'] = int(year)
            
            # Parse settlement date
            df_year['SETTLEMENT_DATE'] = pd.to_datetime(df_year['SETTLEMENT_DATE'], format='%d-%b-%Y')
            
            # Create full datetime
            df_year['DATETIME'] = (df_year['SETTLEMENT_DATE'] + 
                                  pd.Timedelta(minutes=30) * (df_year['SETTLEMENT_PERIOD'] - 1))
            
            # Add temporal features
            df_year['MONTH'] = df_year['DATETIME'].dt.month
            df_year['DAY_OF_WEEK'] = df_year['DATETIME'].dt.dayofweek
            df_year['HOUR'] = df_year['DATETIME'].dt.hour
            df_year['MINUTE'] = df_year['DATETIME'].dt.minute
            df_year['SEASON'] = df_year['MONTH'].map({
                12: 'Winter', 1: 'Winter', 2: 'Winter',
                3: 'Spring', 4: 'Spring', 5: 'Spring',
                6: 'Summer', 7: 'Summer', 8: 'Summer',
                9: 'Autumn', 10: 'Autumn', 11: 'Autumn'
            })
            
            all_dataframes.append(df_year)
            print(f"{len(df_year):,} records")
            
        except Exception as e:
            print(f"Error loading {file}: {e}")
            continue
    
    if all_dataframes:
        # Combine all years
        df_combined = pd.concat(all_dataframes, ignore_index=True)
        df_combined = df_combined.sort_values(['DATETIME']).reset_index(drop=True)
        
        print(f"\n‚úÖ Successfully loaded {len(df_combined):,} records")
        print(f"üìÖ Date range: {df_combined['DATETIME'].min()} to {df_combined['DATETIME'].max()}")
        print(f"üìä Years: {sorted(df_combined['YEAR'].unique())}")
        
        return df_combined
    else:
        print("‚ùå No data loaded successfully")
        return None

# Load a subset of recent years for initial analysis (to avoid memory issues)
recent_years = [2020, 2021, 2022, 2023, 2024]  # 5 years of recent data

print("Loading recent years for analysis...")
df_electricity = load_and_clean_electricity_data(data_dir, years_subset=recent_years)

if df_electricity is not None:
    # Display basic info about the combined dataset
    print(f"\nüìã COMBINED DATASET INFO:")
    print(f"Shape: {df_electricity.shape}")
    print(f"Memory usage: {df_electricity.memory_usage(deep=True).sum() / 1024**2:.1f} MB")
    
    # Save processed data
    output_path = Path("data/interim/electricity_demand_processed.csv")
    output_path.parent.mkdir(parents=True, exist_ok=True)
    df_electricity.to_csv(output_path, index=False)
    print(f"üíæ Saved processed data to {output_path}")
    
    # Show sample of processed data
    print(f"\nüìã PROCESSED DATA SAMPLE:")
    display(df_electricity.head(10))

## üìä **Step 3: Electricity Demand Analysis & Visualizations**

### 3.1 üìö Visualization Setup

**Purpose**: Import libraries and prepare the processed electricity demand data for comprehensive visualization and analysis.

In [None]:
# ======================================
# Visualization Setup for Electricity Data
# ======================================

import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px
import plotly.graph_objects as go
from plotly.subplots import make_subplots
import warnings
warnings.filterwarnings('ignore')

# Set plotting styles
plt.style.use('default')
sns.set_palette("husl")
plt.rcParams['figure.figsize'] = (15, 10)
plt.rcParams['font.size'] = 12

print("‚úÖ Visualization libraries imported")

# Load processed data if not already in memory
if 'df_electricity' not in locals():
    print("üìÇ Loading processed electricity data...")
    processed_path = Path("data/interim/electricity_demand_processed.csv")
    if processed_path.exists():
        df_electricity = pd.read_csv(processed_path)
        df_electricity['DATETIME'] = pd.to_datetime(df_electricity['DATETIME'])
        df_electricity['SETTLEMENT_DATE'] = pd.to_datetime(df_electricity['SETTLEMENT_DATE'])
        print(f"‚úÖ Loaded {len(df_electricity):,} records")
    else:
        print("‚ùå Processed data not found. Please run the data loading cell first.")

if 'df_electricity' in locals():
    print(f"üìä Ready for analysis with {len(df_electricity):,} electricity records")
    print(f"üìÖ Date range: {df_electricity['DATETIME'].min()} to {df_electricity['DATETIME'].max()}")
    print(f"‚ö° Peak demand: {df_electricity['ND'].max():,.0f} MW")
    print(f"‚ö° Minimum demand: {df_electricity['ND'].min():,.0f} MW")

### 3.2 üìà Electricity Demand Patterns Analysis

**Purpose**: Analyze electricity demand patterns across different time scales - daily, weekly, seasonal, and annual trends.

In [None]:
# ======================================
# Comprehensive Electricity Demand Patterns Analysis
# ======================================

def create_demand_analysis(df):
    """Create comprehensive electricity demand analysis"""
    
    fig = plt.figure(figsize=(20, 16))
    
    # 1. Annual demand trends
    ax1 = plt.subplot(3, 3, 1)
    annual_stats = df.groupby('YEAR')['ND'].agg(['mean', 'max', 'min'])
    ax1.plot(annual_stats.index, annual_stats['mean'], marker='o', linewidth=3, label='Average', color='steelblue')
    ax1.plot(annual_stats.index, annual_stats['max'], marker='s', linewidth=2, label='Peak', color='red', alpha=0.7)
    ax1.plot(annual_stats.index, annual_stats['min'], marker='^', linewidth=2, label='Minimum', color='green', alpha=0.7)
    ax1.set_title('‚ö° Annual Electricity Demand Trends', fontweight='bold', fontsize=14)
    ax1.set_ylabel('Demand (MW)')
    ax1.legend()
    ax1.grid(True, alpha=0.3)
    
    # 2. Seasonal patterns
    ax2 = plt.subplot(3, 3, 2)
    seasonal_avg = df.groupby('SEASON')['ND'].mean().reindex(['Spring', 'Summer', 'Autumn', 'Winter'])
    bars = ax2.bar(seasonal_avg.index, seasonal_avg.values, color=['lightgreen', 'gold', 'orange', 'lightblue'], alpha=0.8)
    ax2.set_title('üåç Seasonal Demand Patterns', fontweight='bold', fontsize=14)
    ax2.set_ylabel('Average Demand (MW)')
    for bar in bars:
        height = bar.get_height()
        ax2.text(bar.get_x() + bar.get_width()/2., height + 500,
                f'{height:,.0f}', ha='center', va='bottom', fontsize=11)
    
    # 3. Daily patterns (by hour)
    ax3 = plt.subplot(3, 3, 3)
    hourly_avg = df.groupby('HOUR')['ND'].mean()
    ax3.plot(hourly_avg.index, hourly_avg.values, marker='o', linewidth=3, color='purple')
    ax3.set_title('üïê Daily Demand Profile (24-hour)', fontweight='bold', fontsize=14)
    ax3.set_xlabel('Hour of Day')\n    ax3.set_ylabel('Average Demand (MW)')\n    ax3.grid(True, alpha=0.3)\n    ax3.set_xticks(range(0, 24, 2))\n    \n    # 4. Weekly patterns\n    ax4 = plt.subplot(3, 3, 4)\n    day_names = ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday']\n    weekly_avg = df.groupby('DAY_OF_WEEK')['ND'].mean()\n    bars = ax4.bar(range(7), weekly_avg.values, color='coral', alpha=0.8)\n    ax4.set_title('üìÖ Weekly Demand Patterns', fontweight='bold', fontsize=14)\n    ax4.set_xticks(range(7))\n    ax4.set_xticklabels(day_names, rotation=45)\n    ax4.set_ylabel('Average Demand (MW)')\n    \n    # 5. Monthly demand distribution\n    ax5 = plt.subplot(3, 3, 5)\n    monthly_avg = df.groupby('MONTH')['ND'].mean()\n    ax5.plot(monthly_avg.index, monthly_avg.values, marker='s', linewidth=3, color='darkgreen')\n    ax5.set_title('üìÖ Monthly Demand Trends', fontweight='bold', fontsize=14)\n    ax5.set_xlabel('Month')\n    ax5.set_ylabel('Average Demand (MW)')\n    ax5.grid(True, alpha=0.3)\n    ax5.set_xticks(range(1, 13))\n    \n    # 6. Demand distribution histogram\n    ax6 = plt.subplot(3, 3, 6)\n    ax6.hist(df['ND'], bins=50, color='skyblue', alpha=0.7, edgecolor='black')\n    ax6.axvline(df['ND'].mean(), color='red', linestyle='--', linewidth=2, label=f'Mean: {df[\"ND\"].mean():,.0f} MW')\n    ax6.axvline(df['ND'].median(), color='orange', linestyle='--', linewidth=2, label=f'Median: {df[\"ND\"].median():,.0f} MW')\n    ax6.set_title('‚ö° Demand Distribution', fontweight='bold', fontsize=14)\n    ax6.set_xlabel('National Demand (MW)')\n    ax6.set_ylabel('Frequency')\n    ax6.legend()\n    \n    # 7. Peak vs Off-peak analysis\n    ax7 = plt.subplot(3, 3, 7)\n    df['PERIOD_TYPE'] = df['HOUR'].map(lambda x: 'Peak' if 7 <= x <= 19 else 'Off-Peak')\n    period_stats = df.groupby('PERIOD_TYPE')['ND'].mean()\n    bars = ax7.bar(period_stats.index, period_stats.values, color=['orange', 'lightblue'], alpha=0.8)\n    ax7.set_title('‚ö° Peak vs Off-Peak Demand', fontweight='bold', fontsize=14)\n    ax7.set_ylabel('Average Demand (MW)')\n    for bar in bars:\n        height = bar.get_height()\n        ax7.text(bar.get_x() + bar.get_width()/2., height + 500,\n                f'{height:,.0f}', ha='center', va='bottom', fontsize=12)\n    \n    # 8. COVID-19 impact (2020 vs other years)\n    ax8 = plt.subplot(3, 3, 8)\n    covid_comparison = df.groupby(['YEAR', 'MONTH'])['ND'].mean().unstack(level=0)\n    for year in covid_comparison.columns:\n        line_style = '--' if year == 2020 else '-'\n        alpha = 1.0 if year == 2020 else 0.6\n        linewidth = 3 if year == 2020 else 2\n        ax8.plot(covid_comparison.index, covid_comparison[year], \n                marker='o', label=f'{year}', linestyle=line_style, \n                alpha=alpha, linewidth=linewidth)\n    ax8.set_title('ü¶† COVID-19 Impact on Demand', fontweight='bold', fontsize=14)\n    ax8.set_xlabel('Month')\n    ax8.set_ylabel('Average Demand (MW)')\n    ax8.legend()\n    ax8.grid(True, alpha=0.3)\n    \n    # 9. Settlement period analysis\n    ax9 = plt.subplot(3, 3, 9)\n    period_avg = df.groupby('SETTLEMENT_PERIOD')['ND'].mean()\n    ax9.plot(period_avg.index, period_avg.values, linewidth=2, color='darkblue')\n    ax9.set_title('üìä Half-Hourly Settlement Periods', fontweight='bold', fontsize=14)\n    ax9.set_xlabel('Settlement Period (1-48)')\n    ax9.set_ylabel('Average Demand (MW)')\n    ax9.grid(True, alpha=0.3)\n    ax9.set_xticks(range(1, 49, 4))\n    \n    plt.tight_layout()\n    plt.show()\n    \n    # Print key insights\n    print(\"‚ö° KEY ELECTRICITY DEMAND INSIGHTS:\")\n    print(\"=\" * 60)\n    print(f\"üìä Peak demand: {df['ND'].max():,.0f} MW on {df.loc[df['ND'].idxmax(), 'DATETIME']}\")\n    print(f\"üìä Minimum demand: {df['ND'].min():,.0f} MW on {df.loc[df['ND'].idxmin(), 'DATETIME']}\")\n    print(f\"üìà Average annual growth: {((annual_stats['mean'].iloc[-1] / annual_stats['mean'].iloc[0]) - 1) / len(annual_stats) * 100:.2f}% per year\")\n    \n    # Seasonal insights\n    seasonal_range = seasonal_avg.max() - seasonal_avg.min()\n    print(f\"üåç Seasonal variation: {seasonal_range:,.0f} MW ({seasonal_range/seasonal_avg.mean()*100:.1f}%)\")\n    print(f\"üåç Highest season: {seasonal_avg.idxmax()} ({seasonal_avg.max():,.0f} MW)\")\n    print(f\"üåç Lowest season: {seasonal_avg.idxmin()} ({seasonal_avg.min():,.0f} MW)\")\n    \n    # Daily insights\n    daily_range = hourly_avg.max() - hourly_avg.min()\n    print(f\"üïê Daily variation: {daily_range:,.0f} MW ({daily_range/hourly_avg.mean()*100:.1f}%)\")\n    print(f\"üïê Peak hour: {hourly_avg.idxmax()}:00 ({hourly_avg.max():,.0f} MW)\")\n    print(f\"üïê Low hour: {hourly_avg.idxmin()}:00 ({hourly_avg.min():,.0f} MW)\")\n    \n    # Peak vs off-peak\n    peak_premium = (period_stats['Peak'] - period_stats['Off-Peak']) / period_stats['Off-Peak'] * 100\n    print(f\"‚ö° Peak vs off-peak premium: {peak_premium:.1f}%\")\n\n# Run comprehensive demand analysis\nif 'df_electricity' in locals():\n    create_demand_analysis(df_electricity)\nelse:\n    print(\"‚ùå Please load electricity data first\")"

### 3.3 üå± Renewable Energy Integration Analysis

**Purpose**: Analyze the integration and impact of renewable energy sources (wind and solar) on the UK electricity system.

In [None]:
# ======================================
# Renewable Energy Integration Analysis
# ======================================

def analyze_renewable_integration(df):
    \"\"\"Analyze renewable energy integration patterns\"\"\"\n    \n    fig = plt.figure(figsize=(18, 12))\n    \n    # 1. Renewable capacity growth over time\n    ax1 = plt.subplot(2, 3, 1)\n    renewable_capacity = df.groupby('YEAR')[['EMBEDDED_WIND_CAPACITY', 'EMBEDDED_SOLAR_CAPACITY']].mean()\n    ax1.plot(renewable_capacity.index, renewable_capacity['EMBEDDED_WIND_CAPACITY'], \n             marker='o', linewidth=3, label='Wind Capacity', color='blue')\n    ax1.plot(renewable_capacity.index, renewable_capacity['EMBEDDED_SOLAR_CAPACITY'], \n             marker='s', linewidth=3, label='Solar Capacity', color='orange')\n    ax1.set_title('üå± Renewable Capacity Growth', fontweight='bold')\n    ax1.set_ylabel('Capacity (MW)')\n    ax1.legend()\n    ax1.grid(True, alpha=0.3)\n    \n    # 2. Renewable generation vs capacity (capacity factors)\n    ax2 = plt.subplot(2, 3, 2)\n    df['WIND_CF'] = (df['EMBEDDED_WIND_GENERATION'] / df['EMBEDDED_WIND_CAPACITY'] * 100).clip(0, 100)\n    df['SOLAR_CF'] = (df['EMBEDDED_SOLAR_GENERATION'] / df['EMBEDDED_SOLAR_CAPACITY'] * 100).clip(0, 100)\n    \n    monthly_cf = df.groupby('MONTH')[['WIND_CF', 'SOLAR_CF']].mean()\n    ax2.plot(monthly_cf.index, monthly_cf['WIND_CF'], marker='o', linewidth=2, label='Wind CF', color='blue')\n    ax2.plot(monthly_cf.index, monthly_cf['SOLAR_CF'], marker='s', linewidth=2, label='Solar CF', color='orange')\n    ax2.set_title('üîÑ Monthly Capacity Factors', fontweight='bold')\n    ax2.set_xlabel('Month')\n    ax2.set_ylabel('Capacity Factor (%)')\n    ax2.legend()\n    ax2.grid(True, alpha=0.3)\n    \n    # 3. Daily renewable generation patterns\n    ax3 = plt.subplot(2, 3, 3)\n    hourly_renewables = df.groupby('HOUR')[['EMBEDDED_WIND_GENERATION', 'EMBEDDED_SOLAR_GENERATION']].mean()\n    ax3.plot(hourly_renewables.index, hourly_renewables['EMBEDDED_WIND_GENERATION'], \n             linewidth=2, label='Wind', color='blue')\n    ax3.plot(hourly_renewables.index, hourly_renewables['EMBEDDED_SOLAR_GENERATION'], \n             linewidth=2, label='Solar', color='orange')\n    ax3.set_title('‚òÄÔ∏è Daily Renewable Patterns', fontweight='bold')\n    ax3.set_xlabel('Hour of Day')\n    ax3.set_ylabel('Generation (MW)')\n    ax3.legend()\n    ax3.grid(True, alpha=0.3)\n    \n    # 4. Renewable penetration (% of demand)\n    ax4 = plt.subplot(2, 3, 4)\n    df['TOTAL_RENEWABLES'] = df['EMBEDDED_WIND_GENERATION'] + df['EMBEDDED_SOLAR_GENERATION']\n    df['RENEWABLE_PENETRATION'] = (df['TOTAL_RENEWABLES'] / df['ND'] * 100).clip(0, 100)\n    \n    annual_penetration = df.groupby('YEAR')['RENEWABLE_PENETRATION'].mean()\n    bars = ax4.bar(annual_penetration.index, annual_penetration.values, \n                   color='green', alpha=0.7, edgecolor='darkgreen')\n    ax4.set_title('üìä Renewable Penetration Rate', fontweight='bold')\n    ax4.set_ylabel('Penetration (%)')\n    ax4.grid(True, alpha=0.3, axis='y')\n    \n    # Add value labels on bars\n    for bar in bars:\n        height = bar.get_height()\n        ax4.text(bar.get_x() + bar.get_width()/2., height + 0.1,\n                f'{height:.1f}%', ha='center', va='bottom')\n    \n    # 5. Wind vs Solar generation correlation\n    ax5 = plt.subplot(2, 3, 5)\n    sample_data = df.sample(10000)  # Sample for plotting performance\n    ax5.scatter(sample_data['EMBEDDED_WIND_GENERATION'], \n               sample_data['EMBEDDED_SOLAR_GENERATION'], \n               alpha=0.5, s=10, color='purple')\n    ax5.set_xlabel('Wind Generation (MW)')\n    ax5.set_ylabel('Solar Generation (MW)')\n    ax5.set_title('üå™Ô∏è‚òÄÔ∏è Wind vs Solar Correlation', fontweight='bold')\n    ax5.grid(True, alpha=0.3)\n    \n    # Calculate correlation\n    correlation = df['EMBEDDED_WIND_GENERATION'].corr(df['EMBEDDED_SOLAR_GENERATION'])\n    ax5.text(0.05, 0.95, f'Correlation: {correlation:.3f}', \n             transform=ax5.transAxes, fontsize=12, bbox=dict(boxstyle='round', facecolor='white', alpha=0.8))\n    \n    # 6. Seasonal renewable output\n    ax6 = plt.subplot(2, 3, 6)\n    seasonal_renewables = df.groupby('SEASON')[['EMBEDDED_WIND_GENERATION', 'EMBEDDED_SOLAR_GENERATION']].mean()\n    \n    x = range(len(seasonal_renewables.index))\n    width = 0.35\n    ax6.bar([i - width/2 for i in x], seasonal_renewables['EMBEDDED_WIND_GENERATION'], \n            width, label='Wind', color='blue', alpha=0.7)\n    ax6.bar([i + width/2 for i in x], seasonal_renewables['EMBEDDED_SOLAR_GENERATION'], \n            width, label='Solar', color='orange', alpha=0.7)\n    \n    ax6.set_title('üçÇ Seasonal Renewable Output', fontweight='bold')\n    ax6.set_ylabel('Average Generation (MW)')\n    ax6.set_xticks(x)\n    ax6.set_xticklabels(seasonal_renewables.index)\n    ax6.legend()\n    ax6.grid(True, alpha=0.3, axis='y')\n    \n    plt.tight_layout()\n    plt.show()\n    \n    # Print renewable insights\n    print(\"üå± RENEWABLE ENERGY INSIGHTS:\")\n    print(\"=\" * 60)\n    \n    # Capacity insights\n    latest_wind_cap = df['EMBEDDED_WIND_CAPACITY'].iloc[-1]\n    latest_solar_cap = df['EMBEDDED_SOLAR_CAPACITY'].iloc[-1]\n    print(f\"üå™Ô∏è Current wind capacity: {latest_wind_cap:,.0f} MW\")\n    print(f\"‚òÄÔ∏è Current solar capacity: {latest_solar_cap:,.0f} MW\")\n    \n    # Generation insights\n    avg_wind_gen = df['EMBEDDED_WIND_GENERATION'].mean()\n    avg_solar_gen = df['EMBEDDED_SOLAR_GENERATION'].mean()\n    print(f\"üå™Ô∏è Average wind generation: {avg_wind_gen:,.0f} MW ({avg_wind_gen/latest_wind_cap*100:.1f}% CF)\")\n    print(f\"‚òÄÔ∏è Average solar generation: {avg_solar_gen:,.0f} MW ({avg_solar_gen/latest_solar_cap*100:.1f}% CF)\")\n    \n    # Penetration insights\n    max_penetration = df['RENEWABLE_PENETRATION'].max()\n    avg_penetration = df['RENEWABLE_PENETRATION'].mean()\n    print(f\"üìä Average renewable penetration: {avg_penetration:.1f}%\")\n    print(f\"üìä Peak renewable penetration: {max_penetration:.1f}%\")\n    \n    # Variability insights\n    wind_cv = df['EMBEDDED_WIND_GENERATION'].std() / df['EMBEDDED_WIND_GENERATION'].mean() * 100\n    solar_cv = df['EMBEDDED_SOLAR_GENERATION'].std() / df['EMBEDDED_SOLAR_GENERATION'].mean() * 100\n    print(f\"üìä Wind variability (CV): {wind_cv:.1f}%\")\n    print(f\"üìä Solar variability (CV): {solar_cv:.1f}%\")\n\n# Run renewable analysis\nif 'df_electricity' in locals():\n    analyze_renewable_integration(df_electricity)\nelse:\n    print(\"‚ùå Please load electricity data first\")"

### 3.4 üîå Interconnector Flow Analysis

**Purpose**: Analyze electricity imports/exports through international interconnectors to understand UK's energy trade patterns.

In [None]:
# ======================================
# Interconnector Flow Analysis
# ======================================

def analyze_interconnector_flows(df):
    \"\"\"Analyze electricity imports/exports through interconnectors\"\"\"\n    \n    # Identify interconnector columns\n    interconnector_cols = [col for col in df.columns if '_FLOW' in col and col in df.columns]\n    \n    if not interconnector_cols:\n        print(\"‚ùå No interconnector flow data found\")\n        return\n    \n    print(f\"üîå Found {len(interconnector_cols)} interconnectors: {interconnector_cols}\")\n    \n    fig = plt.figure(figsize=(18, 12))\n    \n    # 1. Total interconnector flows over time\n    ax1 = plt.subplot(2, 3, 1)\n    df['TOTAL_IMPORTS'] = df[interconnector_cols].clip(lower=0).sum(axis=1)\n    df['TOTAL_EXPORTS'] = df[interconnector_cols].clip(upper=0).abs().sum(axis=1)\n    df['NET_IMPORTS'] = df['TOTAL_IMPORTS'] - df['TOTAL_EXPORTS']\n    \n    monthly_flows = df.groupby(['YEAR', 'MONTH'])[['TOTAL_IMPORTS', 'TOTAL_EXPORTS', 'NET_IMPORTS']].mean()\n    \n    # Plot net imports over time\n    time_index = [f\"{year}-{month:02d}\" for year, month in monthly_flows.index]\n    ax1.plot(range(len(time_index)), monthly_flows['NET_IMPORTS'], \n             linewidth=2, color='darkblue', label='Net Imports')\n    ax1.axhline(y=0, color='red', linestyle='--', alpha=0.7)\n    ax1.set_title('üîå Net Electricity Imports Over Time', fontweight='bold')\n    ax1.set_ylabel('Net Imports (MW)')\n    ax1.grid(True, alpha=0.3)\n    \n    # Set x-axis labels (show every 6 months)\n    step = max(1, len(time_index) // 10)\n    ax1.set_xticks(range(0, len(time_index), step))\n    ax1.set_xticklabels([time_index[i] for i in range(0, len(time_index), step)], rotation=45)\n    \n    # 2. Individual interconnector flows\n    ax2 = plt.subplot(2, 3, 2)\n    interconnector_avg = df[interconnector_cols].mean().sort_values(ascending=True)\n    \n    colors = ['red' if x < 0 else 'green' for x in interconnector_avg.values]\n    bars = ax2.barh(range(len(interconnector_avg)), interconnector_avg.values, color=colors, alpha=0.7)\n    ax2.set_yticks(range(len(interconnector_avg)))\n    ax2.set_yticklabels([name.replace('_FLOW', '') for name in interconnector_avg.index])\n    ax2.set_xlabel('Average Flow (MW)')\n    ax2.set_title('üåç Average Interconnector Flows\\n(Positive=Import, Negative=Export)', fontweight='bold')\n    ax2.axvline(x=0, color='black', linestyle='-', alpha=0.7)\n    ax2.grid(True, alpha=0.3, axis='x')\n    \n    # 3. Seasonal interconnector patterns\n    ax3 = plt.subplot(2, 3, 3)\n    seasonal_flows = df.groupby('SEASON')[['TOTAL_IMPORTS', 'TOTAL_EXPORTS']].mean()\n    \n    x = range(len(seasonal_flows.index))\n    width = 0.35\n    ax3.bar([i - width/2 for i in x], seasonal_flows['TOTAL_IMPORTS'], \n            width, label='Imports', color='green', alpha=0.7)\n    ax3.bar([i + width/2 for i in x], seasonal_flows['TOTAL_EXPORTS'], \n            width, label='Exports', color='red', alpha=0.7)\n    \n    ax3.set_title('üçÇ Seasonal Import/Export Patterns', fontweight='bold')\n    ax3.set_ylabel('Average Flow (MW)')\n    ax3.set_xticks(x)\n    ax3.set_xticklabels(seasonal_flows.index)\n    ax3.legend()\n    ax3.grid(True, alpha=0.3, axis='y')\n    \n    # 4. Daily import/export patterns\n    ax4 = plt.subplot(2, 3, 4)\n    hourly_flows = df.groupby('HOUR')[['TOTAL_IMPORTS', 'TOTAL_EXPORTS', 'NET_IMPORTS']].mean()\n    \n    ax4.plot(hourly_flows.index, hourly_flows['TOTAL_IMPORTS'], \n             linewidth=2, label='Imports', color='green')\n    ax4.plot(hourly_flows.index, hourly_flows['TOTAL_EXPORTS'], \n             linewidth=2, label='Exports', color='red')\n    ax4.plot(hourly_flows.index, hourly_flows['NET_IMPORTS'], \n             linewidth=2, label='Net Imports', color='blue', linestyle='--')\n    \n    ax4.set_title('üïê Daily Import/Export Patterns', fontweight='bold')\n    ax4.set_xlabel('Hour of Day')\n    ax4.set_ylabel('Flow (MW)')\n    ax4.legend()\n    ax4.grid(True, alpha=0.3)\n    ax4.axhline(y=0, color='black', linestyle='-', alpha=0.5)\n    \n    # 5. Interconnector utilization correlation with demand\n    ax5 = plt.subplot(2, 3, 5)\n    sample_data = df.sample(5000)  # Sample for performance\n    ax5.scatter(sample_data['ND'], sample_data['NET_IMPORTS'], \n               alpha=0.5, s=10, color='purple')\n    ax5.set_xlabel('National Demand (MW)')\n    ax5.set_ylabel('Net Imports (MW)')\n    ax5.set_title('‚ö° Demand vs Net Imports Correlation', fontweight='bold')\n    ax5.grid(True, alpha=0.3)\n    \n    # Calculate and display correlation\n    correlation = df['ND'].corr(df['NET_IMPORTS'])\n    ax5.text(0.05, 0.95, f'Correlation: {correlation:.3f}', \n             transform=ax5.transAxes, fontsize=12, \n             bbox=dict(boxstyle='round', facecolor='white', alpha=0.8))\n    \n    # 6. Import dependency ratio\n    ax6 = plt.subplot(2, 3, 6)\n    df['IMPORT_DEPENDENCY'] = (df['TOTAL_IMPORTS'] / df['ND'] * 100).clip(0, 100)\n    annual_dependency = df.groupby('YEAR')['IMPORT_DEPENDENCY'].mean()\n    \n    bars = ax6.bar(annual_dependency.index, annual_dependency.values, \n                   color='orange', alpha=0.7, edgecolor='darkorange')\n    ax6.set_title('üìä Import Dependency Ratio', fontweight='bold')\n    ax6.set_ylabel('Import Dependency (%)')\n    ax6.grid(True, alpha=0.3, axis='y')\n    \n    # Add value labels\n    for bar in bars:\n        height = bar.get_height()\n        if height > 0:\n            ax6.text(bar.get_x() + bar.get_width()/2., height + 0.1,\n                    f'{height:.1f}%', ha='center', va='bottom', fontsize=10)\n    \n    plt.tight_layout()\n    plt.show()\n    \n    # Print interconnector insights\n    print(\"üîå INTERCONNECTOR FLOW INSIGHTS:\")\n    print(\"=\" * 60)\n    \n    # Overall trade balance\n    total_imports = df['TOTAL_IMPORTS'].sum()\n    total_exports = df['TOTAL_EXPORTS'].sum()\n    net_trade = total_imports - total_exports\n    \n    print(f\"üìä Total imports: {total_imports:,.0f} MWh\")\n    print(f\"üìä Total exports: {total_exports:,.0f} MWh\")\n    print(f\"üìä Net trade balance: {net_trade:,.0f} MWh ({'Import' if net_trade > 0 else 'Export'} country)\")\n    \n    # Peak flows\n    max_import = df['TOTAL_IMPORTS'].max()\n    max_export = df['TOTAL_EXPORTS'].max()\n    print(f\"üîå Peak import capacity utilization: {max_import:,.0f} MW\")\n    print(f\"üîå Peak export capacity utilization: {max_export:,.0f} MW\")\n    \n    # Individual interconnector analysis\n    print(f\"\\nüåç INDIVIDUAL INTERCONNECTOR PERFORMANCE:\")\n    for interconnector in interconnector_cols:\n        avg_flow = df[interconnector].mean()\n        max_flow = df[interconnector].max()\n        min_flow = df[interconnector].min()\n        \n        direction = \"Import\" if avg_flow > 0 else \"Export\"\n        print(f\"  {interconnector.replace('_FLOW', '')}: {avg_flow:+.0f} MW avg ({direction})\")\n        print(f\"    Range: {min_flow:.0f} to {max_flow:.0f} MW\")\n    \n    # Import dependency\n    avg_dependency = df['IMPORT_DEPENDENCY'].mean()\n    max_dependency = df['IMPORT_DEPENDENCY'].max()\n    print(f\"\\nüìä Average import dependency: {avg_dependency:.1f}% of demand\")\n    print(f\"üìä Peak import dependency: {max_dependency:.1f}% of demand\")\n\n# Run interconnector analysis\nif 'df_electricity' in locals():\n    analyze_interconnector_flows(df_electricity)\nelse:\n    print(\"‚ùå Please load electricity data first\")"

## üéØ **Step 4: Final Summary & Insights**

### 4.1 üìã Comprehensive Electricity System Analysis

**Purpose**: Generate a comprehensive summary of the UK electricity system analysis with key insights and findings.

In [None]:
# ======================================
# Comprehensive UK Electricity System Analysis Summary
# ======================================

def generate_electricity_summary(df):
    \"\"\"Generate comprehensive analysis summary\"\"\"\n    \n    print(\"‚ö° UK ELECTRICITY SYSTEM - COMPREHENSIVE ANALYSIS SUMMARY\")\n    print(\"=\" * 70)\n    \n    # Dataset overview\n    total_records = len(df)\n    date_range = f\"{df['DATETIME'].min()} to {df['DATETIME'].max()}\"\n    years_covered = sorted(df['YEAR'].unique())\n    \n    print(f\"üìä Dataset Overview:\")\n    print(f\"   Total records: {total_records:,} half-hourly measurements\")\n    print(f\"   Time coverage: {date_range}\")\n    print(f\"   Years analyzed: {len(years_covered)} years ({min(years_covered)}-{max(years_covered)})\")\n    print(f\"   Data completeness: {(1 - df.isnull().sum().sum() / (len(df) * len(df.columns))) * 100:.1f}%\")\n    print()\n    \n    # Demand characteristics\n    demand_stats = df['ND'].agg(['min', 'max', 'mean', 'std'])\n    peak_time = df.loc[df['ND'].idxmax(), 'DATETIME']\n    min_time = df.loc[df['ND'].idxmin(), 'DATETIME']\n    \n    print(f\"‚ö° Demand Characteristics:\")\n    print(f\"   Peak demand: {demand_stats['max']:,.0f} MW on {peak_time}\")\n    print(f\"   Minimum demand: {demand_stats['min']:,.0f} MW on {min_time}\")\n    print(f\"   Average demand: {demand_stats['mean']:,.0f} MW\")\n    print(f\"   Demand volatility: {demand_stats['std'] / demand_stats['mean'] * 100:.1f}% (CV)\")\n    \n    # Peak-to-trough ratio\n    peak_ratio = demand_stats['max'] / demand_stats['min']\n    print(f\"   Peak-to-minimum ratio: {peak_ratio:.2f}x\")\n    print()\n    \n    # Seasonal patterns\n    seasonal_stats = df.groupby('SEASON')['ND'].mean().sort_values(ascending=False)\n    seasonal_variation = (seasonal_stats.max() - seasonal_stats.min()) / seasonal_stats.mean() * 100\n    \n    print(f\"üåç Seasonal Patterns:\")\n    for season, demand in seasonal_stats.items():\n        print(f\"   {season}: {demand:,.0f} MW average\")\n    print(f\"   Seasonal variation: {seasonal_variation:.1f}%\")\n    print()\n    \n    # Daily patterns\n    hourly_stats = df.groupby('HOUR')['ND'].mean()\n    peak_hour = hourly_stats.idxmax()\n    low_hour = hourly_stats.idxmin()\n    daily_variation = (hourly_stats.max() - hourly_stats.min()) / hourly_stats.mean() * 100\n    \n    print(f\"üïê Daily Patterns:\")\n    print(f\"   Peak hour: {peak_hour}:00 ({hourly_stats.max():,.0f} MW)\")\n    print(f\"   Low hour: {low_hour}:00 ({hourly_stats.min():,.0f} MW)\")\n    print(f\"   Daily variation: {daily_variation:.1f}%\")\n    print()\n    \n    # Renewable energy analysis\n    renewable_cols = ['EMBEDDED_WIND_GENERATION', 'EMBEDDED_SOLAR_GENERATION']\n    if all(col in df.columns for col in renewable_cols):\n        wind_stats = df['EMBEDDED_WIND_GENERATION'].agg(['mean', 'max'])\n        solar_stats = df['EMBEDDED_SOLAR_GENERATION'].agg(['mean', 'max'])\n        \n        # Renewable penetration\n        df_temp = df.copy()\n        df_temp['TOTAL_RENEWABLES'] = df_temp[renewable_cols].sum(axis=1)\n        df_temp['RENEWABLE_PENETRATION'] = (df_temp['TOTAL_RENEWABLES'] / df_temp['ND'] * 100).clip(0, 100)\n        \n        avg_penetration = df_temp['RENEWABLE_PENETRATION'].mean()\n        max_penetration = df_temp['RENEWABLE_PENETRATION'].max()\n        max_pen_time = df_temp.loc[df_temp['RENEWABLE_PENETRATION'].idxmax(), 'DATETIME']\n        \n        print(f\"üå± Renewable Energy Integration:\")\n        print(f\"   Wind generation: {wind_stats['mean']:,.0f} MW avg, {wind_stats['max']:,.0f} MW peak\")\n        print(f\"   Solar generation: {solar_stats['mean']:,.0f} MW avg, {solar_stats['max']:,.0f} MW peak\")\n        print(f\"   Average renewable penetration: {avg_penetration:.1f}% of demand\")\n        print(f\"   Peak renewable penetration: {max_penetration:.1f}% on {max_pen_time}\")\n        print()\n    \n    # Interconnector analysis\n    interconnector_cols = [col for col in df.columns if '_FLOW' in col]\n    if interconnector_cols:\n        total_imports = df[interconnector_cols].clip(lower=0).sum(axis=1).sum()\n        total_exports = df[interconnector_cols].clip(upper=0).abs().sum(axis=1).sum()\n        net_balance = total_imports - total_exports\n        \n        avg_net_imports = df[interconnector_cols].sum(axis=1).mean()\n        max_imports = df[interconnector_cols].clip(lower=0).sum(axis=1).max()\n        max_exports = df[interconnector_cols].clip(upper=0).abs().sum(axis=1).max()\n        \n        print(f\"üîå International Electricity Trade:\")\n        print(f\"   Net trade balance: {net_balance:,.0f} MWh ({'net importer' if net_balance > 0 else 'net exporter'})\")\n        print(f\"   Average net imports: {avg_net_imports:,.0f} MW\")\n        print(f\"   Peak import capacity: {max_imports:,.0f} MW\")\n        print(f\"   Peak export capacity: {max_exports:,.0f} MW\")\n        \n        # Import dependency\n        import_dependency = (df[interconnector_cols].clip(lower=0).sum(axis=1) / df['ND'] * 100).mean()\n        print(f\"   Average import dependency: {import_dependency:.1f}% of demand\")\n        print()\n    \n    # System flexibility metrics\n    df_temp = df.copy()\n    df_temp['HOURLY_RAMP'] = df_temp['ND'].diff().abs()\n    max_ramp = df_temp['HOURLY_RAMP'].max()\n    avg_ramp = df_temp['HOURLY_RAMP'].mean()\n    \n    print(f\"üîÑ System Flexibility:\")\n    print(f\"   Maximum half-hourly ramp: {max_ramp:,.0f} MW\")\n    print(f\"   Average half-hourly ramp: {avg_ramp:,.0f} MW\")\n    print(f\"   System ramp rate: {max_ramp / demand_stats['mean'] * 100:.1f}% of average demand\")\n    print()\n    \n    # Weekly patterns\n    weekday_avg = df[df['DAY_OF_WEEK'] < 5]['ND'].mean()  # Monday-Friday\n    weekend_avg = df[df['DAY_OF_WEEK'] >= 5]['ND'].mean()  # Saturday-Sunday\n    weekday_premium = (weekday_avg - weekend_avg) / weekend_avg * 100\n    \n    print(f\"üìÖ Weekly Patterns:\")\n    print(f\"   Weekday average: {weekday_avg:,.0f} MW\")\n    print(f\"   Weekend average: {weekend_avg:,.0f} MW\")\n    print(f\"   Weekday premium: {weekday_premium:+.1f}%\")\n    print()\n    \n    # Key insights summary\n    print(f\"üéØ KEY INSIGHTS:\")\n    print(f\"   ‚Ä¢ UK electricity demand shows strong seasonal ({seasonal_variation:.1f}%) and daily ({daily_variation:.1f}%) patterns\")\n    print(f\"   ‚Ä¢ System operates with {peak_ratio:.1f}x variation between peak and minimum demand\")\n    print(f\"   ‚Ä¢ Renewable energy provides {avg_penetration:.1f}% of demand on average, with peaks up to {max_penetration:.1f}%\")\n    \n    if interconnector_cols:\n        trade_status = \"net importer\" if net_balance > 0 else \"net exporter\"\n        print(f\"   ‚Ä¢ UK is a {trade_status} with {import_dependency:.1f}% import dependency on average\")\n    \n    print(f\"   ‚Ä¢ System flexibility requires managing ramps up to {max_ramp:,.0f} MW per half-hour\")\n    print(f\"   ‚Ä¢ Commercial vs residential patterns show {abs(weekday_premium):.1f}% weekday-weekend difference\")\n    print()\n    \n    print(\"‚úÖ ANALYSIS COMPLETE - UK Electricity System comprehensively analyzed!\")\n    print(\"=\" * 70)\n    \n    # Return summary statistics\n    return {\n        'total_records': total_records,\n        'peak_demand': demand_stats['max'],\n        'avg_demand': demand_stats['mean'],\n        'renewable_penetration': avg_penetration if 'avg_penetration' in locals() else 0,\n        'seasonal_variation': seasonal_variation,\n        'daily_variation': daily_variation,\n        'years_analyzed': len(years_covered)\n    }\n\n# Generate final summary\nif 'df_electricity' in locals():\n    summary_stats = generate_electricity_summary(df_electricity)\n    \n    # Save summary\n    import json\n    summary_path = Path(\"data/interim/electricity_analysis_summary.json\")\n    summary_path.parent.mkdir(parents=True, exist_ok=True)\n    with open(summary_path, 'w') as f:\n        json.dump(summary_stats, f, indent=2)\n    print(f\"üíæ Analysis summary saved to {summary_path}\")\nelse:\n    print(\"‚ùå Please load electricity data first to generate summary\")"