# Refactored Mismatch Results Generator

**Date**: 2025-01-04  
**Refactored from**: `25_04_27_Mismatch_results_generator.ipynb`

This notebook provides a clean, high-level interface for running mismatch analysis using the `MismatchAnalysis` class.

## Overview

The mismatch analysis workflow:
1. **Data Loading**: Load and synchronize optimizer telemetry data
2. **Parameter Extraction**: Extract module parameters from .PAN files
3. **I-V Reconstruction**: Reconstruct I-V curves using single-diode models
4. **Mismatch Calculation**: Compare sum-of-MPP vs series-connection power
5. **Results Generation**: Create visualizations and export results

## Core Research Question
How do module-level mismatches and bypass diode activation affect overall system power output in real-world SolarEdge installations?

## Cell 1: Imports and Setup

In [None]:
# Import the refactored MismatchAnalysis class
from mismatch_analysis import MismatchAnalysis

# Import additional libraries for display and configuration
import matplotlib.pyplot as plt
import pandas as pd
import os
from datetime import datetime

# Display configuration
plt.rcParams['figure.figsize'] = (12, 8)
pd.set_option('display.max_columns', None)
pd.set_option('display.width', None)

print("MismatchAnalysis class imported successfully")
print(f"Analysis started at: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")

## Cell 2: Configuration

Configure the analysis parameters. Modify these values according to your specific analysis requirements.

In [None]:
# ===================== CONFIGURATION PARAMETERS =====================

# Site selection - modify to analyze different sites
site_id = '4111846'  # Target site ID for analysis
season = 'spring'    # Season to analyze: 'spring', 'summer', 'autumn', 'winter'

# Directory paths - update these paths according to your system
data_dir = r"C:\Users\z5183876\OneDrive - UNSW\Documents\GitHub\24_09_24_Solar_Edge\Data"
results_dir = r"C:\Users\z5183876\OneDrive - UNSW\Documents\GitHub\24_09_24_Solar_Edge\Results\25_06_24_Results"
summary_dir = r"C:\Users\z5183876\OneDrive - UNSW\Documents\GitHub\24_09_24_Solar_Edge\Data\25_05_01_Newsites_summary.xlsx"

# Analysis parameters
num_days_to_analyze = 10  # Number of days to process (for performance management)

# Physics and analysis settings
use_dynamic_vth = True   # Calculate thermal voltage from actual temperature
use_ambient_temp = True  # Use ambient temperature when module sensors appear faulty

# ===================== VALIDATION =====================

# Validate that required directories exist
required_paths = {
    'Data directory': data_dir,
    'Results directory': results_dir,
    'Summary file': summary_dir
}

print("=== Configuration Validation ===")
for name, path in required_paths.items():
    if os.path.exists(path):
        print(f"✅ {name}: {path}")
    else:
        print(f"❌ {name}: {path} - NOT FOUND")

print(f"\n=== Analysis Configuration ===")
print(f"Site ID: {site_id}")
print(f"Season: {season}")
print(f"Days to analyze: {num_days_to_analyze}")
print(f"Dynamic thermal voltage: {use_dynamic_vth}")
print(f"Use ambient temperature: {use_ambient_temp}")

## Cell 3: Execute Analysis

Run the complete mismatch analysis workflow using the `MismatchAnalysis` class.

In [None]:
# ===================== INSTANTIATE ANALYSIS CLASS =====================

print("Initializing MismatchAnalysis...")
analysis = MismatchAnalysis(
    site_id=site_id,
    season=season,
    data_dir=data_dir,
    results_dir=results_dir,
    summary_dir=summary_dir
)

# Configure analysis parameters
analysis.num_days_to_plot = num_days_to_analyze
analysis.use_dynamic_vth = use_dynamic_vth
analysis.use_a_T = use_ambient_temp

print("✅ MismatchAnalysis initialized successfully")

# ===================== EXECUTE ANALYSIS WORKFLOW =====================

try:
    print("\n" + "="*60)
    print("STARTING MISMATCH ANALYSIS WORKFLOW")
    print("="*60)
    
    # Step 1: Load and prepare data
    print("\n📊 Step 1: Loading and preparing data...")
    analysis.load_and_prepare_data()
    print(f"✅ Data loaded: {len(analysis.reporter_ids)} optimizers found")
    
    # Step 2: Extract module parameters
    print("\n🔧 Step 2: Extracting module parameters...")
    analysis.extract_module_parameters()
    print(f"✅ Module parameters extracted: {analysis.module_params}")
    
    # Step 3: Run main analysis
    print("\n⚡ Step 3: Running I-V curve analysis...")
    print("This step may take several minutes depending on data size...")
    analysis.run_analysis()
    print("✅ I-V curve analysis completed")
    
    # Step 4: Generate summary plots
    print("\n📈 Step 4: Generating summary plots...")
    analysis.generate_plots()
    print("✅ Summary plots generated")
    
    # Step 5: Save results
    print("\n💾 Step 5: Saving results...")
    analysis.save_results()
    print("✅ Results saved successfully")
    
    # Step 6: Calculate final mismatch loss
    print("\n🎯 Step 6: Calculating mismatch loss...")
    mismatch_loss = analysis.calculate_mismatch_loss()
    print(f"✅ Mismatch loss calculated: {mismatch_loss:.2f}%")
    
    print("\n" + "="*60)
    print("ANALYSIS COMPLETED SUCCESSFULLY")
    print("="*60)
    
except Exception as e:
    print(f"\n❌ Error during analysis: {str(e)}")
    print("\nPlease check the error message and verify:")
    print("1. Data directory contains the specified site ID")
    print("2. Season/month data exists for the site")
    print("3. Required .PAN file is present")
    print("4. File permissions allow read/write access")
    raise

## Cell 4: Results Summary and Visualization

Display key results and generate additional visualizations if needed.

In [None]:
# ===================== RESULTS SUMMARY =====================

print("=== MISMATCH ANALYSIS RESULTS SUMMARY ===")
print()

# Get comprehensive results summary
results_summary = analysis.get_results_summary()

print(f"📍 Site ID: {results_summary['site_id']}")
print(f"🗓️  Season: {results_summary['season']}")
print(f"⚡ Number of optimizers: {results_summary['num_optimizers']}")
print(f"📊 Number of timesteps analyzed: {results_summary['num_timesteps']}")
print(f"📅 Analysis period: {results_summary['analysis_period_days']} days")
print()
print(f"🎯 **MISMATCH LOSS: {results_summary['mismatch_loss_percent']:.2f}%**")
print()
print(f"💡 Total Sum-of-MPP Energy: {results_summary['total_sum_mpp_energy']:.1f} W")
print(f"🔗 Total Series Energy: {results_summary['total_series_energy']:.1f} W")
print(f"📂 Results saved to: {results_summary['output_directory']}")

# ===================== DISPLAY SAMPLE DATA =====================

print("\n=== SAMPLE ANALYSIS DATA ===")

if not analysis.module_param_df.empty:
    print("\n📋 Module Parameters (first 10 rows):")
    print(analysis.module_param_df.head(10).round(4))

if not analysis.pmppt_data.empty and not analysis.iv_sum_data.empty:
    print("\n⚡ Power Data Sample (first 10 rows):")
    combined_sample = pd.concat([analysis.iv_sum_data.head(10), analysis.pmppt_data['Pmppt (W)'].head(10)], axis=1)
    combined_sample['Mismatch (%)'] = ((combined_sample['Sum of I*V (W)'] - combined_sample['Pmppt (W)']) / 
                                      combined_sample['Sum of I*V (W)'] * 100).round(2)
    print(combined_sample)

# ===================== SHOW PLOTS =====================

print("\n=== GENERATED PLOTS ===")
print("The following plots have been generated and saved:")
print("1. 📈 Power comparison plot (pmppt_vs_sum_iv.png)")
print("2. 📊 Percentage difference plot (percentage_difference.png)")
print("3. 🎬 Animated GIF showing I-V evolution (combined_iv_curves_long.gif)")
print("4. 📸 Individual timestep plots (long_horizontal_*.png)")

print("\n💡 Use plt.show() to display plots in the notebook if needed")

# ===================== NEXT STEPS =====================

print("\n=== NEXT STEPS ===")
print("1. Review the generated plots and results")
print("2. Analyze the Excel files for detailed timestep data")
print("3. Use the analyzer notebook for cross-site statistical analysis")
print("4. Consider running the analysis for different seasons/sites")
print("\n✅ Analysis complete! Check the results directory for all outputs.")

## Optional: Quick Data Exploration

Use this cell for additional data exploration or custom visualizations.

In [None]:
# ===================== OPTIONAL: ADDITIONAL ANALYSIS =====================

# Uncomment and modify these sections for additional exploration

# # Display data loading summary
# print("Data Loading Summary:")
# print(f"Merged data shape: {analysis.merged_data.shape}")
# print(f"Columns: {list(analysis.merged_data.columns)}")

# # Show module parameters in detail
# print("\nModule Parameters from .PAN file:")
# for param, value in analysis.module_params.items():
#     print(f"{param}: {value}")

# # Quick statistics on power data
# if not analysis.iv_sum_data.empty:
#     print("\nPower Statistics:")
#     print(analysis.iv_sum_data['Sum of I*V (W)'].describe())

# # Custom plotting (example)
# # fig, ax = plt.subplots(figsize=(10, 6))
# # ax.hist(analysis.module_param_df['FF'].dropna(), bins=20, alpha=0.7)
# # ax.set_xlabel('Fill Factor')
# # ax.set_ylabel('Frequency')
# # ax.set_title('Distribution of Fill Factors')
# # plt.show()

print("Optional analysis section - uncomment code blocks above for additional exploration")

## Notes and Documentation

### Key Features of the Refactored Implementation:

1. **Object-Oriented Design**: All analysis logic encapsulated in the `MismatchAnalysis` class
2. **Error Handling**: Comprehensive error handling and validation
3. **Logging**: Progress tracking and debugging information
4. **Modularity**: Clear separation between data loading, analysis, and visualization
5. **Configurability**: Easy parameter adjustment without code modification

### Methodology:

The analysis reconstructs I-V curves from MPP telemetry data using single-diode models, then compares:
- **Sum-of-MPP**: Independent operation power (sum of Vi × Ii)
- **Series-MPP**: Series-constrained operation power (max of Vseries × I)

**Mismatch Loss** = (Sum-of-MPP - Series-MPP) / Sum-of-MPP × 100%

### Files Generated:

- `combined_data_{season}_{site_id}.xlsx`: Main results with mismatch calculations
- `module_param_df.csv`: Individual module parameters per timestep
- `pmppt_vs_sum_iv.png`: Power comparison visualization
- `percentage_difference.png`: Mismatch percentage over time
- `combined_iv_curves_long.gif`: Animated I-V curve evolution

### Research Applications:

This analysis enables investigation of:
- Module-level mismatch impacts on system performance
- Bypass diode activation frequency and effects
- Seasonal and climatic variations in mismatch losses
- System configuration impacts (shading, orientation, sizing)