## Statistical Analysis of Metrics
#### Loading the Metrics Data
The first step involves loading the performance metrics from the CSV file. This function reads the data into a pandas DataFrame.

In [58]:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from scipy import stats
from scipy.stats import wilcoxon

# Load the metrics from the CSV file
def load_metrics(file_path):
    metrics_df = pd.read_csv(file_path)
    return metrics_df


#### Descriptive Statistics
To understand the basic statistics of the performance metrics, we calculate and print the descriptive statistics. This provides an overview of the distribution and central tendencies of the metrics.

In [59]:
# Calculate and print descriptive statistics
def descriptive_statistics(metrics_df):
    desc_stats = metrics_df.describe().transpose()
    return desc_stats


#### Non-Parametric Tests for Statistical Analysis
To compare the performance of different models statistically, we use non-parametric tests like the Wilcoxon signed-rank test. This function performs pairwise comparisons between models for a given metric.

In [60]:
# Non-parametric tests for statistical analysis
def non_parametric_tests(metrics_df, metric):
    models = metrics_df['Model'].unique()
    results = []

    for i in range(len(models)):
        for j in range(i + 1, len(models)):
            model1 = models[i]
            model2 = models[j]
            data1 = metrics_df[metrics_df['Model'] == model1][metric]
            data2 = metrics_df[metrics_df['Model'] == model2][metric]
            stat, p_value = wilcoxon(data1, data2)
            results.append((model1, model2, stat, p_value))

    results_df = pd.DataFrame(results, columns=['Model1', 'Model2', 'T-Statistic', 'P-Value'])
    return results_df


#### Main Function
The main function orchestrates the loading of metrics, calculation of descriptive statistics, plotting of various performance metrics, and conducting non-parametric tests.

In [61]:
# Main function
def main(file_path):
    metrics_df = load_metrics(file_path)
    desc_stats = descriptive_statistics(metrics_df)
    
    # Perform non-parametric tests
    mae_results = non_parametric_tests(metrics_df, 'MAE (Total)')
    rmse_results = non_parametric_tests(metrics_df, 'RMSE (Total)')
    
    # Output the results in DataFrame format for better visualization
    print("Descriptive Statistics:")
    display(desc_stats)
    print("\nWilcoxon Signed-Rank Test Results for MAE (Total):")
    display(mae_results)
    print("\nWilcoxon Signed-Rank Test Results for RMSE (Total):")
    display(rmse_results)




In [62]:
# Run main function
file_path = 'model_performance_metrics.csv'
main(file_path)

Descriptive Statistics:


Unnamed: 0,count,mean,std,min,25%,50%,75%,max
MAE (Total),4.0,47943.39,95870.965147,0.138003,4.431373,11.79131,47950.75,191749.8
RMSE (Total),4.0,48108.0,96041.707013,0.484652,47.480667,130.503636,48191.02,192170.5
MASE (Total),4.0,inf,,0.207968,0.52345,3401.415642,inf,inf
WMAPE (Total),4.0,inf,,0.261855,0.65908,857.673456,inf,inf
MAE (Seasonal),4.0,50152.15,79767.789015,9156.349258,9158.674188,10835.959729,51829.43,169780.3
RMSE (Seasonal),4.0,53038.0,78161.850747,12458.843139,12459.740409,14728.131351,55306.39,170236.9



Wilcoxon Signed-Rank Test Results for MAE (Total):


Unnamed: 0,Model1,Model2,T-Statistic,P-Value
0,ARIMA,SARIMA,0.0,1.0
1,ARIMA,XGBoost,0.0,1.0
2,ARIMA,LSTM,0.0,1.0
3,SARIMA,XGBoost,0.0,1.0
4,SARIMA,LSTM,0.0,1.0
5,XGBoost,LSTM,0.0,1.0



Wilcoxon Signed-Rank Test Results for RMSE (Total):


Unnamed: 0,Model1,Model2,T-Statistic,P-Value
0,ARIMA,SARIMA,0.0,1.0
1,ARIMA,XGBoost,0.0,1.0
2,ARIMA,LSTM,0.0,1.0
3,SARIMA,XGBoost,0.0,1.0
4,SARIMA,LSTM,0.0,1.0
5,XGBoost,LSTM,0.0,1.0
