## 02-Naive Reproduction: Statistical Tests and ML Models

In this notebook I will be trying to naively reproduce:
1. All the statistical tests mentioned in the original paper --> 
    - Mann-Whitney U-tests, Chi-square tests, 
    - Pearson correlation, and 
    - Spearman correlation tests.
2. the explanation with potential discrepencies
3. note missing details for Stats from the paper
4. Reproduce the tables 2,3,and 4.
5. All the ML feature importance and classification tests --> 
    - 7 Feature importance methods (Decision Tree, Random Forest, XGBoost, Permutation RF, Permutation CART, Permutation KNN, and Permutation XGBoost)
    - Borda Count feature voting method
    - SHAP values for feature contribution
    - Classifier Performance with Cnfusion Matrix and Performance graph
6. comprehensive explanation with potential discrepencies
7. note missing details for ML from the paper
8. Reproduce the tables from 5 to 8 and figures 2 to 4.

#### 1.1 Mann-Whitney U-tests and Chi-square Tests

In [1]:
import pandas as pd
import numpy as np
from scipy.stats import mannwhitneyu, chi2_contingency

In [2]:
# load data
df = pd.read_csv("../raw/heart_dataset.csv")

In [3]:
# list the column names for simplicity
colnames = []
for column_name in df.columns:
    name = column_name
    colnames.append(name)

colnames.remove('target') # Remove target feature from the column names and list features
features = colnames

print(f"Feature Names\t: {features}\nFeature Count\t: {len(features)}")

Feature Names	: ['age', 'sex', 'cp', 'trestbps', 'chol', 'fbs', 'restecg', 'thalach', 'exang', 'oldpeak', 'slope', 'ca', 'thal']
Feature Count	: 13


In [4]:
# define function for the tests

def MWU_Chi (df, features):
    """ 
    Mann-Whitney U-Tests: It is a non-parametric test for non-normally distributed data, and it compares two independent groups.
    Chi-square Tests: It is a non-parametric test used for categorical data.
    """
    
    results = []
    
    # Loop for both mann-whitney U and chi-sqqr tests
    for feature in features:
        group0 = df[df['target'] == 0][feature] # No heart disese
        group1 = df[df['target'] == 1][feature] # heart disease
        
        # Mann-whitney U-test
        U_stat, p_mann = mannwhitneyu(group0, group1, alternative='two-sided')
        
        # Chi-square test
        contingency_tab = pd.crosstab(df[feature], df['target'])
        chi_stat, p_chi, freedom, expected = chi2_contingency(contingency_tab)
        expected_freq_validation = (expected >= 5).sum() / expected.size > 0.8
        
        # Store results
        results.append({
            'Feature' : feature,
            'Mann_Whitney_U' : U_stat,
            'Mann_Whitney_P' : p_mann,
            'Chi_Square_Stats' : chi_stat,
            'Chi_Square_p' : p_chi,
            'Chi_Square_Degree_of_Freedom' : freedom,
            'Expected_Freq_Validation' : expected_freq_validation,
            'n_no_attack' : len(group0),
            'n_attack' : len(group1)
        })
        
    results_df = pd.DataFrame(results)
    
    # formatting results as per original paper
    results_df['Mann_Whitney_p_display'] = results_df['Mann_Whitney_P'].apply(
        lambda x: '<0.001' if x < 0.001 else f'{x:.3f}'
    )
    results_df['Chi_p_display'] = results_df['Chi_Square_p'].apply(
        lambda x: '<0.001' if x < 0.001 else f'{x:.3f}'
    )
    
    return results_df

##### Table 2 Reproduction

In [5]:
tab2_rep = MWU_Chi(df=df, features=features)
tab2_rep

Unnamed: 0,Feature,Mann_Whitney_U,Mann_Whitney_P,Chi_Square_Stats,Chi_Square_p,Chi_Square_Degree_of_Freedom,Expected_Freq_Validation,n_no_attack,n_attack,Mann_Whitney_p_display,Chi_p_display
0,age,167644.0,1.467631e-14,178.766033,1.872323e-19,40,True,499,526,<0.001,<0.001
1,sex,165006.0,3.7570039999999997e-19,78.863051,6.656820999999999e-19,1,True,499,526,<0.001,<0.001
2,cp,65882.5,4.683094e-50,280.982249,1.298066e-60,3,True,499,526,<0.001,<0.001
3,trestbps,148623.0,0.0002330572,156.765973,1.824154e-13,48,False,499,526,<0.001,<0.001
4,chol,151386.5,2.104398e-05,597.138624,2.463368e-54,151,False,499,526,<0.001,<0.001
5,fbs,135088.5,0.1878177,1.513379,0.2186241,1,True,499,526,0.188,0.219
6,restecg,111749.0,2.396495e-06,35.784315,1.696425e-08,2,True,499,526,<0.001,<0.001
7,thalach,66089.0,4.785575e-43,368.270625,2.455372e-35,90,False,499,526,<0.001,<0.001
8,exang,185584.5,1.230653e-44,194.815539,2.826637e-44,1,True,499,526,<0.001,<0.001
9,oldpeak,196450.5,1.4469729999999998e-44,311.012325,4.203278e-44,39,False,499,526,<0.001,<0.001


In [6]:
table2_reproduced = tab2_rep[['Feature', 'Mann_Whitney_p_display', 'Chi_p_display']]

# Save to CSV for comparison
table2_reproduced.to_csv('../contents/tables/table2_reproduced.csv', index=False)

table2_reproduced

Unnamed: 0,Feature,Mann_Whitney_p_display,Chi_p_display
0,age,<0.001,<0.001
1,sex,<0.001,<0.001
2,cp,<0.001,<0.001
3,trestbps,<0.001,<0.001
4,chol,<0.001,<0.001
5,fbs,0.188,0.219
6,restecg,<0.001,<0.001
7,thalach,<0.001,<0.001
8,exang,<0.001,<0.001
9,oldpeak,<0.001,<0.001


Given Table 2 in the original paper.
<img src="../contents/tables/originals/table2.jpg" width=50%>

### Test Results: Mann-Whitney U and Chi-Square

**Overall Result:** Highly Reproducible. Successfully reproduced Table 2 with **25/26 exact matches.**

##### Discrepancy Found
For the feature `fbs` (Fasting Blood Sugar), the reproduction got the exact matching result for Mann-Whitney U-test of **0.188** p-value, whereas the Chi-square got **0.219** instead of 0.188 p-value score.

**Possible Reason:** Minor difference in `fbs` distribution due to data preprocessing.

**Impact:** As `fbs` doesn't significantly associate with heart disease; thus this doesn't affect the paper's key findings.

**Conclusion:** fbs does not have a direct impact on cardiac disease. All 12 remaining features showed **p < 0.001** in both tests, exactly matching the paper's reported values.