# 02-Naive Reproduction: Statistical Tests

In this notebook I will be trying to naively reproduce:
1. All the statistical tests mentioned in the original paper --> 
    - Mann-Whitney U-tests, Chi-square tests, 
    - Pearson correlation, and 
    - Spearman correlation tests.
2. the explanation with potential discrepencies
3. note missing details for Stats from the paper
4. Reproduce the tables 2,3,and 4.

### 1.1 Mann-Whitney U-tests and Chi-square Tests

In [1]:
import pandas as pd
import numpy as np
from scipy.stats import mannwhitneyu, chi2_contingency

In [2]:
# load data
df = pd.read_csv("../raw/heart_dataset.csv")

In [3]:
# list the column names for simplicity
colnames = []
for column_name in df.columns:
    name = column_name
    colnames.append(name)

colnames.remove('target') # Remove target feature from the column names and list features
features = colnames

print(f"Feature Names\t: {features}\nFeature Count\t: {len(features)}")

Feature Names	: ['age', 'sex', 'cp', 'trestbps', 'chol', 'fbs', 'restecg', 'thalach', 'exang', 'oldpeak', 'slope', 'ca', 'thal']
Feature Count	: 13


In [4]:
# define function for the tests

def MWU_Chi (df, features):
    """ 
    Mann-Whitney U-Tests: It is a non-parametric test for non-normally distributed data, and it compares two independent groups.
    Chi-square Tests: It is a non-parametric test used for categorical data.
    """
    
    results = []
    
    # Loop for both mann-whitney U and chi-sqqr tests
    for feature in features:
        group0 = df[df['target'] == 0][feature] # No heart disese
        group1 = df[df['target'] == 1][feature] # heart disease
        
        # Mann-whitney U-test
        U_stat, p_mann = mannwhitneyu(group0, group1, alternative='two-sided')
        
        # Chi-square test
        contingency_tab = pd.crosstab(df[feature], df['target'])
        chi_stat, p_chi, freedom, expected = chi2_contingency(contingency_tab)
        expected_freq_validation = (expected >= 5).sum() / expected.size > 0.8
        
        # Store results
        results.append({
            'Feature' : feature,
            'Mann_Whitney_U' : U_stat,
            'Mann_Whitney_P' : p_mann,
            'Chi_Square_Stats' : chi_stat,
            'Chi_Square_p' : p_chi,
            'Chi_Square_Degree_of_Freedom' : freedom,
            'Expected_Freq_Validation' : expected_freq_validation,
            'n_no_attack' : len(group0),
            'n_attack' : len(group1)
        })
        
    results_df = pd.DataFrame(results)
    
    # formatting results as per original paper
    results_df['Mann_Whitney_p_display'] = results_df['Mann_Whitney_P'].apply(
        lambda x: '<0.001' if x < 0.001 else f'{x:.3f}'
    )
    results_df['Chi_p_display'] = results_df['Chi_Square_p'].apply(
        lambda x: '<0.001' if x < 0.001 else f'{x:.3f}'
    )
    
    return results_df

##### Table 2 Reproduction: Mann Whitney U test and Chi squared test outcomes.

In [5]:
tab2_rep = MWU_Chi(df=df, features=features)
tab2_rep

Unnamed: 0,Feature,Mann_Whitney_U,Mann_Whitney_P,Chi_Square_Stats,Chi_Square_p,Chi_Square_Degree_of_Freedom,Expected_Freq_Validation,n_no_attack,n_attack,Mann_Whitney_p_display,Chi_p_display
0,age,167644.0,1.467631e-14,178.766033,1.872323e-19,40,True,499,526,<0.001,<0.001
1,sex,165006.0,3.7570039999999997e-19,78.863051,6.656820999999999e-19,1,True,499,526,<0.001,<0.001
2,cp,65882.5,4.683094e-50,280.982249,1.298066e-60,3,True,499,526,<0.001,<0.001
3,trestbps,148623.0,0.0002330572,156.765973,1.824154e-13,48,False,499,526,<0.001,<0.001
4,chol,151386.5,2.104398e-05,597.138624,2.463368e-54,151,False,499,526,<0.001,<0.001
5,fbs,135088.5,0.1878177,1.513379,0.2186241,1,True,499,526,0.188,0.219
6,restecg,111749.0,2.396495e-06,35.784315,1.696425e-08,2,True,499,526,<0.001,<0.001
7,thalach,66089.0,4.785575e-43,368.270625,2.455372e-35,90,False,499,526,<0.001,<0.001
8,exang,185584.5,1.230653e-44,194.815539,2.826637e-44,1,True,499,526,<0.001,<0.001
9,oldpeak,196450.5,1.4469729999999998e-44,311.012325,4.203278e-44,39,False,499,526,<0.001,<0.001


Given Table 2 in the original paper.
<img src="../contents/tables/originals/table2.jpg" width=50%>

#### Test Results: Mann-Whitney U and Chi-Square

**Overall Result:** Highly Reproducible. Successfully reproduced Table 2 with **25/26 exact matches.**

##### Discrepancy Found
For the feature `fbs` (Fasting Blood Sugar), the reproduction got the exact matching result for Mann-Whitney U-test of **0.188** p-value, whereas the Chi-square got **0.219** instead of 0.188 p-value score.

**Possible Reason:** Minor difference in `fbs` distribution due to data preprocessing.

**Impact:** As `fbs` doesn't significantly associate with heart disease; thus this doesn't affect the paper's key findings.

**Conclusion:** fbs does not have a direct impact on cardiac disease. All 12 remaining features showed **p < 0.001** in both tests, exactly matching the paper's reported values.

### 1.2 Pearson Correlation and Spearman Correlation Tests

In [7]:
# iport pearson and spearman corr tests
from scipy.stats import pearsonr, spearmanr

In [8]:
# define function for pearson correlation test
def pearson_corr(df, features):
    
    """
    Pearson Correlation: It measures linear relationship between 2 continuous variables in normally 
    distributed assumption of data. It is sensitive to extreme outliers and only captures linear
    relationships in data.
    """
    
    pearson_results = []
    
    for feature in features:
        data = df[[feature, 'target']]
        X = data[feature]
        y = data['target']
        
        r_pearson, p_pearson = pearsonr(X, y)

        # Classifying correlation strength like original paper
        def classify_pearsonr(r):
            if abs(r) < 0.3:
                return "No Correlation"
            elif r > 0:
                return "+ Moderate"
            else:
                return "- Moderate"

        pearson_class = classify_pearsonr(r_pearson)

        # Storing rresults
        pearson_results.append({
            'Feature' : feature,
            'r' : r_pearson,
            'p_val' : p_pearson,
            'Classification' : pearson_class,
            'n' : len(X)
        })
    
    pearson_df = pd.DataFrame(pearson_results)
    pearson_df = pearson_df.sort_values('r', ascending=False).reset_index(drop=True)
    
    return pearson_df

In [9]:
# define function for spearman correlation test
def spearman_corr(df, features):
    
    """
    Spearman Correlation: It measures monotonic (variables consistently move in same direction) 
    relationship between variables using ranks instead of raw values. It works better when the
    data distribution is non-normal and has non-inear relationship.
    """
    
    spearman_results = []
    
    for feature in features:
        data = df[[feature, 'target']]
        X = data[feature]
        y = data['target']
        
        r_spearman, p_spearman = spearmanr(X, y)

        # Classifying correlation strength like original paper
        def classify_spearmanr(r):
            if abs(r) < 0.3:
                return "No Correlation"
            elif r > 0:
                return "+ Moderate"
            else:
                return "- Moderate"

        spearman_class = classify_spearmanr(r_spearman)

        # Storing rresults
        spearman_results.append({
            'Feature' : feature,
            'rho (ρ)' : r_spearman,
            'p_val' : p_spearman,
            'Classification' : spearman_class,
            'n' : len(X)
        })
    
    spearman_df = pd.DataFrame(spearman_results)
    spearman_df = spearman_df.sort_values('rho (ρ)', ascending=False).reset_index(drop=True)
    
    return spearman_df

In [10]:
# reproduce Pearson and Spearman correlation data table
pearson_reproduced = pearson_corr(df=df, features=features)
spearman_reproduced = spearman_corr(df=df, features=features)

In [11]:
print("===== PEARSON CORRELATION TABLE REPRODUCED =====")
pearson_reproduced

===== PEARSON CORRELATION TABLE REPRODUCED =====


Unnamed: 0,Feature,r,p_val,Classification,n
0,cp,0.434854,1.563206e-48,+ Moderate,1025
1,thalach,0.422895,9.962971e-46,+ Moderate,1025
2,slope,0.345512,4.122053e-30,+ Moderate,1025
3,restecg,0.134468,1.564103e-05,No Correlation,1025
4,fbs,-0.041164,0.1878967,No Correlation,1025
5,chol,-0.099966,0.001352571,No Correlation,1025
6,trestbps,-0.138772,8.233015e-06,No Correlation,1025
7,age,-0.229324,1.067722e-13,No Correlation,1025
8,sex,-0.279501,7.523831e-20,No Correlation,1025
9,thal,-0.337838,8.781192e-29,- Moderate,1025


In [12]:
print("===== SPEARMAN CORRELATION TABLE REPRODUCED =====")
spearman_reproduced

===== SPEARMAN CORRELATION TABLE REPRODUCED =====


Unnamed: 0,Feature,rho (ρ),p_val,Classification,n
0,cp,0.464894,4.312611e-56,+ Moderate,1025
1,thalach,0.429832,2.429945e-47,+ Moderate,1025
2,slope,0.368808,2.2352229999999998e-34,+ Moderate,1025
3,restecg,0.147402,2.144375e-06,No Correlation,1025
4,fbs,-0.041164,0.1878967,No Correlation,1025
5,trestbps,-0.115009,0.0002243838,No Correlation,1025
6,chol,-0.132926,1.959428e-05,No Correlation,1025
7,age,-0.240326,6.221261e-15,No Correlation,1025
8,sex,-0.279501,7.523831e-20,No Correlation,1025
9,thal,-0.398973,1.902988e-40,- Moderate,1025


##### Table 3 Reproduction: Pearson correlation outcomes.

In [13]:
tab3_rep = pearson_reproduced[['Feature', 'r', 'Classification']].copy()
tab3_rep_cols = ['Features', 'Value of r', 'Degree of Correlation']
tab3_rep.columns = tab3_rep_cols

tab3_rep['Value of r'] = tab3_rep['Value of r'].round(3)
tab3_rep['Degree of Correlation'] = tab3_rep['Degree of Correlation'].astype(str)

tab3_rep

Unnamed: 0,Features,Value of r,Degree of Correlation
0,cp,0.435,+ Moderate
1,thalach,0.423,+ Moderate
2,slope,0.346,+ Moderate
3,restecg,0.134,No Correlation
4,fbs,-0.041,No Correlation
5,chol,-0.1,No Correlation
6,trestbps,-0.139,No Correlation
7,age,-0.229,No Correlation
8,sex,-0.28,No Correlation
9,thal,-0.338,- Moderate


In [14]:
# Save to CSV for comparison
tab3_rep.to_csv('../contents/tables/table3_reproduced.csv', index=False)

Given Table 3 in the original paper.
<img src="../contents/tables/originals/table3.jpg" width=50%>

##### Table 4 Reproduction: Spearman correlation outcomes.

In [15]:
tab4_rep = spearman_reproduced[['Feature', 'rho (ρ)', 'Classification']].copy()
tab4_rep_cols = ['Features', 'Value of r', 'Degree of Correlation']
tab4_rep.columns = tab4_rep_cols

tab4_rep['Value of r'] = tab4_rep['Value of r'].round(3)
tab4_rep['Degree of Correlation'] = tab4_rep['Degree of Correlation'].astype(str)

tab4_rep

Unnamed: 0,Features,Value of r,Degree of Correlation
0,cp,0.465,+ Moderate
1,thalach,0.43,+ Moderate
2,slope,0.369,+ Moderate
3,restecg,0.147,No Correlation
4,fbs,-0.041,No Correlation
5,trestbps,-0.115,No Correlation
6,chol,-0.133,No Correlation
7,age,-0.24,No Correlation
8,sex,-0.28,No Correlation
9,thal,-0.399,- Moderate


In [16]:
# Save to CSV for comparison
tab4_rep.to_csv('../contents/tables/table4_reproduced.csv', index=False)

Given Table 4 in the original paper.
<img src="../contents/tables/originals/table4.jpg" width=50%>

##### Test Results: Pearson Correlation and Spearman Correlation Tests
**Overall Results:** Perfectly reproduced without any mismatch. All values matched with exact correlations from the original paper.