<h1 style='font-size: 25px; color: crimson; font-family: Colonna MT; font-weight: 600; text-align: center'>Paired samples data (e.g., before and after) T-Test</h1>

---


The paired t-test is used when comparing two related or matched groups. The data consists of pairs of observations that are naturally linked, such as measurements taken before and after a treatment or intervention on the same subjects. The paired t-test is useful when you want to see if there is a significant change within the same sample over time or under different conditions.

**Example**: <span style='color: green'>*A researcher might want to compare the soil pH level before and after adding lime to the soil. Since both measurements are taken from the same plot, they are not independent, and a paired t-test would be used to determine whether there is a significant difference in the pH levels before and after treatment.*</span>

**Assumptions for the paired t-test:**

1. The differences between the paired observations should be approximately normally distributed.

2. The data points should be paired in a meaningful way, such as "before and after" measurements.


**Disclaimer**: <span style='color: red; font-weight: 600;'>*This test was performed under assumptions that data adhere all requirements for parametric test*</span>


In [63]:
# Import required libraries 
from scipy.stats import ttest_rel
from itertools import combinations
import pandas as pd

# Define function logic to perfor test over multiple variables
def paired_t_test(df, pairs, alpha=0.05): 
    results = []
    for param, (before_col, after_col) in pairs.items():
        if before_col in df.columns and after_col in df.columns:
            before_data = df[before_col].dropna()
            after_data = df[after_col].dropna()
            min_length = min(len(before_data), len(after_data))
            before_data = before_data[:min_length]
            after_data = after_data[:min_length]
            
            #t_stat, p_value = stats.ttest_rel(before_data, after_data)
            t_stat, p_value = ttest_rel(before_data, after_data)
            conclusion = "Significant Difference" if p_value < alpha else "No Significant Difference"
            
            results.append({
                "Parameter": param,
                "Before Mean": before_data.mean(),
                "After Mean": after_data.mean(),
                "T-Statistic": t_stat,
                "P-Value": p_value,
                "Alpha": alpha,
                "Conclusion": conclusion
            })
    
    return pd.DataFrame(results)


# Implementations 
data = {
    "Soil_pH_Before": [6.4, 6.3, 6.5, 6.2, 6.1], "Soil_pH_After": [6.7, 6.6, 6.8, 6.5, 6.4], 
    "Nitrogen_(%)_Before": [6.4, 3.3, 6.5, 5.2, 4.1], "Nitrogen_(%)_After": [8.7, 9.6, 6.9, 9.5, 6.4],
    "Phosphorous (%)_Before": [6.4, 6.3, 6.5, 6.2, 6.1], "Phosphorous (%)_After": [6.7, 6.6, 6.8, 6.5, 6.4],
    "CEC (Meq/100g)_Before": [9.4, 6.3, 8.5, 6.2, 5.1], "CEC (Meq/100g)_After": [6.7, 8.6, 9.8, 6.5, 7.4]  
    }

data = pd.DataFrame(data)
parameter_pairs = {
    "Soil pH": ("Soil_pH_Before", "Soil_pH_After"),
    "Nitrogen (%)": ("Nitrogen_(%)_Before", "Nitrogen_(%)_After"), 
    "Phosphorous (%)": ("Phosphorous (%)_Before", "Phosphorous (%)_After"), 
    "CEC (Meq/100g)": ("CEC (Meq/100g)_Before", "CEC (Meq/100g)_After"),
    }


results_df = paired_t_test(data, parameter_pairs)
results_df

Unnamed: 0,Parameter,Before Mean,After Mean,T-Statistic,P-Value,Alpha,Conclusion
0,Soil pH,6.3,6.6,-1688026000000000.0,7.389838e-61,0.05,Significant Difference
1,Nitrogen (%),5.1,8.22,-3.100834,0.03619183,0.05,Significant Difference
2,Phosphorous (%),6.3,6.6,-1688026000000000.0,7.389838e-61,0.05,Significant Difference
3,CEC (Meq/100g),7.1,7.8,-0.7548294,0.4923579,0.05,No Significant Difference


<span style='color: purple; font-weight: 600; font-size: 20px;'>Additional Implementations</span>

In [43]:
# Importing dataset contains all samples
pd.set_option('display.max_columns', 10)
filepath = 'Datasets/Pair dataset.xlsx'
df = pd.read_excel(filepath)
df.head(10)

Unnamed: 0,SampleID,Soil pH1,Soil pH2,Nitrogen (%) 1,Nitrogen (%) 2,Phosphorous (%) 1,Phosphorous (%) 2,CEC (Meq/100g) 1,CEC (Meq/100g) 2
0,97153,3.63,5.63,4.81,3.88,5.24,7.62,340.12,399.12
1,46858,5.69,7.69,4.59,3.76,4.12,6.63,274.93,328.93
2,90098,3.8,5.8,2.42,2.18,7.35,4.55,307.0,353.0
3,15254,3.57,5.57,4.81,4.18,2.08,6.89,201.27,246.27
4,83474,5.23,7.23,2.79,1.9,4.68,5.3,331.13,377.13
5,32908,4.52,6.52,3.83,3.43,6.14,6.32,309.9,354.9
6,60502,4.4,6.4,3.92,3.09,5.92,6.37,276.52,328.52
7,76688,4.47,6.47,4.78,4.55,3.72,7.64,332.53,374.53
8,32442,4.21,6.21,4.71,4.14,5.03,6.85,273.09,333.09
9,23943,4.26,6.26,2.22,1.54,6.26,5.06,323.72,363.72



<span style='color: purple; font-weight: 600; font-size: 20px;'>Define pairs for each parameter to compare and perform test</span>


In [64]:
parameter_pairs = {
    "Soil pH": ("Soil pH1", "Soil pH2"),
    "Nitrogen (%)": ("Nitrogen (%) 1", "Nitrogen (%) 2"),
    "Phosphorous (%)": ("Phosphorous (%) 1", "Phosphorous (%) 2"),
    "CEC (Meq/100g)": ("CEC (Meq/100g) 1", "CEC (Meq/100g) 2")
}

results = paired_t_test(df, parameter_pairs)
results.round(3)

  return hypotest_fun_in(*args, **kwds)


Unnamed: 0,Parameter,Before Mean,After Mean,T-Statistic,P-Value,Alpha,Conclusion
0,Soil pH,4.382,6.382,-1.028018e+17,0.0,0.05,Significant Difference
1,Nitrogen (%),3.537,3.038,17.047,0.0,0.05,Significant Difference
2,Phosphorous (%),4.941,5.999,-4.961,0.0,0.05,Significant Difference
3,CEC (Meq/100g),272.856,322.436,-84.698,0.0,0.05,Significant Difference


---

This analysis was performed by **Jabulente**, a passionate and dedicated data scientist with a strong commitment to using data to drive meaningful insights and solutions. For inquiries, collaborations, or further discussions, please feel free to reach out via.  

---

<div align="center">  
    
[![GitHub](https://img.shields.io/badge/GitHub-Jabulente-black?logo=github)](https://github.com/Jabulente)  [![LinkedIn](https://img.shields.io/badge/LinkedIn-Jabulente-blue?logo=linkedin)](https://linkedin.com/in/jabulente-208019349)  [![Email](https://img.shields.io/badge/Email-jabulente@hotmail.com-red?logo=gmail)](mailto:Jabulente@hotmail.com)  

</div>

<h1 style='font-size: 35px; color: Tomato; font-family: Colonna MT; font-weight: 700; text-align: center'>THE END</h1>