<h1 style='font-size: 25px; color: crimson; font-family: Colonna MT; font-weight: 600; text-align: center'>One-Sample T-Test</h1>

---


The one-sample t-test is used when you want to compare the mean of a single sample to a known value or a population mean. This test is typically used when you have a sample and want to assess whether its mean differs from a known population mean or a theoretical expectation.

**Example**: <span style='color: green'>*Suppose a farmer wants to test whether the average soil nitrogen level in a plot of land is different from the national average of 2.5%. The one-sample t-test would compare the average nitrogen level in the sample from the plot to the national average of 2.5%.*</span>

**Assumptions for the one-sample t-test:**

1. The sample data should be approximately normally distributed, particularly important for small sample sizes (typically fewer than 30 observations).

2. The data points should be independent of each other, meaning that the measurement of one observation does not influence another.


**Disclaimer**: <span style='color: red; font-weight: 600;'>*This test was performed under assumptions that data adhere all requirements for parametric test*</span>


<span style='color: purple; font-weight: 600; font-size: 20px;'> Import required libraries and dataset </span>

In [14]:
# Import required libraries
from scipy.stats import ttest_1samp
from itertools import combinations
import pandas as pd
import numpy as np

# Generating Demostration datasets
def dataset_generation(sample_size=1000):
    np.random.seed(42)
    Plot = np.random.choice(['Plot 1', 'Plot 2', 'Plot 3', 'Plot 4'], size=sample_size)
    Nitrogen = np.random.normal(5, 10, size=sample_size)
    Phosphorous = np.random.normal(7, 2, size=sample_size) 
    Calicium = np.random.normal(5, 8, size=sample_size)
    Zinc = np.random.normal(2, 5, size=sample_size)
    Magnesium = np.random.normal(5, 10, size=sample_size)
    Sulphur = np.random.normal(3, 9, size=sample_size)
    
    data = pd.DataFrame({
        "Plot": Plot,
        'Nitrogens': Nitrogen,
        'Phosphorous': Phosphorous,
        'Calicium': Calicium,
        'Zinc': Zinc,
        'Magnesium': Magnesium,
        'Sulphur': Sulphur
        
    })
    return data

pd.set_option('display.float_format', lambda x: '%.2f' % x)
pd.set_option('display.max_columns', 10)
df = dataset_generation(sample_size=1000)
display(df)

Unnamed: 0,Plot,Nitrogens,Phosphorous,Calicium,Zinc,Magnesium,Sulphur
0,Plot 3,8.42,9.60,3.16,-2.08,-1.86,15.16
1,Plot 4,23.76,10.12,-2.39,2.39,-9.33,-10.61
2,Plot 1,14.50,7.06,12.12,6.31,6.46,-8.98
3,Plot 3,-0.77,5.49,13.28,2.70,10.85,-16.60
4,Plot 3,-3.98,7.92,-9.77,-5.88,10.15,-3.29
...,...,...,...,...,...,...,...
995,Plot 1,0.12,5.09,14.52,2.88,-1.20,-9.95
996,Plot 1,26.57,7.69,6.70,-1.33,-4.09,-1.64
997,Plot 4,-1.06,6.90,13.22,2.99,21.85,1.86
998,Plot 4,12.42,7.07,13.85,-1.04,13.42,6.96


<span style='color: purple; font-weight: 600; font-size: 20px;'>Function Logic and Implementations </span>

In [16]:
def one_sample_t_test(df, columns, population_means, alpha=0.05): 
    results = []
    for col in columns:
        if col in df.columns and col in population_means:
            sample_data = df[col].dropna()  # Remove NaN values
            pop_mean = population_means[col]
            
            t_stat, p_value = stats.ttest_1samp(sample_data, pop_mean)
            Interpretation = "Significant Difference" if p_value < alpha else "No Significant Difference"
            
            results.append({
                "Parameter": col,
                "Sample Mean": sample_data.mean(),
                "Hypothesized Mean": pop_mean,
                "T-Statistic": t_stat,
                "P-Value": p_value,
                "Alpha": alpha,
                "Conclusion": Interpretation
            })
    
    return pd.DataFrame(results)


population_means = {"Nitrogens": 2.5, "Phosphorous": 3, "Calicium":4, "Zinc":6, "Magnesium":5.7, "Sulphur":3 }
results_df = one_sample_t_test(df, df.columns, population_means)
display(results_df)

Unnamed: 0,Parameter,Sample Mean,Hypothesized Mean,T-Statistic,P-Value,Alpha,Conclusion
0,Nitrogens,5.4,2.5,9.16,0.0,0.05,Significant Difference
1,Phosphorous,7.08,3.0,65.52,0.0,0.05,Significant Difference
2,Calicium,5.01,4.0,3.97,0.0,0.05,Significant Difference
3,Zinc,1.76,6.0,-26.49,0.0,0.05,Significant Difference
4,Magnesium,4.87,5.7,-2.65,0.01,0.05,Significant Difference
5,Sulphur,2.73,3.0,-0.92,0.36,0.05,No Significant Difference


---

This analysis was performed by **Jabulente**, a passionate and dedicated data scientist with a strong commitment to using data to drive meaningful insights and solutions. For inquiries, collaborations, or further discussions, please feel free to reach out via.  

---

<div align="center">  
    
[![GitHub](https://img.shields.io/badge/GitHub-Jabulente-black?logo=github)](https://github.com/Jabulente)  [![LinkedIn](https://img.shields.io/badge/LinkedIn-Jabulente-blue?logo=linkedin)](https://linkedin.com/in/jabulente-208019349)  [![Email](https://img.shields.io/badge/Email-jabulente@hotmail.com-red?logo=gmail)](mailto:Jabulente@hotmail.com)  

</div>

<h1 style='font-size: 45px; color: Tomato; font-family: Colonna MT; font-weight: 700; text-align: center'>THE END</h1>