# One-Sample T-Test

The One-Sample T-Test is a statistical method used to determine whether the mean of a single sample is significantly different from a known or hypothesized population mean. This test is ideal for situations where you want to compare a sample statistic against a standard or expectation.

#### Statistical Formula

The formula for the One-Sample T-test is:

$$
t = \frac{\bar{x} - \mu}{s / \sqrt{n}}
$$

where:
- $\bar{x}$ is the sample mean,
- $\mu$ is the population mean (or target mean in this context),
- $s$ is the sample standard deviation,
- $n$ is the sample size.

#### Example Dataset: User Experience Ratings

**Description:** This dataset includes user experience ratings for a new feature in a tech product, with ratings on a scale from 1 to 10. The company aims to assess if the average user satisfaction rating significantly deviates from a target rating of 8, considered the benchmark for success.

**Structure:** Each row in the dataset represents a user's rating for the new feature, encapsulated in a column named `User_Ratings`.

#### Python Code for One-Sample T-Test

In [29]:
import pandas as pd
import numpy as np
from scipy import stats

np.random.seed(42)  # Ensuring reproducibility

# Generate 5000 random user ratings
user_ratings = np.random.normal(loc=7.5, scale=1.2, size=5000)

# Simulate additional data columns for a more comprehensive dataset
user_ids = range(1, 5001)  # Simulated user IDs
user_ages = np.random.choice(range(18, 65), size=5000)  # Simulated ages of users
user_engagement = np.random.choice(['low', 'medium', 'high'], size=5000)  # Simulated engagement levels

# Create the DataFrame with the additional columns
df_user_experience = pd.DataFrame({
    'user_id': user_ids,
    'age': user_ages,
    'engagement_level': user_engagement,
    'user_ratings': user_ratings
})

# Perform the one-sample t-test against the target mean of 8
target_mean = 8
t_stat, p_value = stats.ttest_1samp(df_user_experience['user_ratings'], target_mean)

print(f"T-statistic: {t_stat:.4f}")
print(f"P-value: {p_value:.4e}")


T-statistic: -29.1693
P-value: 7.0894e-173


#### Interpretation
The T-test statistic of -29.1693 and the extremely low P-value ($7.0894×10^{−173}$) provide very strong evidence against the null hypothesis, which posited that the average user satisfaction rating is equal to the target rating of 8. The negative T-statistic indicates that the average rating is significantly below the target.

**Conclusion**: The results suggest that the average user rating for the new feature significantly deviates from the company's target benchmark of 8, with users rating it lower on average. This significant difference highlights an area for improvement, suggesting that the new feature may not meet user satisfaction levels as expected by the company.

# Independent (Two-Sample) T-Test

The independent (two-sample) T-test is a statistical method used to compare the means of two independent groups to determine if there is a statistically significant difference between them. This test is particularly useful when the data for the two groups are normally distributed and have similar variances.

Note: There are different formulas for cases when we have equal or unequal sample sizes with similar variances $(\frac{1}{2} < \frac{s_{X_1}}{s_{X_2}} < 2)$ and also for when we have equal or unequal sample sizes with unequal variances $(s_{X_1} > 2s_{X_2} \text{or } s_{X_2} > 2s_{X_1})$.

#### Statistical Formula

The formula for the independent two-sample T-test, assuming equal variances and sample sizes, is:

$$
t = \frac{\bar{X}_1 - \bar{X}_2}{s_p \cdot \sqrt{\frac{2}{n}}}
$$

where:
- $\bar{X}_1$ and $\bar{X}_2$ are the sample means of the two groups,
- $s_p$ is the pooled standard deviation of the two samples,
- $n$ is the sample size (assuming equal sample sizes for simplicity).

The pooled standard deviation is calculated as:

$$
s_p = \sqrt{\frac{s_{X_1}^2 + s_{X_2}^2}{2}}
$$

where $s_{X_1}^2$ and $s_{X_2}^2$ are the unbiased estimators of the population variance.

#### Business Scenario: Comparing Sepal Widths of Iris Setosa and Iris Versicolor

In this scenario, a botanist wants to determine if there is a significant difference in the sepal widths between two species of the Iris flower: Iris Setosa and Iris Versicolor. The botanist uses the Iris dataset, which includes measurements of sepal widths among other features, for this analysis.

#### Python Code for Independent Two-Sample T-Test

In [27]:
import pandas as pd
from scipy import stats

# Load the Iris dataset
url = "https://raw.githubusercontent.com/uiuc-cse/data-fa14/gh-pages/data/iris.csv"
iris = pd.read_csv(url)

# Filter the data for Iris Setosa and Iris Versicolor
setosa = iris[iris['species'] == 'setosa']['sepal_width']
versicolor = iris[iris['species'] == 'versicolor']['sepal_width']

# Check sample sizes
print(f"Sample size for Setosa: {len(setosa)}")
print(f"Sample size for Versicolor: {len(versicolor)}")

# Test for equal variances using Levene's test
stat, p = stats.levene(setosa, versicolor)
print(f"Levene's test for equal variances: Statistic={stat:.4f}, P-value={p:.4f}")

# If p-value > 0.05, we fail to reject the null hypothesis of equal variances
if p > 0.05:
    print("Assumption of equal variances is reasonable.")
    # Perform the independent two-sample T-test with equal variances assumed
    t_stat, p_value = stats.ttest_ind(setosa, versicolor, equal_var=True)
    print(f"T-statistic: {round(t_stat, 4)}")
    print(f"P-value: {p_value:.2e}")
else:
    print("Assumption of equal variances is not reasonable.")
    # Consider using Welch's t-test, which does not assume equal variances
    t_stat, p_value = stats.ttest_ind(setosa, versicolor, equal_var=False)
    print(f"Welch's T-statistic: {round(t_stat, 4)}")
    print(f"P-value: {p_value:.2e}")

Sample size for Setosa: 50
Sample size for Versicolor: 50
Levene's test for equal variances: Statistic=0.6635, P-value=0.4173
Assumption of equal variances is reasonable.
T-statistic: 9.2828
P-value: 4.36e-15


#### Interpretation

Before diving into the detailed interpretation of the t-test results for comparing sepal widths between Iris Setosa and Iris Versicolor, let's briefly set the stage by understanding the preliminary step involving Levene's test. This test is pivotal for verifying the assumption of equal variances between the two groups, a crucial consideration for the subsequent t-test.

**Formulating Hypotheses in Levene's Test**  
Levene's test examines the null hypothesis ($H_0$) that the variances across groups are equal against the alternative hypothesis ($H_1$) that they are not. This verification step is essential to choose the correct version of the t-test.

**Understanding Statistic and P-value in Levene's Test**  
**Statistic**: The Levene's test statistic quantifies the extent to which group variances differ from each other. A higher value indicates a greater disparity in variances.
**P-value**: This value assesses the probability of observing the data or something more extreme under the null hypothesis. A p-value higher than the chosen significance level (typically 0.05) suggests insufficient evidence to reject the null hypothesis of equal variances, guiding us towards a standard two-sample t-test with equal variances assumed.
Having established a sound basis for assuming equal variances between the groups through Levene's test, we proceed to interpret the results of the independent two-sample t-test.

The T-statistic of 9.2828 indicates a significant difference in the means of the sepal widths between Iris Setosa and Iris Versicolor, with this value reflecting the degree to which the groups differ standardized by the variability observed in the samples.

The P-value of 4.36e-15 (which is significantly lower than 0.05) provides very strong evidence against the null hypothesis, which posited that there is no difference in sepal widths between the two species. Since the P-value is much less than the commonly used significance level (α = 0.05), we reject the null hypothesis.

**Conclusion**: There is a statistically significant difference in the sepal widths between Iris Setosa and Iris Versicolor. This result suggests that sepal width can be one of the distinguishing features between these two Iris species.

Given the extremely low P-value, the likelihood that the observed difference in means could have occurred by chance is exceedingly small. This underscores the robustness of the conclusion that Iris Setosa and Iris Versicolor differ with respect to their sepal widths, based on the dataset analyzed. This information could be valuable for botanists or researchers interested in the morphological differentiation among Iris species, contributing to classification, identification, and understanding of species variation.


# Paired (Matched) T-Test

The Paired (Matched) T-Test is a statistical method used to compare the means of two related groups to determine if there is a statistically significant difference between them. This test is particularly useful for analyzing the effects of a specific condition or treatment on the same subjects.

Note: The Paired T-test assumes that the differences between paired observations are normally distributed. It is not suitable for independent samples or more than two related groups.

#### Statistical Formula

The formula for the Paired T-test is:

$$
t = \frac{\bar{d}}{s_d / \sqrt{n}}
$$

where:
- $\bar{d}$ is the mean difference between paired observations,
- $s_d$ is the standard deviation of the differences,
- $n$ is the number of pairs.

#### Example Dataset: Before and After Study on Software Optimization

**Description:** This dataset contains execution times for a set of computational tasks performed by a software application, measured before and after an optimization update. Each row represents a unique task with two columns for execution times: `Time_Before` and `Time_After`.

#### Python Code for Paired T-Test

In [28]:
import pandas as pd
from scipy import stats

# Simulated data for demonstration
time_before = [60, 62, 65, 63, 66, 67, 68]
time_after = [55, 59, 61, 64, 63, 64, 65]

# Creating a DataFrame
df = pd.DataFrame({'Time_Before': time_before, 'Time_After': time_after})

# Perform the paired t-test
t_stat, p_value = stats.ttest_rel(df['Time_Before'], df['Time_After'])

print(f"Paired T-test statistic: {t_stat:.4f}")
print(f"P-value: {p_value:.4e}")

Paired T-test statistic: 4.0544
P-value: 6.6926e-03


#### Interpretation  

Based on the simulated data, the Paired T-test statistic is 9.2828, indicating a significant difference in execution times before and after the software optimization. The P-value of 4.36e-15, significantly lower than 0.05, provides strong evidence against the null hypothesis, which posited no improvement in execution times due to the optimization.

**Conclusion**: There is a statistically significant improvement in the software's performance post-optimization, as indicated by the reduced execution times for computational tasks. This result underscores the effectiveness of the optimization in enhancing software efficiency.

This analysis demonstrates the value of the Paired T-test in assessing the impact of changes or treatments in before-and-after studies, particularly when the same subjects or items are involved under both conditions.