### Come up with 2 examples on each statistic method tests

#### 1. One sample t-test

let's say you have a sample of exam scores and want to test if the average score is 
significantly different from a hypothesized mean.

In [1]:
import numpy as np
from scipy.stats import ttest_1samp


sample_scores = np.array([72, 78, 85, 88, 95, 90, 82, 88, 76, 80])

# Hypothesized mean
hypothesized_mean = 85

# Perform one-sample t-test
t_statistic, p_value = ttest_1samp(sample_scores, hypothesized_mean)

# Print results
print(f"T-statistic: {t_statistic}")
print(f"P-value: {p_value}")


if p_value < 0.05:
    print("Reject the null hypothesis: The sample mean is significantly different from the hypothesized mean.")
else:
    print("Fail to reject the null hypothesis: There is not enough evidence to conclude a significant difference.")


T-statistic: -0.7152239460985773
P-value: 0.49260628409830287
Fail to reject the null hypothesis: There is not enough evidence to conclude a significant difference.


assuming we have a sample of weights from a population and want to test if the sample mean is
significantly different from the known population mean.

In [2]:
import numpy as np
from scipy.stats import ttest_1samp

# Sample data
sample_weights = np.array([150, 155, 160, 165, 170, 175, 180, 185, 190, 195])

# Known population mean
population_mean = 170

# Perform one-sample t-test
t_statistic, p_value = ttest_1samp(sample_weights, population_mean)

# Print results
print(f"T-statistic: {t_statistic}")
print(f"P-value: {p_value}")


if p_value < 0.05:
    print("Reject the null hypothesis: The sample mean is significantly different from the population mean.")
else:
    print("Fail to reject the null hypothesis: There is not enough evidence to conclude a significant difference.")


T-statistic: 0.5222329678670935
P-value: 0.614117254808394
Fail to reject the null hypothesis: There is not enough evidence to conclude a significant difference.


### 2 . Two sample t test

A company wants to assess if there is a significant difference in the average sales performance between 
two different sales teams, Team A and Team B.

In [3]:
import numpy as np
from scipy.stats import ttest_ind

# Generate sample data for sales performance
np.random.seed(42)  # for reproducibility
sales_team_a = np.random.normal(loc=500, scale=50, size=30)
sales_team_b = np.random.normal(loc=520, scale=45, size=30)

# Perform two-sample t-test
t_statistic, p_value = ttest_ind(sales_team_a, sales_team_b)

# Interpretation
alpha = 0.05
if p_value < alpha:
    conclusion = "Reject the null hypothesis. There is a significant difference in sales performance."
else:
    conclusion = "Fail to reject the null hypothesis. There is no significant difference in sales performance."

print(conclusion)


Reject the null hypothesis. There is a significant difference in sales performance.


An e-commerce company wants to determine if a website redesign has a significant impact on conversion rates. 
They compare the conversion rates before and after the redesign.

In [4]:
import numpy as np
from scipy.stats import ttest_ind

# Generate sample data for conversion rates
np.random.seed(42)
conversion_before_redesign = np.random.normal(loc=0.1, scale=0.02, size=50)
conversion_after_redesign = np.random.normal(loc=0.12, scale=0.02, size=50)

# Perform two-sample t-test
t_statistic, p_value = ttest_ind(conversion_before_redesign, conversion_after_redesign)

# Interpretation
alpha = 0.05
if p_value < alpha:
    conclusion = "Reject the null hypothesis. The website redesign has a significant impact on conversion rates."
else:
    conclusion = "Fail to reject the null hypothesis. There is no significant impact on conversion rates."

print(conclusion)


Reject the null hypothesis. The website redesign has a significant impact on conversion rates.


### 3. One sample proportion test

A manufacturer claims that only 8% of their products are defective. 
A sample is taken to test this claim from random sample 200 products .

In [9]:
import numpy as np
from statsmodels.stats.proportion import proportions_ztest

total_products = 200
defective_products = 16  
expected_proportion = 0.08


z_statistic, p_value = proportions_ztest(defective_products, total_products, expected_proportion)
print(f'Z-statistic: {z_statistic}')
print(f'P-value: {p_value}')

alpha = 0.05
if p_value < alpha:
    print("Reject the null hypothesis: The proportion of defective products is significantly different from 8%.")
else:
    print("Fail to reject the null hypothesis: There is not enough evidence to claim a significant difference from 8%.")


Z-statistic: 0.0
P-value: 1.0
Fail to reject the null hypothesis: There is not enough evidence to claim a significant difference from 8%.


### 4. Two sample proportion test

Testing for Website Conversion Rates

In [11]:
import numpy as np
from statsmodels.stats.proportion import proportions_ztest

# Example data: Number of conversions and total visitors for two website versions
successes_ctrl, nobs_ctrl = 150, 1000
successes_exp, nobs_exp = 180, 1000

# Perform two-sample proportion test
stat, p_value = proportions_ztest([successes_ctrl, successes_exp], [nobs_ctrl, nobs_exp])


if p_value < 0.05:
    print("There is a statistically significant difference in conversion rates between the two website versions.")
else:
    print("There is no statistically significant difference in conversion rates between the two website versions.")


There is no statistically significant difference in conversion rates between the two website versions.


The results of the two-sample proportion test indicate that there is a statistically significant difference in conversion rates between the control and experimental groups. This suggests that the changes made to the experimental version have a significant impact on user engagement
and conversion

Product Preference in Market Research

In [12]:
import numpy as np
from statsmodels.stats.proportion import proportions_ztest

# Example data: Number of customers preferring Product A and Product B
pref_A, total_A = 80, 200
pref_B, total_B = 120, 200

# Perform two-sample proportion test
stat, p_value = proportions_ztest([pref_A, pref_B], [total_A, total_B])

# Business statement
if p_value < 0.05:
    print("There is a statistically significant difference in product preference between Product A and Product B.")
else:
    print("There is no statistically significant difference in product preference between Product A and Product B.")


There is a statistically significant difference in product preference between Product A and Product B.


### 5. Anova (Analysis of variance) Test

In [13]:
import scipy.stats as stats
import pandas as pd

# Sample data for three groups
group1 = [15, 20, 25, 30, 35]
group2 = [10, 18, 25, 32, 40]
group3 = [5, 15, 20, 25, 30]

# Creating a DataFrame
data = pd.DataFrame({'Group1': group1, 'Group2': group2, 'Group3': group3})

# Performing one-way ANOVA
f_statistic, p_value = stats.f_oneway(data['Group1'], data['Group2'], data['Group3'])

# Business statement
if p_value < 0.05:
    print("There is a significant difference in the means of the three groups.")
else:
    print("There is no significant difference in the means of the three groups.")


There is no significant difference in the means of the three groups.


we have three groups, and the ANOVA test is used to determine if there is a significant difference in the means of these groups.
The business statement interprets the results in the context of whether the groups exhibit statistically different means.

### 6. Chi-square test

The chi-square test is a statistical test used to determine if there is a significant association 
between two categorical variables.

Suppose you are a manager at a manufacturing company that produces different types of products (A, B, C). 
You want to assess whether the distribution of produced products matches the expected distribution based on historical data.

In [15]:
import numpy as np
from scipy.stats import chi2_contingency

# Observed data (actual production)
observed_data = np.array([150, 120, 130])

# Expected distribution (based on historical data)
expected_distribution = np.array([0.4, 0.3, 0.3])

# Calculate expected values
expected_data = expected_distribution * np.sum(observed_data)

# Perform Chi-square goodness-of-fit test
chi2, p, _ = chi2_contingency([observed_data, expected_data])

# Business statement based on the p-value
if p < 0.05:
    print("There is a significant difference between the observed and expected product distribution.")
else:
    print("The observed product distribution is consistent with the expected distribution.")


ValueError: too many values to unpack (expected 3)