# Assignment: Hypothesis Testing for Business Analytics

This assignment will demonstrate my ability to: 
1. Formulate and test statistical hypotheses based on business data. 
2. Use one-sample, two-sample, and paired t‑tests to compare means. 
3. Compute and interpret 95% confidence intervals. 
4. Apply these statistical methods to real-world business scenarios, supporting decision-making for TechTrends.

In [35]:
import numpy as np
from scipy import stats
import pandas as pd

sales_df = pd.read_csv('sales-data.csv')
prod_perf_df = pd.read_csv('product-performance.csv')

## Task 1: Sales Performance Analysis Using Hypothesis Testing

### Part 1

In [None]:
#defining benchmark
benchmark = 650000

#calculate average monthly sales in 2024
sales_2024 = sales_df[sales_df['Year'] == 2024]
mean = sales_2024['Sales'].mean()
print(f'The average monthly sales in 2024: ${mean:.2f} \n')

# Perform a one-sample t-test
t_stat, p_value = stats.ttest_1samp(sales_2024['Sales'], benchmark)
print("One-Sample T-Test: \nTesting if average monthly sales in 2024 differ significantly from a benchmark of $650,000:")
print("     T-Statistic:", t_stat)
print("     P-value:", p_value)

#interpreting the results
if p_value < 0.05:
    print(f'\nResult: The average monthly sales in 2024 ({mean:.2f}) is significantly different from the benchmark of $650,000.')
else:
    print(f'\nResult: The average monthly sales in 2024 ({mean:.2f}) is NOT significantly different from the benchmark of $650,000.')


The average monthly sales in 2024: $632916.67 

One-Sample T-Test: 
Testing if average monthly sales in 2024 differ significantly from a benchmark of $650,000
     T-Statistic: -0.4641389266211005
     P-value: 0.6515986342131292

Result: The average monthly sales in 2024 (632916.67) is NOT significantly different from the benchmark of $650,000.


### Part 2

In [None]:
#calculate and print the means of average sales for both years
avg_sales_2024 = sales_df[sales_df['Year'] == 2024]['Sales'].mean()
avg_sales_2023 = sales_df[sales_df['Year'] == 2023]['Sales'].mean()
print(f"Average sales in 2023: ${avg_sales_2023:.2f}")
print(f"Average sales in 2024: ${avg_sales_2024:.2f}")

#perform two-sample t-test
t_stat, p_value = stats.ttest_ind(sales_df[sales_df['Year'] == 2023]['Sales'], sales_df[sales_df['Year'] == 2024]['Sales'])
print("\nTwo-Sample T-Test: Monthly Average Sales 2023-24 Comparison")
print("T-Statistic:", t_stat)
print("P-value:", p_value)

#interpreting the results
if p_value < 0.05:
    print(f'\nResult: The average monthly sales in 2024 ({mean:.2f}) is significantly different from that of 2023.')
else:
    print(f'\nResult: The average monthly sales in 2024 ({mean:.2f}) is NOT significantly different from that of 2023.')


Average sales in 2023: $580000.00
Average sales in 2024: $632916.67

Two-Sample T-Test: Monthly Average Sales 2023-24 Comparison
T-Statistic: -1.089005096540923
P-value: 0.28793754034624225

Result: The average monthly sales in 2024 (632916.67) is NOT significantly different from the benchmark of $650,000.
