## Problem Statement

An e-commerce company claims that the average customer spends ₹5000 on the first purchase.
We collect a random sample of 50 new customers to test if the average spending has changed.

## Step 2: Hypothesis

1.  Null Hypothesis (H₀): μ = 5000

2.  Alternative Hypothesis (H₁): μ ≠ 5000 (Significance Level α = 0.05)

## Step 3: Dataset Creation

We'll generate sample data (in a real case, this would come from the database)

In [71]:
import numpy as np
import pandas as pd
from scipy import stats

# Set random seed for reproducibility
np.random.seed(42)

# Historical mean and std deviation
population_mean = 5000
population_std = 500

# Sample of new customers (size 50)
sample_data = np.random.normal(5100, 480, 50)

df = pd.DataFrame({'Customer_Spending': sample_data})
df.head()

Unnamed: 0,Customer_Spending
0,5338.422793
1,5033.633135
2,5410.890498
3,5831.054331
4,4987.60638


In [72]:
## Compute the Z-Score

sample_mean = np.mean(sample_data)
sample_size = np.sqrt(len(sample_data))

In [73]:
z_score_num = (sample_mean - population_mean)

z_score_deno = population_std/sample_size

z_score = z_score_num/z_score_deno

print("Z Score value is:", z_score)

Z Score value is: -0.11635406054432011


In [74]:
## Calculate the P Value statistics

p_value = 2 * (1- stats.norm.cdf(z_score))

print("P Value is:", p_value)

P Value is: 1.0926280576008502


In [75]:
alpha = 0.05

In [76]:
## Test the statistics

if(p_value < alpha):
    print("Alternate Hypothesis: There is significant difference in average spending:")
else:
    print("Null Hypothesis: There is no significant difference in average spending")

Null Hypothesis: There is no significant difference in average spending
