<a href="https://colab.research.google.com/github/Inbha1503/ResearchStudy/blob/main/Hypothesis_testing.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Hypothesis Testing

This notebook performs formal hypothesis testing to statistically validate relationships observed during exploratory data analysis.

In [None]:
import pandas as pd
import scipy.stats as stats

In [None]:
df=pd.read_csv('cleaned_superstore.csv')

## State Hypothesis
Based on patterns observed during exploratory data analysis, we formally test the impact of discounting on profitability.

H0 (Null Hypothesis):\
Discount has no significant effect on profit.

H1 (Alternative Hypothesis):\
Discount has a significant negative effect on profit.


## Independent Two-Sample t-test

In [None]:
profit_without_discount=df[df['Discount']==0]['Profit']
profit_with_discount=df[df['Discount']>0]['Profit']

Transactions were divided into two independent groups based on the presence or absence of discounts.

In [None]:
print(len(profit_without_discount), len(profit_with_discount))

4798 5196


Both groups contain sufficient observations to justify parametric testing and exceed the minimum sample size threshold required for the Central Limit Theorem.

## Performing the t-test

In [None]:
t_stat, p_value = stats.ttest_ind(profit_without_discount,profit_with_discount,equal_var=False)
print(t_stat, p_value)

15.737992941015493 4.356930371141414e-55


The t-test produced a p-value of <p_value>.

Since **p-value < 0.05**, we reject the null hypothesis.
This provides statistical evidence that discounts have a significant impact on profit.


## Average Profit comparison with and without discount

In [None]:
print(profit_without_discount.mean(),profit_with_discount.mean())

66.90029245518967 -6.657155792917629


Transactions without discounts shows significantly higher average profit compared to discounted transactions.

## Correlation test

In [None]:
corr, corr_p = stats.pearsonr(df['Discount'], df['Profit'])
print(corr, corr_p)

-0.21948745637176847 2.702294436198141e-109


A statistically significant negative correlation further supports the inverse relationship between discount and profit.

# **Hypothesis Testing Conclusion:**
The results of the statistical analysis provide strong evidence that **discounting strategies significantly reduce profitability**. Both t-test and correlation analysis confirm a **negative association between discount levels and profit**, suggesting that excessive discounting may lead to loss-making transactions.