# **HYPOTHESIS TESTING**

## ***Introduction***
This notebook performs hypothesis testing and A/B testing on job market trends using `cleaned_jobs.csv`. 
We aim to test salary variations, skill demand, and experience impact using statistical methods.

## **Importing Libraries**

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

## **Step-1: *Load Data***

> Load & Prepare Data

We will load the cleaned dataset and ensure it is ready for hypothesis testing.

In [5]:
df = pd.read_csv('../Data/cleaned_jobs.csv')
print(df.head())

                   Job Title   Company        Location  \
0             Data Scientist    Amazon   Mumbai, India   
1             Data Scientist    Google  Chennai, India   
2             Data Scientist  Flipkart  Chennai, India   
3  Machine Learning Engineer   Infosys     Pune, India   
4  Machine Learning Engineer  Deloitte     Pune, India   

                            Skills  Experience Required          Salary  \
0                Tableau, Excel, R                    6  ₹10L per annum   
1    Data Wrangling, Pandas, Numpy                    6  ₹17L per annum   
2  Machine Learning, Deep Learning                    9   ₹9L per annum   
3  Machine Learning, Deep Learning                    4  ₹19L per annum   
4            Python, Sql, Power Bi                    3   ₹6L per annum   

          Date Posted  
0   Posted 9 days ago  
1  Posted 13 days ago  
2   Posted 7 days ago  
3   Posted 5 days ago  
4   Posted 9 days ago  


## **Step-2: *Hypothesis Testing:***

### **Hypothesis 1: *Salary Differences by Location (T-test)***

> ***Question:*** Are Data Scientist salaries in Bangalore higher than in Mumbai?

We compare salaries between Bangalore and Mumbai using an independent t-test.

- ***Test Used:*** Independent T-test
- ***Null Hypothesis (H₀):*** Salaries in Bangalore and Mumbai are the same.
- ***Alternative Hypothesis (H₁):*** Salaries in Bangalore are higher.

### **Hypothesis 2: *Skill Demand (Chi-Square Test)***

> ***Question:*** Are Python and SQL equally in demand?

We check if Python and SQL have the same demand in job listings.
* ***Test Used:*** Chi-Square Test
* ***Null Hypothesis (H₀):*** Python and SQL appear equally in job listings.
* ***Alternative Hypothesis (H₁):*** One skill is significantly more in demand.

### **Hypothesis 3: *Experience vs. Salary Correlation (Pearson Correlation)***
> ***Question:*** Does more experience lead to higher salaries?

We analyze whether experience has a strong correlation with salary.
- ***Test Used:*** Pearson Correlation
- ***Null Hypothesis (H₀):*** There is no correlation between experience and salary.
- ***Alternative Hypothesis (H₁):*** There is a positive correlation.

### **Hypothesis 4: *A/B Testing on Job Salaries***
> ***Question:*** Do Data Scientists earn more than ML Engineers?

We compare salaries between Data Scientists and ML Engineers using A/B testing.
- ***Test Used:*** Two-Sample T-test (A/B Testing)
- ***Null Hypothesis (H₀):*** Data Scientists and ML Engineers have similar salaries.
- ***Alternative Hypothesis (H₁):*** Data Scientists earn significantly more.

## **Conclusion & Insights**

Summarizing the key insights from our hypothesis testing results.