## Problem Statement

A company has recently implemented a new marketing campaign for one of its products. The company wants to assess if this campaign has significantly increased the product's average monthly sales by more than 15%.
To evaluate the impact of this campaign, the company has compiled a sample dataset named **"monthly_sales_data.csv"**. It contains the following columns:

- **product_id:** A unique identifier for each product.
- **sales_increase_pct:** The percentage increase in monthly sales for each product as a result of the new marketing campaign.


The primary goal of the analysis is to determine whether this campaign increased the product's average monthly sales by more than 15%.


In [6]:
#given population parameters

population_mean = 12  #(This implies that before the new campaign, the average increase in sales was around 12%)
population_std_dev = 5  #(variability)

**Import Necessary Libraries**

### Task1: Data Import

1. Import the data from the "monthly_sales_data.csv" file.
2. display the number of rows and columns. 
3. Display the first few rows of the dataset to get an overview.


In [1]:
import pandas as pd

df = pd.read_csv("monthly_sales_data.csv")
df.head()

Unnamed: 0,product_id,sales_increase_pct
0,P0001,19.23
1,P0002,25.47
2,P0003,19.16
3,P0004,17.77
4,P0005,11.35


### Task2: Define Hypotheses

- State the null and alternative hypotheses based on the given scenario.

null hypotheses = the sales will be less than or equal to 15%.
alternative hypotheses = the sales will go more than 15%.

### Task3: Calculate Test Statistics

- Compute the sample mean of cost_reduction_pct.
- Determine the sample size.
- Calculate the standard error using the provided population standard deviation.
- Compute the Z-score for the test statistic

In [4]:
#1. sample mean of cost_reduction_pct

sample_mean = df.sales_increase_pct.mean()
sample_mean

15.4845

In [5]:
#2. sample size
df.shape[0]

100

In [7]:
#3. standard error
import numpy as np

standard_error = population_std_dev / np.sqrt(df.shape[0])
standard_error

0.5

In [8]:
#4. z_score

z_score = (sample_mean - population_mean)/standard_error
z_score

6.969000000000001

### Task4: Calculate the P-Value

- Set the significance level (e.g., alpha = 0.05).
- Calculate the p-value associated with the obtained z-score.

In [12]:
#defining significance level

alpha = 0.05

from scipy import stats

percentage = stats.norm.cdf(z_score)
percentage 

0.9999999999984039

In [11]:
#p-value

p_value = 1 - percentage
p_value

1.596056620201125e-12

### Task5: Decision Making

- Compare the calculated p-value with the alpha.
- Decide whether to reject or fail to reject the null hypothesis.
- Write a conclusion summarizing the findings.

In [14]:
p_value, alpha 

# 1.5847e-12 means 1.5847 × 10^(-12).
# This is equivalent to 0.0000000000015847 (notice the decimal has moved 12 places to the left).

(1.596056620201125e-12, 0.05)

## Summary
