## Problem Statement

A company has recently implemented a new marketing campaign for one of its products. The company wants to assess if this campaign has significantly increased the product's average monthly sales by more than 15%.
To evaluate the impact of this campaign, the company has compiled a sample dataset named **"monthly_sales_data.csv"**. It contains the following columns:

- **product_id:** A unique identifier for each product.
- **sales_increase_pct:** The percentage increase in monthly sales for each product as a result of the new marketing campaign.


The primary goal of the analysis is to determine whether this campaign increased the product's average monthly sales by more than 15%.


In [1]:
#given population parameters

population_mean = 12  #(This implies that before the new campaign, the average increase in sales was around 12%)
population_std_dev = 5  #(variability)

**Import Necessary Libraries**

In [2]:
import numpy as np
import pandas as pd
from scipy import stats

### Task1: Data Import

1. Import the data from the "monthly_sales_data.csv" file.
2. display the number of rows and columns. 
3. Display the first few rows of the dataset to get an overview.


In [3]:
df = pd.read_csv('monthly_sales_data.csv')
print(df.shape)
df.sample(5)

(100, 2)


Unnamed: 0,product_id,sales_increase_pct
91,P0092,14.11
47,P0048,8.42
63,P0064,13.21
68,P0069,17.78
24,P0025,16.8


### Task2: Define Hypotheses

- State the null and alternative hypotheses based on the given scenario.

#### Null Hypothesis : Average monthly sales increased by 15%
#### Alternate Hypothesis : Average monthly sales increased by more than 15%

### Task3: Calculate Test Statistics

- Compute the sample mean of cost_reduction_pct.
- Determine the sample size.
- Calculate the standard error using the provided population standard deviation.
- Compute the Z-score for the test statistic

In [4]:
#1. sample mean of cost_reduction_pct
sample_mean = df['sales_increase_pct'].mean()
sample_mean

15.4845

In [5]:
#2. sample size
sample_size = df.shape[0]
sample_size

100

In [6]:
#3. standard error
standard_error = population_std_dev/np.sqrt(sample_size)
standard_error

0.5

In [7]:
#4. z_score
z_score = (sample_mean-population_mean)/standard_error
z_score

6.969000000000001

### Task4: Calculate the P-Value

- Set the significance level (e.g., alpha = 0.05).
- Calculate the p-value associated with the obtained z-score.

In [8]:
#defining significance level
alpha = 0.05

In [9]:
#p-value
p_value = 1-stats.norm.cdf(z_score)
p_value

1.596056620201125e-12

### Task5: Decision Making

- Compare the calculated p-value with the alpha.
- Decide whether to reject or fail to reject the null hypothesis.
- Write a conclusion summarizing the findings.

In [10]:
p_value,alpha

(1.596056620201125e-12, 0.05)

## Summary
#### Since, p value is less than significance level, It suggest strong evidence to reject null hypothesis. We can conclude that average monthly sales is increased by more than 15% after the campaign.