## Problem Statement

You are a data scientist at a business efficiency consulting firm. Your client, a multinational corporation, has recently implemented a series of cost-saving measures across various departments.  To evaluate the impact of these initiatives, the company has compiled a sample dataset named **"operational_costs_data.csv"**. This sample dataset tracks the percentage reduction in operational costs for each department post-implementation of the cost-saving activities. The dataset includes the following columns:



- **department_id:** A unique identifier for each department.
- **cost_reduction_pct:** The percentage reduction in operational costs for each department following the cost-saving measures..

The primary goal of the analysis is to determine whether these cost-saving measures have effectively reduced operational costs beyond the company's target of 8%. 

In [5]:
#given population parameters

population_mean = 7  #(indicating an average cost reduction target of 7% before the series of cost-saving measures).
population_std_dev = 3  #(variability).

**Import Necessary Libraries**

In [1]:
import pandas as pd 
import numpy as  np
import seaborn as sns
from matplotlib import pyplot as plt 
from scipy import stats as st 

### Task1: Data Import

1. Import the data from the "operational_costs_data.csv" file.
2. display the number of rows and columns. 
3. Display the first few rows of the dataset to get an overview.


In [2]:
# Import the data from the"operational_costs_data.csv

df = pd.read_csv("C:/Users/MANISHA/chapter10_exercise1/operational_costs_data.csv")

In [3]:
# display the number of rows and columns

df.shape 

(100, 2)

In [4]:
# display the first few rows of the dataset to get an overview

df.head()

Unnamed: 0,department_id,cost_reduction_pct
0,D001,7.4
1,D002,11.31
2,D003,10.78
3,D004,8.79
4,D005,3.29


### Task2: Define Hypotheses

- State the null and alternative hypotheses based on the given scenario.

In [9]:
Ho: population_mean<=8 
Ha: population_mean>8

## Ho:The cost-saving measures have not reduced operational costs beyond the company's target of 8%. 
##In other words, the reduction in operational costs is less than or equal to 8%.

## Ha: The cost-saving measures have reduced operational costs beyond the company's target of 8%. 
##In other words, the reduction in operational costs is greater than 8%.



### Task3: Calculate Test Statistics

- Compute the sample mean of cost_reduction_pct.
- Determine the sample size.
- Calculate the standard error using the provided population standard deviation.
- Compute the Z-score for the test statistic

In [11]:
#1. sample mean of cost_reduction_pct
sample_mean = df.cost_reduction_pct.mean()
sample_mean

7.2562

In [12]:
#2. sample size
sample_size = df.shape[0]
sample_size 

100

In [13]:
#3. standard error
standard_error = population_std_dev/np.sqrt(sample_size)
standard_error 

0.3

In [19]:
#4. z_score

z_score = round((sample_mean - population_mean)/standard_error,2)
z_score

0.85

### Task4: Determine Z-critical Value

- Set the significance level (e.g., alpha = 0.05).
- Find the critical Z-value corresponding to this alpha level.

In [16]:
#defining significance level

confidence_level = 0.95
alpha = 1 - confidence_level
alpha


0.050000000000000044

In [17]:
#critical value

z_critical_value = st.norm.ppf(1-alpha)

z_critical_value

1.6448536269514722

In [20]:
z_score , z_critical_value

(0.85, 1.6448536269514722)

### Task5: Decision Making

- Compare the calculated Z-score with the critical Z-value.
- Decide whether to reject or fail to reject the null hypothesis.
- Write a conclusion summarizing the findings.

In [18]:
if z_score > z_critical_value:
    print("Reject the null hypothesis")
else:
    print("Accept the null hypothesis")

Accept the null hypothesis


## Summary



## Calculated Z_Score is less than z_critical value hence fail to reject the null hypothsis. it means that The cost-saving measures have not reduced operational costs beyond the company's target of 8%. 

##In other words, the reduction in operational costs is less than or equal to 8%.
