# Basic Statistics Case Study

# Business Problem-3

## BACKGROUND:

The New Life Residential Treatment Facility is a NGO that treats teenagers who have shown signs of mental illness. It provides housing and supervision of teenagers who are making the transition from psychiatric hospitals back into the community. Because many of the teenagers were severely abused as children and have been involved with the juvenile justice system, behavioral problems are common at New Life. Employee pay is low and staff turnover (attrition) is high. A reengineering program was instituted at New Life with the goals of lowering behavioral problems of the kids and decreasing employee turnover rates. As a part of this effort, the following changes were made:

     Employee shifts were shortened from 10 hours to 8 hours each day.
     Employees were motivated to become more involved in patient treatments. This included encouraging staff to run various therapeutic treatment sessions and allowing staff to have more say in program changes.
     The activities budget was increased.
     A facility-wide performance evaluation system was put into place that rewarded staff participation and innovation.
     Management and staff instituted a program designed to raise expectations about appropriate behavior from the kids. This included strict compliance with reporting of behavioral violations, insistence on participation in therapeutic sessions, and a     lowered tolerance for even moderate behavioral infractions.
    
To determine the effectiveness of the reengineering effort, a data set comprised of pre- and post-reengineering periods was compiled. The information contains two measures of behavioral problems. A critical incident occurs when a resident goes AWOL (leaves the premises without permission), destroys property (e.g., punching a hole in a wall or throwing furniture through windows), is caught in possession of street drugs, or engages in assault against other residents or staff members. A teenager is temporarily removed from the facility when s/he is sent to jail or back to a psychiatric hospital

## BUSINESS PROBLEM:

Determine what effect, if any, the reengineering effort had on the incidence behavioral problems and staff turnover. i.e To determine if the reengineering effort changed the critical incidence rate. Is there evidence that the critical incidence rate
improved?

## DATA AVAILABLE:

The data set contains 20 months of data; the first 13 months were prior to reengineering. The variables in the data include:

     Reengineer: Whether the month was before (Prior) or after (Post) reengineering
     Employee Turnover: The percentage of employees who quit in a given month, out of the total number of employees
     TRFF(%): The percentage of residents who were temporarily removed from the facility, out of the total number of residents
     CI (%): The percentage of critical incident reports written that month, out of the total number of residents
    
#### Import Libraries    

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import scipy.stats as stats
import seaborn as sns
from scipy import stats

%matplotlib inline

In [2]:
treatment_data= pd.read_csv('Treatment_Facility.csv')
treatment_data                            

Unnamed: 0,Month,Reengineer,Employee_Turnover,VAR4,VAR5
0,1,Prior,0.0,24.390244,42.682927
1,2,Prior,6.0606,19.354839,25.806452
2,3,Prior,12.1212,35.087719,146.19883
3,4,Prior,3.3333,18.404908,110.429448
4,5,Prior,12.9032,17.964072,23.952096
5,6,Prior,9.6774,41.176471,47.058824
6,7,Prior,11.7647,13.422819,0.0
7,8,Prior,11.4286,31.25,25.0
8,9,Prior,23.0769,17.241379,132.183908
9,10,Prior,15.0,16.574586,16.574586


In [8]:
usage = "VAR5"

In [14]:
Prior= treatment_data.loc[treatment_data.Reengineer=='Prior',usage]
Post= treatment_data.loc[treatment_data.Reengineer=='Post',usage]

In [12]:
def check_normality(data):
    test_stat_normality, p_value_normality=stats.shapiro(data)
    print("p value:%.4f" % p_value_normality)
    if p_value_normality <0.05:
        print("Reject null hypothesis >> The data is not normally distributed")
    else:
        print("Fail to reject null hypothesis >> The data is normally distributed")  

In [13]:
def check_variance_homogeneity(group1, group2):
    test_stat_var, p_value_var= stats.levene(group1,group2)
    print("p value:%.4f" % p_value_var)
    if p_value_var <0.05:
        print("Reject null hypothesis >> The variances of the samples are different.")
    else:
        print("Fail to reject null hypothesis >> The variances of the samples are same.")

### 1. Defining Hypothesis

To determine if the reengineering effort changed the critical incidence rate. Is there evidence that the critical incidence rate
improved?

    H₀: μ₁=μ₂ or The mean of the samples is the same.
    H₁: At least one of them is different.
    
The performance of the methods by using a 0.05 significance level. the hypothesis testing to check whether there is a difference between the performance of the methods by using a 0.05 significance level.

### 2. Assumption Check

    H₀: The data is normally distributed.
    H₁: The data is not normally distributed.

    H₀: The variances of the samples are the same.
    H₁: The variances of the samples are different.    

In [15]:
check_normality(Prior)
check_normality(Post)

p value:0.0328
Reject null hypothesis >> The data is not normally distributed
p value:0.9451
Fail to reject null hypothesis >> The data is normally distributed


In [16]:
check_variance_homogeneity(Prior, Post)

p value:0.0575
Fail to reject null hypothesis >> The variances of the samples are same.


### 3. Selecting the Proper Test

In [21]:
ttest,p_value = stats.ttest_ind(Prior,Post)
print("p value:%.8f" % p_value)

if p_value <0.05:
    print("Reject null hypothesis")
else:
    print("Fail to reject null hypothesis")

p value:0.12091989
Fail to reject null hypothesis


### 4. Decision and Conclusion

we observe a p-value of 0.120920 which is higher than significance level. This prove that there is no change in critical incidence rate. 