In [1]:
# #1. A physician is evaluating a new diet for her patients with a family history of heart disease. 
# To test the effectiveness of this diet, 16 patients are placed on the diet for 6 months. 
# Their weights and triglyceride levels are measured before and after the study, 
# and the physician wants to know if either set of measurements has changed.

In [2]:
import numpy as np
import pandas as pd
import scipy.stats as stats

In [3]:
dietDF = pd.read_csv("dietstudy.csv")
display(dietDF.head())

Unnamed: 0,patid,age,gender,tg0,tg1,tg2,tg3,tg4,wgt0,wgt1,wgt2,wgt3,wgt4
0,1,45,Male,180,148,106,113,100,198,196,193,188,192
1,2,56,Male,139,94,119,75,92,237,233,232,228,225
2,3,50,Male,152,185,86,149,118,233,231,229,228,226
3,4,46,Female,112,145,136,149,82,179,181,177,174,172
4,5,64,Male,156,104,157,79,97,219,217,215,213,214


In [4]:
##Step 1: Checking if sample truly represents the population or not::
#Yes


##Step 2: Defining Null & Alt Hypothesis::
#Null hypothesis: There is no difference in the levels of Triglycerides and weight of individual after using new diet 
#for 6 months. i.e.,  “the difference in the mean values is zero”

#Alt hypothesis: There has been a significant difference in the levels of Triglycerides and weight of individual after 
#using new diet for 6 months.


##Step 3: Will decide what test to be done, based on the data::
#A paired sample t-test is used to determine whether there is a significant difference between the average values of the 
# same measurement made under two different conditions.

In [5]:
##Step 4: Will calculate the test statistic – it will be the Program Output::
#triglyceride levels

print('Test for -- triglyceride levels...')
triglyceride = stats.ttest_rel(a=dietDF.tg4, b = dietDF.tg0)
print(triglyceride)

Test for -- triglyceride levels...
Ttest_relResult(statistic=-1.2000008533342437, pvalue=0.24874946576903698)


In [6]:
print("The average triglyceride level of the customers at the start of the test is {}".format(dietDF.tg0.mean()))
print("The average triglyceride level of the customers at the end of the test is {}".format(dietDF.tg4.mean()))

The average triglyceride level of the customers at the start of the test is 138.4375
The average triglyceride level of the customers at the end of the test is 124.375


In [7]:
##Step 5: Final conclusion based on p-value::
#Conclusion - triglyceride ::
# Across all 16 subjects, triglyceride levels dropped between 14 and 15 points on average after 6 months of the new diet.
#The significance value greater than 0.05 for change in triglyceride level shows the diet did not significantly 
#reduce their triglyceride levels.

print(triglyceride.pvalue > 0.05)
print('Conclusion: Since the p value fails to REJECT H0, will conclude that a==b, hence no effect of new diet.')

True
Conclusion: Since the p value fails to REJECT H0, will conclude that a==b, hence no effect of new diet.


In [12]:
np.std(dietDF.tg0)

28.11798523632161

In [8]:
##Weights....

In [9]:
print('Test for -- Weights...')
weight = stats.ttest_rel(a=dietDF.wgt4, b = dietDF.wgt0)
print(weight)

Test for -- Weights...
Ttest_relResult(statistic=-11.174521688532522, pvalue=1.137689414996614e-08)


In [10]:
print("The average weight of the customers at the start of the test: {}".format(dietDF.wgt0.mean()))
print("The average weight of the customers at the end of the test: {}".format(dietDF.wgt4.mean()))

The average weight of the customers at the start of the test: 198.375
The average weight of the customers at the end of the test: 190.3125


In [11]:
##Step 5: Final conclusion based on p-value::
#Conclusion - Weight::
# The subjects clearly lost weight over the course of the study; on average, about 8 units.
#Since the significance value for change in weight is less than 0.05, we can conclude that the average loss 
#of 8.06 units per patient is not due to chance variation, and can be attributed to the diet.

print(weight.pvalue > 0.05)
print('Conclusion: Since the p value supports the evidence of rejecting H0, will conclude that a>b, hence \
    had a good effect of new diet.')

False
Conclusion: Since the p value supports the evidence of rejecting H0, will conclude that a>b, hence     had a good effect of new diet.
