In [1]:
# Import pandas, numpy, scip.stats
import pandas as pd
import numpy as np
from scipy import stats

## One Sample T Test

- According to Reynolds Intellectual Ability Scales, the average VIQ (Verbal IQ scores based on the four Wechsler (1981) subtests) is about 109.

- In our sample data, we have a sample of 40 cases. 
- Let's test if the average VIQ of people is significantly bigger than 109.

In [2]:
# Brain size and weight and IQ data (Willerman et al. 1991)
df = pd.read_csv("brain_size.csv", sep=";", na_values = ".", index_col = 0)

In [4]:
df.head()

Unnamed: 0,Gender,FSIQ,VIQ,PIQ,Weight,Height,MRI_Count
1,Female,133,132,124,118.0,64.5,816932
2,Male,140,150,124,,72.5,1001121
3,Male,139,123,150,143.0,73.3,1038437
4,Male,133,129,128,172.0,68.8,965353
5,Female,137,132,134,147.0,65.0,951545


In [5]:
# H0: mu = 109
# H1: mu > 109

In [5]:
# Calculate the mean of VIQ
xbar = df.VIQ.mean()
xbar

112.35

In [6]:
# Calculate the std of VIQ
s = df.VIQ.std()
s

23.616107063199742

In [7]:
s/np.sqrt(df.shape[0])   #x barlardan olusan hayali grafigimizin standart sapmasi yani std error

3.7340343893050605

In [8]:
df.shape

(40, 7)

In [9]:
# Calculate the test statistic
t_test = (xbar - 109)/(s/np.sqrt(df.shape[0]))

In [10]:
#test statistic
t_test

0.8971529586323551

In [12]:
# Calculate p-value
1 - stats.t.cdf(t_test, 39)

0.18757115929257173

In [18]:
help(stats.ttest_1samp)

Help on function ttest_1samp in module scipy.stats._stats_py:

ttest_1samp(a, popmean, axis=0, nan_policy='propagate', alternative='two-sided')
    Calculate the T-test for the mean of ONE group of scores.
    
    This is a test for the null hypothesis that the expected value
    (mean) of a sample of independent observations `a` is equal to the given
    population mean, `popmean`.
    
    Parameters
    ----------
    a : array_like
        Sample observation.
    popmean : float or array_like
        Expected value in null hypothesis. If array_like, then it must have the
        same shape as `a` excluding the axis dimension.
    axis : int or None, optional
        Axis along which to compute test; default is 0. If None, compute over
        the whole array `a`.
    nan_policy : {'propagate', 'raise', 'omit'}, optional
        Defines how to handle when input contains nan.
        The following options are available (default is 'propagate'):
    
          * 'propagate': returns 

In [14]:
# Use stats.ttest_1samp() to calculate the test statistic and p-value
oneSamp = stats.ttest_1samp(df.VIQ, 109)
oneSamp

Ttest_1sampResult(statistic=0.897152958632355, pvalue=0.3751423185851436)

In [15]:
#Display p-value
oneSamp.pvalue

0.3751423185851436

In [16]:
oneSamp.pvalue/2    #yapilan test t test one sample ise, tekli ise 2 ye bölüyoruz

0.1875711592925718

In [17]:
# Compare p-value and alpha
alpha = 0.05

if oneSamp.pvalue/2 < alpha:
    print("Reject the null")
else:
    print("Fail to reject the null")

Fail to reject the null


# Independent Samples T Test

## Arsenic Example

- Arsenic concentration in public drinking water supplies is a potential health risk. 
- An article in the Arizona Republic (May 27, 2001) reported drinking water arsenic concentrations in parts per billion (ppb) for 10 metropolitan Phoenix communities and 10 communities in rural Arizona.
- You can find the data in CSV file.

Determine if there is any difference in mean arsenic concentrations between metropolitan Phoenix communities and communities in rural Arizona.

In [19]:
#Import arsenic dataset
arsenic = pd.read_csv("arsenic.csv")

In [21]:
arsenic     #2 farkli bagimsiz grup var

Unnamed: 0,Metro Phoenix,x1,Rural Arizona,x2
0,Phoenix,3,Rimrock,48
1,Chandler,7,Goodyear,44
2,Gilbert,25,New River,40
3,Glendale,10,Apache Junction,38
4,Mesa,15,Buckeye,33
5,Paradise Valley,6,Nogales,21
6,Peoria,12,Black Canyon City,20
7,Scottsdale,25,Sedona,12
8,Tempe,15,Payson,1
9,Sun City,7,Casa Grande,18


In [22]:
arsenic.columns

Index(['Metro Phoenix', 'x1', 'Rural Arizona', 'x2'], dtype='object')

In [20]:
#H0: mu1 = mu2
#H1: mu1 != mu2

In [23]:
#Perform Levene test for equal variances
#H0: The population variances are equal
#H1: There is a difference between the variances in the population
#The small p-value suggests that the populations do not have equal variances.
leveneTest = stats.levene(arsenic.x1, arsenic.x2)
leveneTest

LeveneResult(statistic=7.7015516672169, pvalue=0.012482954069299166)

In [24]:
# average Metro Phoenix
arsenic.x1.mean()

12.5

In [25]:
# average Rural Arizona
arsenic.x2.mean()

27.5

Calculate the T-test for the means of two independent samples of scores.

In [None]:
# H0: mu1 = mu2
# H1: mu1 != mu2

In [28]:
help(stats.ttest_ind)        #stat kütüphanenin adi ind testin adi, zaten 2 sided o yüzden 2 ye bölmeyecegiz

Help on function ttest_ind in module scipy.stats._stats_py:

ttest_ind(a, b, axis=0, equal_var=True, nan_policy='propagate', permutations=None, random_state=None, alternative='two-sided', trim=0)
    Calculate the T-test for the means of *two independent* samples of scores.
    
    This is a test for the null hypothesis that 2 independent samples
    have identical average (expected) values. This test assumes that the
    populations have identical variances by default.
    
    Parameters
    ----------
    a, b : array_like
        The arrays must have the same shape, except in the dimension
        corresponding to `axis` (the first, by default).
    axis : int or None, optional
        Axis along which to compute test. If None, compute over the whole
        arrays, `a`, and `b`.
    equal_var : bool, optional
        If True (default), perform a standard independent 2 sample test
        that assumes equal population variances [1]_.
        If False, perform Welch's t-test, which

In [29]:
# Calculate test statistics using stats.ttest_ind()
indTest = stats.ttest_ind(arsenic.x1, arsenic.x2, equal_var = False)    #p esit degil diye buraya false yazmamiz gerekiyor
indTest

Ttest_indResult(statistic=-2.7669395785560558, pvalue=0.015827284816100885)

In [30]:
indTest.statistic

-2.7669395785560558

In [31]:
indTest.pvalue

0.015827284816100885

In [32]:
# Decision
alpha = 0.05

if indTest.pvalue < alpha:
    print("Reject the null")
else:
    print("Fail to reject the null")

Reject the null


In [None]:
# kirsal kesimde fazlaymis arsenik miktari

# Paired (Dependent) Samples T Test

## Prozac Data

- Let us consider a simple example of what is often termed "pre/post" data or "pretest/posttest" data. 
- Suppose you wish to test the effect of Prozac on the well-being of depressed individuals, using a standardised "well-being scale" that sums Likert-type items to obtain a score that could range from 0 to 20. 
- Higher scores indicate greater well-being (that is, Prozac is having a positive effect). 
- While there are flaws in this design (e.g., lack of a control group) it will serve as an example of how to analyse such data.

Determine if Prozac enhances well-being in depressed individuals. Use   0.05


In [33]:
# read prozac dataset
prozac = pd.read_csv("prozac.csv")

In [34]:
prozac    # 0 mutsuz 20 cok mutlu, cogunlukta iyiye gidis var, bu ilac iyi geliyor mu diye analiz yapacagiz

Unnamed: 0,moodpre,moodpost,difference
0,3,5,2
1,0,1,1
2,6,5,-1
3,7,7,0
4,4,10,6
5,3,9,6
6,2,7,5
7,1,11,10
8,4,8,4


In [None]:
# H0: d_bar = 0
# H1: d_bar > 0    #bu cikarsa ilac ise yariyor demek

In [35]:
# Calculate test statistics using stats.ttest_rel()  
# moodpost - moodpre
pairedtest = stats.ttest_rel(prozac.moodpost, prozac.moodpre)
pairedtest

Ttest_relResult(statistic=3.1428571428571423, pvalue=0.013745824394788492)

In [37]:
pairedtest.pvalue/2        #2 tailed bir test 2 sided bir test bize kendisi lazim 2 ye bölecegiz, yüksek oranda ilac ise yariyor gözüküyor p value 0 a yakin cikti

0.006872912197394246

In [34]:
# moodpre - moodpost
# H0: d_bar = 0
# H1: d_bar < 0
#stats.ttest_rel(prozac.moodpre, prozac.moodpost, alternative="less")

In [38]:
# Decision
alpha = 0.05

if pairedtest.pvalue/2 < alpha:
    print("Reject the Null")
else:
    print("Fail to reject")

Reject the Null


In [37]:
#That means prozac treatment gives good results.