In [1]:
import numpy as np
import pandas as pd
import scipy as sc
import itertools

from statsmodels.stats.descriptivestats import sign_test
from statsmodels.stats.weightstats import zconfint

%pylab inline

Populating the interactive namespace from numpy and matplotlib


In [2]:
failure_times = pd.read_csv('failure times.txt', header = None)
failure_times.columns = ['CPU_seconds']
failure_times

Unnamed: 0,CPU_seconds
0,3
1,33
2,146
3,227
4,342
...,...
131,76057
132,81542
133,82702
134,84566


In [3]:
sample = []
for i in range(1, len(failure_times.CPU_seconds)):
    sample.append(failure_times.CPU_seconds[i] - failure_times.CPU_seconds[i-1])

The dataset gives the failure times (in CPU seconds, measured in terms of execution time) of a real-time command and control software system.

Do failures on average happen more often than every 500 CPU seconds? Let's test the following hypothesis:

H0: average time between failures is not greater than 500 CPU seconds

H1: average time between failures is greater than 500 CPU seconds

First, let's use Student's t-test. What is its p-value? Round the answer to 4 decimal points.

In [4]:
print('T test:', sc.stats.ttest_1samp(sample, 500, alternative='greater'))

T test: Ttest_1sampResult(statistic=1.7572536270462775, pvalue=0.040579209207804064)


Let's test the same hypothesis and alternative with sign test. First of all, what number of observations in the sample is above 500?

In [5]:
sample_gt_500 = list(filter(lambda x: x > 500, sample))
sample_not_gt_500 = list(filter(lambda x: x <= 500, sample))
print('above', len(sample_gt_500))
print('not above', len(sample_not_gt_500))

above 49
not above 86


What is the p-value of the sign test? Round the answer to 4 decimal points.

In [7]:
sc.stats.binom_test(49, 135, alternative='greater')

0.9995002578123924

Good, let's try signed rank test now. What p-value does it give? Provide the answer rounded to 4 decimal points.



In [8]:
new_sample = []
for i in sample:
    new_sample.append(i - 500)
print("Signed rank test:", 1 - sc.stats.wilcoxon(new_sample, zero_method='zsplit', correction=False)[1]/2)

Signed rank test: 0.8632079654217537


Great, let's proceed to the permutation test with sum of the (centered) sample as a statistic. What is it's p-value? Round the answer to 4 decimal points.

The sample is too big to go through all the permutation – let's use 10000 of them. To get the same result as us, use the functions from the example notebook, and set random seed = 0 before calling permutation_test_1s function.

In [9]:
def permutation_t_stat_1s(sample, mean):
    t_stat = sum(sample - mean)
    return t_stat

def permutation_null_distr_1s(sample, mean, max_permutations = None):
    centered_sample = sample - mean
    if max_permutations:
        signs_array = set([tuple(x) for x in 2 * np.random.randint(2, size = (max_permutations, 
                                                                              len(sample))) - 1 ])
    else:
        signs_array =  itertools.product([-1, 1], repeat = len(sample))
    distr = [permutation_t_stat_1s(centered_sample * np.array(signs), 0) for signs in signs_array]
    return distr

def permutation_test_1s(sample, mean, max_permutations = None, alternative = 'two-sided', return_distr = False):
    if alternative not in ('two-sided', 'less', 'greater'):
        raise ValueError("alternative not recognized\n"
                         "should be 'two-sided', 'less' or 'greater'")
    
    t_stat = permutation_t_stat_1s(sample, mean)
    
    null_distr = permutation_null_distr_1s(sample, mean, max_permutations)
    
    if alternative == 'two-sided':
        p = sum([1. if abs(x) >= abs(t_stat) else 0. for x in null_distr]) / len(null_distr)
    elif alternative == 'less':
        p = sum([1. if x <= t_stat else 0. for x in null_distr]) / len(null_distr)
    else: # alternative == 'greater':
        p = sum([1. if x >= t_stat else 0. for x in null_distr]) / len(null_distr)
        
    if return_distr:
        return {'t': t_stat, 'p': p, 'null_distr': null_distr}
    else:
        return {'t': t_stat, 'p': p}
    
np.random.seed(0)
res = permutation_test_1s(pd.DataFrame(sample, columns=['cpu']).cpu, 500, max_permutations = 10000, alternative='greater')
res

{'t': 21179, 'p': 0.0366}

Check the underlying assumptions of each test – which of the four do you think could be trusted for this problem?

**Answer:**
- Sign test
- Student's t-test