In [17]:
import numpy as np
from numpy import loadtxt
from statsmodels.stats.weightstats import DescrStatsW


failure = loadtxt("failure times.txt")

The dataset above gives the failure times (in CPU seconds, measured in terms of execution time) of a real-time command and control software system.

How often do failures happen?

Calculate average time between failures, present the answer with two decimal points.

In [21]:
times = []
for i in range(1, len(failure)):
    times.append(failure[i] - failure[i-1])

np.mean(times)

656.8814814814815

Calculate 95% confidence interval for the mean time between failures based on asymptotic normality of the sample mean. What is its upper bound? Round the answer to 2 decimal points.

In [22]:
print("95% confidence interval:", DescrStatsW(times).tconfint_mean())

95% confidence interval: (480.30808862423476, 833.4548743387282)


Let's try to build 95% confidence interval for the same mean time between failures with bootstrap. What is its upper bound? Round the answer to 2 decimal points.

To get exactly the same results as we did:

- use get_bootstrap_samples and percentile_interval functions

- set random seed = 0 before calling get_bootstrap_samples, once

- use 10000 bootstrap resamples

In [28]:
def get_bootstrap_samples(x, n_resamples):
    indices = np.random.randint(0, len(x), (n_resamples, len(x)))
    resamples = x[indices]
    return resamples


def percentile_interval(stat, alpha):
    boundaries = np.percentile(stat, [100 * alpha / 2., 100 * (1 - alpha / 2.)])
    return boundaries


np.random.seed(0)
mean_scores = list(map(np.mean, get_bootstrap_samples(np.array(times), 10000)))
print("95% confidence interval for the CPU seconds mean:",  percentile_interval(mean_scores, 0.05))

95% confidence interval for the CPU seconds mean: [492.31703704 838.1562963 ]
