## Reliability Statistics for Finite Samples Notebook

This Jupyter notebook shows how to compute reliability engineering 
statistics (reliability, confidence, and assurance) using the
[relistats library](https://github.com/sanjaymjoshi/relistats) for
finite population size.

This notebook was used to generate the plots and tables in the paper:

S.M. Joshi, "TBD: Computation of Reliability Statistics for Success-Failure Experiments,"
[TBD: arXiv:2303.03167](https://doi.org/10.48550/arXiv.2303.03167), March 2023.
  

In [None]:
# For online mode, such as google colab or GitHub codespace
# You can use `pipenv install` for local use and not execute this cell
!pip install "relistats >= 1.1"
!pip install tabulate


## Set up
Run the follow cell first to import packages needed.

In [None]:

import itertools
from relistats.binom_fin import assur_fin, conf_fin, reli_fin
from relistats.binomial import assurance, confidence, reliability
import matplotlib.pyplot as plt
from tabulate import tabulate

#table_format = "latex" # for paper
table_format = "simple" # for notebook

marker = itertools.cycle(('*', '^', '+')) 



## Reliability

Reliability is probability of success. For success-failure experiments or test units
that either pass or fail, the samples typically have 'binomial distribution'. They are
assumed to be independent (one sample passing or failing has no effect on another sample)
and identically distributed (the reliability is same for each sample).

The math assumes infinite number of samples, but we can get access to only a finite
number of samples. By observing a number of successes and failures, we can compute an
estimate of reliability. Based on the number of samples, we qualify the quality of this
estimate using 'confidence'. Then we can say something like "Having seen zero failures
in 10 samples, we have 80% confidence that the reliability is at least about 85% for the
whole population". The "population" means all possible samples. That is typically infinite!

In real life, we almost always deal with finite population sizes. For example, suppose we 
are working on making a new part (or a new software). We start with a small number
of samples for initial production (or a small number of installations of the new software)
to make sure they are successful. This is typically done with "in-house" testing. There are
practical limits on how many samples we can
afford at this initial stage. Suppose they meet our criteria for success. It may be still
too early to sign-up for theoretically unlimited (practically 'large') number of
samples. It is a good idea to send a small number of samples to a few 'friendly' customers.
Continuing with our example of 10 samples above, suppose we are comfortable with making only
20 more units in our first batch (total population = 10+20 = 30), are the reliability and
confidence numbers still the same as those for infinite population? Fortunately, they are better!

In the code block below the size of population, `m`, can be varied. We compute
the reliability as confidence increases. From the plot
below, we see that at the same 80% confidence level, the reliability increases to 90% since we limit
the number of additional samples to 20.

If you are using this notebook on Google Colab, you can change these numbers on the right
and see how the plot changes. Change `plot_xy_min` and `plot_xy_max` to tweak the range of
`x` and `y` axes for easier viewing.

In [None]:
#@title Reliability computations { vertical-output: true }
n = 10 #@param {type:"integer"}
f = 0 #@param {type:"integer"}
m_start = 10 #@param {type:"integer"}
m_end = 30 #@param {type:"integer"}
m_step = 10 #@param {type:"integer"}

c_start = 1 #@param {type:"integer"}
c_end = 99 #@param {type:"integer"}
c_step = 1 #@param {type:"integer"}

plot_xy_min = 60 #@param {type:"integer"}
plot_xy_max = 100 #@param {type:"integer"}

all_m = list(range(m_start, m_end+m_step, m_step))
all_c = range(c_start, c_end+c_step, c_step)

for m in all_m:
  rr = []
  cc = []
  for c in all_c:
    r, c2 = reli_fin(n, f, c/100, m)
    cc.append(c2*100)
    rr.append(r*100)
  
  plot_label = m
  plt.plot(cc, rr, next(marker), label=plot_label)

# Infinite samples
rr_inf = [reliability(n,f,c/100)*100 for c in all_c]
plt.plot(all_c, rr_inf, label="infinite")

plt.xlim(plot_xy_min, plot_xy_max)
plt.ylim(plot_xy_min, plot_xy_max)
plt.ylabel('Reliability (%) at remaining population size')
plt.xlabel(f'Confidence (%) with at {f} failures in {n} samples')
plt.legend()
plt.grid()
plt.show()


## Confidence

Confidence in reliability is probability that the actual reliability of the whole
population is at least the estimated reliability. 

As the number of samples increases, the confidence in reliability value also increases.
For the same number of samples, the confidence in minimum reliability increases as the
reliability level drops. As above, the computations are done traditionally for infinite
population, but the numbers look better for finite populations!

In the block below, the number of remaining samples vary from 20 to 40 in steps of 10 and the
number of failures in first 10 samples is set to 0. If we are looking at 90% reliability,
the confidence is only 65% for infinite samples, but 80% for 20 remaining samples.

If you are using this notebook on Google Colab, you can change these numbers on
the right and see how the plot and table changes. Change `plot_xy_min` and `plot_xy_max`
to tweak the range of `x` and `y` axes for easier viewing.

In [None]:
#@title Confidence computations { vertical-output: true }
n = 10 #@param {type:"integer"}
f = 0 #@param {type:"integer"}
m_start = 10 #@param {type:"integer"}
m_end = 30 #@param {type:"integer"}
m_step = 10 #@param {type:"integer"}

plot_xy_min = 0 #@param {type:"integer"}
plot_xy_max = 100 #@param {type:"integer"}

all_m = list(range(m_start, m_end+m_step, m_step))
for m in all_m:
  rr = []
  cc = []
  for d in range(m+1):
    c, r2 = conf_fin(n, f, m, d)
    cc.append(c*100)
    rr.append(r2*100)
  plot_label = m if m is not None else "infinite"
  plt.plot(rr, cc, next(marker), label=plot_label)

# Infinite samples
all_r = range(1, 100)
cc_inf = [confidence(n,f,r/100)*100 for r in all_r]
plt.plot(all_r, cc_inf, label="infinite")

plt.xlim(plot_xy_min, plot_xy_max)
plt.ylim(plot_xy_min, plot_xy_max)

plt.ylabel('Confidence (%) at remaining population size')
plt.xlabel(f'Reliability (%) with at {f} failures in {n} samples')
plt.legend()
plt.grid()
plt.show()



## Assurance

Assurance simplifies reliability and confidence by setting both of them the same.
The result is just one number that is easier to communicate. For example, 90% assurance
means 90% reliability with 90% confidence. Given the number of samples and number of
failures, assurance is just one number. As the number of tested samples increases,
the assurance increases.

In the table below, the first entry in each row is the number of tested samples with 0
failures. The number of remaining samples increase from 1 to 15 in each column, with 
infinite samples as the last column. For example:
 - Suppose you have successfully tested 3 samples with 0 failures. If you plan to
   make 5 more samples, the assurance is 75% (as opposed to 68.2% for infinite samples,
   as shown in the last column). That is, reliability is at least 75%
   with confidence of at least 75%. 
 - Suppose this first batch of 5 samples has zero failures. Now you have tested 8 samples
   with 0 samples. If you want to make 8 more samples in the next batch, the assurance
   is 87.5%.

You might have noticed some strange behavior, the assurance does not always reduce as
the number of remaining samples increases. Let's look closely further below.

If you are using this notebook on Google Colab,
you can change these numbers on the right and see how the table changes.

In [None]:
#@title Confidence computations { vertical-output: true }
n_start = 3 #@param {type:"integer"}
n_end = 22 #@param {type:"integer"}
n_step = 1 #@param {type:"integer"}

f = 0 #@param {type:"integer"}

m_start = 1 #@param {type:"integer"}
m_end = 10 #@param {type:"integer"}
m_step = 1 #@param {type:"integer"}

all_m = list(range(m_start, m_end+m_step, m_step))
all_a = []
for n in range(n_start, n_end+n_step, n_step):
    a_row = [assur_fin(n, f, m)[0]*100 for m in all_m]
    a_row.insert(0, n)
    a_row.append(assurance(n, f)*100)
    all_a.append(a_row)

all_m.append('inf')
print(tabulate(all_a, headers=all_m, tablefmt=table_format, floatfmt=".1f"))

### Assurance Anonmaly
Let's look closely at the possible anomaly in the first row, at `n = 3`. The assurance increases from 71.4% for 4 additional
samples to 75% for 5 additional samples! Shouldn't assurance decrease as we increase the number of additional samples? Remember that assurance is the minimum of reliability and confidence. Let's look at both these.

The plots below show the reliability and confidence as number of defects from `m` additional samples, `d`, increase from 0. The number
of initial test samples is `n = 3` with `f = 0` failures.

The reliability at `f+d = 2` failures out of `3+4 = 7` samples is `(0+2)/(3+4) = 71.4%`. For this reliability, the confidence with 0 failures in first 3 samples is 87.5%, assuming 4 additional samples. The assurance is minimum of the two, 71.4%. At `f+d= 2` defects out of `3+5 = 8` samples, the reliability is `(0+2)/(3+5) = 75%`, a little higher. The confidence with 0 failures in first 3 samples drops to
78.4%. The assurance, minimum of these, is 75%, still higher than 70.4%!

You can think of the assurance as the trade-off between the reliability and confidence. The smaller the trade-off, i.e., the closer reliability and confidence are to each other, the higher the assurance. Due to discrete values of reliability levels, we can sometimes get slightly higher assurance even if the number of additional samples increase.

With infinite remaining samples, it is possible to compute confidence at any level of reliability. Therefore, assurance is where the trade-off is zero, i.e., confidence is equal to reliability. You can see this behavior in the third plot below. Note the inverted x axis to keep consistency with the axes in the first two plots. 

In [None]:
def plot_assurance(n, f, m):
    all_cr = [conf_fin(n, f, m, d) for d in range(m+1)]
    plt.plot([c*100 for c,r in all_cr], '*', label='confidence')
    plt.plot([r*100 for c,r in all_cr], '^', label='reliability')
    plt.bar(range(m+1), [min(r,c)*100 for r,c in all_cr], color='cyan', alpha=0.5, width=0.5)
    plt.xlabel(f'Failures in {m} additional samples')
    plt.ylabel(f'assurance at {f} failures in {n} tests')
    plt.grid()
    plt.legend()
    plt.show()
    print([f"(c={c*100:.1f},r={r*100:.1f})" for c,r in all_cr])

n = 3
f = 0
plot_assurance(n, f, 4)
plot_assurance(n, f, 5)

all_r = range(100)
all_c = [confidence(n, f, r/100)*100 for r in all_r]
plt.plot(all_r, all_c, label='confidence')
plt.plot(all_r, all_r, label='reliability')

plt.fill_between(all_r, [min(r,c) for r,c in zip(all_r, all_c)], color='cyan', alpha=0.5)
plt.grid()
plt.legend()
plt.xlabel('reliability')
plt.ylabel(f'assurance at {f} failures in {n} tests')
plt.gca().invert_xaxis()
plt.show()