Skip to content

Kukuster/CI_methods_analyser

Repository files navigation

CI methods analyser

A toolkit for measuring the efficacy of various methods for calculating a confidence interval. Currently provides with a toolkit for measuring the efficacy of methods for a confidence interval for the following statistics:

  • proportion
  • difference between two proportions

This library was mainly inspired by the library: "Five Confidence Intervals for Proportions That You Should Know About" by Dr. Dennis Robert

Installation

https://pypi.org/project/CI-methods-analyser/

Usage

Testing Wald Interval - a popular method for calculating confidence interval for proportion

Wald Interval is defined as so:

$$ (w^-, w^+) = p\,\pm\,z\sqrt{\frac{p(1-p)}{n}} $$
from CI_methods_analyser import CImethodForProportion_efficacyToolkit as toolkit, methods_for_CI_for_proportion

# take an already implemented method for calculating CI for proportions
wald_interval = methods_for_CI_for_proportion.wald_interval

# initialize the toolkit
wald_interval_test_toolkit = toolkit(
    method=wald_interval, method_name="Wald Interval")


# calculate the real coverage that the method produces
# for each case of a true population proportion (taken from the list `proportions`)
wald_interval_test_toolkit.calculate_coverage_analytically(
    sample_size=100, proportions=('0.001', '0.999', '0.01'), confidence=0.95)
# now you can access the calculated coverage and a few statistics:
# wald_interval_test_toolkit.coverage  # 1-d array of 0-100, the same shape as passed `proportions`
# wald_interval_test_toolkit.average_coverage  # np.longdouble 0-100, avg of `coverage`
# wald_interval_test_toolkit.average_deviation  # np.longdouble 0-100, avg abs diff w/ `confidence`

# plots the calculated coverage in a matplotlib.pyplot figure
wald_interval_test_toolkit.plot_coverage(
    plt_figure_title="Wald Interval coverage")
# you can access the figure here:
# wald_interval_test_toolkit.figure

# shows the figure (non-blocking)
wald_interval_test_toolkit.show_plot()

# because show_plot() is non-blocking,
# you have to pause the execution in order for the figure to be rendered completely
input('press Enter to exit')

This will output the image:

Wald Interval - real coverage

The plot indicates overall bad performance of the method and particularly poor performance for extreme proportions.


You really might want to use a different method. Check out this wonderful medium.com article by Dr. Dennis Robert:



The shortcut function calculate_coverage_and_show_plot will yield the equivalent calculation and render the same picture:

from CI_methods_analyser import CImethodForProportion_efficacyToolkit as toolkit, methods_for_CI_for_proportion

toolkit(
    method=methods_for_CI_for_proportion.wald_interval, method_name="Wald Interval"
).calculate_coverage_and_show_plot(
    sample_size=100, proportions=('0.001', '0.999', '0.01'), confidence=0.95,
    plt_figure_title="Wald Interval coverage"
)


input('press Enter to exit')

I personally prefer night light-friendly styling:

from CI_methods_analyser import CImethodForProportion_efficacyToolkit as toolkit, methods_for_CI_for_proportion


toolkit(
    method=methods_for_CI_for_proportion.wald_interval, method_name="Wald Interval"
).calculate_coverage_and_show_plot(
    sample_size=100, proportions=('0.001', '0.999', '0.01'), confidence=0.95,
    plt_figure_title="Wald Interval coverage",
    theme='dark_background', plot_color="green", line_color="orange"
)


input('press Enter to exit')

Wald Interval - real coverage (dark theme)


Testing custom method for CI for proportion

You can implement your own methods and test them:

from CI_methods_analyser import CImethodForProportion_efficacyToolkit as toolkit
from CI_methods_analyser.math_functions import normal_z_score_two_tailed
from functools import lru_cache

# not a particularly good method for calculating CI for proportion
@lru_cache(100_000)
def im_telling_ya_test(x: int, n: int, conflevel: float = 0.95):
    z = normal_z_score_two_tailed(conflevel)

    p = float(x)/n
    return (
        p - 0.02*z,
        p + 0.02*z
    )


toolkit(
    method=im_telling_ya_test, method_name='"I\'m telling ya" test'
).calculate_coverage_and_show_plot(
    sample_size=100, proportions=('0.001', '0.999', '0.01'), confidence=0.95,
    plt_figure_title='"I\'m telling ya" coverage',
    theme='dark_background', plot_color="green", line_color="orange"
)


input('press Enter to exit')

"I'm telling ya" test - real coverage

from CI_methods_analyser import CImethodForProportion_efficacyToolkit as toolkit
from CI_methods_analyser.math_functions import normal_z_score_two_tailed
from functools import lru_cache

# you could say, this method is "too good"
@lru_cache(100_000)
def God_is_my_witness_score(x: int, n: int, conflevel: float = 0.95):
    z = normal_z_score_two_tailed(conflevel)

    p = float(x)/n
    return (
        (0 + p)/2 - 0.005*z,
        (1 + p)/2 + 0.005*z
    )


toolkit(
    method=God_is_my_witness_score, method_name='"God is my witness" score'
).calculate_coverage_and_show_plot(
    sample_size=100, proportions=('0.001', '0.999', '0.01'), confidence=0.95,
    plt_figure_title='"God is my witness" score coverage', theme='dark_background'
)

input('press Enter to exit')

"God is my witness" score - real coverage

Testing methods for CI for the difference between two proportions

Let's use the implemented Pooled Z test:

$$ (\delta^-, \delta^+) = \hat{p}_T - \hat{p}_C \pm z_{\alpha}\sqrt{\bar{p}(1-\bar{p})(\frac{1}{n_T}+\frac{1}{n_C})} $$
, where:
$$ \bar{p} = \frac{n_T\hat{p}_T + n_C\hat{p}_C}{n_T + n_C} $$
from CI_methods_analyser import CImethodForDiffBetwTwoProportions_efficacyToolkit as toolkit_d, methods_for_CI_for_diff_betw_two_proportions as methods


toolkit_d(
    method=methods.Z_test_pooled, method_name='Z test pooled'
).calculate_coverage_and_show_plot(
    sample_size1=100, sample_size2=100, proportions=('0.001', '0.999', '0.01'), confidence=0.95,
    plt_figure_title='Z test pooled', theme='dark_background',
)

input('press Enter to exit')

Z test (unpooled) - real coverage

As you can see, this test is generally very good for close proportions, unless proportions have extreme values [purple]

Also, this test is extremely concervative for the high and extreme differences between two proportions, i.e. for proportions which values a far apart [green]


You may want to change the color palette (although I wouldn't):

from CI_methods_analyser import CImethodForDiffBetwTwoProportions_efficacyToolkit as toolkit_d, methods_for_CI_for_diff_betw_two_proportions as methods


toolkit_d(
    method=methods.Z_test_pooled, method_name='Z test pooled'
).calculate_coverage_and_show_plot(
    sample_size1=100, sample_size2=100, proportions=('0.001', '0.999', '0.01'), confidence=0.95,
    plt_figure_title='Z test pooled', theme='dark_background',
    colors=("gray", "purple", "white", "orange", "#d62728")
)

input('press Enter to exit')

Z test (unpooled) - real coverage



NOTES

Methods for measuring efficacy of CI methods

Two ways can be used to calculate the efficacy of CI methods:

  • approximately, with random simulation (as implemented in R by Dr. Dennis Robert, see link above). Here: calculate_coverage_randomly
  • precisely, with the analytical solution. Here: calculate_coverage_analytically

Both methods are implemented for CI for both statistics: proportion, and difference between two proportions. For the precise analytical solution, an optimization was made. Theoretically, it is lossy, but practically the error is always negligible (as proven by test_z_precision_difference.py). Optimization is regulated with the parameter z_precision and it is automatically estimated by default.


Various links

1. Equivalence and Noninferiority Testing (as I understand, are fancy terms for 2-sided and 1-sided p tests for the difference between two proportions)

2. Biostatistics course (Dr. Nicolas Padilla Raygoza, et al.)

3. Using z-test instead of a binomial test: