SUMM/ENH/Overview: two sample proportion, independent #4828

josef-pkt · 2018-08-03T13:46:42Z

update: basic non-exact methods merged in #2605, original PR #4829, issue #2605

two sample binomial proportion analogue to Poisson rates
#2718, https://github.com/josef-pkt/misc/wiki/Exact-and-Asymptotic-Tests---Rates,-Poisson
and one-sample proportion
https://github.com/josef-pkt/misc/blob/7fb0160bb7f8fd5eef00e276c61ec7fbe2e20238/notebooks/proportion_one_power.ipynb

#3954 more general issue, rates, proportion, paired sample, ...

results

hypothesis test, pvalue
confidence interval
power and sample size

comparisons/hypothesis

difference in proportion, support non-zero value
(risk) ratio
odds ratio
equality value=0 for diff, or value=1 for ratios

Basic version needs two-sided and one-sided alternative
TOST equivalence can reuse one-sided with value != 0.
Equality tests can take advantage of permutation or similar "exact" methods. E.g. fisher's exact test.

methods

wald
score
exact and pseudo-/semi-exact

the list of methods differs by references, e.g. PASS/NCSS documentation,
Newcombe 1998 has 11 methods for diff confint
Fagerland at al 2015 have 6-8 methods for confint for each diff, ratio and odds ration, not including berger boos
https://doi.org/10.1080/10543406.2018.1452028 has 97 methods according to title for one-sided diff inference.

I guess we get at least 5 to 10 including options, e.g.

wald: with and without adjustments, e.g. add 0.5 or 1 observation. fast
score_test: with or without adjustments, e.g. small sample correction (nobs / (nobs - 1)) or skewness/Bartlett correction (Gart and Nam)
(semi-)exact: based on binomial distribution
- e-test: use MLE for nuisance parameter
- sup sup/max over nuisance parameter
- berger-boos
- conditional only available for odds-ratio cases (analogue to fisher's exact test, both margins fixed)
  most likely I ignore this case for the general functions
- variants for which test statistic is used to rank cases, e.g. Agresti-Min (one two-sided score test) versus two one-sided score tests)
- mid-p variants

What's available in various issues/PRs and what's missing?

confint
#2605 issue, josef-pkt#5 score_confint code from PropCI
and code in issue comments

#2608 PR hypothesis tests for equality, including exact, berger-boos,
#2607 issue for exact

model analogue, GLM
score and profile loglike confint
#4798 (comment)
#1791

code structure, API

public
confint_proportion_2indep draft version (draft version, uncommitted)
test_proportion_2indep, ? not written yet, maybe as analogue to confint function
???

internal

function for constrained estimation in 2 independent sample case: mainly Miettinen and Nurminen 1985
test statistics
ztest_generic usage
confint explicit
confint numerical inversion and root finding

classes or functions, e.g. #2608 uses class for exact and berger-boos
confint_proportion_2indep is a single function with code for wald confint inside, but needs to delegate for score and exact confints
more complex confint will need tests as helper functions, for simple cases we can compute confint directly.

power, sample size: I haven't looked yet. NCSS uses exact computation for asymptotic/approximate tests and confint, at least for small samples

The text was updated successfully, but these errors were encountered:

josef-pkt · 2018-08-03T23:20:52Z

overview table from Fagerland et al for confidence intervals

Table 8. Availability of confidence intervals in standard software packages
Confidence interval
Closed
forma CIA 2.1 R 2.12 SAS 9.2 SPSS 18 Stata 11 StatXact 9

Difference between proportions
Approximate intervals
Wald ˇ ˇ ˇ ˇ ˇb ˇ –
Wald with continuity correction ˇ – ˇ ˇ – – –
Agresti–Caffo (3) ˇ – ˇc – – ˇd –
Newcombe hybrid score (2) ˇ ˇ ˇc ˇ – ˇd –
Miettinen–Nurminen asymptotic score – – ˇe – – ˇd ˇ
Exact intervals
Santner–Snell exact unconditional – – – ˇ – – ˇ
Chan–Zhang exact unconditional – – – – – – ˇ
Agresti-Min exact unconditional (1) – – – – – – ˇ

Ratio of proportions
Approximate intervals
Katz log ˇ ˇ – ˇ ˇ ˇ –
Adjusted log ˇ – ˇc – – – –
Inverse hyperbolic sine ˇ – – – – – –
Koopman asymptotic score (1) – – ˇc,e – – ˇf ˇ
Exact intervals
Chan–Zhang exact unconditional – – – – – – ˇ
Agresti–Min exact unconditional (2) – – – – – – ˇ

OR
Approximate intervals
Woolf logit ˇ ˇ – ˇ ˇ ˇ ˇ
Gart adjusted logit (3) ˇ – ˇc – – ˇg –
Independent-smoothed logit (4) ˇ – – – – ˇg –
Cornfield mid-p – – – – – – –
Baptista–Pike mid-p (1) – –– – – – –
Exact intervals
Cornfield exact conditional – – ˇ ˇ – ˇ ˇ
Baptista–Pike exact conditional – – – – – – –
Agresti–Min exact unconditional (2) – – – – – – –

josef-pkt · 2018-08-03T23:55:28Z

Difference between proportions
Approximate intervals

sm	Fagerland et al	other
ci t	Wald
--	Wald with continuity correction
ci t	Agresti–Caffo (3)
x --	Newcombe hybrid score (2)
-- t	Miettinen–Nurminen asymptotic score
Exact intervals
--	--	Berger-Boos
--	Santner–Snell exact unconditional
--	Chan–Zhang exact unconditional
--	Agresti-Min exact unconditional (1)

Ratio of proportions

sm	Fagerland	other
Approximate intervals
ci t	Katz log
ci t	Adjusted log
ci t	Inverse hyperbolic sine
ci t	Koopman asymptotic score (1)	Miettinen-Nurminen ?
Exact intervals
--	--	Berger-Boos
--	Chan–Zhang exact unconditional
--	Agresti–Min exact unconditional (2)

OR

sm	Fagerland	other
Approximate intervals
ci t	Woolf logit
ci t	Gart adjusted logit (3)
ci t	Independent-smoothed logit (4)
-- t	---	score Miettinen-Nurminen
Approximate exact intervals
--	Cornfield mid-p
--	Baptista–Pike mid-p
Exact intervals
--	--	Berger-Boos
--	Cornfield exact conditional
--	Baptista–Pike exact conditional
--	Agresti–Min exact unconditional (2)

josef-pkt · 2020-02-14T18:30:18Z

another reference
Prendergast, Luke A., and Robert G. Staudte. 2014. “Better than You Think: Interval Estimators of the Difference of Binomial Proportions.” Journal of Statistical Planning and Inference 148 (May): 38–48. https://doi.org/10.1016/j.jspi.2013.11.012.

Monte Carlo to compare 5 confidence intervals for diff of two proportions.

josef-pkt · 2020-02-29T19:26:40Z

parking a reference, correction that found a mistake in the Monte Carlo
I don't remember Farrington and Manning 1990

Schoder, Volker. 2002. “Test Statistics and Sample Size Formulae for Comparative Binomial Trials with Null Hypothesis of Non-Zero Risk Difference or Non-Unity Relative Risk by C. P. Farrington and G. Manning, Statistics in Medicine 1990; 9:1447–1454.” Statistics in Medicine 21 (13): 1958–60. https://doi.org/10.1002/sim.1242.

josef-pkt · 2021-12-24T22:06:33Z

I just saw a new book

Pradhan, Vivek, Ashis Gangopadhyay, Sandeep Menon, Cynthia Basu, and Tathagata Banerjee. 2021. Confidence Intervals for Discrete Data in Clinical Research. Boca Raton: Chapman and Hall/CRC. https://doi.org/10.1201/9781315169859.

However, based on TOC, it has less than what we already have
1 and 2 sample binomial, only 1 sample rate (counts)

possibly what we don't fully have yet: 2 sample paired proportions (AFAIR, that's just McNemar, but I don't know if we include confint, it should reduce just to one sample confint for the test statistic)

I didn't check how much the book covers "exact" methods.

josef-pkt · 2022-01-30T19:23:58Z

power for 2-sample equivalence tests

https://ncss-wpengine.netdna-ssl.com/wp-content/themes/ncss/pdf/Procedures/PASS/Equivalence_Tests_for_the_Difference_Between_Two_Proportions.pdf

josef-pkt · 2022-01-30T19:47:56Z

overview tables for proportions #6399

josef-pkt · 2022-03-22T17:41:12Z

our current equivalence test for proportion are individual functions

This articles compares several method in Monte Carlo

Barker, Lawrence, Henry Rolka, Deborah Rolka, and Cedric Brown. 2001. “Equivalence Testing for Binomial Random Variables.” The American Statistician 55 (4): 279–87. https://doi.org/10.1198/000313001753272213.

(we need more systematic functions in proportions)
that's not correct for 2indep proportions, we have test, confint and tost, but power for only limited case

josef-pkt added type-enh comp-stats labels Aug 3, 2018

josef-pkt added this to the 0.10 milestone Aug 3, 2018

josef-pkt mentioned this issue Aug 3, 2018

ENH: add two independent proportion inference #4829

Closed

josef-pkt mentioned this issue Aug 12, 2018

ENH: power, sample size for 2 sample negative binomial rates #4882

Open

josef-pkt mentioned this issue Nov 16, 2018

t-test in scipy.stats: combine tests? scipy/scipy#9485

Closed

josef-pkt mentioned this issue Feb 1, 2019

ENH: proportions_ztest, variance options, vectorize #3720

Open

josef-pkt mentioned this issue May 4, 2020

ENH: confidence intervals for two proportions, difference, odds ratio and risk ratio #2605

Closed

josef-pkt mentioned this issue Jan 30, 2022

Bug in /statsmodels/stats/proportion.py? #8049

Closed

josef-pkt mentioned this issue Feb 18, 2022

ENH: 2 sample comparison of poisson rates, including exact #2718

Open

josef-pkt mentioned this issue Aug 28, 2022

ENH: stats for cluster randomized trials - count, binomial #8390

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SUMM/ENH/Overview: two sample proportion, independent #4828

SUMM/ENH/Overview: two sample proportion, independent #4828

josef-pkt commented Aug 3, 2018 •

edited

josef-pkt commented Aug 3, 2018 •

edited

josef-pkt commented Aug 3, 2018 •

edited

josef-pkt commented Feb 14, 2020

josef-pkt commented Feb 29, 2020

josef-pkt commented Dec 24, 2021

josef-pkt commented Jan 30, 2022

josef-pkt commented Jan 30, 2022

josef-pkt commented Mar 22, 2022 •

edited

SUMM/ENH/Overview: two sample proportion, independent #4828

SUMM/ENH/Overview: two sample proportion, independent #4828

Comments

josef-pkt commented Aug 3, 2018 • edited

results

comparisons/hypothesis

methods

What's available in various issues/PRs and what's missing?

code structure, API

josef-pkt commented Aug 3, 2018 • edited

josef-pkt commented Aug 3, 2018 • edited

josef-pkt commented Feb 14, 2020

josef-pkt commented Feb 29, 2020

josef-pkt commented Dec 24, 2021

josef-pkt commented Jan 30, 2022

josef-pkt commented Jan 30, 2022

josef-pkt commented Mar 22, 2022 • edited

josef-pkt commented Aug 3, 2018 •

edited

josef-pkt commented Aug 3, 2018 •

edited

josef-pkt commented Aug 3, 2018 •

edited

josef-pkt commented Mar 22, 2022 •

edited