Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SUMM/ENH/Overview: two sample proportion, independent #4828

Open
josef-pkt opened this issue Aug 3, 2018 · 8 comments
Open

SUMM/ENH/Overview: two sample proportion, independent #4828

josef-pkt opened this issue Aug 3, 2018 · 8 comments

Comments

@josef-pkt
Copy link
Member

josef-pkt commented Aug 3, 2018

update: basic non-exact methods merged in #2605, original PR #4829, issue #2605


two sample binomial proportion analogue to Poisson rates
#2718, https://github.com/josef-pkt/misc/wiki/Exact-and-Asymptotic-Tests---Rates,-Poisson
and one-sample proportion
https://github.com/josef-pkt/misc/blob/7fb0160bb7f8fd5eef00e276c61ec7fbe2e20238/notebooks/proportion_one_power.ipynb

#3954 more general issue, rates, proportion, paired sample, ...

results

  • hypothesis test, pvalue
  • confidence interval
  • power and sample size

comparisons/hypothesis

  • difference in proportion, support non-zero value
  • (risk) ratio
  • odds ratio
  • equality value=0 for diff, or value=1 for ratios

Basic version needs two-sided and one-sided alternative
TOST equivalence can reuse one-sided with value != 0.
Equality tests can take advantage of permutation or similar "exact" methods. E.g. fisher's exact test.

methods

  • wald
  • score
  • exact and pseudo-/semi-exact

the list of methods differs by references, e.g. PASS/NCSS documentation,
Newcombe 1998 has 11 methods for diff confint
Fagerland at al 2015 have 6-8 methods for confint for each diff, ratio and odds ration, not including berger boos
https://doi.org/10.1080/10543406.2018.1452028 has 97 methods according to title for one-sided diff inference.

I guess we get at least 5 to 10 including options, e.g.

  • wald: with and without adjustments, e.g. add 0.5 or 1 observation. fast
  • score_test: with or without adjustments, e.g. small sample correction (nobs / (nobs - 1)) or skewness/Bartlett correction (Gart and Nam)
  • (semi-)exact: based on binomial distribution
    • e-test: use MLE for nuisance parameter
    • sup sup/max over nuisance parameter
    • berger-boos
    • conditional only available for odds-ratio cases (analogue to fisher's exact test, both margins fixed)
      most likely I ignore this case for the general functions
    • variants for which test statistic is used to rank cases, e.g. Agresti-Min (one two-sided score test) versus two one-sided score tests)
    • mid-p variants

What's available in various issues/PRs and what's missing?

confint
#2605 issue, josef-pkt#5 score_confint code from PropCI
and code in issue comments

#2608 PR hypothesis tests for equality, including exact, berger-boos,
#2607 issue for exact

model analogue, GLM
score and profile loglike confint
#4798 (comment)
#1791

code structure, API

public
confint_proportion_2indep draft version (draft version, uncommitted)
test_proportion_2indep, ? not written yet, maybe as analogue to confint function
???

internal

  • function for constrained estimation in 2 independent sample case: mainly Miettinen and Nurminen 1985
  • test statistics
  • ztest_generic usage
  • confint explicit
  • confint numerical inversion and root finding

classes or functions, e.g. #2608 uses class for exact and berger-boos
confint_proportion_2indep is a single function with code for wald confint inside, but needs to delegate for score and exact confints
more complex confint will need tests as helper functions, for simple cases we can compute confint directly.

power, sample size: I haven't looked yet. NCSS uses exact computation for asymptotic/approximate tests and confint, at least for small samples

@josef-pkt
Copy link
Member Author

josef-pkt commented Aug 3, 2018

overview table from Fagerland et al for confidence intervals

Table 8. Availability of confidence intervals in standard software packages
Confidence interval
Closed
forma CIA 2.1 R 2.12 SAS 9.2 SPSS 18 Stata 11 StatXact 9

Difference between proportions
Approximate intervals
Wald ˇ ˇ ˇ ˇ ˇb ˇ –
Wald with continuity correction ˇ – ˇ ˇ – – –
Agresti–Caffo (3) ˇ – ˇc – – ˇd –
Newcombe hybrid score (2) ˇ ˇ ˇc ˇ – ˇd –
Miettinen–Nurminen asymptotic score – – ˇe – – ˇd ˇ
Exact intervals
Santner–Snell exact unconditional – – – ˇ – – ˇ
Chan–Zhang exact unconditional – – – – – – ˇ
Agresti-Min exact unconditional (1) – – – – – – ˇ

Ratio of proportions
Approximate intervals
Katz log ˇ ˇ – ˇ ˇ ˇ –
Adjusted log ˇ – ˇc – – – –
Inverse hyperbolic sine ˇ – – – – – –
Koopman asymptotic score (1) – – ˇc,e – – ˇf ˇ
Exact intervals
Chan–Zhang exact unconditional – – – – – – ˇ
Agresti–Min exact unconditional (2) – – – – – – ˇ

OR
Approximate intervals
Woolf logit ˇ ˇ – ˇ ˇ ˇ ˇ
Gart adjusted logit (3) ˇ – ˇc – – ˇg –
Independent-smoothed logit (4) ˇ – – – – ˇg –
Cornfield mid-p – – – – – – –
Baptista–Pike mid-p (1) – –– – – – –
Exact intervals
Cornfield exact conditional – – ˇ ˇ – ˇ ˇ
Baptista–Pike exact conditional – – – – – – –
Agresti–Min exact unconditional (2) – – – – – – –

image

@josef-pkt
Copy link
Member Author

josef-pkt commented Aug 3, 2018

Difference between proportions
Approximate intervals

sm Fagerland et al other
ci t Wald
-- Wald with continuity correction
ci t Agresti–Caffo (3)
x -- Newcombe hybrid score (2)
-- t Miettinen–Nurminen asymptotic score
Exact intervals
-- -- Berger-Boos
-- Santner–Snell exact unconditional
-- Chan–Zhang exact unconditional
-- Agresti-Min exact unconditional (1)

Ratio of proportions

sm Fagerland other
Approximate intervals
ci t Katz log
ci t Adjusted log
ci t Inverse hyperbolic sine
ci t Koopman asymptotic score (1) Miettinen-Nurminen ?
Exact intervals
-- -- Berger-Boos
-- Chan–Zhang exact unconditional
-- Agresti–Min exact unconditional (2)

OR

sm Fagerland other
Approximate intervals
ci t Woolf logit
ci t Gart adjusted logit (3)
ci t Independent-smoothed logit (4)
-- t --- score Miettinen-Nurminen
Approximate exact intervals
-- Cornfield mid-p
-- Baptista–Pike mid-p
Exact intervals
-- -- Berger-Boos
-- Cornfield exact conditional
-- Baptista–Pike exact conditional
-- Agresti–Min exact unconditional (2)

@josef-pkt
Copy link
Member Author

another reference
Prendergast, Luke A., and Robert G. Staudte. 2014. “Better than You Think: Interval Estimators of the Difference of Binomial Proportions.” Journal of Statistical Planning and Inference 148 (May): 38–48. https://doi.org/10.1016/j.jspi.2013.11.012.

Monte Carlo to compare 5 confidence intervals for diff of two proportions.

@josef-pkt
Copy link
Member Author

parking a reference, correction that found a mistake in the Monte Carlo
I don't remember Farrington and Manning 1990

Schoder, Volker. 2002. “Test Statistics and Sample Size Formulae for Comparative Binomial Trials with Null Hypothesis of Non-Zero Risk Difference or Non-Unity Relative Risk by C. P. Farrington and G. Manning, Statistics in Medicine 1990; 9:1447–1454.” Statistics in Medicine 21 (13): 1958–60. https://doi.org/10.1002/sim.1242.

@josef-pkt
Copy link
Member Author

I just saw a new book

Pradhan, Vivek, Ashis Gangopadhyay, Sandeep Menon, Cynthia Basu, and Tathagata Banerjee. 2021. Confidence Intervals for Discrete Data in Clinical Research. Boca Raton: Chapman and Hall/CRC. https://doi.org/10.1201/9781315169859.

However, based on TOC, it has less than what we already have
1 and 2 sample binomial, only 1 sample rate (counts)

possibly what we don't fully have yet: 2 sample paired proportions (AFAIR, that's just McNemar, but I don't know if we include confint, it should reduce just to one sample confint for the test statistic)

I didn't check how much the book covers "exact" methods.

@josef-pkt
Copy link
Member Author

@josef-pkt
Copy link
Member Author

overview tables for proportions #6399

@josef-pkt
Copy link
Member Author

josef-pkt commented Mar 22, 2022

our current equivalence test for proportion are individual functions

This articles compares several method in Monte Carlo

Barker, Lawrence, Henry Rolka, Deborah Rolka, and Cedric Brown. 2001. “Equivalence Testing for Binomial Random Variables.” The American Statistician 55 (4): 279–87. https://doi.org/10.1198/000313001753272213.

(we need more systematic functions in proportions)
that's not correct for 2indep proportions, we have test, confint and tost, but power for only limited case

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant