Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sample versus population SD in DEBIT #64

Closed
lhdjung opened this issue Mar 5, 2024 · 1 comment
Closed

Sample versus population SD in DEBIT #64

lhdjung opened this issue Mar 5, 2024 · 1 comment
Labels
documentation Improvements or additions to documentation

Comments

@lhdjung
Copy link
Owner

lhdjung commented Mar 5, 2024

The formula difference is just whether N is used directly (as the population SD) or as N/N-1 as the sample SD. The sample SD as per the preprint is certainly more common - but not used exclusively - so that might be worth documenting as a possible reason for discrepancies?

Originally posted by @LukasWallrich in #61 (comment)

@lhdjung
Copy link
Owner Author

lhdjung commented Mar 5, 2024

For reference, the formulas for SDs of binary data are:

  • Sample SD (DEBIT preprint, p. 5): $\sqrt{ \frac{N}{N - 1} \times \frac{ab}{N^2} }$
    where $a$ and $b$ are the group sizes

  • Population SD (@LukasWallrich here): $\sqrt{ \frac{M (1 - M)}{N} }$

Edit: population SD implementation:

sd_binary_population <- function(mean, n) {
  sqrt(mean * (1 - mean) / n)
}

@lhdjung lhdjung added the documentation Improvements or additions to documentation label Mar 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

1 participant