# Power Analysis for Sample Size Calculation

## Objective
The following R code calculates the sample size required for conducting a Wilcoxon paired rank sum test based on specified parameters. The Wilcoxon test will be used to assess the significance of differences between paired observations obtained at baseline and after 20 weeks of treatment. The primary outcome measure is the degree of goal attainment of participants using the Goal Attainment Scale (GAS).

## Primary Outcome
The primary endpoint of interest is the degree of goal attainment of participants, assessed using the Goal Attainment Scale (GAS) after 20 weeks of treatment.

## Package Information
- The power analysis is conducted using the `pwr` package in R.
- The `pwr` package provides functions for power analysis in various statistical tests, including non-parametric tests like the Wilcoxon test.

## Literature
- Cohen, J. (1988). *Statistical Power Analysis for the Behavioral Sciences* (2nd ed.). Lawrence Erlbaum Associates.
- Navarro, D. J. (2015). *Learning statistics with R: A tutorial for psychology students and other beginners* (Version 0.5). Adelaide, Australia: University of Adelaide.
- R Core Team (2022). *R: A language and environment for statistical computing*. R Foundation for Statistical Computing, Vienna, Austria. [https://www.R-project.org/](https://www.R-project.org/)
- McCue, M., Sarkey, S., Eramo, A. et al. (2021). Using the Goal Attainment Scale adapted for depression to better understand treatment outcomes in patients with major depressive disorder switching to vortioxetine: a phase 4, single-arm, open-label, multicenter study. *BMC Psychiatry, 21*(1), 622. [https://doi.org/10.1186/s12888-021-03608-1](https://doi.org/10.1186/s12888-021-03608-1)



In [4]:
install.packages("pwrss")
library(pwrss)


The downloaded binary packages are in
	/var/folders/1j/rn9q783j7sjdszqsfc6_89sw0000gn/T//RtmpKDHMTu/downloaded_packages



Attaching package: 'pwrss'


The following object is masked from 'package:stats':

    power.t.test




In [20]:
# Set the parameters
p <- 0.50     # Proportion of individuals expected to have T-score >= 50
p0 <- 0.50    # Proportion under the null hypothesis
margin <- 0.10  # Difference between p and p0 considered meaningful, e.g., 10%
alpha <- 0.05 # Significance level
power <- 0.90 # Desired statistical power
alternative <- "superior" # Testing for superiority


In [21]:
# Perform the superiority test
result <- pwrss.z.prop(p = p, p0 = p0, margin = margin,
                       alpha = alpha, power = power,
                       alternative = alternative)

# Print the results
print(result)

 Approach: Normal Approximation 
 A Proportion against a Constant (z Test) 
 H0: p - p0 <= margin 
 HA: p - p0 > margin 
 ------------------------------ 
  Statistical power = 0.9 
  n = 215 
 ------------------------------ 
 Alternative = "superior" 
 Non-centrality parameter = -2.926 
 Type I error rate = 0.05 
 Type II error rate = 0.1 
$call
function (name, ...)  .Primitive("call")

$parms
$parms$p
[1] 0.5

$parms$p0
[1] 0.5

$parms$arcsin.trans
[1] FALSE

$parms$alpha
[1] 0.05

$parms$margin
[1] 0.1

$parms$alternative
[1] "superior"

$parms$verbose
[1] TRUE


$test
[1] "z"

$ncp
[1] -2.926405

$power
[1] 0.9

$n
[1] 214.0962

attr(,"class")
[1] "pwrss" "z"     "prop" 


To theoretically justify a margin of 0.075 in your study, you would need to consider several factors related to the context of your research, the intervention being studied, and the expected effect size. Here are some steps you can take to justify this margin:

Clinical Relevance: Determine what constitutes a meaningful difference in the context of your study. This could be based on clinical guidelines, expert opinions, or prior research. For example, you might consult clinical experts to understand what level of improvement in the proportion of individuals achieving a T-score of 50 or higher would be considered clinically significant.


Minimally Important Difference (MID): Identify the minimally important difference (MID) or smallest effect size that would be clinically meaningful. This could be based on patient-reported outcomes, changes in symptom severity, or other relevant measures. The margin of 0.075 should reflect this MID.


Statistical Considerations: Consider the statistical power of your study and the trade-off between Type I and Type II errors. A margin of 0.075 should be chosen to ensure that your study has adequate power to detect the expected effect size while controlling the risk of a false positive (Type I error).


Practical Constraints: Take into account practical constraints such as sample size limitations, cost considerations, and feasibility of the intervention. A margin of 0.075 should be feasible to detect within the constraints of your study design and resources available.


Literature Review: Review existing literature on similar interventions or outcomes to see if there are any established thresholds or benchmarks for effect sizes in comparable studies. This can help contextualize your choice of margin and provide additional justification.

FOR STEVEN:

Superiority Tests would be appropriate if you want to demonstrate that the intervention improves outcomes significantly compared to the baseline. In this case, you'd be interested in showing that the proportion of individuals with a T-score of 50 or higher after the intervention is significantly greater than a specified threshold (e.g., 50%).

Hypotheses:

Null hypothesis (Superiority Test): The proportion of individuals with T-scores of 50 or higher after the intervention is less than or equal to 50%.

Alternative hypothesis (Superiority Test): The proportion of individuals with T-scores of 50 or higher after the intervention is greater than 50%.

7.5% would mean a meaningful difference between GAS values before and after the intervention.



FOR STEVEN:

Increasing the power from the usual 0.8 to 0.9 can be justified for several reasons:

Reducing Type II error: A higher power reduces the risk of failing to detect a true effect (Type II error), which can be particularly important when the consequences of missing an effect are significant.

Increased sensitivity: A higher power makes your study more sensitive to detect small or subtle effects, which might be important for your research question.

In [1]:
# export the results to an HTML file
system("jupyter nbconvert --to html --execute your_notebook.ipynb")