# Table of Content

- [Table of Content](#table-of-content)
- [0-General](#0-general)
  - [Introduction](#introduction)
  - [Study Objectives](#study-objectives)
  - [Study Significance](#study-significance)
  - [Methods - Study Design](#methods---study-design)
  - [Methods - Outcome Measures](#methods---outcome-measures)
  - [Sample Size Calculation Rationale](#sample-size-calculation-rationale)
  - [Simulation-Based Sample Size Calculation for ANCOVA](#simulation-based-sample-size-calculation-for-ancova)
- [1-Sample Size Calculation](#1--sample-size-calculation)
- [2-Conclusion](#2-conclusion)

# 0-General
[Back to Table of Content](#table-of-content)
# Sample Size Calculation for the Nature-Boost Randomized Controlled Trial (RCT)

## Introduction
[Back to Table of Content](#table-of-content)

The Nature-Boost project aims to address the rising challenges in long-term care facilities by enhancing the health and relationship between caregivers and residents through innovative, nature-based interventions. The project recognizes the dual burdens on both residents and staff due to increased demand for long-term care services and chronic workforce shortages in caregiving roles. In particular, residents often experience reduced mobility, social isolation, and sensory deprivation, which can impact sleep quality, mood, and overall well-being. Caregivers also face high levels of stress and burnout, contributing to a challenging work environment.

To counter these challenges, the Nature-Boost project proposes immersive, multi-sensory interventions replicating nature experiences. These include:
1. **Olfactory stimuli**: Essential oil blends simulating the scent of the Bavarian forest.
2. **Visual stimuli**: VR simulations showing forest scenes.
3. **Auditory stimuli**: Natural sounds, such as forest ambiance.


## Study Objectives
[Back to Table of Content](#table-of-content)

The Nature-Boost study aims to test the health-promoting effects of virtual and sensory nature experiences for both residents and caregiving staff in long-term care settings. Specifically, it examines:
1. **Primary Objective**: Assess whether immersive nature-based experiences improve residents' sleep quality.
2. **Secondary Objectives**: Evaluate improvements in residents' overall well-being, mood, loneliness, cognitive function, and emotional exhaustion. Additionally, assess changes in caregivers' perceived stress and burnout.


## Study Significance
[Back to Table of Content](#table-of-content)

The anticipated outcomes of Nature-Boost align with an urgent need for effective mental and physical health interventions in long-term care. By exploring whether digitally simulated nature experiences can improve sleep and reduce stress for those unable to access outdoor environments directly, the study could pave the way for scalable, accessible solutions for both residents and caregivers.


## Methods - Study Design
[Back to Table of Content](#table-of-content)

This study will use a randomized controlled trial (RCT) design with four intervention groups:
- **Experimental Group 1**: Participants receive an immersive sensory experience before sleep, combining forest-scented essential oils, VR forest images, and natural forest sounds.
- **Experimental Group 2**: Participants experience VR images and natural sounds of the forest, without scent stimuli.
- **Experimental Group 3**: Participants are exposed to forest-scented essential oils alone.
- **Control Group**: Participants receive a neutral intervention, such as a non-nature-related video, without accompanying scents or sounds.

## Methods - Outcome Measures
[Back to Table of Content](#table-of-content)

- **Primary Outcome**: Sleep quality, assessed using the Pittsburgh Sleep Quality Index (PSQI).
- **Secondary Outcomes**:
  - Residents' well-being (WHO-5 Well-Being Index), loneliness (UCLA Loneliness Scale), mood (Hospital Anxiety and Depression Scale), and cognitive function.
  - Caregivers' stress (Perceived-Stress-Scale) and burnout risk (Burnout Assessment Tool).


## Sample Size Calculation Rationale
[Back to Table of Content](#table-of-content)

For this randomized controlled trial (RCT), we aim to detect differences in PSQI scores across four intervention groups. Due to the inclusion of baseline covariates (e.g., baseline PSQI scores, age, sex, medication use), an ANCOVA model is more suitable than ANOVA, as it adjusts for these covariates and provides increased precision.


To determine an appropriate sample size for detecting differences in PSQI scores across the four groups, we consider the Wan et al. () study, which investigated the effects of VR interventions on sleep quality in elderly participants using the PSQI.

### Reference Study Data (Wan et al.)
The study by Wan et al. (2024, PMID: 39213857) provides benchmark values for PSQI changes in a control and a VR group, as shown below:

| Group          | Baseline PSQI Mean | Post-Treatment PSQI Mean | Standard Deviation (SD) |
|----------------|--------------------|--------------------------|--------------------------|
| Control (n=20) | 14.00              | 9.90                     | 1.80                     |
| VR (n=32)      | 13.94              | 8.37                     | 2.17                     |

Wan et al.’s findings indicate a significant improvement in PSQI for the VR group (mean reduction of ~5.6 points) compared to the control (mean reduction of ~4.1 points). We use these values as assumptions for the expected impact of Nature-Boost's VR interventions on PSQI.

## Simulation-Based Sample Size Calculation for ANCOVA
[Back to Table of Content](#table-of-content)

To accommodate the complexity of ANCOVA and covariate adjustments, we conduct a simulation-based power analysis rather than relying on traditional formulas. In this simulation, we generate hypothetical data for each group based on:
1. Expected post-intervention PSQI means for each group.
2. The covariate effect (baseline PSQI on post-intervention PSQI).
3. The standard deviation of PSQI scores (using Wan et al.'s reported values as a benchmark).

The simulation will iterate over different sample sizes to determine the minimum number of participants required to achieve a target power of 80% for detecting a significant group effect. We anticipate a 20% dropout rate, so the sample size will be adjusted accordingly to ensure adequate power after accounting for attrition.


# 1- Sample Size Calculation
[Back to Table of Content](#table-of-content)

In [47]:
# Load the pwr package for power analysis
if (!requireNamespace("pwr", quietly = TRUE)) install.packages("pwr")
library(pwr)
library(lme4)

In [59]:
# Set parameters based on Wan et al. data for Control Group
# Control Group parameters
baseline_mean_control <- 14.00
baseline_sd_control <- 2.15
post_mean_control <- 9.90
post_sd_control <- 1.68
delta_mean_control <- -4.10
delta_sd_control <- 1.80

In [60]:
# Calculate the Covariate Effect
# Estimate correlation between baseline and post-intervention PSQI
correlation_baseline_post <- (delta_mean_control / post_sd_control) / sqrt((post_sd_control^2 / baseline_sd_control^2) + 1)

In [61]:
# Calculate covariate effect (regression coefficient of baseline PSQI on post-intervention PSQI)
covariate_effect <- (post_sd_control / baseline_sd_control) * correlation_baseline_post

# Output covariate effect to verify
cat("Estimated Covariate Effect of Baseline PSQI on Post-Intervention PSQI:", covariate_effect, "\n")

Estimated Covariate Effect of Baseline PSQI on Post-Intervention PSQI: -1.502638 


In [83]:
# Set Simulation Parameters for Power Analysis
n_groups <- 4                # Number of intervention groups
target_power <- 0.80         # Desired power level (80%)
n_simulations <- 10000        # Number of simulations per sample size tested
baseline_mean <- 14          # Mean baseline PSQI score across groups
post_means <- c(9.9, 8.5, 8.0, 9.0)  # Expected post-intervention PSQI means for each group
sd <- 2.0                    # Standard deviation for PSQI scores (from Wan et al.)
alpha <- 0.05                # Significance level for the test
attrition_rate <- 0.40       # Expected attrition rate (30%)


In [84]:
# Initialize variables for sample size and power estimation
n_per_group <- 10            # Starting estimate for participants per group (this will increment)
estimated_power <- 0         # Initialize estimated power

In [85]:
# Loop to Find Minimum Sample Size per Group to Reach Target Power
while (estimated_power < target_power) {
  
  # Initialize vector to store results from each simulation
  significant_results <- numeric(n_simulations)
  
  # Loop over the number of simulations for the current sample size
  for (i in 1:n_simulations) {
    
    # Generate data for each group with specified sample size and baseline PSQI
    data <- data.frame(
      group = rep(1:n_groups, each = n_per_group),  # Repeat group numbers to simulate 4 groups
      baseline_psqi = rnorm(n_per_group * n_groups, mean = baseline_mean, sd = sd)  # Simulated baseline PSQI scores
    )
    
    # Create post-intervention PSQI scores based on group means and covariate effect
    data <- data %>%
      mutate(
        # post_psqi is based on baseline PSQI * covariate effect + group effect + random noise
        post_psqi = baseline_psqi * covariate_effect + 
                    rnorm(n_per_group * n_groups, mean = rep(post_means, each = n_per_group), sd = sd)
      )
    
    # Fit ANCOVA model with group as factor and baseline_psqi as covariate
    model <- lm(post_psqi ~ factor(group) + baseline_psqi, data = data)
    
    # Extract p-value for group effect from ANCOVA results
    anova_results <- anova(model)
    p_value <- anova_results$`Pr(>F)`[1]  # First row corresponds to the group effect p-value
    
    # Record whether this simulation detected a significant effect (p < alpha)
    significant_results[i] <- ifelse(p_value < alpha, 1, 0)
  }
  
  # Calculate estimated power as the proportion of simulations with significant results
  estimated_power <- mean(significant_results)
  
  # Print intermediate results for each sample size tested
  cat("Sample size per group:", n_per_group, "- Estimated Power:", estimated_power, "\n")
  
  # Check if the estimated power meets or exceeds the target power
  if (estimated_power < target_power) {
    n_per_group <- n_per_group + 2  # Increase sample size by 2 per group for finer control
  }
}

Sample size per group: 10 - Estimated Power: 0.6436 
Sample size per group: 12 - Estimated Power: 0.676 
Sample size per group: 14 - Estimated Power: 0.7121 
Sample size per group: 16 - Estimated Power: 0.7382 
Sample size per group: 18 - Estimated Power: 0.7672 
Sample size per group: 20 - Estimated Power: 0.7962 
Sample size per group: 22 - Estimated Power: 0.8125 


In [86]:
# Adjust for 40% Attrition Rate
adjusted_n_per_group <- ceiling(n_per_group / (1 - attrition_rate))

In [87]:
# Output the minimum sample size needed to achieve target power, adjusted for attrition
cat("Minimum sample size per group (with 40% attrition rate):", adjusted_n_per_group)

Minimum sample size per group (with 30% attrition rate): 37

# 2-Conclusion
[Back to Table of Content](#table-of-content)

The objective of this analysis was to estimate the minimum sample size required per group to achieve adequate statistical power for detecting differences in ***PSQI (Pittsburgh Sleep Quality Index) scores*** across four intervention groups in the Nature-Boost ***randomized controlled trial***. Given the study’s design and the inclusion of baseline PSQI scores as a covariate, an ***ANCOVA model*** was chosen as the primary method for analysis.

We employed a simulation-based power analysis to determine the appropriate sample size per group, incrementally increasing the sample size until the simulation achieved the desired power. A target power of ***80%*** was set, which is typically sufficient for clinical studies. Due to the elderly participant population, which includes individuals who are bed-bound, we accounted for a high dropout rate of 40% to ensure adequate power despite potential participant loss.

The results from the simulation indicated that a sample size of 22 participants per group would be required to achieve the target power of 80% without attrition. However, with a ***40% attrition*** adjustment, the final recommended sample size increased to 37 participants per group. For the four-group design, this yields a total sample size of 148 participants. This adjusted sample size is expected to provide sufficient power, accounting for the elevated dropout rate, which is anticipated due to participants' advanced age and the physical constraints some may face with VR and sensory technology.

In conclusion, recruiting ***148 participants in total (37 per group)*** will allow the Nature-Boost study to detect meaningful differences between intervention groups in terms of sleep quality, even if a significant proportion of participants withdraw. This ensures that the study maintains robust statistical power and can draw reliable conclusions about the impact of nature-based interventions on sleep quality and well-being in long-term care residents.