In [None]:
# ============================================================================
# Meta-Analysis of L2 Pronunciation Training Effectiveness
# ============================================================================
#
# Quantifying the overall effectiveness of L2 English accent-reduction training and isolating the learner and instructional predictors of improvement via moderator, robustness, and bias analyses.
#
# ============================================================================
# ANALYSIS WORKFLOW
# ============================================================================
#
# STEP 0:   Environment Setup & Data Validation
#
# STEP 1:   Overall Random-Effects Model
#           ‚Ä¢ REML estimation for pooled effect size
#           ‚Ä¢ Heterogeneity quantification (Q, I¬≤, œÑ¬≤)
#           ‚Ä¢ Prediction intervals for generalization
#
# STEP 2:   Moderator Analyses
#   2.1:    Univariate Meta-Regression (single-predictor models)
#   2.2:    Multivariate Meta-Regression (controlling for confounds)
#
# STEP 3:   Robustness & Sensitivity Checks
#   3.1:    Leave-One-Out Analysis (stability testing)
#   3.2:    Influence Diagnostics (outlier detection)
#   3.3:    Robust Variance Estimation (dependency adjustment)
#
# STEP 4:   Publication Bias Assessment
#           ‚Ä¢ Funnel plot asymmetry (Egger's test)
#           ‚Ä¢ Trim-and-fill imputation
#           ‚Ä¢ Fail-safe N calculation
#
# ============================================================================
# DEPENDENCIES
# ============================================================================
#
# metafor  ‚â• 3.0.0    Meta-analytic models (Viechtbauer, 2010)
# robumeta ‚â• 2.0      RVE for clustered effects (Hedges et al., 2010)
#
# ============================================================================
# DATA STRUCTURE
# ============================================================================
#
# Input:  meta_ready_cleaned.csv
#
# Key Variables:
#   Study_ID     Study-level identifier (clustering variable)
#   Effect_ID    Effect-level identifier (unique)
#   Hedges_g     Standardized mean difference (bias-corrected)
#   SE           Standard error (or Variance)
#   [Moderators] Categorical/continuous predictors
#
# ============================================================================
# OUTPUTS
# ============================================================================
#
# 1. overall_meta_analysis_results.csv       Pooled effect & heterogeneity
# 2. univariate_moderator_results.csv        Single-predictor tests
# 3. significant_moderators.csv              p < .05 moderators
# 4. multivariate_model_coefficients.csv     Adjusted moderator effects
# 5. multivariate_model_fit.csv              Model R¬≤ & fit indices
# 6. leave_one_out_analysis.csv              Sensitivity metrics
# 7. influence_diagnostics.csv               Cook's D, DFBETAS, leverage
# 8. rve_overall_effect.csv                  RVE-adjusted pooled effect
# 9. rve_moderator_results.csv               RVE-adjusted moderator tests
# 10. publication_bias_tests.csv             Egger, trim-and-fill, fsN
# 11. publication_bias_trim_and_fill.csv     Imputed studies & adjusted g
# 12. funnel_plot.png                        Visual asymmetry assessment
#
# ============================================================================

# Clear workspace
rm(list = ls())

# Record analysis start time
analysis_start_time <- Sys.time()

# Display header
cat("\n")
cat(paste0(rep("=", 80), collapse = ""), "\n")
cat("L2 PRONUNCIATION TRAINING META-ANALYSIS\n")
cat(paste0(rep("=", 80), collapse = ""), "\n")
cat("Analysis initiated: ", format(analysis_start_time, "%Y-%m-%d %H:%M:%S"), "\n", sep = "")
cat("R version:         ", R.version.string, "\n", sep = "")
cat("Platform:          ", R.version$platform, "\n", sep = "")
cat(paste0(rep("=", 80), collapse = ""), "\n")
cat("\n")


# ============================================================================
# STEP 0: ENVIRONMENT SETUP AND DATA LOADING
# ============================================================================
#
# This step ensures:
#   1. All required R packages are installed and loaded
#   2. R environment is configured for optimal output
#   3. Dataset is loaded and validated
#   4. Sampling variance is calculated
#
# ============================================================================

cat(paste0(rep("=", 80), collapse = ""), "\n")
cat("STEP 0: ENVIRONMENT SETUP AND DATA LOADING\n")
cat(paste0(rep("=", 80), collapse = ""), "\n\n")

# ----------------------------------------------------------------------------
# Step 0.1: Install and Load Required Packages
# ----------------------------------------------------------------------------
cat("Step 0.1: Installing and loading required packages...\n")
cat(paste0(rep("-", 80), collapse = ""), "\n")

# Define required packages
required_packages <- c("metafor", "robumeta")

# Check and install packages
for (pkg in required_packages) {
  if (!require(pkg, character.only = TRUE, quietly = TRUE)) {
    cat("  Installing package: ", pkg, "\n", sep = "")
    install.packages(pkg, repos = "https://cloud.r-project.org/", quiet = TRUE)
    cat("  ‚úÖ ", pkg, " installed\n", sep = "")
  }
  # Explicitly load the package (ensures it's loaded even if already installed)
  library(pkg, character.only = TRUE)
  cat("  ‚úÖ ", pkg, " v", as.character(packageVersion(pkg)), " loaded\n", sep = "")
}

cat("\n")

# ----------------------------------------------------------------------------
# Step 0.1.5: Define Helper Function for Safe CSV Writing
# ----------------------------------------------------------------------------
cat("Step 0.1.5: Defining helper functions...\n")
cat(paste0(rep("-", 80), collapse = ""), "\n")

# Safe CSV write function with automatic error handling
safe_write_csv <- function(data, filename, show_message = TRUE) {
  # Ensure output_path exists
  if (!exists("output_path", envir = .GlobalEnv) || is.null(get("output_path", envir = .GlobalEnv))) {
    output_path <- getwd()
    assign("output_path", output_path, envir = .GlobalEnv)
  } else {
    output_path <- get("output_path", envir = .GlobalEnv)
  }
  
  # Try to write file
  tryCatch({
    write.csv(data,
              file.path(output_path, filename),
              row.names = FALSE,
              fileEncoding = "UTF-8")
    if (show_message) {
      cat("  ‚úÖ Saved: ", filename, "\n", sep = "")
    }
    return(TRUE)
  }, error = function(e) {
    # If file is locked, use timestamped version
    timestamp <- format(Sys.time(), "%Y%m%d_%H%M%S")
    base_name <- sub("\\.csv$", "", filename)
    alt_filename <- paste0(base_name, "_", timestamp, ".csv")
    write.csv(data,
              file.path(output_path, alt_filename),
              row.names = FALSE,
              fileEncoding = "UTF-8")
    if (show_message) {
      cat("  ‚ö†Ô∏è  File locked. Saved as: ", alt_filename, "\n", sep = "")
    }
    return(FALSE)
  })
}

cat("  ‚úÖ Helper functions defined\n")

cat("\n")

# ----------------------------------------------------------------------------
# Step 0.2: Configure R Environment
# ----------------------------------------------------------------------------
cat("Step 0.2: Configuring R environment settings...\n")
cat(paste0(rep("-", 80), collapse = ""), "\n")

# Set global options
options(
  digits = 4,                    # Decimal places for output
  scipen = 999,                  # Disable scientific notation
  width = 80,                    # Console width
  stringsAsFactors = FALSE,      # Don't auto-convert to factors
  warn = 1                       # Show warnings immediately
)

cat("  ‚úÖ Output precision: 4 decimal places\n")
cat("  ‚úÖ Scientific notation: disabled\n")
cat("  ‚úÖ Console width: 80 characters\n")
cat("  ‚úÖ String handling: no auto-factorization\n")

cat("\n")

# ----------------------------------------------------------------------------
# Step 0.3: Load Meta-Analysis Dataset
# ----------------------------------------------------------------------------
cat("Step 0.3: Loading meta-analysis dataset...\n")
cat(paste0(rep("-", 80), collapse = ""), "\n")

# Define data file path
data_file <- "meta_ready_cleaned.csv"

# Check file existence
if (!file.exists(data_file)) {
  cat("  ‚ùå ERROR: Data file not found!\n")
  cat("     Expected file: ", data_file, "\n", sep = "")
  cat("     Working directory: ", getwd(), "\n", sep = "")
  stop("Data file missing. Please ensure 'meta_ready_cleaned.csv' is in the working directory.")
}

# Load data
df <- read.csv(data_file, stringsAsFactors = FALSE, fileEncoding = "UTF-8")

cat("  ‚úÖ Dataset loaded successfully\n")
cat("     File:        ", data_file, "\n", sep = "")
cat("     Effect sizes: ", nrow(df), "\n", sep = "")
cat("     Studies:      ", length(unique(df$Study_ID)), "\n", sep = "")
cat("     Columns:      ", ncol(df), "\n", sep = "")

cat("\n")

# ----------------------------------------------------------------------------
# Step 0.3.5: Set Output Directory for Results
# ----------------------------------------------------------------------------
cat("Step 0.3.5: Configuring output directory...\n")
cat(paste0(rep("-", 80), collapse = ""), "\n")

# Get current working directory
output_dir <- getwd()

# Create subdirectory for results (optional - keeps workspace clean)
results_dir <- file.path(output_dir, "meta_analysis_results")
if (!dir.exists(results_dir)) {
  dir.create(results_dir, recursive = TRUE)
  cat("  ‚úÖ Created results directory: ", results_dir, "\n", sep = "")
} else {
  cat("  ‚úÖ Results directory exists: ", results_dir, "\n", sep = "")
}

# Set output directory (use main directory for now to avoid path issues)
output_path <- output_dir  # Change to results_dir if you want subfolder

cat("  ‚úÖ Output path configured\n")
cat("     Directory: ", output_path, "\n", sep = "")
cat("     All CSV files will be saved here\n")

cat("\n")

# ----------------------------------------------------------------------------
# Step 0.4: Validate Required Columns
# ----------------------------------------------------------------------------
cat("Step 0.4: Validating dataset structure...\n")
cat(paste0(rep("-", 80), collapse = ""), "\n")

# Check for essential columns
essential_cols <- c("Study_ID", "Effect_ID", "Hedges_g")
missing_cols <- setdiff(essential_cols, names(df))

if (length(missing_cols) > 0) {
  cat("  ‚ùå ERROR: Missing essential columns:\n")
  for (col in missing_cols) {
    cat("     - ", col, "\n", sep = "")
  }
  stop("Essential columns missing from dataset")
}

cat("  ‚úÖ Essential columns present: Study_ID, Effect_ID, Hedges_g\n")

# Check for variance information
if ("SE" %in% names(df)) {
  variance_source <- "SE"
  cat("  ‚úÖ Variance source: Standard Error (SE)\n")
} else if ("Variance" %in% names(df)) {
  variance_source <- "Variance"
  cat("  ‚úÖ Variance source: Variance column\n")
} else {
  cat("  ‚ùå ERROR: No variance information found!\n")
  cat("     Required: 'SE' or 'Variance' column\n")
  stop("Variance information missing")
}

cat("\n")

# ----------------------------------------------------------------------------
# Step 0.5: Calculate Sampling Variance
# ----------------------------------------------------------------------------
cat("Step 0.5: Calculating sampling variance (vi)...\n")
cat(paste0(rep("-", 80), collapse = ""), "\n")

# Calculate or assign variance
if (variance_source == "SE") {
  df$vi <- df$SE^2
  cat("  ‚úÖ Variance calculated: vi = SE¬≤\n")
} else {
  df$vi <- df$Variance
  cat("  ‚úÖ Variance assigned: vi = Variance\n")
}

# Display variance statistics
cat("     Mean variance:   ", sprintf("%.4f", mean(df$vi, na.rm = TRUE)), "\n", sep = "")
cat("     Median variance: ", sprintf("%.4f", median(df$vi, na.rm = TRUE)), "\n", sep = "")
cat("     Range:           [", sprintf("%.4f", min(df$vi, na.rm = TRUE)), ", ",
    sprintf("%.4f", max(df$vi, na.rm = TRUE)), "]\n", sep = "")

cat("\n")

# ----------------------------------------------------------------------------
# Step 0.6: Display Effect Size Distribution
# ----------------------------------------------------------------------------
cat("Step 0.6: Effect size distribution summary...\n")
cat(paste0(rep("-", 80), collapse = ""), "\n")

cat("  Hedges' g statistics:\n")
cat("     N:      ", nrow(df), "\n", sep = "")
cat("     Mean:   ", sprintf("%.4f", mean(df$Hedges_g, na.rm = TRUE)), "\n", sep = "")
cat("     Median: ", sprintf("%.4f", median(df$Hedges_g, na.rm = TRUE)), "\n", sep = "")
cat("     SD:     ", sprintf("%.4f", sd(df$Hedges_g, na.rm = TRUE)), "\n", sep = "")
cat("     Min:    ", sprintf("%.4f", min(df$Hedges_g, na.rm = TRUE)), "\n", sep = "")
cat("     Max:    ", sprintf("%.4f", max(df$Hedges_g, na.rm = TRUE)), "\n", sep = "")
cat("     Q1:     ", sprintf("%.4f", quantile(df$Hedges_g, 0.25, na.rm = TRUE)), "\n", sep = "")
cat("     Q3:     ", sprintf("%.4f", quantile(df$Hedges_g, 0.75, na.rm = TRUE)), "\n", sep = "")

cat("\n")
cat("‚úÖ STEP 0 COMPLETE: Environment configured and data loaded\n")
cat(paste0(rep("=", 80), collapse = ""), "\n\n")




L2 PRONUNCIATION TRAINING META-ANALYSIS
Analysis initiated: 2025-11-26 15:58:47
R version:         R version 4.5.1 (2025-06-13 ucrt)
Platform:          x86_64-w64-mingw32

L2 PRONUNCIATION TRAINING META-ANALYSIS
Analysis initiated: 2025-11-26 15:58:47
R version:         R version 4.5.1 (2025-06-13 ucrt)
Platform:          x86_64-w64-mingw32

STEP 0: ENVIRONMENT SETUP AND DATA LOADING

Step 0.1: Installing and loading required packages...
-------------------------------------------------------------------------------- 
  ‚úÖ metafor v4.8.0 loaded
  ‚úÖ robumeta v2.1 loaded
STEP 0: ENVIRONMENT SETUP AND DATA LOADING

Step 0.1: Installing and loading required packages...
-------------------------------------------------------------------------------- 
  ‚úÖ metafor v4.8.0 loaded
  ‚úÖ robumeta v2.1 loaded

Step 0.2: Configuring R environment settings...
-------------------------------------------------------------------------------- 
  ‚úÖ Output precision: 4 decimal places
  ‚úÖ Scienti

In [35]:
# ============================================================================
# STEP 1: OVERALL RANDOM-EFFECTS META-ANALYSIS
# ============================================================================
#
# Purpose: Estimate the overall effectiveness of L2 pronunciation training
#          and assess between-study heterogeneity
#
# Question: Is L2 pronunciation training effective overall?
#
# Model Specification:
#   yi ~ N(Œ∏ + ui, vi)
#   ui ~ N(0, œÑ¬≤)
#
#   where:
#     yi = observed effect size (Hedges' g)
#     Œ∏  = true mean effect size
#     ui = random study effect
#     vi = sampling variance (known)
#     œÑ¬≤ = between-study variance (estimated)
#
# Estimation Method: 
#   Restricted Maximum Likelihood (REML)
#
# Expected Outputs:
#   1. Overall mean effect size (pooled g)
#   2. 95% Confidence interval
#   3. Z-test and p-value
#   4. Heterogeneity statistics (Q, I¬≤, œÑ¬≤)
#
# Output File (This Step Produces):
#   ‚Üí overall_meta_analysis_results.csv
#
# Interpretation:
#   - If effect is significant (p < .05): Training is effective overall
#   - If I¬≤ is large (> 50%): Substantial heterogeneity exists
#   - High I¬≤ justifies moderator analyses (STEP 2)
#
# ============================================================================

cat(paste0(rep("=", 80), collapse = ""), "\n")
cat("STEP 1: OVERALL RANDOM-EFFECTS META-ANALYSIS\n")
cat(paste0(rep("=", 80), collapse = ""), "\n\n")

cat("Research Question: Is L2 pronunciation training effective overall?\n\n")

# ----------------------------------------------------------------------------
# Step 1.1: Fit Overall Random-Effects Model
# ----------------------------------------------------------------------------
cat("Step 1.1: Fitting random-effects meta-analysis model (REML)...\n")
cat(paste0(rep("-", 80), collapse = ""), "\n")

# Fit intercept-only model (no moderators)
res_overall <- rma(
  yi = Hedges_g,           # Effect sizes
  vi = vi,                 # Sampling variances
  data = df,               # Dataset
  method = "REML",         # Restricted ML estimation
  test = "z"               # Use z-distribution for inference
)

cat("  ‚úÖ Model fitted successfully\n")
cat("     Method:     Restricted Maximum Likelihood (REML)\n")
cat("     Model type: Random-effects (intercept-only)\n")
cat("     Test:       Z-distribution\n")

cat("\n")

# ----------------------------------------------------------------------------
# Step 1.2: Extract and Display Overall Effect Size
# ----------------------------------------------------------------------------
cat("Step 1.2: Overall effect size estimate...\n")
cat(paste0(rep("-", 80), collapse = ""), "\n\n")

# Extract estimates
overall_g <- res_overall$b[1]           # Mean effect size
overall_se <- res_overall$se            # Standard error
overall_ci_lb <- res_overall$ci.lb      # CI lower bound
overall_ci_ub <- res_overall$ci.ub      # CI upper bound
overall_z <- res_overall$zval           # Z-value
overall_p <- res_overall$pval           # p-value

# Display results
cat("  OVERALL EFFECT SIZE (Hedges' g):\n")
cat("  ", paste0(rep("-", 76), collapse = ""), "\n", sep = "")
cat("    Estimate (g):      ", sprintf("%.4f", overall_g), "\n", sep = "")
cat("    Standard Error:    ", sprintf("%.4f", overall_se), "\n", sep = "")
cat("    95% CI:            [", sprintf("%.4f", overall_ci_lb), ", ", 
    sprintf("%.4f", overall_ci_ub), "]\n", sep = "")
cat("    Z-value:           ", sprintf("%.4f", overall_z), "\n", sep = "")
cat("    p-value:           ", sprintf("%.4f", overall_p), sep = "")

# Add significance markers
if (overall_p < 0.001) {
  cat(" ***\n")
  sig_interpretation <- "HIGHLY SIGNIFICANT (p < .001)"
} else if (overall_p < 0.01) {
  cat(" **\n")
  sig_interpretation <- "Very significant (p < .01)"
} else if (overall_p < 0.05) {
  cat(" *\n")
  sig_interpretation <- "Significant (p < .05)"
} else {
  cat("\n")
  sig_interpretation <- "Not statistically significant (p ‚â• .05)"
}

cat("\n")

# ----------------------------------------------------------------------------
# Step 1.3: Heterogeneity Assessment
# ----------------------------------------------------------------------------
cat("Step 1.3: Between-study heterogeneity statistics...\n")
cat(paste0(rep("-", 80), collapse = ""), "\n\n")

# Extract heterogeneity statistics
Q_stat <- res_overall$QE               # Q statistic
Q_df <- res_overall$k - 1              # Degrees of freedom
Q_p <- res_overall$QEp                 # Q test p-value
I2 <- res_overall$I2                   # I¬≤ statistic
tau2 <- res_overall$tau2               # œÑ¬≤ (tau-squared)
tau <- sqrt(tau2)                      # œÑ (tau)
H2 <- res_overall$H2                   # H¬≤ statistic

cat("  HETEROGENEITY STATISTICS:\n")
cat("  ", paste0(rep("-", 76), collapse = ""), "\n", sep = "")
cat("    Q statistic:       ", sprintf("%.4f", Q_stat), 
    " (df = ", Q_df, ", p ", sep = "")
if (Q_p < 0.001) {
  cat("< .001 ***)\n")
} else {
  cat("= ", sprintf("%.4f", Q_p), ")\n", sep = "")
}
cat("    I¬≤ (% total var):  ", sprintf("%.2f", I2), "%\n", sep = "")
cat("    œÑ¬≤ (tau-squared):  ", sprintf("%.4f", tau2), "\n", sep = "")
cat("    œÑ (tau):           ", sprintf("%.4f", tau), "\n", sep = "")
cat("    H¬≤:                ", sprintf("%.4f", H2), "\n", sep = "")

cat("\n")

# Interpret I¬≤
cat("  I¬≤ Interpretation:\n")
if (I2 < 25) {
  I2_interpretation <- "Low heterogeneity (I¬≤ < 25%)"
  cat("    ‚Üí Low heterogeneity (I¬≤ < 25%)\n")
  cat("    ‚Üí Effect sizes are relatively homogeneous\n")
} else if (I2 < 50) {
  I2_interpretation <- "Moderate heterogeneity (25% ‚â§ I¬≤ < 50%)"
  cat("    ‚Üí Moderate heterogeneity (25% ‚â§ I¬≤ < 50%)\n")
  cat("    ‚Üí Some variability across studies\n")
} else if (I2 < 75) {
  I2_interpretation <- "Substantial heterogeneity (50% ‚â§ I¬≤ < 75%)"
  cat("    ‚Üí Substantial heterogeneity (50% ‚â§ I¬≤ < 75%)\n")
  cat("    ‚Üí Considerable variability across studies\n")
  cat("    ‚Üí ‚úÖ MODERATOR ANALYSIS WARRANTED\n")
} else {
  I2_interpretation <- "Very high heterogeneity (I¬≤ ‚â• 75%)"
  cat("    ‚Üí Very high heterogeneity (I¬≤ ‚â• 75%)\n")
  cat("    ‚Üí Large variability across studies\n")
  cat("    ‚Üí ‚úÖ MODERATOR ANALYSIS ESSENTIAL\n")
}

cat("\n")

# ----------------------------------------------------------------------------
# Step 1.4: Prediction Interval
# ----------------------------------------------------------------------------
cat("Step 1.4: Prediction interval for true effects...\n")
cat(paste0(rep("-", 80), collapse = ""), "\n\n")

# Calculate 95% prediction interval
# PI = estimate ¬± t(k-2) √ó sqrt(œÑ¬≤ + SE¬≤)
k <- res_overall$k
t_crit <- qt(0.975, df = k - 2)
pi_lower <- overall_g - t_crit * sqrt(tau2 + overall_se^2)
pi_upper <- overall_g + t_crit * sqrt(tau2 + overall_se^2)

cat("  95% PREDICTION INTERVAL:\n")
cat("  ", paste0(rep("-", 76), collapse = ""), "\n", sep = "")
cat("    Interval: [", sprintf("%.4f", pi_lower), ", ", 
    sprintf("%.4f", pi_upper), "]\n", sep = "")
cat("\n")
cat("  Interpretation:\n")
cat("    ‚Üí In 95% of contexts, the true effect is expected to fall\n")
cat("      within this range\n")
cat("    ‚Üí Wide interval indicates substantial heterogeneity\n")

cat("\n")

# ----------------------------------------------------------------------------
# Step 1.5: Summary and Interpretation
# ----------------------------------------------------------------------------
cat("Step 1.5: Overall meta-analysis summary...\n")
cat(paste0(rep("-", 80), collapse = ""), "\n\n")

cat("  SUMMARY:\n")
cat("  ", paste0(rep("=", 76), collapse = ""), "\n", sep = "")
cat("\n")

# Effect size interpretation
cat("  1. EFFECT SIZE:\n")
cat("     ‚Üí Training effect: g = ", sprintf("%.4f", overall_g), 
    " [", sprintf("%.4f", overall_ci_lb), ", ", 
    sprintf("%.4f", overall_ci_ub), "]\n", sep = "")
cat("     ‚Üí Interpretation:  ", sig_interpretation, "\n", sep = "")

# Effect size magnitude (Cohen's benchmarks)
if (abs(overall_g) < 0.2) {
  magnitude <- "Negligible to small"
} else if (abs(overall_g) < 0.5) {
  magnitude <- "Small to medium"
} else if (abs(overall_g) < 0.8) {
  magnitude <- "Medium to large"
} else {
  magnitude <- "Large to very large"
}
cat("     ‚Üí Magnitude:       ", magnitude, " (Cohen, 1988)\n", sep = "")

cat("\n")

cat("  2. HETEROGENEITY:\n")
cat("     ‚Üí ", I2_interpretation, "\n", sep = "")
if (I2 > 50) {
  cat("     ‚Üí CONCLUSION: Moderator analysis is justified\n")
  proceed_to_moderators <- TRUE
} else {
  cat("     ‚Üí CONCLUSION: Moderator analysis optional\n")
  proceed_to_moderators <- FALSE
}

cat("\n")

cat("  3. NEXT STEPS:\n")
if (proceed_to_moderators) {
  cat("     ‚úÖ Proceed to STEP 2: Moderator Analyses\n")
  cat("        (High I¬≤ indicates systematic variation to explain)\n")
} else {
  cat("     ‚Üí Moderator analysis not essential (low I¬≤)\n")
  cat("     ‚Üí May still explore moderators for theoretical reasons\n")
}

cat("\n")

# ----------------------------------------------------------------------------
# Step 1.6: Export Overall Results
# ----------------------------------------------------------------------------
cat("Step 1.6: Exporting overall meta-analysis results...\n")
cat(paste0(rep("-", 80), collapse = ""), "\n")

# Create results dataframe
overall_results <- data.frame(
  Analysis = "Overall Effect",
  k = res_overall$k,
  Estimate = round(overall_g, 4),
  SE = round(overall_se, 4),
  CI_Lower = round(overall_ci_lb, 4),
  CI_Upper = round(overall_ci_ub, 4),
  Z_value = round(overall_z, 4),
  p_value = round(overall_p, 4),
  PI_Lower = round(pi_lower, 4),
  PI_Upper = round(pi_upper, 4),
  Q = round(Q_stat, 4),
  Q_df = Q_df,
  Q_p = round(Q_p, 4),
  I2 = round(I2, 2),
  tau2 = round(tau2, 4),
  tau = round(tau, 4),
  H2 = round(H2, 4),
  stringsAsFactors = FALSE
)

# Verify output_path exists (defensive programming)
if (!exists("output_path") || is.null(output_path)) {
  output_path <- getwd()
  cat("  ‚ö†Ô∏è  output_path not found - using working directory\n")
}

cat("  üìÅ Output directory: ", output_path, "\n", sep = "")

# Check if directory exists and is writable
if (!dir.exists(output_path)) {
  cat("  ‚ö†Ô∏è  Directory does not exist, creating...\n")
  dir.create(output_path, recursive = TRUE)
}

# Construct output file path
output_file <- file.path(output_path, "overall_meta_analysis_results.csv")
cat("  üìÑ Full file path: ", output_file, "\n", sep = "")

# Test write permissions
test_file <- file.path(output_path, ".write_test.tmp")
tryCatch({
  writeLines("test", test_file)
  file.remove(test_file)
  cat("  ‚úÖ Write permissions: OK\n")
}, error = function(e) {
  cat("  ‚ùå Write permissions: FAILED\n")
  cat("     Error: ", conditionMessage(e), "\n", sep = "")
  stop("Cannot write to output directory. Check permissions.")
})

# Save to CSV
cat("  üíæ Writing CSV file...\n")

# Attempt to write with error handling
tryCatch({
  write.csv(overall_results, 
            output_file, 
            row.names = FALSE, 
            fileEncoding = "UTF-8")
  cat("  ‚úÖ Results exported: overall_meta_analysis_results.csv\n")
  cat("     Location: ", output_path, "\n", sep = "")
}, error = function(e) {
  # If file is locked, try alternative filename with timestamp
  timestamp <- format(Sys.time(), "%Y%m%d_%H%M%S")
  alt_file <- file.path(output_path, paste0("overall_meta_analysis_results_", timestamp, ".csv"))
  cat("  ‚ö†Ô∏è  Original file locked. Using alternative filename:\n")
  cat("     ", basename(alt_file), "\n", sep = "")
  write.csv(overall_results, 
            alt_file, 
            row.names = FALSE, 
            fileEncoding = "UTF-8")
  cat("  ‚úÖ Results exported successfully\n")
  cat("     Location: ", output_path, "\n", sep = "")
  cat("\n  ‚ÑπÔ∏è  NOTE: Close the original CSV file and re-run to use standard filename\n")
})

cat("\n")
cat("‚úÖ STEP 1 COMPLETE: Overall random-effects meta-analysis finished\n")
cat(paste0(rep("=", 80), collapse = ""), "\n\n")


STEP 1: OVERALL RANDOM-EFFECTS META-ANALYSIS

Research Question: Is L2 pronunciation training effective overall?

Step 1.1: Fitting random-effects meta-analysis model (REML)...
-------------------------------------------------------------------------------- 
  ‚úÖ Model fitted successfully
STEP 1: OVERALL RANDOM-EFFECTS META-ANALYSIS

Research Question: Is L2 pronunciation training effective overall?

Step 1.1: Fitting random-effects meta-analysis model (REML)...
-------------------------------------------------------------------------------- 
  ‚úÖ Model fitted successfully
     Method:     Restricted Maximum Likelihood (REML)
     Model type: Random-effects (intercept-only)
     Test:       Z-distribution

Step 1.2: Overall effect size estimate...
-------------------------------------------------------------------------------- 

  OVERALL EFFECT SIZE (Hedges' g):
  ----------------------------------------------------------------------------
    Estimate (g):      0.5272
    Standard 

In [None]:
# ============================================================================
# STEP 2: MODERATOR ANALYSES
# ============================================================================
#
# OUTPUT FILES GENERATED IN STEP 2:
#
# (A) UNIVARIATE ANALYSES (STEP 2.1)
#   1. univariate_moderator_results.csv
#       ‚Ä¢ Contains results for all tested moderators
#       ‚Ä¢ Includes: Œ≤, SE, 95% CI, p-values, R¬≤, œÑ¬≤_residual, I¬≤_residual, QM
#
#   2. significant_moderators.csv
#       ‚Ä¢ Contains only moderators with p < .05
#       ‚Ä¢ Used as inputs for STEP 2.2 multivariate analysis
#
#
# (B) MULTIVARIATE ANALYSIS (STEP 2.2) 
#   *Generated only if ‚â• 2 significant moderators from STEP 2.1*
#
#   3. multivariate_model_coefficients.csv
#       ‚Ä¢ Regression coefficients for each moderator (adjusted model)
#       ‚Ä¢ Includes: Œ≤, SE, CI, z, p
#
#   4. multivariate_model_fit.csv
#       ‚Ä¢ Overall model statistics
#       ‚Ä¢ Includes: QM, QE, œÑ¬≤, I¬≤, pseudo-R¬≤, k, p
#
# ============================================================================


# ============================================================================
# STEP 2.1: UNIVARIATE MODERATOR ANALYSES
# ============================================================================
#
# Purpose:  Identify candidate predictors via single-moderator tests
# Model:    yi = Œ≤‚ÇÄ + Œ≤‚ÇÅ(Moderator) + ui + ei
#           where ui ~ N(0,œÑ¬≤), ei ~ N(0,vi)
#
# Rationale:
#   Univariate meta-regression isolates bivariate associations before
#   accounting for confounds. Significant Œ≤‚ÇÅ indicates systematic
#   variation warranting multivariate investigation.
#
# Criteria:
#   ‚Ä¢ n ‚â• 5 complete cases
#   ‚Ä¢ ‚â• 2 moderator levels
#   ‚Ä¢ Model convergence
#
# Outputs:
#   Œ≤‚ÇÅ (moderator effect), SE, 95% CI, p-value, R¬≤ (variance explained)
#
# Decision Rule:
#   p < .05 ‚Üí Retain for STEP 2.2 multivariate model
#
# ============================================================================

cat(paste0(rep("=", 80), collapse = ""), "\n")
cat("STEP 2.1: UNIVARIATE MODERATOR ANALYSES\n")
cat(paste0(rep("=", 80), collapse = ""), "\n\n")

cat("Research Question: Which factors moderate training effectiveness?\n\n")

# ----------------------------------------------------------------------------
# Step 2.1.1: Define Candidate Moderator Variables
# ----------------------------------------------------------------------------
cat("Step 2.1.1: Defining candidate moderator variables...\n")
cat(paste0(rep("-", 80), collapse = ""), "\n\n")

# ----------------------------------------------------------------------------
# Set Reference Levels for Categorical Moderators
# ----------------------------------------------------------------------------
# Purpose: Establish meaningful baseline categories for meta-regression
#          interpretation (Nature reporting standard)
#
# Rationale: Reference levels should represent:
#   ‚Ä¢ Most common/typical condition (enhances generalizability)
#   ‚Ä¢ Theoretically neutral baseline (facilitates interpretation)
#   ‚Ä¢ Contrast group for effect comparison
#
# Note: All regression coefficients represent difference from reference
# ----------------------------------------------------------------------------

cat("  Setting reference levels for categorical moderators...\n\n")

# LEARNER CHARACTERISTICS
# Adult as reference (most prevalent in dataset)
if ("Age_Group" %in% names(df)) {
  df$Age_Group <- relevel(factor(df$Age_Group), ref = "Adult")
}

# Beginner as reference (lowest proficiency baseline)
if ("Proficiency_Level" %in% names(df)) {
  df$Proficiency_Level <- relevel(factor(df$Proficiency_Level), ref = "Beginner")
}

# Non-English major as reference (typical baseline)
if ("English_Major" %in% names(df)) {
  df$English_Major <- relevel(factor(df$English_Major), ref = "No")
}

# Undergraduate as reference (most common education level)
if ("Education_Stage" %in% names(df)) {
  df$Education_Stage <- relevel(factor(df$Education_Stage), ref = "Undergraduate")
}

# LEARNING ENVIRONMENT
# Foreign language context as reference (more common than L2)
if ("Learning_Context" %in% names(df)) {
  df$Learning_Context <- relevel(factor(df$Learning_Context), ref = "Foreign")
}

# Classroom as reference (most ecological setting)
if ("Training_Context" %in% names(df)) {
  df$Training_Context <- relevel(factor(df$Training_Context), ref = "Classroom")
}

# INSTRUCTIONAL FEATURES
# Production as reference (most common focus type)
if ("Focus_Type" %in% names(df)) {
  df$Focus_Type <- relevel(factor(df$Focus_Type), ref = "Production")
}

# Segmental as reference (traditional pronunciation target)
if ("Target_Feature" %in% names(df)) {
  df$Target_Feature <- relevel(factor(df$Target_Feature), ref = "Segmental")
}

# Explicit feedback as reference (most common feedback type)
if ("Feedback_Type" %in% names(df)) {
  df$Feedback_Type <- relevel(factor(df$Feedback_Type), ref = "Explicit")
}

# Human instructor as reference (traditional instruction mode)
if ("Instructor_Type" %in% names(df)) {
  df$Instructor_Type <- relevel(factor(df$Instructor_Type), ref = "Human")
}

# No peer interaction as reference (typical baseline)
if ("Peer_Interaction" %in% names(df)) {
  df$Peer_Interaction <- relevel(factor(df$Peer_Interaction), ref = "No")
}

# No visual cue as reference (audio-only baseline)
if ("Visual_Cue" %in% names(df)) {
  df$Visual_Cue <- relevel(factor(df$Visual_Cue), ref = "No")
}

# Short duration as reference (minimal treatment baseline)
if ("Treatment_Duration" %in% names(df)) {
  df$Treatment_Duration <- relevel(factor(df$Treatment_Duration), ref = "Short")
}

# METHODOLOGICAL FEATURES
# Active BAU as reference (most common comparator)
if ("Comparator_Type" %in% names(df)) {
  df$Comparator_Type <- relevel(factor(df$Comparator_Type), ref = "Active_BAU")
}

# Quasi-experimental as reference (most common design)
if ("Design_Type" %in% names(df)) {
  df$Design_Type <- relevel(factor(df$Design_Type), ref = "Quasi_Experiment")
}

# Pronunciation accuracy as reference (most common outcome)
if ("Outcome_Domain" %in% names(df)) {
  df$Outcome_Domain <- relevel(factor(df$Outcome_Domain), ref = "Pronunciation_Accuracy")
}

# Human rater as reference (traditional assessment)
if ("Rater_Type" %in% names(df)) {
  df$Rater_Type <- relevel(factor(df$Rater_Type), ref = "Human")
}

cat("  ‚úÖ Reference levels set for all categorical moderators\n")
cat("     Interpretation: Regression coefficients = difference from reference\n\n")

# Organize moderators by conceptual categories

# CATEGORY 1: Learner Characteristics
learner_moderators <- c(
  "Age_Group",          # Adult vs. children/adolescent
  "L1",                 # First language background
  "Proficiency_Level",  # Beginner, intermediate, advanced
  "Education_Stage",    # Educational level
  "English_Major"       # English major vs. non-major
)

# CATEGORY 2: Learning Environment
environment_moderators <- c(
  "Learning_Context",   # L2 vs. FL context
  "Training_Context"    # Classroom vs. lab
)

# CATEGORY 3: Instructional Features
instruction_moderators <- c(
  "Focus_Type",         # Explicit, implicit, reactive
  "Target_Feature",     # Segmentals, suprasegmentals, both
  "Feedback_Type",      # Visual, auditory, combined
  "Instructor_Type",    # Human, computer, peer
  "Peer_Interaction",   # Yes/No
  "Visual_Cue",         # Yes/No
  "Training_TotalMinute",  # Duration (numeric)
  "Training_TotalWeeks",   # Duration (numeric)
  "Treatment_Duration"     # Duration categories
)

# CATEGORY 4: Methodological Features
method_moderators <- c(
  "Comparator_Type",    # Control condition type
  "Design_Type",        # Between vs. within
  "Outcome_Domain",     # Controlled vs. spontaneous
  "Rater_Type"          # Expert vs. non-expert
)

# Combine all moderators
all_moderators <- c(
  learner_moderators,
  environment_moderators,
  instruction_moderators,
  method_moderators
)

# Display moderator categories
cat("  MODERATOR CATEGORIES:\n\n")
cat("  Category 1: Learner Characteristics (n = ", length(learner_moderators), ")\n", sep = "")
for (i in seq_along(learner_moderators)) {
  cat("    ", i, ". ", learner_moderators[i], "\n", sep = "")
}
cat("\n")

cat("  Category 2: Learning Environment (n = ", length(environment_moderators), ")\n", sep = "")
for (i in seq_along(environment_moderators)) {
  cat("    ", i, ". ", environment_moderators[i], "\n", sep = "")
}
cat("\n")

cat("  Category 3: Instructional Features (n = ", length(instruction_moderators), ")\n", sep = "")
for (i in seq_along(instruction_moderators)) {
  cat("    ", i, ". ", instruction_moderators[i], "\n", sep = "")
}
cat("\n")

cat("  Category 4: Methodological Features (n = ", length(method_moderators), ")\n", sep = "")
for (i in seq_along(method_moderators)) {
  cat("    ", i, ". ", method_moderators[i], "\n", sep = "")
}
cat("\n")

cat("  ", paste0(rep("-", 76), collapse = ""), "\n", sep = "")
cat("  TOTAL CANDIDATE MODERATORS: ", length(all_moderators), "\n", sep = "")
cat("  ", paste0(rep("-", 76), collapse = ""), "\n", sep = "")

cat("\n")

# Filter to only include moderators present in dataset
available_moderators <- intersect(all_moderators, names(df))
missing_moderators <- setdiff(all_moderators, names(df))

if (length(missing_moderators) > 0) {
  cat("  ‚ö†Ô∏è  WARNING: ", length(missing_moderators), " moderator(s) not in dataset:\n", sep = "")
  for (mod in missing_moderators) {
    cat("     - ", mod, "\n", sep = "")
  }
  cat("\n")
}

cat("  ‚úÖ Available moderators: ", length(available_moderators), "/", 
    length(all_moderators), "\n", sep = "")

# Update moderator list
all_moderators <- available_moderators

cat("\n")

# ----------------------------------------------------------------------------
# Step 2.1.2: Initialize Storage for Results
# ----------------------------------------------------------------------------
cat("Step 2.1.2: Initializing result storage...\n")
cat(paste0(rep("-", 80), collapse = ""), "\n")

# Storage containers
univariate_results_list <- list()  # Stores all successful models
usable_moderators <- c()           # Tracks which moderators worked
skipped_moderators <- data.frame(  # Tracks failures
  Moderator = character(),
  Reason = character(),
  stringsAsFactors = FALSE
)

cat("  ‚úÖ Storage containers initialized\n")
cat("     - univariate_results_list: Model results\n")
cat("     - usable_moderators:       Successful moderators\n")
cat("     - skipped_moderators:      Failed/excluded moderators\n")

cat("\n")

# ----------------------------------------------------------------------------
# Step 2.1.3: Run Univariate Meta-Regression Loop
# ----------------------------------------------------------------------------
cat("Step 2.1.3: Running univariate meta-regression loop...\n")
cat(paste0(rep("-", 80), collapse = ""), "\n\n")

cat("Testing ", length(all_moderators), " moderators individually...\n\n", sep = "")

# Progress counter
mod_counter <- 0

# Loop through each moderator
for (m in all_moderators) {
  
  mod_counter <- mod_counter + 1
  
  # Display progress
  cat(paste0(rep("-", 80), collapse = ""), "\n")
  cat("[", mod_counter, "/", length(all_moderators), "] MODERATOR: ", m, "\n", sep = "")
  cat(paste0(rep("-", 80), collapse = ""), "\n")
  
  # --- Sub-step 1: Extract complete cases ---
  cat("  (1) Extracting complete cases... ", sep = "")
  
  temp_df <- df[, c("Study_ID", "Effect_ID", "Hedges_g", "vi", m), drop = FALSE]
  temp_df <- temp_df[complete.cases(temp_df), ]
  
  n_complete <- nrow(temp_df)
  cat(n_complete, " cases\n", sep = "")
  
  # --- Sub-step 2: Check minimum sample size ---
  if (n_complete < 5) {
    cat("  (2) ‚ùå SKIP: Insufficient data (n = ", n_complete, " < 5)\n\n", sep = "")
    skipped_moderators <- rbind(skipped_moderators, data.frame(
      Moderator = m,
      Reason = paste0("Insufficient data (n=", n_complete, ")"),
      stringsAsFactors = FALSE
    ))
    next
  }
  
  cat("  (2) ‚úÖ Sample size adequate (n = ", n_complete, ")\n", sep = "")
  
  # --- Sub-step 3: Check variance ---
  n_unique <- length(unique(temp_df[[m]]))
  cat("  (3) Checking variance... ", n_unique, " unique levels\n", sep = "")
  
  if (n_unique < 2) {
    cat("  (4) ‚ùå SKIP: No variance (only 1 level)\n\n")
    skipped_moderators <- rbind(skipped_moderators, data.frame(
      Moderator = m,
      Reason = "No variance (1 unique level)",
      stringsAsFactors = FALSE
    ))
    next
  }
  
  # --- Sub-step 4: Prepare moderator variable ---
  if (is.character(temp_df[[m]])) {
    temp_df[[m]] <- as.factor(temp_df[[m]])
    cat("  (4) Converted to factor (categorical)\n")
  } else if (is.numeric(temp_df[[m]])) {
    cat("  (4) Numeric moderator (will be centered)\n")
  } else {
    cat("  (4) Variable type: ", class(temp_df[[m]])[1], "\n", sep = "")
  }
  
  # --- Sub-step 5: Fit meta-regression model ---
  cat("  (5) Fitting meta-regression model (REML)... ", sep = "")
  
  model <- tryCatch(
    {
      rma(yi = Hedges_g,
          vi = vi,
          mods = as.formula(paste0("~", m)),
          data = temp_df,
          method = "REML")
    },
    error = function(e) {
      cat("FAILED\n")
      cat("      Error: ", e$message, "\n\n", sep = "")
      return(NULL)
    }
  )
  
  if (is.null(model)) {
    skipped_moderators <- rbind(skipped_moderators, data.frame(
      Moderator = m,
      Reason = "Model convergence failure",
      stringsAsFactors = FALSE
    ))
    next
  }
  
  cat("SUCCESS\n")
  
  # --- Sub-step 6: Extract coefficients ---
  coefs <- coef(model)
  
  if (length(coefs) < 2 || is.na(coefs[2])) {
    cat("  (6) ‚ùå SKIP: No valid moderator coefficient\n\n")
    skipped_moderators <- rbind(skipped_moderators, data.frame(
      Moderator = m,
      Reason = "No valid coefficient",
      stringsAsFactors = FALSE
    ))
    next
  }
  
  # --- Sub-step 7: Extract inference statistics ---
  if (is.null(model$ci.lb) || is.null(model$ci.ub) ||
      length(model$ci.lb) < 2 || length(model$ci.ub) < 2 ||
      is.na(model$ci.lb[2]) || is.na(model$ci.ub[2])) {
    cat("  (6) ‚ùå SKIP: CI extraction failed\n\n")
    skipped_moderators <- rbind(skipped_moderators, data.frame(
      Moderator = m,
      Reason = "CI not available",
      stringsAsFactors = FALSE
    ))
    next
  }
  
  # Get p-value
  model_summary <- summary(model)
  pval <- model_summary$pval[2]
  
  cat("  (6) Coefficient extracted: Œ≤ = ", sprintf("%.4f", coefs[2]), 
      ", p = ", sprintf("%.4f", pval), sep = "")
  
  # Significance marker
  if (pval < 0.001) {
    cat(" ***\n")
  } else if (pval < 0.01) {
    cat(" **\n")
  } else if (pval < 0.05) {
    cat(" *\n")
  } else {
    cat("\n")
  }
  
  # --- Sub-step 8: Calculate R¬≤ (proportion of heterogeneity explained) ---
  # R¬≤ = (œÑ¬≤_overall - œÑ¬≤_residual) / œÑ¬≤_overall
  R2 <- max(0, (tau2 - model$tau2) / tau2) * 100
  
  cat("  (7) Heterogeneity: œÑ¬≤ = ", sprintf("%.4f", model$tau2), 
      ", I¬≤ = ", sprintf("%.2f", model$I2), "%, R¬≤ = ", 
      sprintf("%.2f", R2), "%\n", sep = "")
  
  # --- Sub-step 9: Store results ---
  univariate_results_list[[m]] <- data.frame(
    Moderator = m,
    n = n_complete,
    n_levels = n_unique,
    Estimate = coefs[2],
    SE = model$se[2],
    CI_Lower = model$ci.lb[2],
    CI_Upper = model$ci.ub[2],
    z_value = coefs[2] / model$se[2],
    p_value = pval,
    tau2_residual = model$tau2,
    I2_residual = model$I2,
    R2 = R2,
    QM = model$QM,
    QM_p = model$QMp,
    stringsAsFactors = FALSE
  )
  
  usable_moderators <- c(usable_moderators, m)
  
  cat("  (8) ‚úÖ Results stored successfully\n")
  cat("\n")
  
}  # End of moderator loop

cat(paste0(rep("=", 80), collapse = ""), "\n\n")


# ----------------------------------------------------------------------------
# Step 2.1.4: Summarize Univariate Analysis Loop
# ----------------------------------------------------------------------------
cat("Step 2.1.4: Univariate analysis loop summary...\n")
cat(paste0(rep("-", 80), collapse = ""), "\n\n")

n_tested <- length(all_moderators)
n_successful <- length(usable_moderators)
n_skipped <- nrow(skipped_moderators)

cat("  LOOP SUMMARY:\n")
cat("  ", paste0(rep("-", 76), collapse = ""), "\n", sep = "")
cat("    Total moderators tested: ", n_tested, "\n", sep = "")
cat("    Successful analyses:     ", n_successful, " (", 
    sprintf("%.1f", (n_successful/n_tested)*100), "%)\n", sep = "")
cat("    Skipped moderators:      ", n_skipped, " (", 
    sprintf("%.1f", (n_skipped/n_tested)*100), "%)\n", sep = "")

cat("\n")

# Display skipped moderators if any
if (n_skipped > 0) {
  cat("  Skipped moderators:\n")
  for (i in 1:nrow(skipped_moderators)) {
    cat("    ", i, ". ", skipped_moderators$Moderator[i], 
        " (", skipped_moderators$Reason[i], ")\n", sep = "")
  }
  cat("\n")
}

# Check if analysis can proceed
if (n_successful == 0) {
  cat("  ‚ùå ERROR: No moderators could be analyzed\n")
  cat("     Cannot proceed with moderator analysis\n")
  stop("Univariate moderator analysis failed - no valid models")
}

cat("  ‚úÖ Proceeding with ", n_successful, " successfully analyzed moderators\n", sep = "")

cat("\n")

# ----------------------------------------------------------------------------
# Step 2.1.5: Combine and Sort Results
# ----------------------------------------------------------------------------
cat("Step 2.1.5: Organizing univariate results...\n")
cat(paste0(rep("-", 80), collapse = ""), "\n")

# Combine all results into single dataframe
univariate_results <- do.call(rbind, univariate_results_list)
rownames(univariate_results) <- NULL

# Sort by p-value (most significant first)
univariate_results <- univariate_results[order(univariate_results$p_value), ]

cat("  ‚úÖ Results combined and sorted by significance\n")
cat("     Rows: ", nrow(univariate_results), "\n", sep = "")

cat("\n")

# ----------------------------------------------------------------------------
# Step 2.1.6: Display All Univariate Results
# ----------------------------------------------------------------------------
cat("Step 2.1.6: Displaying all univariate meta-regression results...\n")
cat(paste0(rep("-", 80), collapse = ""), "\n\n")

cat("  ALL UNIVARIATE MODERATOR RESULTS (sorted by p-value):\n")
cat("  ", paste0(rep("=", 76), collapse = ""), "\n", sep = "")
cat("\n")

# Create formatted display table
for (i in 1:nrow(univariate_results)) {
  row <- univariate_results[i, ]
  
  # Significance marker
  sig <- ""
  if (row$p_value < 0.001) {
    sig <- " ***"
  } else if (row$p_value < 0.01) {
    sig <- " **"
  } else if (row$p_value < 0.05) {
    sig <- " *"
  }
  
  cat(sprintf("  %2d. %-25s (n=%2d, k=%d)\n", 
              i, row$Moderator, row$n, row$n_levels))
  cat(sprintf("      Œ≤ = %7.4f, SE = %6.4f, 95%% CI [%7.4f, %7.4f]\n",
              row$Estimate, row$SE, row$CI_Lower, row$CI_Upper))
  cat(sprintf("      p = %7.4f%s, R¬≤ = %5.2f%%, œÑ¬≤ = %6.4f\n",
              row$p_value, sig, row$R2, row$tau2_residual))
  cat("\n")
}

cat("  ", paste0(rep("=", 76), collapse = ""), "\n", sep = "")

cat("\n")

# ----------------------------------------------------------------------------
# Step 2.1.7: Identify Significant Moderators
# ----------------------------------------------------------------------------
cat("Step 2.1.7: Identifying significant moderators (p < .05)...\n")
cat(paste0(rep("-", 80), collapse = ""), "\n\n")

# Filter significant results
significant_moderators <- univariate_results[univariate_results$p_value < 0.05, ]
n_significant <- nrow(significant_moderators)

cat("  SIGNIFICANT MODERATORS (p < .05):\n")
cat("  ", paste0(rep("=", 76), collapse = ""), "\n", sep = "")
cat("\n")

if (n_significant > 0) {
  cat("  ‚úÖ Found ", n_significant, " significant moderator(s)\n\n", sep = "")
  
  for (i in 1:n_significant) {
    row <- significant_moderators[i, ]
    
    # Significance level
    if (row$p_value < 0.001) {
      sig_level <- "p < .001 (highly significant)"
      sig_marker <- "***"
    } else if (row$p_value < 0.01) {
      sig_level <- "p < .01 (very significant)"
      sig_marker <- "**"
    } else {
      sig_level <- "p < .05 (significant)"
      sig_marker <- "*"
    }
    
    cat("  ", i, ". ", row$Moderator, " ", sig_marker, "\n", sep = "")
    cat("     Œ≤ = ", sprintf("%.4f", row$Estimate), 
        " [", sprintf("%.4f", row$CI_Lower), ", ", 
        sprintf("%.4f", row$CI_Upper), "]\n", sep = "")
    cat("     ", sig_level, "\n", sep = "")
    cat("     Explains ", sprintf("%.2f", row$R2), "% of heterogeneity\n", sep = "")
    cat("\n")
  }
} else {
  cat("  ‚ö†Ô∏è  No moderators reached statistical significance at p < .05\n")
  cat("     ‚Üí Effect sizes are relatively homogeneous across moderator levels\n")
  cat("     ‚Üí Multivariate analysis may not be warranted\n")
}

cat("\n")

# ----------------------------------------------------------------------------
# Step 2.1.8: Export Univariate Results
# ----------------------------------------------------------------------------
cat("Step 2.1.8: Exporting univariate results to CSV...\n")
cat(paste0(rep("-", 80), collapse = ""), "\n")

# Round for export
univariate_export <- univariate_results
numeric_cols <- c("Estimate", "SE", "CI_Lower", "CI_Upper", "z_value", 
                  "p_value", "tau2_residual", "I2_residual", "R2", "QM", "QM_p")
for (col in numeric_cols) {
  if (col %in% names(univariate_export)) {
    univariate_export[[col]] <- round(univariate_export[[col]], 4)
  }
}

# Export all results
safe_write_csv(univariate_export, "univariate_moderator_results.csv")

# Export significant results only
if (n_significant > 0) {
  significant_export <- significant_moderators
  for (col in numeric_cols) {
    if (col %in% names(significant_export)) {
      significant_export[[col]] <- round(significant_export[[col]], 4)
    }
  }
  
  safe_write_csv(significant_export, "significant_moderators.csv")
            row.names = FALSE,
            fileEncoding = "UTF-8")
  cat("  ‚úÖ Significant results saved: significant_moderators.csv\n")
}

cat("\n")
cat("‚úÖ STEP 2.1 COMPLETE: Univariate moderator analyses finished\n")
cat(paste0(rep("=", 80), collapse = ""), "\n\n")


# ============================================================================
# STEP 2.2: MULTIVARIATE META-REGRESSION
# ============================================================================
#
# Purpose:  Identify independent predictors after controlling for confounds
# Model:    yi = Œ≤‚ÇÄ + Œ≤‚ÇÅ(Mod‚ÇÅ) + Œ≤‚ÇÇ(Mod‚ÇÇ) + ... + Œ≤‚Çñ(Mod‚Çñ) + ui + ei
#
# Rationale:
#   Univariate tests may reflect spurious associations due to correlations
#   among moderators. Multivariate models isolate unique contributions.
#
# Selection Criteria:
#   ‚Ä¢ Include p < .05 moderators from STEP 2.1
#   ‚Ä¢ REML estimation for unbiased variance components
#
# Key Outputs:
#   Adjusted Œ≤ (controlling for covariates), SE, 95% CI, p-value
#   Model R¬≤ (total variance explained), QM (omnibus test)
#
# Interpretation:
#   Œ≤ significant in both models ‚Üí Robust independent predictor
#   Œ≤ n.s. in multivariate (sig in univariate) ‚Üí Confounded effect
#   Model R¬≤ ‚Üí Combined explanatory power of moderator set
#
# ============================================================================

cat(paste0(rep("=", 80), collapse = ""), "\n")
cat("STEP 2.2: MULTIVARIATE META-REGRESSION\n")
cat(paste0(rep("=", 80), collapse = ""), "\n\n")

cat("Research Question: Which moderators are unique predictors?\n\n")

# ----------------------------------------------------------------------------
# Step 2.2.1: Determine Moderators for Multivariate Model
# ----------------------------------------------------------------------------
cat("Step 2.2.1: Selecting moderators for multivariate model...\n")
cat(paste0(rep("-", 80), collapse = ""), "\n\n")

# Decision criteria
cat("  SELECTION CRITERIA:\n")
cat("    1. Statistical: p < .05 in univariate analysis\n")
cat("    2. Theoretical: Strong theoretical rationale\n")
cat("    3. Practical: Sufficient complete cases for joint analysis\n\n")

# Identify significant moderators
significant_mod_names <- significant_moderators$Moderator

cat("  Significant moderators from univariate analysis: ", 
    length(significant_mod_names), "\n", sep = "")
if (length(significant_mod_names) > 0) {
  for (i in seq_along(significant_mod_names)) {
    cat("    ", i, ". ", significant_mod_names[i], "\n", sep = "")
  }
} else {
  cat("    (None)\n")
}

cat("\n")

# Check if multivariate analysis is warranted
if (length(significant_mod_names) < 2) {
  cat("  ‚ö†Ô∏è  DECISION: Multivariate model not warranted\n")
  cat("     Reason: Fewer than 2 significant moderators\n")
  cat("     ‚Üí Univariate results are sufficient\n\n")
  
  cat("‚úÖ STEP 2.2 SKIPPED: Insufficient moderators for multivariate analysis\n")
  cat(paste0(rep("=", 80), collapse = ""), "\n\n")
  
  # Set flag to skip multivariate sections
  run_multivariate <- FALSE
  
} else {
  
  run_multivariate <- TRUE
  
  cat("  ‚úÖ DECISION: Proceed with multivariate analysis\n")
  cat("     ", length(significant_mod_names), " significant moderators available\n\n", sep = "")
  
  # ----------------------------------------------------------------------------
  # Step 2.2.2: Prepare Dataset for Multivariate Analysis
  # ----------------------------------------------------------------------------
  cat("Step 2.2.2: Preparing dataset for multivariate model...\n")
  cat(paste0(rep("-", 80), collapse = ""), "\n")
  
  # Select columns
  multi_cols <- c("Study_ID", "Effect_ID", "Hedges_g", "vi", significant_mod_names)
  df_multi <- df[, multi_cols, drop = FALSE]
  
  # Remove incomplete cases
  df_multi_complete <- df_multi[complete.cases(df_multi), ]
  
  n_original <- nrow(df)
  n_complete <- nrow(df_multi_complete)
  n_dropped <- n_original - n_complete
  
  cat("  Original dataset:    ", n_original, " effect sizes\n", sep = "")
  cat("  Complete cases:      ", n_complete, " effect sizes\n", sep = "")
  cat("  Dropped (missing):   ", n_dropped, " (", 
      sprintf("%.1f", (n_dropped/n_original)*100), "%)\n", sep = "")
  
  # Check if sufficient cases remain
  if (n_complete < 10) {
    cat("\n  ‚ö†Ô∏è  WARNING: Only ", n_complete, " complete cases\n", sep = "")
    cat("     Multivariate results may be unstable\n")
    cat("     Consider reducing number of moderators\n\n")
  } else {
    cat("\n  ‚úÖ Sufficient complete cases for multivariate analysis\n\n")
  }
  
  # Convert character variables to factors
  for (mod in significant_mod_names) {
    if (is.character(df_multi_complete[[mod]])) {
      df_multi_complete[[mod]] <- as.factor(df_multi_complete[[mod]])
    }
  }
  
  # ----------------------------------------------------------------------------
  # Step 2.2.3: Build and Fit Multivariate Model
  # ----------------------------------------------------------------------------
  cat("Step 2.2.3: Fitting multivariate meta-regression model...\n")
  cat(paste0(rep("-", 80), collapse = ""), "\n")
  
  # Build formula
  formula_multi <- as.formula(
    paste("Hedges_g ~", paste(significant_mod_names, collapse = " + "))
  )
  
  cat("\n  MODEL FORMULA:\n")
  cat("    ", deparse(formula_multi), "\n\n", sep = "")
  
  # Fit model
  cat("  Fitting model using REML estimation...\n")
  
  res_multi <- tryCatch(
    {
      rma(yi = Hedges_g,
          vi = vi,
          mods = formula_multi,
          data = df_multi_complete,
          method = "REML")
    },
    error = function(e) {
      cat("  ‚ùå ERROR: Model fitting failed\n")
      cat("     ", e$message, "\n", sep = "")
      return(NULL)
    }
  )
  
  if (is.null(res_multi)) {
    cat("\n  ‚ö†Ô∏è  Multivariate model could not be fitted\n")
    cat("     Possible reasons:\n")
    cat("       - Perfect multicollinearity\n")
    cat("       - Insufficient data\n")
    cat("       - Convergence issues\n\n")
    
    run_multivariate <- FALSE
    
  } else {
    cat("  ‚úÖ Model fitted successfully\n\n")
    
    # ----------------------------------------------------------------------------
    # Step 2.2.4: Display Multivariate Results
    # ----------------------------------------------------------------------------
    cat("Step 2.2.4: Multivariate meta-regression results...\n")
    cat(paste0(rep("-", 80), collapse = ""), "\n\n")
    
    cat("  MODEL SUMMARY:\n")
    cat("  ", paste0(rep("=", 76), collapse = ""), "\n", sep = "")
    cat("\n")
    
    # Display coefficients table
    coef_table <- data.frame(
      Predictor = rownames(res_multi$b),
      Estimate = res_multi$b[, 1],
      SE = res_multi$se,
      CI_Lower = res_multi$ci.lb,
      CI_Upper = res_multi$ci.ub,
      z_value = res_multi$zval,
      p_value = res_multi$pval,
      stringsAsFactors = FALSE
    )
    
    cat("  COEFFICIENTS:\n\n")
    for (i in 1:nrow(coef_table)) {
      row <- coef_table[i, ]
      
      # Significance marker
      sig <- ""
      if (row$p_value < 0.001) {
        sig <- " ***"
      } else if (row$p_value < 0.01) {
        sig <- " **"
      } else if (row$p_value < 0.05) {
        sig <- " *"
      }
      
      cat(sprintf("  %-30s\n", row$Predictor))
      cat(sprintf("    Œ≤ = %7.4f, SE = %6.4f, 95%% CI [%7.4f, %7.4f]\n",
                  row$Estimate, row$SE, row$CI_Lower, row$CI_Upper))
      cat(sprintf("    z = %7.4f, p = %7.4f%s\n",
                  row$z_value, row$p_value, sig))
      cat("\n")
    }
    
    # ----------------------------------------------------------------------------
    # Step 2.2.5: Model Fit Statistics
    # ----------------------------------------------------------------------------
    cat("  MODEL FIT:\n")
    cat("  ", paste0(rep("-", 76), collapse = ""), "\n", sep = "")
    
    # Calculate pseudo R¬≤
    R2_multi <- max(0, (tau2 - res_multi$tau2) / tau2) * 100
    
    cat("    QM (model test):         ", sprintf("%.4f", res_multi$QM), 
        " (df = ", res_multi$p - 1, ", p ", sep = "")
    if (res_multi$QMp < 0.001) {
      cat("< .001 ***)\n")
    } else {
      cat("= ", sprintf("%.4f", res_multi$QMp), ")\n", sep = "")
    }
    
    cat("    QE (residual):           ", sprintf("%.4f", res_multi$QE), 
        " (df = ", res_multi$k - res_multi$p, ", p ", sep = "")
    if (res_multi$QEp < 0.001) {
      cat("< .001 ***)\n")
    } else {
      cat("= ", sprintf("%.4f", res_multi$QEp), ")\n", sep = "")
    }
    
    cat("    œÑ¬≤ (residual):           ", sprintf("%.4f", res_multi$tau2), "\n", sep = "")
    cat("    I¬≤ (residual):           ", sprintf("%.2f", res_multi$I2), "%\n", sep = "")
    cat("    R¬≤ (pseudo):             ", sprintf("%.2f", R2_multi), "%\n", sep = "")
    
    cat("\n")
    
    # Interpret R¬≤
    if (R2_multi < 25) {
      cat("    ‚Üí Model explains <25% of heterogeneity (low)\n")
    } else if (R2_multi < 50) {
      cat("    ‚Üí Model explains 25-50% of heterogeneity (moderate)\n")
    } else if (R2_multi < 75) {
      cat("    ‚Üí Model explains 50-75% of heterogeneity (substantial)\n")
    } else {
      cat("    ‚Üí Model explains ‚â•75% of heterogeneity (high)\n")
    }
    
    cat("\n")
    
    # ----------------------------------------------------------------------------
    # Step 2.2.6: Compare Univariate vs Multivariate Results
    # ----------------------------------------------------------------------------
    cat("Step 2.2.6: Comparing univariate vs multivariate results...\n")
    cat(paste0(rep("-", 80), collapse = ""), "\n\n")
    
    cat("  UNIVARIATE VS MULTIVARIATE COMPARISON:\n")
    cat("  ", paste0(rep("=", 76), collapse = ""), "\n", sep = "")
    cat("\n")
    
    # For each moderator, compare univariate and multivariate results
    for (mod in significant_mod_names) {
      # Get univariate result
      uni_row <- univariate_results[univariate_results$Moderator == mod, ]
      
      # Get multivariate coefficient (may be multiple if categorical)
      # Find rows that contain the moderator name
      multi_rows <- grep(mod, rownames(res_multi$b), value = FALSE)
      
      if (length(multi_rows) > 0) {
        # Take the first coefficient (or average if multiple)
        multi_idx <- multi_rows[1]
        multi_beta <- res_multi$b[multi_idx, 1]
        multi_p <- res_multi$pval[multi_idx]
        
        uni_sig <- ifelse(uni_row$p_value < 0.05, "Sig", "NS")
        multi_sig <- ifelse(multi_p < 0.05, "Sig", "NS")
        
        interpretation <- ""
        if (uni_sig == "Sig" && multi_sig == "Sig") {
          interpretation <- "‚Üí Unique predictor"
        } else if (uni_sig == "Sig" && multi_sig == "NS") {
          interpretation <- "‚Üí Confounded with other moderators"
        } else if (uni_sig == "NS" && multi_sig == "Sig") {
          interpretation <- "‚Üí Suppressor effect"
        }
        
        cat("  ", mod, "\n", sep = "")
        cat(sprintf("    Univariate:   Œ≤ = %7.4f, p = %7.4f (%s)\n",
                    uni_row$Estimate, uni_row$p_value, uni_sig))
        cat(sprintf("    Multivariate: Œ≤ = %7.4f, p = %7.4f (%s)\n",
                    multi_beta, multi_p, multi_sig))
        cat("    ", interpretation, "\n", sep = "")
        cat("\n")
      }
    }
    
    # ----------------------------------------------------------------------------
    # Step 2.2.7: Export Multivariate Results
    # ----------------------------------------------------------------------------
    cat("Step 2.2.7: Exporting multivariate results...\n")
    cat(paste0(rep("-", 80), collapse = ""), "\n")
    
    # Round coefficients for export
    coef_export <- coef_table
    coef_export$Estimate <- round(coef_export$Estimate, 4)
    coef_export$SE <- round(coef_export$SE, 4)
    coef_export$CI_Lower <- round(coef_export$CI_Lower, 4)
    coef_export$CI_Upper <- round(coef_export$CI_Upper, 4)
    coef_export$z_value <- round(coef_export$z_value, 4)
    coef_export$p_value <- round(coef_export$p_value, 4)
    
    # Add model fit statistics
    model_fit <- data.frame(
      Model = "Multivariate",
      k = res_multi$k,
      p = res_multi$p,
      QM = round(res_multi$QM, 4),
      QM_p = round(res_multi$QMp, 4),
      QE = round(res_multi$QE, 4),
      QE_p = round(res_multi$QEp, 4),
      tau2 = round(res_multi$tau2, 4),
      I2 = round(res_multi$I2, 2),
      R2 = round(R2_multi, 2),
      stringsAsFactors = FALSE
    )
    
    write.csv(coef_export,
              "multivariate_model_coefficients.csv",
              row.names = FALSE,
              fileEncoding = "UTF-8")
    cat("  ‚úÖ Coefficients saved: multivariate_model_coefficients.csv\n")
    
    write.csv(model_fit,
              "multivariate_model_fit.csv",
              row.names = FALSE,
              fileEncoding = "UTF-8")
    cat("  ‚úÖ Model fit saved: multivariate_model_fit.csv\n")
    
    cat("\n")
  }
}

if (run_multivariate) {
  cat("‚úÖ STEP 2.2 COMPLETE: Multivariate meta-regression finished\n")
} else {
  cat("‚úÖ STEP 2.2 COMPLETE: Multivariate analysis not performed\n")
}
cat(paste0(rep("=", 80), collapse = ""), "\n\n")



STEP 2.1: UNIVARIATE MODERATOR ANALYSES

Research Question: Which factors moderate training effectiveness?

Step 2.1.1: Defining candidate moderator variables...
-------------------------------------------------------------------------------- 

  Setting reference levels for categorical moderators...

  ‚úÖ Reference levels set for all categorical moderators
STEP 2.1: UNIVARIATE MODERATOR ANALYSES

Research Question: Which factors moderate training effectiveness?

Step 2.1.1: Defining candidate moderator variables...
-------------------------------------------------------------------------------- 

  Setting reference levels for categorical moderators...

  ‚úÖ Reference levels set for all categorical moderators
     Interpretation: Regression coefficients = difference from reference

  MODERATOR CATEGORIES:

  Category 1: Learner Characteristics (n = 5)
    1. Age_Group
    2. L1
    3. Proficiency_Level
    4. Education_Stage
    5. English_Major

  Category 2: Learning Environment (

In [44]:
# ============================================================================
# STEP 3: SENSITIVITY & ROBUSTNESS ANALYSES (LEAVE-ONE-OUT, INFLUENCE, RVE)
# ============================================================================
#
# Outputs (All CSV files):
#   STEP 3.1 ‚Äî Leave-one-out Sensitivity:
#       ‚Ä¢ leave_one_out_analysis.csv
#
#   STEP 3.2 ‚Äî Influence Diagnostics:
#       ‚Ä¢ influence_diagnostics.csv
#
#   STEP 3.3 ‚Äî Robust Variance Estimation (RVE):
#       ‚Ä¢ rve_overall_effect.csv
#       ‚Ä¢ rve_moderator_results.csv          (only if significant moderators exist)
#
# ============================================================================


# ============================================================================
# STEP 3.1: LEAVE-ONE-OUT SENSITIVITY ANALYSIS
# ============================================================================
#
# Purpose:  Test robustness of pooled effect to individual study exclusion
# Method:   Iteratively refit model with k-1 studies (leave1out)
#
# Influence Criteria:
#   ‚Ä¢ |Œîg| > 10% (estimate changes >10%)
#   ‚Ä¢ Significance flip (p crosses .05 threshold)
#   ‚Ä¢ ŒîI¬≤ > 10% (heterogeneity shift)
#
# Outputs:
#   Effect estimate range, influential cases, stability classification
#
# Interpretation:
#   Narrow range (Œîg <5%) ‚Üí Highly robust
#   Moderate range (5-10%) ‚Üí Robust
#   Wide range (>10%) ‚Üí Potentially fragile, driven by specific studies
#
# ============================================================================

cat(paste0(rep("=", 80), collapse = ""), "\n")
cat("STEP 3.1: LEAVE-ONE-OUT SENSITIVITY ANALYSIS\n")
cat(paste0(rep("=", 80), collapse = ""), "\n\n")

cat("Research Question: Is the overall effect robust to study exclusion?\n\n")

# ----------------------------------------------------------------------------
# Step 3.1.1: Review Overall Effect (Baseline)
# ----------------------------------------------------------------------------
cat("Step 3.1.1: Baseline overall effect (for comparison)...\n")
cat(paste0(rep("-", 80), collapse = ""), "\n\n")

cat("  BASELINE (All studies included):\n")
cat("    Effect size:  g = ", sprintf("%.4f", overall_g), 
    " [", sprintf("%.4f", overall_ci_lb), ", ",
    sprintf("%.4f", overall_ci_ub), "]\n", sep = "")
cat("    p-value:      ", sprintf("%.4f", overall_p), sep = "")
if (overall_p < 0.001) {
  cat(" ***\n")
} else if (overall_p < 0.01) {
  cat(" **\n")
} else if (overall_p < 0.05) {
  cat(" *\n")
} else {
  cat("\n")
}
cat("    œÑ¬≤:           ", sprintf("%.4f", tau2), "\n", sep = "")
cat("    I¬≤:           ", sprintf("%.2f", I2), "%\n", sep = "")
cat("    k:            ", res_overall$k, " effect sizes\n", sep = "")

cat("\n")

# ----------------------------------------------------------------------------
# Step 3.1.2: Perform Leave-One-Out Analysis
# ----------------------------------------------------------------------------
cat("Step 3.1.2: Running leave-one-out analysis...\n")
cat(paste0(rep("-", 80), collapse = ""), "\n")

cat("  Systematically excluding each effect size...\n\n")

# Use metafor's built-in function
loo_results <- leave1out(res_overall)

cat("  ‚úÖ Leave-one-out analysis completed\n")
cat("     Iterations: ", nrow(loo_results), "\n", sep = "")

cat("\n")

# ----------------------------------------------------------------------------
# Step 3.1.3: Organize Leave-One-Out Results
# ----------------------------------------------------------------------------
cat("Step 3.1.3: Organizing sensitivity results...\n")
cat(paste0(rep("-", 80), collapse = ""), "\n")

# Create comprehensive results dataframe
loo_summary <- data.frame(
  Study_ID = df$Study_ID,
  Effect_ID = df$Effect_ID,
  Estimate = loo_results$estimate,
  SE = loo_results$se,
  CI_Lower = loo_results$ci.lb,
  CI_Upper = loo_results$ci.ub,
  z_value = loo_results$zval,
  p_value = loo_results$pval,
  Q = loo_results$Q,
  Q_p = loo_results$Qp,
  tau2 = loo_results$tau2,
  I2 = loo_results$I2,
  H2 = loo_results$H2,
  stringsAsFactors = FALSE
)

# Calculate change metrics
loo_summary$Change_Estimate <- loo_summary$Estimate - overall_g
loo_summary$Pct_Change_Estimate <- (loo_summary$Change_Estimate / overall_g) * 100
loo_summary$Change_I2 <- loo_summary$I2 - I2

cat("  ‚úÖ Results organized with change metrics\n")

cat("\n")

# ----------------------------------------------------------------------------
# Step 3.1.4: Summary Statistics
# ----------------------------------------------------------------------------
cat("Step 3.1.4: Summary statistics across leave-one-out iterations...\n")
cat(paste0(rep("-", 80), collapse = ""), "\n\n")

cat("  EFFECT SIZE RANGE WHEN EACH STUDY EXCLUDED:\n")
cat("  ", paste0(rep("-", 76), collapse = ""), "\n", sep = "")
cat("    Minimum:  ", sprintf("%.4f", min(loo_summary$Estimate)), "\n", sep = "")
cat("    Maximum:  ", sprintf("%.4f", max(loo_summary$Estimate)), "\n", sep = "")
cat("    Range:    ", sprintf("%.4f", max(loo_summary$Estimate) - min(loo_summary$Estimate)), "\n", sep = "")
cat("    Mean:     ", sprintf("%.4f", mean(loo_summary$Estimate)), "\n", sep = "")
cat("    Median:   ", sprintf("%.4f", median(loo_summary$Estimate)), "\n", sep = "")
cat("    SD:       ", sprintf("%.4f", sd(loo_summary$Estimate)), "\n", sep = "")

cat("\n")

cat("  I¬≤ RANGE:\n")
cat("  ", paste0(rep("-", 76), collapse = ""), "\n", sep = "")
cat("    Minimum:  ", sprintf("%.2f", min(loo_summary$I2)), "%\n", sep = "")
cat("    Maximum:  ", sprintf("%.2f", max(loo_summary$I2)), "%\n", sep = "")
cat("    Range:    ", sprintf("%.2f", max(loo_summary$I2) - min(loo_summary$I2)), "%\n", sep = "")

cat("\n")

# ----------------------------------------------------------------------------
# Step 3.1.5: Identify Influential Studies
# ----------------------------------------------------------------------------
cat("Step 3.1.5: Identifying influential studies...\n")
cat(paste0(rep("-", 80), collapse = ""), "\n\n")

# Define influence criteria
influence_threshold_estimate <- 10  # 10% change in estimate
influence_threshold_I2 <- 10        # 10 percentage point change in I¬≤

# Flag influential cases
loo_summary$Influential_Estimate <- abs(loo_summary$Pct_Change_Estimate) > influence_threshold_estimate
loo_summary$Influential_I2 <- abs(loo_summary$Change_I2) > influence_threshold_I2
loo_summary$Influential_Any <- loo_summary$Influential_Estimate | loo_summary$Influential_I2

# Check for significance changes
loo_summary$Sig_Original <- overall_p < 0.05
loo_summary$Sig_LOO <- loo_summary$p_value < 0.05
loo_summary$Sig_Changed <- loo_summary$Sig_Original != loo_summary$Sig_LOO

# Count influential cases
n_influential_estimate <- sum(loo_summary$Influential_Estimate)
n_influential_I2 <- sum(loo_summary$Influential_I2)
n_influential_any <- sum(loo_summary$Influential_Any)
n_sig_changed <- sum(loo_summary$Sig_Changed)

cat("  INFLUENCE DETECTION:\n")
cat("  ", paste0(rep("=", 76), collapse = ""), "\n", sep = "")
cat("\n")

if (n_influential_any > 0) {
  cat("  ‚ö†Ô∏è  INFLUENTIAL STUDIES DETECTED: ", n_influential_any, "/", nrow(loo_summary), "\n\n", sep = "")
  
  # Show influential studies
  influential_studies <- loo_summary[loo_summary$Influential_Any, ]
  influential_studies <- influential_studies[order(-abs(influential_studies$Pct_Change_Estimate)), ]
  
  for (i in 1:nrow(influential_studies)) {
    row <- influential_studies[i, ]
    cat("  ", i, ". ", row$Study_ID, " (", row$Effect_ID, ")\n", sep = "")
    cat("     When excluded: g = ", sprintf("%.4f", row$Estimate), 
        " (change: ", sprintf("%+.2f", row$Pct_Change_Estimate), "%)\n", sep = "")
    cat("     I¬≤ change: ", sprintf("%+.2f", row$Change_I2), " percentage points\n", sep = "")
    
    if (row$Influential_Estimate) {
      cat("     ‚ö†Ô∏è  Large effect on estimate (>10%)\n")
    }
    if (row$Influential_I2) {
      cat("     ‚ö†Ô∏è  Large effect on heterogeneity (ŒîI¬≤ >10%)\n")
    }
    cat("\n")
  }
  
} else {
  cat("  ‚úÖ NO HIGHLY INFLUENTIAL STUDIES DETECTED\n")
  cat("     All individual exclusions change estimate by <10%\n")
  cat("     All individual exclusions change I¬≤ by <10 percentage points\n\n")
}

# Check significance stability
if (n_sig_changed > 0) {
  cat("  ‚ö†Ô∏è  SIGNIFICANCE INSTABILITY DETECTED\n")
  cat("     ", n_sig_changed, " study/studies alter statistical significance\n\n", sep = "")
  
  sig_changed_studies <- loo_summary[loo_summary$Sig_Changed, ]
  for (i in 1:nrow(sig_changed_studies)) {
    row <- sig_changed_studies[i, ]
    cat("     - ", row$Study_ID, " (", row$Effect_ID, "): ", sep = "")
    cat("p = ", sprintf("%.4f", row$p_value), "\n", sep = "")
  }
  cat("\n")
} else {
  cat("  ‚úÖ SIGNIFICANCE IS STABLE\n")
  cat("     Statistical significance consistent across all iterations\n\n")
}

# ----------------------------------------------------------------------------
# Step 3.1.6: Interpretation and Conclusion
# ----------------------------------------------------------------------------
cat("Step 3.1.6: Sensitivity analysis interpretation...\n")
cat(paste0(rep("-", 80), collapse = ""), "\n\n")

cat("  ROBUSTNESS ASSESSMENT:\n")
cat("  ", paste0(rep("=", 76), collapse = ""), "\n", sep = "")
cat("\n")

# Determine robustness level
if (n_influential_any == 0 && n_sig_changed == 0) {
  robustness_level <- "HIGHLY ROBUST"
  robustness_desc <- "Results are stable across all leave-one-out iterations"
  robustness_icon <- "‚úÖ"
} else if (n_influential_any <= 2 && n_sig_changed == 0) {
  robustness_level <- "ROBUST"
  robustness_desc <- "Minimal influence from individual studies"
  robustness_icon <- "‚úÖ"
} else if (n_sig_changed == 0) {
  robustness_level <- "MODERATELY ROBUST"
  robustness_desc <- "Some individual influence but significance is stable"
  robustness_icon <- "‚ö†Ô∏è "
} else {
  robustness_level <- "FRAGILE"
  robustness_desc <- "Results depend on specific studies"
  robustness_icon <- "‚ö†Ô∏è "
}

cat("  ", robustness_icon, " CONCLUSION: ", robustness_level, "\n", sep = "")
cat("     ", robustness_desc, "\n\n", sep = "")

cat("  RECOMMENDATION:\n")
if (n_influential_any == 0) {
  cat("     ‚Üí Report overall effect with confidence\n")
  cat("     ‚Üí No sensitivity concerns\n")
} else if (n_influential_any <= 2) {
  cat("     ‚Üí Report overall effect\n")
  cat("     ‚Üí Note influential studies in supplementary materials\n")
} else {
  cat("     ‚Üí Report overall effect with caution\n")
  cat("     ‚Üí Conduct influence diagnostics (STEP 3.2)\n")
  cat("     ‚Üí Consider excluding outliers in sensitivity analysis\n")
}

cat("\n")

# ----------------------------------------------------------------------------
# Step 3.1.7: Export Leave-One-Out Results
# ----------------------------------------------------------------------------
cat("Step 3.1.7: Exporting leave-one-out results...\n")
cat(paste0(rep("-", 80), collapse = ""), "\n")

# Round for export
loo_export <- loo_summary
numeric_cols <- c("Estimate", "SE", "CI_Lower", "CI_Upper", "z_value", "p_value",
                  "Q", "Q_p", "tau2", "I2", "H2", "Change_Estimate", 
                  "Pct_Change_Estimate", "Change_I2")
for (col in numeric_cols) {
  if (col %in% names(loo_export)) {
    loo_export[[col]] <- round(loo_export[[col]], 4)
  }
}

write.csv(loo_export,
          "leave_one_out_analysis.csv",
          row.names = FALSE,
          fileEncoding = "UTF-8")

cat("  ‚úÖ Results exported: leave_one_out_analysis.csv\n")

cat("\n")
cat("‚úÖ STEP 3.1 COMPLETE: Leave-one-out sensitivity analysis finished\n")
cat(paste0(rep("=", 80), collapse = ""), "\n\n")



# ============================================================================
# STEP 3.2: INFLUENCE DIAGNOSTICS
# ============================================================================
#
# Purpose:  Detect outliers and high-leverage cases via formal diagnostics
# Metrics:  Cook's D, DFBETAS, Hat values, Standardized residuals
#
# Thresholds (established criteria):
#   Cook's D:    Di > 4/(k-p-1)
#   DFBETAS:     |DFBETAS| > 2/‚àök
#   Hat values:  hi > 2p/k (leverage)
#   Residuals:   |ri| > 2 (moderate), |ri| > 3 (extreme)
#
# Outputs:
#   Flagged influential cases, sensitivity analysis excluding outliers
#
# Interpretation:
#   Cases exceeding multiple thresholds warrant exclusion sensitivity tests
#   Compare results with/without influential cases for robustness assessment
#
# ============================================================================

cat(paste0(rep("=", 80), collapse = ""), "\n")
cat("STEP 3.2: INFLUENCE DIAGNOSTICS\n")
cat(paste0(rep("=", 80), collapse = ""), "\n\n")

cat("Research Question: Are there influential outliers affecting results?\n\n")

# ----------------------------------------------------------------------------
# Step 3.2.1: Calculate Influence Diagnostics
# ----------------------------------------------------------------------------
cat("Step 3.2.1: Calculating influence diagnostics...\n")
cat(paste0(rep("-", 80), collapse = ""), "\n")

cat("  Computing diagnostic metrics:\n")
cat("    ‚Üí Cook's Distance (overall influence)\n")
cat("    ‚Üí DFBETAS (coefficient influence)\n")
cat("    ‚Üí Hat values (leverage)\n")
cat("    ‚Üí Standardized residuals (outliers)\n\n")

# Use metafor's influence function
influence_stats <- influence(res_overall)

cat("  ‚úÖ Diagnostic calculations complete\n")

cat("\n")

# ----------------------------------------------------------------------------
# Step 3.2.2: Extract and Organize Diagnostics
# ----------------------------------------------------------------------------
cat("Step 3.2.2: Organizing diagnostic statistics...\n")
cat(paste0(rep("-", 80), collapse = ""), "\n")

# Extract diagnostics
cooks_d <- influence_stats$inf$cook.d        # Cook's distance
dfbetas_raw <- influence_stats$inf$dfbs      # DFBETAS (may be matrix or NULL)
hat_values <- influence_stats$inf$hat        # Hat values (leverage)
std_resid_obj <- rstandard(res_overall)      # Standardized residuals object

# Extract standardized residuals
if (is.list(std_resid_obj) && "z" %in% names(std_resid_obj)) {
  std_resid <- std_resid_obj$z
} else if (is.numeric(std_resid_obj)) {
  std_resid <- std_resid_obj
} else {
  std_resid <- as.numeric(std_resid_obj)
}

# Handle DFBETAS - can be matrix, vector, or NULL
if (is.null(dfbetas_raw) || length(dfbetas_raw) == 0) {
  # If DFBETAS not available, calculate it manually
  dfbetas <- rep(0, length(cooks_d))
  cat("  ‚ö†Ô∏è  DFBETAS not available from influence(), using zeros\n")
} else if (is.matrix(dfbetas_raw)) {
  # For each case, take maximum absolute DFBETAS across all coefficients
  dfbetas <- apply(abs(dfbetas_raw), 1, max)
} else if (is.numeric(dfbetas_raw)) {
  dfbetas <- abs(dfbetas_raw)
} else {
  # Fallback: try to convert to numeric
  dfbetas <- abs(as.numeric(dfbetas_raw))
}

# Ensure all vectors have same length
cat("  Vector lengths: Cook's D = ", length(cooks_d), 
    ", DFBETAS = ", length(dfbetas),
    ", Hat = ", length(hat_values),
    ", Resid = ", length(std_resid), "\n", sep = "")

# Verify all have same length as df
if (length(cooks_d) != nrow(df) || length(dfbetas) != nrow(df) || 
    length(hat_values) != nrow(df) || length(std_resid) != nrow(df)) {
  cat("  ‚ö†Ô∏è  Dimension mismatch detected!\n")
  cat("     df has ", nrow(df), " rows\n", sep = "")
  cat("     Adjusting to match...\n")
  
  # Pad shorter vectors with NA
  max_len <- nrow(df)
  if (length(cooks_d) < max_len) cooks_d <- c(cooks_d, rep(NA, max_len - length(cooks_d)))
  if (length(dfbetas) < max_len) dfbetas <- c(dfbetas, rep(NA, max_len - length(dfbetas)))
  if (length(hat_values) < max_len) hat_values <- c(hat_values, rep(NA, max_len - length(hat_values)))
  if (length(std_resid) < max_len) std_resid <- c(std_resid, rep(NA, max_len - length(std_resid)))
  
  # Truncate longer vectors
  cooks_d <- cooks_d[1:max_len]
  dfbetas <- dfbetas[1:max_len]
  hat_values <- hat_values[1:max_len]
  std_resid <- std_resid[1:max_len]
}

# Create comprehensive diagnostics dataframe
influence_df <- data.frame(
  Study_ID = df$Study_ID,
  Effect_ID = df$Effect_ID,
  Hedges_g = df$Hedges_g,
  Cooks_D = cooks_d,
  DFBETAS = dfbetas,
  Hat = hat_values,
  Std_Resid = std_resid,
  stringsAsFactors = FALSE
)

cat("  ‚úÖ Diagnostics organized for ", nrow(influence_df), " effect sizes\n", sep = "")

cat("\n")

# ----------------------------------------------------------------------------
# Step 3.2.3: Define Influence Thresholds
# ----------------------------------------------------------------------------
cat("Step 3.2.3: Defining influence thresholds...\n")
cat(paste0(rep("-", 80), collapse = ""), "\n\n")

k <- res_overall$k          # Number of studies
p <- res_overall$p          # Number of parameters

# Standard thresholds
threshold_cooks <- 4 / (k - p - 1)
threshold_dfbetas <- 2 / sqrt(k)
threshold_hat <- 2 * p / k
threshold_resid_moderate <- 2
threshold_resid_extreme <- 3

cat("  INFLUENCE THRESHOLDS:\n")
cat("  ", paste0(rep("-", 76), collapse = ""), "\n", sep = "")
cat("    Cook's D:         > ", sprintf("%.4f", threshold_cooks), 
    " (4/(k-p-1))\n", sep = "")
cat("    DFBETAS:          > ", sprintf("%.4f", threshold_dfbetas), 
    " (2/‚àök)\n", sep = "")
cat("    Hat values:       > ", sprintf("%.4f", threshold_hat), 
    " (2p/k)\n", sep = "")
cat("    Std. residuals:   > ", threshold_resid_moderate, 
    " (moderate outlier)\n", sep = "")
cat("                      > ", threshold_resid_extreme, 
    " (extreme outlier)\n", sep = "")

cat("\n")

# ----------------------------------------------------------------------------
# Step 3.2.4: Flag Influential Cases
# ----------------------------------------------------------------------------
cat("Step 3.2.4: Identifying influential cases...\n")
cat(paste0(rep("-", 80), collapse = ""), "\n\n")

# Apply thresholds
influence_df$Flag_Cooks <- influence_df$Cooks_D > threshold_cooks
influence_df$Flag_DFBETAS <- abs(influence_df$DFBETAS) > threshold_dfbetas
influence_df$Flag_Hat <- influence_df$Hat > threshold_hat
influence_df$Flag_Resid_Moderate <- abs(influence_df$Std_Resid) > threshold_resid_moderate
influence_df$Flag_Resid_Extreme <- abs(influence_df$Std_Resid) > threshold_resid_extreme

# Overall influence flag (any diagnostic exceeds threshold)
influence_df$Influential <- (
  influence_df$Flag_Cooks |
  influence_df$Flag_DFBETAS |
  influence_df$Flag_Hat |
  influence_df$Flag_Resid_Extreme
)

# Count influential cases
n_cooks <- sum(influence_df$Flag_Cooks)
n_dfbetas <- sum(influence_df$Flag_DFBETAS)
n_hat <- sum(influence_df$Flag_Hat)
n_resid_mod <- sum(influence_df$Flag_Resid_Moderate)
n_resid_ext <- sum(influence_df$Flag_Resid_Extreme)
n_influential <- sum(influence_df$Influential)

cat("  INFLUENCE DETECTION SUMMARY:\n")
cat("  ", paste0(rep("=", 76), collapse = ""), "\n", sep = "")
cat("\n")
cat("    High Cook's D:           ", n_cooks, " cases (", 
    sprintf("%.1f", (n_cooks/k)*100), "%)\n", sep = "")
cat("    High DFBETAS:            ", n_dfbetas, " cases (", 
    sprintf("%.1f", (n_dfbetas/k)*100), "%)\n", sep = "")
cat("    High leverage (hat):     ", n_hat, " cases (", 
    sprintf("%.1f", (n_hat/k)*100), "%)\n", sep = "")
cat("    Moderate outliers (|r|>2): ", n_resid_mod, " cases (", 
    sprintf("%.1f", (n_resid_mod/k)*100), "%)\n", sep = "")
cat("    Extreme outliers (|r|>3):  ", n_resid_ext, " cases (", 
    sprintf("%.1f", (n_resid_ext/k)*100), "%)\n", sep = "")
cat("\n")
cat("  ", paste0(rep("-", 76), collapse = ""), "\n", sep = "")
cat("    TOTAL INFLUENTIAL CASES: ", n_influential, "/", k, " (", 
    sprintf("%.1f", (n_influential/k)*100), "%)\n", sep = "")
cat("  ", paste0(rep("-", 76), collapse = ""), "\n", sep = "")

cat("\n")

# ----------------------------------------------------------------------------
# Step 3.2.5: Display Influential Cases
# ----------------------------------------------------------------------------
cat("Step 3.2.5: Detailed information on influential cases...\n")
cat(paste0(rep("-", 80), collapse = ""), "\n\n")

if (n_influential > 0) {
  cat("  ‚ö†Ô∏è  INFLUENTIAL CASES DETECTED:\n")
  cat("  ", paste0(rep("=", 76), collapse = ""), "\n", sep = "")
  cat("\n")
  
  # Sort by Cook's D (descending)
  influential_cases <- influence_df[influence_df$Influential, ]
  influential_cases <- influential_cases[order(-influential_cases$Cooks_D), ]
  
  for (i in 1:nrow(influential_cases)) {
    row <- influential_cases[i, ]
    
    cat("  ", i, ". ", row$Study_ID, " (", row$Effect_ID, ")\n", sep = "")
    cat("     Effect size:      g = ", sprintf("%.4f", row$Hedges_g), "\n", sep = "")
    cat("     Cook's D:         ", sprintf("%.4f", row$Cooks_D), sep = "")
    if (row$Flag_Cooks) cat(" ‚ö†Ô∏è  HIGH")
    cat("\n")
    cat("     DFBETAS:          ", sprintf("%.4f", row$DFBETAS), sep = "")
    if (row$Flag_DFBETAS) cat(" ‚ö†Ô∏è  HIGH")
    cat("\n")
    cat("     Hat value:        ", sprintf("%.4f", row$Hat), sep = "")
    if (row$Flag_Hat) cat(" ‚ö†Ô∏è  HIGH")
    cat("\n")
    cat("     Std. residual:    ", sprintf("%.4f", row$Std_Resid), sep = "")
    if (row$Flag_Resid_Extreme) {
      cat(" ‚ö†Ô∏è  EXTREME OUTLIER")
    } else if (row$Flag_Resid_Moderate) {
      cat(" ‚ö†Ô∏è  MODERATE OUTLIER")
    }
    cat("\n")
    
    # Interpretation
    issues <- c()
    if (row$Flag_Cooks) issues <- c(issues, "high influence")
    if (row$Flag_DFBETAS) issues <- c(issues, "affects coefficients")
    if (row$Flag_Hat) issues <- c(issues, "high leverage")
    if (row$Flag_Resid_Extreme) issues <- c(issues, "extreme outlier")
    
    if (length(issues) > 0) {
      cat("     Issues:           ", paste(issues, collapse = ", "), "\n", sep = "")
    }
    cat("\n")
  }
  
  # Recommendations
  cat("  RECOMMENDATIONS:\n")
  cat("  ", paste0(rep("-", 76), collapse = ""), "\n", sep = "")
  cat("    1. Review influential studies for data accuracy\n")
  cat("    2. Examine study characteristics (design, sample, etc.)\n")
  cat("    3. Consider sensitivity analysis excluding influential cases\n")
  cat("    4. Report results both with and without influential cases\n")
  
} else {
  cat("  ‚úÖ NO INFLUENTIAL CASES DETECTED\n")
  cat("     All effect sizes fall within acceptable diagnostic ranges\n")
  cat("     ‚Üí Results are robust to individual case influence\n")
}

cat("\n")

# ----------------------------------------------------------------------------
# Step 3.2.6: Sensitivity Analysis (Excluding Influential Cases)
# ----------------------------------------------------------------------------
if (n_influential > 0) {
  cat("Step 3.2.6: Sensitivity analysis excluding influential cases...\n")
  cat(paste0(rep("-", 80), collapse = ""), "\n\n")
  
  # Create dataset without influential cases
  df_robust <- df[!influence_df$Influential, ]
  
  cat("  Refitting model without influential cases...\n")
  cat("    Original k:  ", k, " effect sizes\n", sep = "")
  cat("    Excluded:    ", n_influential, " influential cases\n", sep = "")
  cat("    Remaining:   ", nrow(df_robust), " effect sizes\n\n", sep = "")
  
  # Refit overall model
  res_robust <- rma(yi = Hedges_g, vi = vi, data = df_robust, method = "REML")
  
  cat("  ‚úÖ Robust model fitted\n\n")
  
  # Compare results
  cat("  COMPARISON: ORIGINAL vs. ROBUST (without influential cases)\n")
  cat("  ", paste0(rep("=", 76), collapse = ""), "\n", sep = "")
  cat("\n")
  
  cat("  Effect size:\n")
  cat("    Original:  g = ", sprintf("%.4f", overall_g), 
      " [", sprintf("%.4f", overall_ci_lb), ", ",
      sprintf("%.4f", overall_ci_ub), "]\n", sep = "")
  cat("    Robust:    g = ", sprintf("%.4f", res_robust$b[1]), 
      " [", sprintf("%.4f", res_robust$ci.lb), ", ",
      sprintf("%.4f", res_robust$ci.ub), "]\n", sep = "")
  cat("    Change:    ", sprintf("%+.4f", res_robust$b[1] - overall_g), 
      " (", sprintf("%+.1f", ((res_robust$b[1] - overall_g)/overall_g)*100), "%)\n\n", sep = "")
  
  cat("  Statistical significance:\n")
  cat("    Original:  p = ", sprintf("%.4f", overall_p), sep = "")
  if (overall_p < 0.001) {
    cat(" ***\n")
  } else if (overall_p < 0.01) {
    cat(" **\n")
  } else if (overall_p < 0.05) {
    cat(" *\n")
  } else {
    cat("\n")
  }
  cat("    Robust:    p = ", sprintf("%.4f", res_robust$pval), sep = "")
  if (res_robust$pval < 0.001) {
    cat(" ***\n")
  } else if (res_robust$pval < 0.01) {
    cat(" **\n")
  } else if (res_robust$pval < 0.05) {
    cat(" *\n")
  } else {
    cat("\n")
  }
  
  # Check if significance changed
  original_sig <- overall_p < 0.05
  robust_sig <- res_robust$pval < 0.05
  
  if (original_sig == robust_sig) {
    cat("    ‚Üí Significance UNCHANGED\n\n")
  } else {
    cat("    ‚ö†Ô∏è  Significance CHANGED\n\n")
  }
  
  cat("  Heterogeneity:\n")
  cat("    Original:  I¬≤ = ", sprintf("%.2f", I2), "%, œÑ¬≤ = ", 
      sprintf("%.4f", tau2), "\n", sep = "")
  cat("    Robust:    I¬≤ = ", sprintf("%.2f", res_robust$I2), "%, œÑ¬≤ = ", 
      sprintf("%.4f", res_robust$tau2), "\n\n", sep = "")
  
  # Overall interpretation
  change_pct <- abs((res_robust$b[1] - overall_g)/overall_g) * 100
  
  if (change_pct < 5) {
    cat("  ‚úÖ INTERPRETATION: Results are ROBUST\n")
    cat("     Excluding influential cases changes estimate by <5%\n")
    cat("     ‚Üí Main conclusions remain valid\n")
  } else if (change_pct < 10) {
    cat("  ‚ö†Ô∏è  INTERPRETATION: Results are MODERATELY AFFECTED\n")
    cat("     Excluding influential cases changes estimate by 5-10%\n")
    cat("     ‚Üí Report both original and robust estimates\n")
  } else {
    cat("  ‚ö†Ô∏è  INTERPRETATION: Results are SUBSTANTIALLY AFFECTED\n")
    cat("     Excluding influential cases changes estimate by >10%\n")
    cat("     ‚Üí Influential cases have major impact\n")
    cat("     ‚Üí Consider reporting robust estimate as primary\n")
  }
  
  cat("\n")
}

# ----------------------------------------------------------------------------
# Step 3.2.7: Export Influence Diagnostics
# ----------------------------------------------------------------------------
cat("Step 3.2.7: Exporting influence diagnostics...\n")
cat(paste0(rep("-", 80), collapse = ""), "\n")

# Round for export
influence_export <- influence_df
numeric_cols <- c("Hedges_g", "Cooks_D", "DFBETAS", "Hat", "Std_Resid")
for (col in numeric_cols) {
  influence_export[[col]] <- round(influence_export[[col]], 4)
}

write.csv(influence_export,
          "influence_diagnostics.csv",
          row.names = FALSE,
          fileEncoding = "UTF-8")

cat("  ‚úÖ Diagnostics exported: influence_diagnostics.csv\n")

cat("\n")
cat("‚úÖ STEP 3.2 COMPLETE: Influence diagnostics finished\n")
cat(paste0(rep("=", 80), collapse = ""), "\n\n")



# ============================================================================
# STEP 3.3: ROBUST VARIANCE ESTIMATION (RVE)
# ============================================================================
#
# Purpose:  Correct for statistical dependency in clustered effect sizes
# Problem:  Multiple ES per study violates independence ‚Üí SE underestimation
#
# Method:   Hierarchical weights with robust variance (Hedges et al., 2010)
#   ‚Ä¢ Assumes within-study correlation œÅ (default: 0.8)
#   ‚Ä¢ Small-sample corrections for limited clusters
#   ‚Ä¢ robumeta package implementation
#
# Outputs:
#   RVE-adjusted g, SE inflation %, comparison to standard MA
#
# Interpretation:
#   SE inflation <10% ‚Üí Minimal dependency impact
#   SE inflation 10-25% ‚Üí Moderate dependency, report RVE as primary
#   SE inflation >25% ‚Üí Strong dependency, RVE results essential
#
# Decision Rule:
#   ‚â•15% of studies with multiple ES ‚Üí Conduct RVE analysis
#
# ============================================================================

cat(paste0(rep("=", 80), collapse = ""), "\n")
cat("STEP 3.3: ROBUST VARIANCE ESTIMATION (RVE)\n")
cat(paste0(rep("=", 80), collapse = ""), "\n\n")

cat("Purpose: Account for dependency from multiple effect sizes per study\n\n")

# ----------------------------------------------------------------------------
# Step 3.3.1: Assess Data Structure and Dependency
# ----------------------------------------------------------------------------
cat("Step 3.3.1: Assessing data structure...\n")
cat(paste0(rep("-", 80), collapse = ""), "\n\n")

# Count effect sizes per study
effects_per_study <- table(df$Study_ID)
n_studies <- length(unique(df$Study_ID))
n_effects <- nrow(df)

cat("  DATA STRUCTURE:\n")
cat("  ", paste0(rep("-", 76), collapse = ""), "\n", sep = "")
cat("    Total studies:              ", n_studies, "\n", sep = "")
cat("    Total effect sizes:         ", n_effects, "\n", sep = "")
cat("    Average ES per study:       ", sprintf("%.2f", n_effects/n_studies), "\n", sep = "")
cat("    Median ES per study:        ", median(effects_per_study), "\n", sep = "")
cat("    Range ES per study:         ", min(effects_per_study), " to ", 
    max(effects_per_study), "\n", sep = "")

cat("\n")

# Identify studies with multiple effect sizes
multi_es_studies <- names(effects_per_study[effects_per_study > 1])
n_multi_es <- length(multi_es_studies)
n_multi_effects <- sum(effects_per_study[effects_per_study > 1])

cat("  DEPENDENCY ASSESSMENT:\n")
cat("  ", paste0(rep("-", 76), collapse = ""), "\n", sep = "")
cat("    Studies with >1 effect:     ", n_multi_es, " (", 
    sprintf("%.1f", (n_multi_es/n_studies)*100), "% of studies)\n", sep = "")
cat("    Effect sizes from these:    ", n_multi_effects, " (", 
    sprintf("%.1f", (n_multi_effects/n_effects)*100), "% of total ES)\n", sep = "")

cat("\n")

# Display distribution
cat("  Effect sizes per study (distribution):\n")
es_dist <- as.data.frame(table(effects_per_study))
names(es_dist) <- c("ES_Count", "N_Studies")
for (i in 1:nrow(es_dist)) {
  cat("    ", es_dist$ES_Count[i], " ES: ", es_dist$N_Studies[i], 
      " studies (", sprintf("%.1f", (as.numeric(es_dist$N_Studies[i])/n_studies)*100), 
      "%)\n", sep = "")
}

cat("\n")

# Decision on RVE necessity
if (n_multi_es == 0) {
  cat("  ‚úÖ DECISION: RVE not necessary\n")
  cat("     All studies contribute exactly 1 effect size\n")
  cat("     ‚Üí No dependency issues\n")
  cat("     ‚Üí Standard meta-analysis is appropriate\n\n")
  
  run_rve <- FALSE
  
} else {
  cat("  ‚ö†Ô∏è  DECISION: RVE is ESSENTIAL\n")
  cat("     Multiple effect sizes per study create statistical dependency\n")
  cat("     ‚Üí RVE adjusts standard errors for within-study correlation\n")
  cat("     ‚Üí RVE results should be reported as primary analysis\n\n")
  
  run_rve <- TRUE
}

# ----------------------------------------------------------------------------
# Conditional RVE Analysis (only if needed)
# ----------------------------------------------------------------------------
if (run_rve) {
  
  # --------------------------------------------------------------------------
  # Step 3.3.2: Fit RVE Model for Overall Effect
  # --------------------------------------------------------------------------
  cat("Step 3.3.2: Fitting RVE model for overall effect...\n")
  cat(paste0(rep("-", 80), collapse = ""), "\n\n")
  
  # Check if robumeta is loaded
  if (!require(robumeta, quietly = TRUE)) {
    cat("  Installing robumeta package...\n")
    install.packages("robumeta", repos = "https://cloud.r-project.org/")
    library(robumeta)
  }
  
  # Set assumed within-study correlation
  rho_assumed <- 0.8  # Common default (can be varied in sensitivity analysis)
  
  cat("  MODEL SPECIFICATIONS:\n")
  cat("    Assumed œÅ (within-study correlation): ", rho_assumed, "\n", sep = "")
  cat("    Small sample correction: Yes\n")
  cat("    Estimation method: Hierarchical effects\n\n")
  
  # Fit RVE model
  cat("  Fitting RVE model...\n")
  
  rve_overall <- robu(
    formula = Hedges_g ~ 1,
    data = df,
    studynum = Study_ID,
    var.eff.size = vi,
    rho = rho_assumed,
    small = TRUE
  )
  
  cat("  ‚úÖ RVE model fitted successfully\n\n")
  
  # --------------------------------------------------------------------------
  # Step 3.3.3: Display RVE Results
  # --------------------------------------------------------------------------
  cat("Step 3.3.3: RVE overall effect estimate...\n")
  cat(paste0(rep("-", 80), collapse = ""), "\n\n")
  
  cat("  RVE OVERALL EFFECT:\n")
  cat("  ", paste0(rep("=", 76), collapse = ""), "\n", sep = "")
  cat("\n")
  
  rve_g <- rve_overall$reg_table$b.r[1]
  rve_se <- rve_overall$reg_table$SE[1]
  rve_ci_lb <- rve_overall$reg_table$CI.L[1]
  rve_ci_ub <- rve_overall$reg_table$CI.U[1]
  rve_t <- rve_overall$reg_table$t[1]
  rve_p <- rve_overall$reg_table$prob[1]
  rve_df <- rve_overall$reg_table$dfs[1]
  
  cat("    Estimate (g):     ", sprintf("%.4f", rve_g), "\n", sep = "")
  cat("    Standard Error:   ", sprintf("%.4f", rve_se), "\n", sep = "")
  cat("    95% CI:           [", sprintf("%.4f", rve_ci_lb), ", ", 
      sprintf("%.4f", rve_ci_ub), "]\n", sep = "")
  cat("    t-value:          ", sprintf("%.4f", rve_t), "\n", sep = "")
  cat("    df:               ", sprintf("%.1f", rve_df), "\n", sep = "")
  cat("    p-value:          ", sprintf("%.4f", rve_p), sep = "")
  
  if (rve_p < 0.001) {
    cat(" ***\n")
  } else if (rve_p < 0.01) {
    cat(" **\n")
  } else if (rve_p < 0.05) {
    cat(" *\n")
  } else {
    cat("\n")
  }
  
  cat("\n")
  
  # Heterogeneity from RVE
  rve_tau2 <- rve_overall$mod_info$tau.sq
  rve_I2 <- rve_overall$mod_info$I.2
  
  cat("    œÑ¬≤:               ", sprintf("%.4f", rve_tau2), "\n", sep = "")
  cat("    I¬≤:               ", sprintf("%.2f", rve_I2), "%\n", sep = "")
  
  cat("\n")
  
  # --------------------------------------------------------------------------
  # Step 3.3.4: Compare Standard vs. RVE
  # --------------------------------------------------------------------------
  cat("Step 3.3.4: Comparing standard meta-analysis vs. RVE...\n")
  cat(paste0(rep("-", 80), collapse = ""), "\n\n")
  
  cat("  STANDARD vs. RVE COMPARISON:\n")
  cat("  ", paste0(rep("=", 76), collapse = ""), "\n", sep = "")
  cat("\n")
  
  cat("  Effect Size Estimate:\n")
  cat("    Standard MA: g = ", sprintf("%.4f", overall_g), 
      " [", sprintf("%.4f", overall_ci_lb), ", ",
      sprintf("%.4f", overall_ci_ub), "]\n", sep = "")
  cat("    RVE:         g = ", sprintf("%.4f", rve_g), 
      " [", sprintf("%.4f", rve_ci_lb), ", ",
      sprintf("%.4f", rve_ci_ub), "]\n", sep = "")
  cat("    Difference:      ", sprintf("%+.4f", rve_g - overall_g), "\n\n", sep = "")
  
  cat("  Standard Error:\n")
  cat("    Standard MA: SE = ", sprintf("%.4f", overall_se), "\n", sep = "")
  cat("    RVE:         SE = ", sprintf("%.4f", rve_se), "\n", sep = "")
  
  se_ratio <- rve_se / overall_se
  se_inflation <- (se_ratio - 1) * 100
  
  cat("    SE Ratio:        ", sprintf("%.2f", se_ratio), "x\n", sep = "")
  cat("    SE Inflation:    +", sprintf("%.1f", se_inflation), "%\n\n", sep = "")
  
  cat("  Confidence Interval Width:\n")
  ci_width_standard <- overall_ci_ub - overall_ci_lb
  ci_width_rve <- rve_ci_ub - rve_ci_lb
  cat("    Standard MA: ", sprintf("%.4f", ci_width_standard), "\n", sep = "")
  cat("    RVE:         ", sprintf("%.4f", ci_width_rve), "\n", sep = "")
  cat("    Increase:    +", sprintf("%.1f", ((ci_width_rve/ci_width_standard - 1)*100)), "%\n\n", sep = "")
  
  cat("  Statistical Significance:\n")
  cat("    Standard MA: p = ", sprintf("%.4f", overall_p), sep = "")
  if (overall_p < 0.05) cat(" (significant)")
  cat("\n")
  cat("    RVE:         p = ", sprintf("%.4f", rve_p), sep = "")
  if (rve_p < 0.05) cat(" (significant)")
  cat("\n\n")
  
  # Check if significance changed
  standard_sig <- overall_p < 0.05
  rve_sig <- rve_p < 0.05
  
  if (standard_sig == rve_sig) {
    cat("    ‚Üí Significance UNCHANGED\n")
  } else {
    cat("    ‚ö†Ô∏è  Significance CHANGED with RVE adjustment\n")
  }
  
  cat("\n")
  
  # --------------------------------------------------------------------------
  # Step 3.3.5: Interpret Impact of Dependency
  # --------------------------------------------------------------------------
  cat("Step 3.3.5: Interpreting dependency impact...\n")
  cat(paste0(rep("-", 80), collapse = ""), "\n\n")
  
  cat("  DEPENDENCY IMPACT ASSESSMENT:\n")
  cat("  ", paste0(rep("=", 76), collapse = ""), "\n", sep = "")
  cat("\n")
  
  if (se_inflation < 10) {
    cat("  ‚úÖ MINIMAL DEPENDENCY IMPACT (SE inflation < 10%)\n")
    cat("     ‚Üí Within-study correlation has limited effect\n")
    cat("     ‚Üí Standard and RVE results are similar\n")
    cat("     ‚Üí Either approach acceptable, but RVE preferred for transparency\n")
  } else if (se_inflation < 25) {
    cat("  ‚ö†Ô∏è  MODERATE DEPENDENCY IMPACT (SE inflation 10-25%)\n")
    cat("     ‚Üí Within-study correlation moderately affects inference\n")
    cat("     ‚Üí RVE provides more conservative estimates\n")
    cat("     ‚Üí RECOMMEND: Report RVE as primary analysis\n")
  } else {
    cat("  ‚ö†Ô∏è  SUBSTANTIAL DEPENDENCY IMPACT (SE inflation > 25%)\n")
    cat("     ‚Üí Within-study correlation substantially affects inference\n")
    cat("     ‚Üí Standard MA underestimates uncertainty\n")
    cat("     ‚Üí ESSENTIAL: Report RVE as primary analysis\n")
    cat("     ‚Üí Standard MA results may be misleading\n")
  }
  
  cat("\n")
  
  # --------------------------------------------------------------------------
  # Step 3.3.6: RVE for Significant Moderators (if any)
  # --------------------------------------------------------------------------
  if (exists("significant_moderators") && nrow(significant_moderators) > 0) {
    
    cat("Step 3.3.6: RVE analysis for significant moderators...\n")
    cat(paste0(rep("-", 80), collapse = ""), "\n\n")
    
    cat("  Testing ", nrow(significant_moderators), " significant moderator(s) with RVE...\n\n", sep = "")
    
    rve_moderator_results <- list()
    
    for (i in 1:nrow(significant_moderators)) {
      mod_name <- significant_moderators$Moderator[i]
      
      cat("  [", i, "/", nrow(significant_moderators), "] ", mod_name, "\n", sep = "")
      
      # Check complete cases
      temp_rve <- df[!is.na(df[[mod_name]]), ]
      
      if (nrow(temp_rve) < 5) {
        cat("      ‚ö†Ô∏è  Insufficient data (n = ", nrow(temp_rve), ") ‚Üí SKIP\n\n", sep = "")
        next
      }
      
      # Fit RVE model with moderator
      rve_mod <- tryCatch(
        {
          robu(
            formula = as.formula(paste("Hedges_g ~", mod_name)),
            data = temp_rve,
            studynum = Study_ID,
            var.eff.size = vi,
            rho = rho_assumed,
            small = TRUE
          )
        },
        error = function(e) {
          cat("      ‚ùå RVE model failed: ", e$message, "\n\n", sep = "")
          return(NULL)
        }
      )
      
      if (is.null(rve_mod) || nrow(rve_mod$reg_table) < 2) {
        cat("      ‚ö†Ô∏è  No moderator coefficient ‚Üí SKIP\n\n")
        next
      }
      
      # Extract results (second row is moderator effect)
      rve_mod_results <- data.frame(
        Moderator = mod_name,
        Estimate_RVE = rve_mod$reg_table$b.r[2],
        SE_RVE = rve_mod$reg_table$SE[2],
        CI_Lower_RVE = rve_mod$reg_table$CI.L[2],
        CI_Upper_RVE = rve_mod$reg_table$CI.U[2],
        t_value = rve_mod$reg_table$t[2],
        df = rve_mod$reg_table$dfs[2],
        p_value_RVE = rve_mod$reg_table$prob[2],
        n_studies = length(unique(temp_rve$Study_ID)),
        n_effects = nrow(temp_rve),
        stringsAsFactors = FALSE
      )
      
      rve_moderator_results[[mod_name]] <- rve_mod_results
      
      cat("      ‚úÖ Œ≤ = ", sprintf("%.4f", rve_mod_results$Estimate_RVE), 
          ", SE = ", sprintf("%.4f", rve_mod_results$SE_RVE), 
          ", p = ", sprintf("%.4f", rve_mod_results$p_value_RVE), sep = "")
      
      if (rve_mod_results$p_value_RVE < 0.001) {
        cat(" ***\n")
      } else if (rve_mod_results$p_value_RVE < 0.01) {
        cat(" **\n")
      } else if (rve_mod_results$p_value_RVE < 0.05) {
        cat(" *\n")
      } else {
        cat("\n")
      }
      
      cat("\n")
    }
    
    # Combine RVE moderator results
    if (length(rve_moderator_results) > 0) {
      rve_mod_df <- do.call(rbind, rve_moderator_results)
      rownames(rve_mod_df) <- NULL
      
      # Compare standard vs RVE for each moderator
      cat("  MODERATOR COMPARISON: Standard MA vs. RVE\n")
      cat("  ", paste0(rep("=", 76), collapse = ""), "\n", sep = "")
      cat("\n")
      
      for (mod in rve_mod_df$Moderator) {
        # Get standard result
        std_result <- significant_moderators[significant_moderators$Moderator == mod, ]
        rve_result <- rve_mod_df[rve_mod_df$Moderator == mod, ]
        
        cat("  ", mod, "\n", sep = "")
        cat("    Standard: Œ≤ = ", sprintf("%.4f", std_result$Estimate), 
            ", SE = ", sprintf("%.4f", std_result$SE), 
            ", p = ", sprintf("%.4f", std_result$p_value), "\n", sep = "")
        cat("    RVE:      Œ≤ = ", sprintf("%.4f", rve_result$Estimate_RVE), 
            ", SE = ", sprintf("%.4f", rve_result$SE_RVE), 
            ", p = ", sprintf("%.4f", rve_result$p_value_RVE), "\n", sep = "")
        
        se_ratio_mod <- rve_result$SE_RVE / std_result$SE
        cat("    SE Ratio: ", sprintf("%.2f", se_ratio_mod), "x", sep = "")
        
        # Check if significance changed
        std_sig_mod <- std_result$p_value < 0.05
        rve_sig_mod <- rve_result$p_value_RVE < 0.05
        
        if (std_sig_mod != rve_sig_mod) {
          cat(" ‚ö†Ô∏è  Significance changed!")
        }
        cat("\n\n")
      }
    }
  }
  
  # --------------------------------------------------------------------------
  # Step 3.3.7: Export RVE Results
  # --------------------------------------------------------------------------
  cat("Step 3.3.7: Exporting RVE results...\n")
  cat(paste0(rep("-", 80), collapse = ""), "\n")
  
  # Verify output_path exists
  if (!exists("output_path") || is.null(output_path)) {
    output_path <- getwd()
  }
  
  # Overall effect
  rve_overall_export <- data.frame(
    Analysis = "RVE Overall Effect",
    rho_assumed = rho_assumed,
    k_studies = n_studies,
    k_effects = n_effects,
    Estimate = round(rve_g, 4),
    SE = round(rve_se, 4),
    CI_Lower = round(rve_ci_lb, 4),
    CI_Upper = round(rve_ci_ub, 4),
    t_value = round(rve_t, 4),
    df = round(rve_df, 1),
    p_value = round(rve_p, 4),
    tau2 = round(rve_tau2, 4),
    I2 = round(rve_I2, 2),
    SE_ratio_vs_standard = round(se_ratio, 2),
    SE_inflation_pct = round(se_inflation, 1),
    stringsAsFactors = FALSE
  )
  
  # Write RVE overall effect with error handling
  tryCatch({
    write.csv(rve_overall_export,
              file.path(output_path, "rve_overall_effect.csv"),
              row.names = FALSE,
              fileEncoding = "UTF-8")
    cat("  ‚úÖ RVE overall effect saved: rve_overall_effect.csv\n")
  }, error = function(e) {
    timestamp <- format(Sys.time(), "%Y%m%d_%H%M%S")
    alt_file <- file.path(output_path, paste0("rve_overall_effect_", timestamp, ".csv"))
    write.csv(rve_overall_export, alt_file, row.names = FALSE, fileEncoding = "UTF-8")
    cat("  ‚ö†Ô∏è  Using timestamped file: rve_overall_effect_", timestamp, ".csv\n", sep = "")
  })
  
  # Moderator results (if any)
  if (exists("rve_mod_df")) {
    rve_mod_export <- rve_mod_df
    numeric_cols <- c("Estimate_RVE", "SE_RVE", "CI_Lower_RVE", "CI_Upper_RVE",
                      "t_value", "df", "p_value_RVE")
    for (col in numeric_cols) {
      rve_mod_export[[col]] <- round(rve_mod_export[[col]], 4)
    }
    
    tryCatch({
      write.csv(rve_mod_export,
                file.path(output_path, "rve_moderator_results.csv"),
                row.names = FALSE,
                fileEncoding = "UTF-8")
      cat("  ‚úÖ RVE moderator results saved: rve_moderator_results.csv\n")
    }, error = function(e) {
      timestamp <- format(Sys.time(), "%Y%m%d_%H%M%S")
      alt_file <- file.path(output_path, paste0("rve_moderator_results_", timestamp, ".csv"))
      write.csv(rve_mod_export, alt_file, row.names = FALSE, fileEncoding = "UTF-8")
      cat("  ‚ö†Ô∏è  Using timestamped file: rve_moderator_results_", timestamp, ".csv\n", sep = "")
    })
  }
  
  cat("\n")
}

if (run_rve) {
  cat("‚úÖ STEP 3.3 COMPLETE: Robust variance estimation finished\n")
} else {
  cat("‚úÖ STEP 3.3 COMPLETE: RVE not necessary (no dependency)\n")
}
cat(paste0(rep("=", 80), collapse = ""), "\n\n")



STEP 3.1: LEAVE-ONE-OUT SENSITIVITY ANALYSIS

Research Question: Is the overall effect robust to study exclusion?

Step 3.1.1: Baseline overall effect (for comparison)...
-------------------------------------------------------------------------------- 

  BASELINE (All studies included):
    Effect size:  g = 0.5272 [0.3781, 0.6763]
STEP 3.1: LEAVE-ONE-OUT SENSITIVITY ANALYSIS

Research Question: Is the overall effect robust to study exclusion?

Step 3.1.1: Baseline overall effect (for comparison)...
-------------------------------------------------------------------------------- 

  BASELINE (All studies included):
    Effect size:  g = 0.5272 [0.3781, 0.6763]
    p-value:      0.0000 ***
    œÑ¬≤:           0.0882
    I¬≤:           54.64%
    k:            29 effect sizes
    p-value:      0.0000 ***
    œÑ¬≤:           0.0882
    I¬≤:           54.64%
    k:            29 effect sizes

Step 3.1.2: Running leave-one-out analysis...
---------------------------------------------------

In [47]:
# ============================================================================
# STEP 4: PUBLICATION BIAS ASSESSMENT
# ============================================================================
#
# Purpose: To evaluate the potential presence of publication bias and small‚Äêstudy effects.
#
# Methods:
#   Multi-pronged publication bias detection:
#     1. Funnel plot (visual asymmetry)
#     2. Egger‚Äôs regression test for small-study effects
#     3. Trim-and-fill procedure for bias-adjusted effect estimation
#     4. Rosenthal‚Äôs fail-safe N to quantify robustness to unpublished null studies
#
# Files Generated (Output):
#   ‚Ä¢ funnel_plot.png
#   ‚Ä¢ publication_bias_tests.csv
#   ‚Ä¢ publication_bias_trim_and_fill.csv ‚Üí (Only generated when k_imputed > 0)
#
# Bias Interpretation Framework:
#   0 indicators ‚Üí No evidence of bias
#   1 indicator  ‚Üí Minimal concern
#   2 indicators ‚Üí Moderate concern
#   3 indicators ‚Üí Substantial bias likely present
#
# ============================================================================

cat(paste0(rep("=", 80), collapse = ""), "\n")
cat("STEP 4: PUBLICATION BIAS ASSESSMENT\n")
cat(paste0(rep("=", 80), collapse = ""), "\n\n")

cat("Research Question: Is there evidence of publication bias?\n\n")

# ----------------------------------------------------------------------------
# Step 4.1: Funnel Plot Asymmetry (Visual Inspection)
# ----------------------------------------------------------------------------
cat("Step 4.1: Funnel plot assessment...\n")
cat(paste0(rep("-", 80), collapse = ""), "\n\n")

cat("  FUNNEL PLOT INTERPRETATION:\n")
cat("  ", paste0(rep("-", 76), collapse = ""), "\n", sep = "")
cat("    Expected (no bias): Symmetric inverted funnel\n")
cat("    Bias indicator:     Asymmetry, especially at bottom (low precision)\n")
cat("    Missing studies:    Gaps in lower right (small non-significant)\n\n")

cat("  NOTE: Funnel plot saved for visual inspection\n")
cat("        (Requires graphical output - not shown in text)\n\n")

# Save funnel plot (if graphical output available)
tryCatch({
  png("funnel_plot.png", width = 800, height = 600, res = 120)
  funnel(res_overall, 
         xlab = "Hedges' g",
         ylab = "Standard Error",
         main = "Funnel Plot for Publication Bias Assessment",
         back = "white",
         shade = "white")
  dev.off()
  cat("  ‚úÖ Funnel plot saved: funnel_plot.png\n\n")
}, error = function(e) {
  cat("  ‚ö†Ô∏è  Funnel plot not saved (graphical device not available)\n\n")
})

# ----------------------------------------------------------------------------
# Step 4.2: Egger's Regression Test
# ----------------------------------------------------------------------------
cat("Step 4.2: Egger's regression test for funnel plot asymmetry...\n")
cat(paste0(rep("-", 80), collapse = ""), "\n\n")

cat("  METHOD: Regression of standardized effect on precision\n")
cat("  H‚ÇÄ: No small-study effects (intercept = 0)\n")
cat("  H‚ÇÅ: Funnel plot asymmetry present\n\n")

# Perform Egger's test
egger_test <- regtest(res_overall, model = "lm")

cat("  EGGER'S TEST RESULTS:\n")
cat("  ", paste0(rep("=", 76), collapse = ""), "\n", sep = "")
cat("\n")
cat("    Intercept:        ", sprintf("%.4f", egger_test$est), "\n", sep = "")
cat("    Standard Error:   ", sprintf("%.4f", egger_test$se), "\n", sep = "")
cat("    z-value:          ", sprintf("%.4f", egger_test$zval), "\n", sep = "")
cat("    p-value:          ", sprintf("%.4f", egger_test$pval), sep = "")

if (egger_test$pval < 0.001) {
  cat(" ***\n")
  egger_interpretation <- "HIGHLY SIGNIFICANT asymmetry (p < .001)"
  egger_conclusion <- "‚ö†Ô∏è  Strong evidence of publication bias"
} else if (egger_test$pval < 0.01) {
  cat(" **\n")
  egger_interpretation <- "Very significant asymmetry (p < .01)"
  egger_conclusion <- "‚ö†Ô∏è  Evidence of publication bias"
} else if (egger_test$pval < 0.05) {
  cat(" *\n")
  egger_interpretation <- "Significant asymmetry (p < .05)"
  egger_conclusion <- "‚ö†Ô∏è  Some evidence of publication bias"
} else {
  cat("\n")
  egger_interpretation <- "No significant asymmetry (p ‚â• .05)"
  egger_conclusion <- "‚úÖ No statistical evidence of publication bias"
}

cat("\n")
cat("  INTERPRETATION:\n")
cat("    ", egger_interpretation, "\n", sep = "")
cat("    ", egger_conclusion, "\n", sep = "")

cat("\n")

# ----------------------------------------------------------------------------
# Step 4.3: Trim-and-Fill Analysis
# ----------------------------------------------------------------------------
cat("Step 4.3: Trim-and-fill analysis...\n")
cat(paste0(rep("-", 80), collapse = ""), "\n\n")

cat("  METHOD: Impute missing studies to restore funnel plot symmetry\n")
cat("  PURPOSE: Estimate effect size adjusted for publication bias\n\n")

# Perform trim-and-fill
taf_results <- trimfill(res_overall)

cat("  TRIM-AND-FILL RESULTS:\n")
cat("  ", paste0(rep("=", 76), collapse = ""), "\n", sep = "")
cat("\n")

# Number of imputed studies
k_imputed <- taf_results$k0

cat("    Studies imputed:    ", k_imputed, "\n", sep = "")

if (k_imputed == 0) {
  cat("    ‚Üí No missing studies detected\n")
  cat("    ‚Üí Funnel plot appears symmetric\n\n")
  
  cat("  ADJUSTED ESTIMATE: Same as original (no adjustment needed)\n\n")
  
  taf_conclusion <- "‚úÖ No publication bias detected by trim-and-fill"
  
} else {
  cat("    ‚Üí ", k_imputed, " studies imputed on ", 
      ifelse(taf_results$side == "left", "left", "right"), " side\n", sep = "")
  cat("    ‚Üí Suggests missing ", 
      ifelse(taf_results$side == "left", "negative", "positive"), 
      " studies\n\n", sep = "")
  
  # Adjusted effect size
  taf_g <- taf_results$b[1]
  taf_ci_lb <- taf_results$ci.lb
  taf_ci_ub <- taf_results$ci.ub
  taf_p <- taf_results$pval
  
  cat("  ORIGINAL vs. ADJUSTED EFFECT:\n")
  cat("  ", paste0(rep("-", 76), collapse = ""), "\n", sep = "")
  cat("    Original:  g = ", sprintf("%.4f", overall_g), 
      " [", sprintf("%.4f", overall_ci_lb), ", ",
      sprintf("%.4f", overall_ci_ub), "], p = ", 
      sprintf("%.4f", overall_p), "\n", sep = "")
  cat("    Adjusted:  g = ", sprintf("%.4f", taf_g), 
      " [", sprintf("%.4f", taf_ci_lb), ", ",
      sprintf("%.4f", taf_ci_ub), "], p = ", 
      sprintf("%.4f", taf_p), "\n", sep = "")
  
  # Calculate change
  taf_change <- taf_g - overall_g
  taf_change_pct <- (taf_change / overall_g) * 100
  
  cat("    Change:    ", sprintf("%+.4f", taf_change), 
      " (", sprintf("%+.1f", taf_change_pct), "%)\n\n", sep = "")
  
  # Interpretation
  if (abs(taf_change_pct) < 10) {
    taf_conclusion <- "‚ö†Ô∏è  Minimal bias impact (adjustment < 10%)"
    taf_recommendation <- "Report original effect; note trim-and-fill in supplementary"
  } else if (abs(taf_change_pct) < 25) {
    taf_conclusion <- "‚ö†Ô∏è  Moderate bias impact (adjustment 10-25%)"
    taf_recommendation <- "Report both original and adjusted effects"
  } else {
    taf_conclusion <- "‚ö†Ô∏è  Substantial bias impact (adjustment > 25%)"
    taf_recommendation <- "Consider adjusted effect as primary estimate"
  }
  
  cat("  INTERPRETATION:\n")
  cat("    ", taf_conclusion, "\n", sep = "")
  cat("    Recommendation: ", taf_recommendation, "\n", sep = "")
  
  cat("\n")
}

# ----------------------------------------------------------------------------
# Step 4.4: Fail-Safe N (Rosenthal's Method)
# ----------------------------------------------------------------------------
cat("Step 4.4: Fail-safe N analysis...\n")
cat(paste0(rep("-", 80), collapse = ""), "\n\n")

cat("  METHOD: Calculate number of null studies needed to nullify effect\n")
cat("  PURPOSE: Assess robustness to unreported non-significant studies\n\n")

# Calculate fail-safe N using the fitted model
fsn_results <- fsn(res_overall, type = "Rosenthal")

cat("  FAIL-SAFE N RESULTS:\n")
cat("  ", paste0(rep("=", 76), collapse = ""), "\n", sep = "")
cat("\n")
cat("    Fail-safe N:      ", fsn_results$fsnum, " studies\n", sep = "")
cat("    Target (5k+5):    ", 5*k + 5, " studies\n", sep = "")

if (fsn_results$fsnum > (5*k + 5)) {
  cat("    ‚Üí Fail-safe N EXCEEDS target\n")
  cat("    ‚Üí Effect is robust to file-drawer problem\n")
  fsn_conclusion <- "‚úÖ Effect robust to unreported null studies"
} else {
  cat("    ‚Üí Fail-safe N BELOW target\n")
  cat("    ‚Üí Effect may be vulnerable to file-drawer problem\n")
  fsn_conclusion <- "‚ö†Ô∏è  Effect potentially vulnerable to unreported studies"
}

cat("\n")
cat("  INTERPRETATION:\n")
cat("    ", fsn_conclusion, "\n", sep = "")
cat("    ", fsn_results$fsnum, " null studies would be needed to nullify effect\n", sep = "")

cat("\n")

# ----------------------------------------------------------------------------
# Step 4.5: Overall Publication Bias Summary
# ----------------------------------------------------------------------------
cat("Step 4.5: Publication bias assessment summary...\n")
cat(paste0(rep("-", 80), collapse = ""), "\n\n")

cat("  PUBLICATION BIAS ASSESSMENT SUMMARY:\n")
cat("  ", paste0(rep("=", 76), collapse = ""), "\n", sep = "")
cat("\n")

# Count bias indicators
bias_indicators <- 0
if (egger_test$pval < 0.05) bias_indicators <- bias_indicators + 1
if (k_imputed > 0) bias_indicators <- bias_indicators + 1
if (fsn_results$fsnum < (5*k + 5)) bias_indicators <- bias_indicators + 1

cat("  1. FUNNEL PLOT:\n")
cat("     ‚Üí Visual inspection recommended (see funnel_plot.png)\n\n")

cat("  2. EGGER'S TEST:\n")
cat("     ‚Üí ", egger_conclusion, "\n", sep = "")
cat("     ‚Üí p = ", sprintf("%.4f", egger_test$pval), "\n\n", sep = "")

cat("  3. TRIM-AND-FILL:\n")
cat("     ‚Üí ", taf_conclusion, "\n", sep = "")
if (k_imputed > 0) {
  cat("     ‚Üí ", k_imputed, " studies imputed\n", sep = "")
  cat("     ‚Üí Adjusted g = ", sprintf("%.4f", taf_g), 
      " (change: ", sprintf("%+.1f", taf_change_pct), "%)\n", sep = "")
}
cat("\n")

cat("  4. FAIL-SAFE N:\n")
cat("     ‚Üí ", fsn_conclusion, "\n", sep = "")
cat("     ‚Üí N = ", fsn_results$fsnum, " (target: ", 5*k + 5, ")\n\n", sep = "")

cat("  ", paste0(rep("-", 76), collapse = ""), "\n", sep = "")
cat("  BIAS INDICATORS PRESENT: ", bias_indicators, "/3\n", sep = "")
cat("  ", paste0(rep("-", 76), collapse = ""), "\n", sep = "")
cat("\n")

# Overall conclusion
if (bias_indicators == 0) {
  overall_bias_conclusion <- "‚úÖ NO EVIDENCE OF PUBLICATION BIAS"
  overall_bias_recommendation <- "Results appear unbiased; report with confidence"
} else if (bias_indicators == 1) {
  overall_bias_conclusion <- "‚ö†Ô∏è  MINIMAL EVIDENCE OF PUBLICATION BIAS"
  overall_bias_recommendation <- "Results likely robust; acknowledge limitations"
} else if (bias_indicators == 2) {
  overall_bias_conclusion <- "‚ö†Ô∏è  MODERATE EVIDENCE OF PUBLICATION BIAS"
  overall_bias_recommendation <- "Report bias tests; consider adjusted estimates"
} else {
  overall_bias_conclusion <- "‚ö†Ô∏è  SUBSTANTIAL EVIDENCE OF PUBLICATION BIAS"
  overall_bias_recommendation <- "Interpret results cautiously; report all bias assessments"
}

cat("  OVERALL CONCLUSION:\n")
cat("    ", overall_bias_conclusion, "\n", sep = "")
cat("    Recommendation: ", overall_bias_recommendation, "\n", sep = "")

cat("\n")

# ----------------------------------------------------------------------------
# Step 4.6: Export Publication Bias Results
# ----------------------------------------------------------------------------
cat("Step 4.6: Exporting publication bias assessment results...\n")
cat(paste0(rep("-", 80), collapse = ""), "\n")

# Verify output_path exists
if (!exists("output_path") || is.null(output_path)) {
  output_path <- getwd()
}

# Compile all bias test results
bias_results <- data.frame(
  Test = c("Egger's Regression", "Trim-and-Fill", "Fail-Safe N"),
  Statistic = c(
    sprintf("Intercept = %.4f", egger_test$est),
    sprintf("%d studies imputed", k_imputed),
    sprintf("N = %d", fsn_results$fsnum)
  ),
  p_value = c(
    round(egger_test$pval, 4),
    ifelse(k_imputed > 0, round(taf_p, 4), NA),
    NA
  ),
  Interpretation = c(
    ifelse(egger_test$pval < 0.05, "Asymmetry detected", "No asymmetry"),
    ifelse(k_imputed > 0, "Bias suspected", "No bias"),
    ifelse(fsn_results$fsnum > (5*k + 5), "Robust", "Vulnerable")
  ),
  Bias_Evidence = c(
    ifelse(egger_test$pval < 0.05, "Yes", "No"),
    ifelse(k_imputed > 0, "Yes", "No"),
    ifelse(fsn_results$fsnum < (5*k + 5), "Yes", "No")
  ),
  stringsAsFactors = FALSE
)

# Add trim-and-fill adjusted effect if imputed
if (k_imputed > 0) {
  taf_effect <- data.frame(
    Analysis = "Trim-and-Fill Adjusted",
    k_original = k,
    k_imputed = k_imputed,
    Estimate_original = round(overall_g, 4),
    Estimate_adjusted = round(taf_g, 4),
    Change = round(taf_change, 4),
    Change_pct = round(taf_change_pct, 2),
    CI_Lower_adjusted = round(taf_ci_lb, 4),
    CI_Upper_adjusted = round(taf_ci_ub, 4),
    p_value_adjusted = round(taf_p, 4),
    stringsAsFactors = FALSE
  )
  
  tryCatch({
    write.csv(taf_effect,
              file.path(output_path, "publication_bias_trim_and_fill.csv"),
              row.names = FALSE,
              fileEncoding = "UTF-8")
    cat("  ‚úÖ Trim-and-fill results saved: publication_bias_trim_and_fill.csv\n")
  }, error = function(e) {
    timestamp <- format(Sys.time(), "%Y%m%d_%H%M%S")
    alt_file <- file.path(output_path, paste0("publication_bias_trim_and_fill_", timestamp, ".csv"))
    write.csv(taf_effect, alt_file, row.names = FALSE, fileEncoding = "UTF-8")
    cat("  ‚ö†Ô∏è  Using timestamped file: publication_bias_trim_and_fill_", timestamp, ".csv\n", sep = "")
  })
}

tryCatch({
  write.csv(bias_results,
            file.path(output_path, "publication_bias_tests.csv"),
            row.names = FALSE,
            fileEncoding = "UTF-8")
  cat("  ‚úÖ Bias test summary saved: publication_bias_tests.csv\n")
}, error = function(e) {
  timestamp <- format(Sys.time(), "%Y%m%d_%H%M%S")
  alt_file <- file.path(output_path, paste0("publication_bias_tests_", timestamp, ".csv"))
  write.csv(bias_results, alt_file, row.names = FALSE, fileEncoding = "UTF-8")
  cat("  ‚ö†Ô∏è  Using timestamped file: publication_bias_tests_", timestamp, ".csv\n", sep = "")
})

cat("\n")
cat("‚úÖ STEP 4 COMPLETE: Publication bias assessment finished\n")
cat(paste0(rep("=", 80), collapse = ""), "\n\n")


STEP 4: PUBLICATION BIAS ASSESSMENT

Research Question: Is there evidence of publication bias?

Step 4.1: Funnel plot assessment...
-------------------------------------------------------------------------------- 

  FUNNEL PLOT INTERPRETATION:
  ----------------------------------------------------------------------------
    Expected (no bias): Symmetric inverted funnel
STEP 4: PUBLICATION BIAS ASSESSMENT

Research Question: Is there evidence of publication bias?

Step 4.1: Funnel plot assessment...
-------------------------------------------------------------------------------- 

  FUNNEL PLOT INTERPRETATION:
  ----------------------------------------------------------------------------
    Expected (no bias): Symmetric inverted funnel
    Bias indicator:     Asymmetry, especially at bottom (low precision)
    Missing studies:    Gaps in lower right (small non-significant)

  NOTE: Funnel plot saved for visual inspection
        (Requires graphical output - not shown in text)

    Bi

"Setting type='General' when using fsn() on a model object."


  FAIL-SAFE N RESULTS:

    Fail-safe N:      281 studies
    Target (5k+5):    150 studies
    ‚Üí Fail-safe N EXCEEDS target
    ‚Üí Effect is robust to file-drawer problem

  INTERPRETATION:
    ‚úÖ Effect robust to unreported null studies
    281 null studies would be needed to nullify effect

    Fail-safe N:      281 studies
    Target (5k+5):    150 studies
    ‚Üí Fail-safe N EXCEEDS target
    ‚Üí Effect is robust to file-drawer problem

  INTERPRETATION:
    ‚úÖ Effect robust to unreported null studies
    281 null studies would be needed to nullify effect

Step 4.5: Publication bias assessment summary...
-------------------------------------------------------------------------------- 

  PUBLICATION BIAS ASSESSMENT SUMMARY:

  1. FUNNEL PLOT:
     ‚Üí Visual inspection recommended (see funnel_plot.png)

  2. EGGER'S TEST:

Step 4.5: Publication bias assessment summary...
-------------------------------------------------------------------------------- 

  PUBLICATION BIAS ASS