# Moderation of Self-Determination on AI Acceptance

Goal:
- Test Hypothesis 2 (H2): Whether the association between Self-Determination (TENS_Life_mean) and AI acceptance for mental-health applications  (UTAUT_AI_mean) is moderated by general AI attitudes (GAAIS_mean).

Model: UTAUT_AI_mean ~ TENS_c * GAAIS_c + age_c + C(gender) + C(Country)

Key Steps:
- Load merged cross-cultural dataset
- Define H2 analysis sample
- Center continuous predictors
- Fit main-effects and interaction models
- Inspect coefficients, R¬≤, and diagnostics
- Prepare simple slopes / plot-ready data for interpretation

# 0.0 Imports and Path Setup

In [None]:
from __future__ import annotations

import warnings
from pathlib import Path
from typing import Dict, List

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

import statsmodels.api as sm
import statsmodels.formula.api as smf
from statsmodels.stats.anova import anova_lm
from statsmodels.stats.outliers_influence import variance_inflation_factor

warnings.filterwarnings("ignore", category=FutureWarning)

PROJECT_ROOT = Path.cwd()
DATA_DIR = PROJECT_ROOT / "data"
OUTPUT_DIR = DATA_DIR / "output"
PROCESSED_PATH = OUTPUT_DIR / "processed.csv"

# 1.0 Define H2 Analysis Sample

Goal of H2: Test whether general AI attitudes (GAAIS) moderate the association between self-determination (TENS) and global AI acceptance for mental-health interventions (UTAUT_AI_mean).

- Outcome: UTAUT_AI_mean_imputed

- Predictor (SDT): TENS_Life_mean_imputed

- Moderator: GAAIS_mean_imputed

- Covariates: age_imputed, gender, Country (China vs USA; covariate, not moderator here)

In [None]:
processed = pd.read_csv(PROCESSED_PATH)
print("Processed shape:", processed.shape)

In [None]:
h2_vars = [
    "UTAUT_AI_mean_imputed",    # outcome
    "TENS_Life_mean_imputed",   # SDT predictor
    "GAAIS_mean_imputed",       # moderator
    "age_imputed",              # covariate
    "gender",                   # covariate (categorical)
    "Country",                  # covariate (China vs USA)
]

h2_df = processed[h2_vars].copy()

## 1.1. Keep rows with non-missing categorical covariates

In [None]:
n_total = len(h2_df)
h2_df = h2_df.dropna(subset=["gender", "Country"])
n_analytic = len(h2_df)

In [None]:
print("H2 analytic sample:")
print(f"Total N in processed: {n_total}")
print(f"N with non-missing gender & Country: {n_analytic}")

In [None]:
print("Country distribution (H2 sample):")
print(h2_df["Country"].value_counts(dropna=False))

In [None]:
print("Gender distribution (H2 sample):")
print(h2_df["gender"].value_counts(dropna=False))

# 2.0 Descriptive Statistics and Correlations (H2)

We describe the continuous variables and inspect basic correlations among SDT, GAAIS, age, and global AI acceptance.

In [None]:
continuous_h2 = [
    "UTAUT_AI_mean_imputed",
    "TENS_Life_mean_imputed",
    "GAAIS_mean_imputed",
    "age_imputed",
]

print("Descriptive statistics (H2 continuous variables):")
display(h2_df[continuous_h2].describe().T)

In [None]:
# Correlation matrix
corr_h2 = h2_df[continuous_h2].corr()
print("Correlation matrix (H2):")
display(corr_h2.round(3))

# 3.0 Center Continuous Predictors

We mean-center SDT (TENS), general AI attitudes (GAAIS), and age for interpretability and to align with moderation conventions.

In [None]:
center_cols_h2 = ["TENS_Life_mean_imputed", "GAAIS_mean_imputed", "age_imputed"]

for col in center_cols_h2:
    mean_val = h2_df[col].mean()
    h2_df[f"{col}_c"] = h2_df[col] - mean_val
    print(f"{col} mean for centering: {mean_val:.3f}")

In [None]:
print("Means of centered variables (should be ‚âà 0):")
print(h2_df[[f"{c}_c" for c in center_cols_h2]].mean())

# 4.0 Baseline H2 Model ‚Äì Main Effects Only

We first estimate a main-effects model without the interaction:

UTAUT_AI = Œ≤_0 + Œ≤_1 TENS_c + Œ≤_2 GAAIS_c + Œ≤_3 age_c + Œ≤_4 gender + Œ≤_5 Country + ùúÄ

In [None]:
formula_h2_main = (
    "UTAUT_AI_mean_imputed ~ "
    "TENS_Life_mean_imputed_c + GAAIS_mean_imputed_c "
    "+ age_imputed_c + C(gender) + C(Country)"
)

h2_main_model = smf.ols(formula=formula_h2_main, data=h2_df).fit()

print("H2 Baseline (main effects only) model summary:")
display(h2_main_model.summary())

In [None]:
print(f"R¬≤ (H2 main-effects model): {h2_main_model.rsquared:.3f}")

In [None]:
print(f"Adj. R¬≤ (H2 main-effects model): {h2_main_model.rsquared_adj:.3f}")

# 5.0. Full H2 Model ‚Äì SDT √ó GAAIS Interaction

We add the interaction between SDT and general AI attitudes:

UTAUT_AI = Œ≤_0 + Œ≤_1 TENS_c + Œ≤_2 GAAIS_c + Œ≤_3 (TENS_c √ó GAAIS_c) + covariates + ùúÄ

In [None]:
# Full H2 model with interaction term
formula_h2_full = (
    "UTAUT_AI_mean_imputed ~ "
    "TENS_Life_mean_imputed_c * GAAIS_mean_imputed_c "
    "+ age_imputed_c + C(gender) + C(Country)"
)

h2_full_model = smf.ols(formula=formula_h2_full, data=h2_df).fit()

print("H2 Full model (with interaction) summary:")
display(h2_full_model.summary())

In [None]:
print(f"R¬≤ (H2 full model): {h2_full_model.rsquared:.3f}")

In [None]:
print(f"Adj. R¬≤ (H2 full model): {h2_full_model.rsquared_adj:.3f}")

## 5.1. Extract Key Composite

In [None]:
print("Key H2 coefficients:")
display(
    h2_full_model.params[[
        "TENS_Life_mean_imputed_c",
        "GAAIS_mean_imputed_c",
        "TENS_Life_mean_imputed_c:GAAIS_mean_imputed_c"
    ]]
)

In [None]:
print("Key H2 p-values:")
display(
    h2_full_model.pvalues[[
        "TENS_Life_mean_imputed_c",
        "GAAIS_mean_imputed_c",
        "TENS_Life_mean_imputed_c:GAAIS_mean_imputed_c"
    ]]
)

# 6.0. Model Comparison: Main Effects VS Interaction

We compare the main-effects model vs. the interaction model using ANOVA and ŒîR¬≤.

In [None]:
print("ANOVA comparison: main-effects vs interaction model (H2)")
anova_results = anova_lm(h2_main_model, h2_full_model)
display(anova_results)

Adding the interaction term reduced the residual error (SSR dropped from 3095.81 ‚Üí 3091.40), But this reduction is not statistically significant:
- F = 3.17
- p = 0.075 (above .05 but suggestive)

In [None]:
r2_main = h2_main_model.rsquared
r2_full = h2_full_model.rsquared
delta_r2 = r2_full - r2_main

print(f"R¬≤ (main effects): {r2_main:.3f}")
print(f"R¬≤ (full with interaction): {r2_full:.3f}")
print(f"ŒîR¬≤ due to interaction: {delta_r2:.3f}")

# 9.0. Multicollinearity Check
- Build design matrix for VIF (excluding intercept)
- Use the same structure as the full H2 model

In [None]:
X_h2 = h2_full_model.model.exog
vif_data_h2 = []

for i, name in enumerate(h2_full_model.model.exog_names):
    if name == "Intercept":
        continue
    vif = variance_inflation_factor(X_h2, i)
    vif_data_h2.append({"Predictor": name, "VIF": vif})

vif_h2_df = pd.DataFrame(vif_data_h2).sort_values("VIF", ascending=False)

print("Variance Inflation Factors (H2 full model):")
display(vif_h2_df)

# 10.0. Residual Diagnostics (H2 Full Model)

We check linearity, homoscedasticity, and residual distribution.

In [None]:
# Standardized residuals and fitted values
h2_df["resid_h2"] = h2_full_model.resid
h2_df["fitted_h2"] = h2_full_model.fittedvalues

fig, axes = plt.subplots(1, 2, figsize=(12, 5))

# Residuals vs fitted values
sns.scatterplot(
    x="fitted_h2",
    y="resid_h2",
    data=h2_df,
    ax=axes[0],
    alpha=0.5
)
axes[0].axhline(0, color="black", linestyle="--", linewidth=1)
axes[0].set_xlabel("Fitted values")
axes[0].set_ylabel("Residuals")
axes[0].set_title("H2: Residuals vs Fitted")

# Residual distribution
sns.histplot(h2_df["resid_h2"], kde=True, ax=axes[1])
axes[1].set_xlabel("Residual")
axes[1].set_title("H2: Residual Distribution")

plt.tight_layout()
plt.show()

# 11.0. Simple Slopes / Plot-Ready Data (Low / Mean / High GAAIS)

Even if the interaction is non-significant, reviewers often like seeing simple slopes for interpretability. Here we generate predicted UTAUT scores across SDT levels at low / mean / high GAAIS (¬±1 SD).

In [None]:
# Ensure centered columns exist
for col in ["TENS_Life_mean_imputed", "GAAIS_mean_imputed", "age_imputed"]:
    c_name = f"{col}_c"
    if c_name not in h2_df.columns:
        h2_df[c_name] = h2_df[col] - h2_df[col].mean()

# Create +/- 1 SD values of moderator (GAAIS) on the *centered* scale
g_mean = h2_df["GAAIS_mean_imputed_c"].mean()   # should be ~0
g_sd   = h2_df["GAAIS_mean_imputed_c"].std()

g_levels = {
    "low_GAAIS":  g_mean - g_sd,
    "mean_GAAIS": g_mean,
    "high_GAAIS": g_mean + g_sd,
}

# Range of TENS (centered) values for simple slopes
tens_min = h2_df["TENS_Life_mean_imputed_c"].min()
tens_max = h2_df["TENS_Life_mean_imputed_c"].max()

tens_grid_c = np.linspace(tens_min, tens_max, 50)

pred_rows = []

for level_name, g_val in g_levels.items():
    for t_val in tens_grid_c:
        row = {
            "TENS_Life_mean_imputed_c": t_val,
            "GAAIS_mean_imputed_c":    g_val,
            "age_imputed_c":           0.0,  # mean-centered age
            "gender":                  h2_df["gender"].mode()[0],
            "Country":                 h2_df["Country"].mode()[0],
            "GAAIS_level":             level_name,
        }
        pred_rows.append(row)

pred_df = pd.DataFrame(pred_rows)

# Use the fitted full H2 model (with interaction) to get predictions
pred_df["UTAUT_pred"] = h2_full_model.predict(pred_df)

# Add a raw-scale TENS variable for nicer plotting
tens_raw_mean = h2_df["TENS_Life_mean_imputed"].mean()
pred_df["TENS_Life_raw"] = pred_df["TENS_Life_mean_imputed_c"] + tens_raw_mean

display(pred_df.head())

# 12.0. Plot simple slopes of TENS ‚Üí UTAUT at different GAAIS levels

In [None]:
plt.figure(figsize=(8, 6))
sns.lineplot(
    data=pred_df,
    x="TENS_Life_raw",
    y="UTAUT_pred",
    hue="GAAIS_level"
)
plt.xlabel("Self-Determination (TENS, raw scale)")
plt.ylabel("Predicted Global AI Acceptance (UTAUT_AI)")
plt.title("H2: Predicted AI Acceptance Across SDT\nat Low / Mean / High General AI Attitudes")
plt.tight_layout()
plt.show()

# Narrative Summary

In [None]:
beta_sdt = h2_full_model.params["TENS_Life_mean_imputed_c"]
p_sdt = h2_full_model.pvalues["TENS_Life_mean_imputed_c"]

beta_gaais = h2_full_model.params["GAAIS_mean_imputed_c"]
p_gaais = h2_full_model.pvalues["GAAIS_mean_imputed_c"]

beta_int = h2_full_model.params["TENS_Life_mean_imputed_c:GAAIS_mean_imputed_c"]
p_int = h2_full_model.pvalues["TENS_Life_mean_imputed_c:GAAIS_mean_imputed_c"]

print(
    f"In the H2 model, higher self-determination (SDT; TENS) was associated with "
    f"greater global acceptance of AI mental-health interventions "
    f"(Œ≤ = {beta_sdt:.3f}, p = {p_sdt:.3g}), controlling for age, gender, country, "
    f"and general AI attitudes.\n"
    f"General AI attitudes (GAAIS) also showed a positive association with global AI acceptance "
    f"(Œ≤ = {beta_gaais:.3f}, p = {p_gaais:.3g}).\n"
    f"The SDT √ó GAAIS interaction term was "
    f"{'statistically significant' if p_int < 0.05 else 'not statistically significant'} "
    f"(Œ≤ = {beta_int:.3f}, p = {p_int:.3g}), and the inclusion of the interaction changed R¬≤ by "
    f"ŒîR¬≤ = {delta_r2:.3f} relative to the main-effects model."
)