# Role Moderation of SDT → Intervention-Specific Acceptance (China Sample)

Goal of H3

Test whether the association between self-determination (SDT; TENS_Life_mean_imputed) and intervention-specific acceptance:

- Accept_avatar_imputed (AI avatar / generic AI therapist)
- Accept_chatbot_imputed (AI chatbot)
- Accept_tele_imputed (teletherapy / human therapist)

is moderated by clinical role (role_label: client vs therapist) in the Chinese sample.

Note: Because the USA sample has role_label = "unknown" for all cases, a joint SDT × Country × Role model is not identified. Cross-country differences are instead handled via Country main effects in H1/H2. H3 focuses on role moderation within China where both clients and therapists are observed.

# 0.0 Paths and Data Loading

In [None]:
from __future__ import annotations

import warnings
from pathlib import Path
from typing import Dict, List

import numpy as np
import pandas as pd

import matplotlib.pyplot as plt
import seaborn as sns

import statsmodels.api as sm
import statsmodels.formula.api as smf
from statsmodels.stats.anova import anova_lm
from statsmodels.stats.outliers_influence import variance_inflation_factor

warnings.filterwarnings("ignore", category=FutureWarning)

sns.set(style="whitegrid")
plt.rcParams["figure.figsize"] = (8, 5)
plt.rcParams["axes.titlesize"] = 13
plt.rcParams["axes.labelsize"] = 12
plt.rcParams["font.size"] = 11

PROJECT_ROOT = Path.cwd().resolve()
DATA_DIR = PROJECT_ROOT / "data"
OUTPUT_DIR = DATA_DIR / "output"
PROCESSED_PATH = OUTPUT_DIR / "processed.csv"

processed = pd.read_csv(PROCESSED_PATH)

# 1.0. Define H3 Contextual Sample (China Only) and Descriptives

Pulled everything from processed.csv using the imputed variables:

- Outcomes: Accept_avatar_imputed, Accept_chatbot_imputed, Accept_tele_imputed
- Predictor: TENS_Life_mean_imputed
- Covariates: age_imputed, gender, PHQ5_mean_imputed, SSRPH_mean_imputed, GAAIS_mean_imputed ET_mean_imputed, plus Country and role_label.

Restricted to:
- Country == "China"
- role_label ∈ {"client", "therapist"}
- non-missing gender + role

Final N = 485 (therapists = 269, clients = 216).

In [None]:
h3_vars = [
    # outcomes
    "Accept_avatar_imputed",
    "Accept_chatbot_imputed",
    "Accept_tele_imputed",
    # SDT predictor
    "TENS_Life_mean_imputed",
    # covariates
    "age_imputed", "gender", "Country", "role_label",
    "PHQ5_mean_imputed", "SSRPH_mean_imputed",
    "GAAIS_mean_imputed", "ET_mean_imputed",
]

context_df = processed[h3_vars].copy()

## 1.1. Focus on the Chinese sample where role_label is known (client vs therapist)

In [None]:
context_df = context_df[
    (context_df["Country"] == "China") &
    (context_df["role_label"].isin(["client", "therapist"]))
].copy()

# Drop rows missing key covariates (gender, role)
context_df = context_df.dropna(subset=["gender", "role_label"])

In [None]:
print("H3 contextual sample (China, client/therapist only):")
print("N =", len(context_df))

In [None]:
print("Role distribution:")
print(context_df["role_label"].value_counts(dropna=False))

In [None]:
print("Gender distribution:")
print(context_df["gender"].value_counts(dropna=False))

In [None]:
print("Descriptives for SDT and outcomes (China only):")
display(
    context_df[
        ["TENS_Life_mean_imputed",
         "Accept_avatar_imputed",
         "Accept_chatbot_imputed",
         "Accept_tele_imputed"]
    ].describe().T
)

# 2.0. Center SDT and Age

In this China subsample, age_imputed has mean ≈ 12.36. That’s unusual if “age” is meant in years, but this is a data / measurement issue, not a modeling bug.

In [None]:
# Center key continuous predictors for interpretability
for col in ["TENS_Life_mean_imputed", "age_imputed"]:
    mean_val = context_df[col].mean()
    context_df[f"{col}_c"] = context_df[col] - mean_val
    print(f"{col} mean for centering (China H3 sample): {mean_val:.3f}")

In [None]:
print("Means of centered variables (≈ 0):")
print(context_df[["TENS_Life_mean_imputed_c", "age_imputed_c"]].mean())

# 3.0. Helper Function: Baseline vs Role-Moderation Models for Each Outcome

Baseline models: {outcome} ~ TENS_Life_mean_imputed_c + age_imputed_c + C(gender) + PHQ5_mean_imputed + SSRPH_mean_imputed + GAAIS_mean_imputed + ET_mean_imputed + C(role_label)

Role-moderation models: {outcome} ~ TENS_Life_mean_imputed_c * C(role_label) + age_imputed_c + C(gender) + PHQ5_mean_imputed + SSRPH_mean_imputed + GAAIS_mean_imputed + ET_mean_imputed

In [None]:
def fit_role_moderation(outcome: str, data: pd.DataFrame):

    cols = [
        outcome,
        "TENS_Life_mean_imputed_c",
        "age_imputed_c",
        "gender", "role_label",
        "PHQ5_mean_imputed",
        "SSRPH_mean_imputed",
        "GAAIS_mean_imputed",
        "ET_mean_imputed",
    ]

    sub_df = data[cols].dropna().copy()
    if sub_df.empty:
        print(f"\n{outcome}: no complete cases available for role moderation.")
        return None, None, None

    print(f"\n=== H3 Role moderation for {outcome} (China; N={len(sub_df)}) ===")

    # Baseline model: SDT main effect + covariates + role main effect
    baseline_formula = (
        f"{outcome} ~ "
        "TENS_Life_mean_imputed_c "
        "+ age_imputed_c + C(gender) "
        "+ PHQ5_mean_imputed + SSRPH_mean_imputed "
        "+ GAAIS_mean_imputed + ET_mean_imputed "
        "+ C(role_label)"
    )

    baseline_model = smf.ols(formula=baseline_formula, data=sub_df).fit()
    print("\nBaseline model (main effects only):")
    display(baseline_model.summary().tables[1])
    print(f"R² (baseline) = {baseline_model.rsquared:.3f}")

    # Role-moderation model: add SDT × role interaction
    role_formula = (
        f"{outcome} ~ "
        "TENS_Life_mean_imputed_c * C(role_label) "
        "+ age_imputed_c + C(gender) "
        "+ PHQ5_mean_imputed + SSRPH_mean_imputed "
        "+ GAAIS_mean_imputed + ET_mean_imputed"
    )

    role_model = smf.ols(formula=role_formula, data=sub_df).fit()
    print("\nRole-moderation model (TENS × role_label):")
    display(role_model.summary().tables[1])
    print(f"R² (role-moderation) = {role_model.rsquared:.3f}")

    # Model comparison via ANOVA
    print("\nModel comparison (Baseline vs Role-moderation):")
    comp = anova_lm(baseline_model, role_model)
    display(comp)

    return sub_df, baseline_model, role_model

In [None]:
h3_outcomes = [
    "Accept_avatar_imputed",
    "Accept_chatbot_imputed",
    "Accept_tele_imputed",
]

h3_models: Dict[str, Dict[str, object]] = {}

for outcome in h3_outcomes:
    sub_df, base_m, role_m = fit_role_moderation(outcome, context_df)
    h3_models[outcome] = {
        "data": sub_df,
        "baseline": base_m,
        "role_model": role_m,
    }

In the role-moderation model, TENS * C(role_label) automatically includes both:

- main effect of centered TENS,

- main effect of role (therapist vs client),

- interaction TENS × Role.

Covariates are identical across both models, which is important for a clean ANOVA comparison.

# 4.0. Summary Table of SDT × Role Effects Across Outcomes

In [None]:
summary_rows = []

for outcome in h3_outcomes:
    role_model = h3_models[outcome]["role_model"]
    if role_model is None:
        continue

    # Interaction term name: TENS_c:C(role_label)[T.therapist]
    term_name = "TENS_Life_mean_imputed_c:C(role_label)[T.therapist]"
    params = role_model.params
    bse = role_model.bse
    pvals = role_model.pvalues
    conf = role_model.conf_int()

    if term_name not in params.index:
        continue

    beta = params[term_name]
    se = bse[term_name]
    p = pvals[term_name]
    ci_low, ci_high = conf.loc[term_name]
    r2 = role_model.rsquared

    summary_rows.append({
        "Outcome": outcome,
        "beta_TENSxRole(therapist_vs_client)": beta,
        "SE": se,
        "p": p,
        "CI_low": ci_low,
        "CI_high": ci_high,
        "R2_role_model": r2,
    })

h3_summary_df = pd.DataFrame(summary_rows)
print("H3: SDT × Role interaction summary (China sample):")
display(h3_summary_df)

We tested whether clinical role (client vs. therapist) moderated the association between self-determination (TENS) and intervention-specific acceptance for AI avatars, AI chatbots, and teletherapy in the Chinese subsample. 

Across all three outcomes, there was no evidence of a statistically significant SDT × role interaction (see Table X). For AI avatar acceptance, the interaction term was very small and non-significant, β = −0.008, SE = 0.043, p = .845, 95% CI [−0.09, 0.08], R² = .365. 

A similar pattern emerged for AI chatbot acceptance, β = −0.016, SE = 0.044, p = .718, 95% CI [−0.10, 0.07], R² = .334, and for teletherapy acceptance, β = −0.007, SE = 0.042, p = .866, 95% CI [−0.09, 0.07], R² = .369. 

Taken together, these models indicate that higher self-determination is associated with greater acceptance of both AI-based and human-delivered interventions to a similar degree for clients and therapists, with no detectable evidence that role systematically strengthens or weakens the SDT–acceptance link in this sample.

# 4.0. VIF Check for a Focal H3 Role-Moderation Model

Pick one outcome (e.g., Accept_chatbot_imputed) as the focal model.

In [None]:
focal_outcome = "Accept_chatbot_imputed"
focal_role_model = h3_models[focal_outcome]["role_model"]

if focal_role_model is not None:
    X = focal_role_model.model.exog
    names = focal_role_model.model.exog_names

    vif_rows = []
    for i, name in enumerate(names):
        if name == "Intercept":
            continue
        vif_val = variance_inflation_factor(X, i)
        vif_rows.append({"Predictor": name, "VIF": vif_val})

    vif_h3_df = pd.DataFrame(vif_rows).sort_values("VIF", ascending=False)

    print(f"Variance Inflation Factors – H3 role model for {focal_outcome}:")
    display(vif_h3_df)


All VIFs for the focal model (Accept_chatbot_imputed) are < 3.

SDT and SDT×Role have the highest VIFs (~2.5–3), which is completely expected in an interaction model and still well below any red-flag threshold.

No concerning multicollinearity; SDT and its interaction are distinguishable.

# 5.0. Residual Diagnostics for the Focal H3 Model

In [None]:
if focal_role_model is not None:
    focal_df = h3_models[focal_outcome]["data"].copy()
    focal_df["fitted_h3"] = focal_role_model.fittedvalues
    focal_df["resid_h3"] = focal_role_model.resid

    fig, axes = plt.subplots(1, 2, figsize=(12, 5))

    # Residuals vs fitted
    sns.scatterplot(
        x="fitted_h3",
        y="resid_h3",
        data=focal_df,
        ax=axes[0],
        alpha=0.6
    )
    axes[0].axhline(0, linestyle="--", linewidth=1)
    axes[0].set_xlabel("Fitted values")
    axes[0].set_ylabel("Residuals")
    axes[0].set_title(f"H3 {focal_outcome}: Residuals vs Fitted")

    # Residual distribution
    sns.histplot(focal_df["resid_h3"], kde=True, ax=axes[1])
    axes[1].set_xlabel("Residual")
    axes[1].set_title(f"H3 {focal_outcome}: Residual Distribution")

    plt.tight_layout()
    plt.show()


# 6.0. Plot-Ready Predictions: SDT × Country × Role
- This builds a grid over SDT (TENS) and generates predicted acceptance for each combination of Country × role_label, so you can visualize how the SDT slope changes by context.

In [None]:
if focal_role_model is not None:
    focal_df = h3_models[focal_outcome]["data"]

    tens_min = focal_df["TENS_Life_mean_imputed_c"].quantile(0.05)
    tens_max = focal_df["TENS_Life_mean_imputed_c"].quantile(0.95)
    tens_grid = np.linspace(tens_min, tens_max, 50)

    # Typical covariate profile
    gender_ref = focal_df["gender"].mode()[0]
    age_ref = 0.0  # centered
    phq_mean = focal_df["PHQ5_mean_imputed"].mean()
    ssrph_mean = focal_df["SSRPH_mean_imputed"].mean()
    gaa_mean = focal_df["GAAIS_mean_imputed"].mean()
    et_mean = focal_df["ET_mean_imputed"].mean()

    role_levels = ["client", "therapist"]

    pred_rows = []
    for role in role_levels:
        for t_val in tens_grid:
            pred_rows.append({
                "TENS_Life_mean_imputed_c": t_val,
                "age_imputed_c": age_ref,
                "gender": gender_ref,
                "role_label": role,
                "PHQ5_mean_imputed": phq_mean,
                "SSRPH_mean_imputed": ssrph_mean,
                "GAAIS_mean_imputed": gaa_mean,
                "ET_mean_imputed": et_mean,
            })

    pred_df = pd.DataFrame(pred_rows)
    pred_df["pred_accept"] = focal_role_model.predict(pred_df)

    # Use the China sample mean for raw reconstruction
    tens_raw_mean = context_df["TENS_Life_mean_imputed"].mean()
    pred_df["TENS_Life_raw"] = (
        pred_df["TENS_Life_mean_imputed_c"] + tens_raw_mean
    )

    plt.figure(figsize=(8, 6))
    sns.lineplot(
        data=pred_df,
        x="TENS_Life_raw",
        y="pred_accept",
        hue="role_label"
    )
    plt.xlabel("Self-Determination (TENS, raw scale)")
    plt.ylabel(f"Predicted {focal_outcome}")
    plt.title(
        f"H3: Predicted {focal_outcome} across SDT\n"
        "for clients vs therapists (China)"
    )
    plt.tight_layout()
    plt.show()