## Fitting the model

To get the idea of the distribution (for the model):
- Very technical approach: Run random effects model > generate residuals > check distribution of the residuals > choose the model accordingly
- Empirical approach: run the model based on either gaussian distribution (identity link function) or gamma distribution (log link function) and compare which of them fits better

Steps for running stats using GLMM:
- Get the fixed effects table
- Get estimated means of conditions and difference between the means
- Run t-test based on the estmated means

What I want to compare?

1. BL_plan vs MAIN_plan_baseline vs MAIN_plan_adaptation
2. BL_go vs MAIN_go_baseline vs MAIN_go_adaptation
3. MAIN_plan_baseline vs MAIN_plan_adaptation
4. MAIN_go_baseline vs MAIN_go_adaptation


# IMPORTANT NOTES ON CODE

Here’s a compact, step-by-step way to read what you’ve got and what it means.

# 1) What the MixedLM table tells you

* **Model**: `pac_value ~ condition` with a **random intercept per subject** (`groups=sub`).
* **Coding**: The intercept is the mean of the reference level of `condition` (likely `_BL_N/A`). The two listed coefficients are **differences** from that reference.
* **Coefficients**

  * `Intercept`: mean pac_value at `_BL_N/A`.
  * `condition[T._MAIN__adaptation]`: mean difference vs `_BL_N/A` (negative here).
  * `condition[T._MAIN__baseline]`: mean difference vs `_BL_N/A` (positive here).
* **z and p**: Wald tests for whether each coefficient differs from 0. All three fixed-effect lines are highly significant (tiny p’s).
* **CIs**: All tight around the estimates—consistent with strong effects.
* **Random Effect Variance (Group Var ≈ 3.68e-10)**: Between-subject variability (after accounting for `condition`) is extremely small relative to your outcome scale. That can be real or just a scaling/rounding artifact, but it means subjects don’t differ much in baseline levels in this model.

# 2) What the EMM pairwise table adds

You computed **estimated marginal means (EMMs)** and **pairwise contrasts** with **Tukey adjustment** (good for all pairwise comparisons among groups).

Each row compares two conditions:

* **estimate_link**: the **difference in means** on the model’s link scale. Here the link is identity (linear mixed model), so this is a **raw difference in pac_value**.
* **SE**: standard error of that difference.
* **t.ratio**: test statistic for the difference.
* **p.value.adj**: Tukey-adjusted p-value (controls family-wise error across all pairs).

Your results:

* `_BL_N/A – _MAIN__baseline = −0.000149`, **adj p < 1e-6** → `_MAIN__baseline` is **higher** than `_BL_N/A` by ~1.49×10⁻⁴.
* `_BL_N/A – _MAIN__adaptation = +0.000020`, **adj p = 0.0018** → `_BL_N/A` is **slightly higher** than `_MAIN__adaptation` by ~2×10⁻⁵.
* `_MAIN__baseline – _MAIN__adaptation = +0.000170`, **adj p < 1e-6** → `_MAIN__baseline` is **higher** than `_MAIN__adaptation` by ~1.70×10⁻⁴.

So the **ordering of means** is:

```
MAIN__baseline  >  BL_N/A  >  MAIN__adaptation
```

and **all three pairwise differences are statistically significant** after Tukey correction.

# 3) Are these differences big or small?

You said earlier pac_value has **mean ≈ 1×10⁻⁴** and **SD ≈ 5×10⁻⁵**. Using that scale:

* The largest contrast (baseline vs adaptation) ≈ **1.7×10⁻⁴**, which is about **3.4 SDs** (1.7e-4 / 5e-5). That’s **very large** relative to your outcome variability.
* The small contrast (BL_N/A vs adaptation) ≈ **2×10⁻⁵**, ~**0.4 SD**, still meaningful and significant.

(If those summary stats came from a subset, re-check with your actual sample SDs per condition.)

# 4) How to report this

* **Model**: “We fitted a linear mixed-effects model with condition as a fixed effect and random intercepts for subjects.”
* **Main effect**: “Condition had a significant effect on pac_value (all pairwise Tukey-adjusted p’s ≤ 0.0018).”
* **EMM contrasts** (report the three differences with CIs if you also computed EMM CIs):

  * `MAIN__baseline` vs `MAIN__adaptation`: Δ = 0.000170 (Tukey-adj p < 0.001)
  * `MAIN__baseline` vs `BL_N/A`: Δ = 0.000149 (Tukey-adj p < 0.001)
  * `BL_N/A` vs `MAIN__adaptation`: Δ = 0.000020 (Tukey-adj p = 0.0018)
* **Direction/ordering**: “pac_value is highest in MAIN__baseline, intermediate at BL_N/A, lowest in MAIN__adaptation.”
* **Random effects**: “Between-subject variance was near zero on this scale, suggesting limited subject-level heterogeneity after accounting for condition.”

# 5) Practical tips


* **Rescale for readability**: Because values are ~1e-4, consider reporting in **micro-units** (e.g., multiply by 10⁶). Then the big contrast becomes **+170 μ-units** with the same statistics. This improves interpretability without changing inference.
* **Check model diagnostics**: residual plots, influential points, and whether random-effect variance being ~0 is reasonable for your data.
* **Include EMMs**: It’s useful to also print the per-condition EMMs (not just contrasts) with 95% CIs, so readers see the absolute levels.


In [7]:
import statsmodels.formula.api as smf
import pandas as pd
import os
# Load the dataframe
group = 'Y'
roi_dir = f'D:\\BonoKat\\research project\\# study 1\\eeg_data\\set\\{group} group\\roi_source_analysis'
df_pac_roi = pd.read_csv(os.path.join(roi_dir, f'PAC_MI_SOURCE_{group}_ROI.csv'))
df_pac_roi['block'] = df_pac_roi['block'].fillna('N/A')
df_pac_roi["condition"] = df_pac_roi["task"] + "_" + df_pac_roi["block"]
roi = 'G_precentral-lh'  # Example ROI name
df_roi = df_pac_roi.query("roi == @roi")

df_plan = df_roi.query("task_stage == '_plan'") # for model_1
df_go = df_roi.query("task_stage == '_go'") # for model_2
df_plan_main = df_plan.query("task == '_MAIN'") # for model_3
df_go_main = df_go.query("task == '_MAIN'") # for model_4

In [16]:
import statsmodels.api as sm
import statsmodels.formula.api as smf
import warnings
from statsmodels.tools.sm_exceptions import ConvergenceWarning

# Suppress convergence warnings for mixed models with zero random effects
# (These warnings indicate minimal between-subject variance, which is OK)
warnings.filterwarnings('ignore', category=ConvergenceWarning)

# Model templates
print(f"\nROI: {roi}")

# 1 - BL_plan vs MAIN_plan_baseline vs MAIN_plan_adaptation
model_1 = smf.mixedlm(
    "pac_value ~ condition",
    data=df_plan,
    groups=df_plan["sub"]
).fit(reml=True)
print("\n" + "="*70)
print("MODEL 1: Planning Stage (BL vs MAIN baseline vs MAIN adaptation)")
print("="*70)
print(model_1.summary().as_text())
print(f"Random Effect Variance: {model_1.cov_re.iloc[0, 0]:.2e}")

# 2 - BL_go vs MAIN_go_baseline vs MAIN_go_adaptation
model_2 = smf.mixedlm(
    "pac_value ~ condition",
    data=df_go,
    groups=df_go["sub"]
).fit(reml=True)
print("\n" + "="*70)
print("MODEL 2: Go Stage (BL vs MAIN baseline vs MAIN adaptation)")
print("="*70)
print(model_2.summary().as_text())
print(f"Random Effect Variance: {model_2.cov_re.iloc[0, 0]:.2e}")

# 3 - MAIN_plan_baseline vs MAIN_plan_adaptation
model_3 = smf.mixedlm(
    "pac_value ~ block",
    data=df_plan_main,
    groups=df_plan_main["sub"]
).fit(reml=True)
print("\n" + "="*70)
print("MODEL 3: Planning Stage MAIN only (baseline vs adaptation)")
print("="*70)
print(model_3.summary().as_text())
print(f"Random Effect Variance: {model_3.cov_re.iloc[0, 0]:.2e}")

# 4 - MAIN_go_baseline vs MAIN_go_adaptation
model_4 = smf.mixedlm(
    "pac_value ~ block",
    data=df_go_main,
    groups=df_go_main["sub"]
).fit(reml=True)
print("\n" + "="*70)
print("MODEL 4: Go Stage MAIN only (baseline vs adaptation)")
print("="*70)
print(model_4.summary().as_text())
print(f"Random Effect Variance: {model_4.cov_re.iloc[0, 0]:.2e}")


ROI: G_precentral-lh

MODEL 1: Planning Stage (BL vs MAIN baseline vs MAIN adaptation)
                  Mixed Linear Model Regression Results
Model:                  MixedLM       Dependent Variable:       pac_value
No. Observations:       69            Method:                   REML     
No. Groups:             23            Scale:                    0.0000   
Min. group size:        3             Log-Likelihood:           598.5126 
Max. group size:        3             Converged:                Yes      
Mean group size:        3.0                                              
-------------------------------------------------------------------------
                               Coef.  Std.Err.   z    P>|z| [0.025 0.975]
-------------------------------------------------------------------------
Intercept                       0.000    0.000 22.546 0.000  0.000  0.000
condition[T._MAIN__adaptation] -0.000    0.000 -3.250 0.001 -0.000 -0.000
condition[T._MAIN__baseline]    0.000    0

### Diagnostic: Should We Use OLS Instead?

If random effect variances are near zero (< 1e-10), consider using regular OLS models with clustered standard errors.

In [17]:
# Check if we should use OLS models instead
models = {
    'Model 1 (Planning)': model_1,
    'Model 2 (Go)': model_2,
    'Model 3 (MAIN Planning)': model_3,
    'Model 4 (MAIN Go)': model_4
}

print("RANDOM EFFECT VARIANCE DIAGNOSTICS")
print("="*70)
threshold = 1e-8
use_ols_recommendation = []

for name, model in models.items():
    var_re = model.cov_re.iloc[0, 0]
    print(f"{name:25s} | Random Var: {var_re:.2e} | ", end="")
    
    if var_re < threshold:
        print("⚠️  Near zero - consider OLS")
        use_ols_recommendation.append(name)
    else:
        print("✓ Mixed model appropriate")

if use_ols_recommendation:
    print("\n" + "="*70)
    print("RECOMMENDATION: Consider using OLS for models with near-zero random effects.")
    print("This would avoid convergence warnings and provide more interpretable results.")
    print("\nAlternatively, the current mixed models are still valid - the warnings")
    print("simply indicate that subject-level random intercepts add little variance.")
else:
    print("\n✓ All models have meaningful random effects - mixed models appropriate.")

RANDOM EFFECT VARIANCE DIAGNOSTICS
Model 1 (Planning)        | Random Var: 3.68e-10 | ⚠️  Near zero - consider OLS
Model 2 (Go)              | Random Var: 1.45e-10 | ⚠️  Near zero - consider OLS
Model 3 (MAIN Planning)   | Random Var: 6.02e-10 | ⚠️  Near zero - consider OLS
Model 4 (MAIN Go)         | Random Var: 8.69e-11 | ⚠️  Near zero - consider OLS

RECOMMENDATION: Consider using OLS for models with near-zero random effects.

simply indicate that subject-level random intercepts add little variance.


## Model Fitting

The convergence warnings below indicate that random effect variances are estimated at or near zero. This suggests minimal between-subject variability, and we may consider:
1. Using regular OLS models instead
2. Suppressing the warnings if mixed models are theoretically justified
3. Adding diagnostic output to check variance components

## Estimated Marginal Means (EMMs) Implementation

This section implements a comprehensive `emmeans` function for post-hoc testing of mixed linear models.

In [18]:
# Core imports for emmeans
import numpy as np
import pandas as pd
from itertools import product, combinations
from typing import Dict, List, Optional, Union, Tuple
import patsy
from statsmodels.stats.multitest import multipletests
from scipy.stats import norm, t as tdist

# Try to import psturng for proper Tukey adjustment
try:
    from statsmodels.stats.libqsturng import psturng
    _HAVE_PSTURNG = True
except ImportError:
    _HAVE_PSTURNG = False
    
print("EMM dependencies imported successfully")

EMM dependencies imported successfully


In [19]:
"""
Implementation of estimated marginal means (EMMs) computation and contrasts,
adapted from the provided CODEBASE. This module defines the `emmeans` function
along with its supporting utilities. The code is lifted directly from the
original snippet to facilitate testing and debugging in a standalone module.

Note: This file was automatically generated during validation of the
emmeans function. Any changes or bug fixes should be documented in the
appropriate section of the validation report.
"""

from __future__ import annotations

from dataclasses import dataclass
from typing import Callable, Union, List, Optional, Dict, Tuple

import numpy as np
import pandas as pd
import patsy
from itertools import product, combinations
from scipy.stats import norm, t as tdist
from statsmodels.stats.multitest import multipletests
import statsmodels.api as sm
import statsmodels.formula.api as smf  # noqa: F401  (kept for user parity)
import re

try:
    # statsmodels >= 0.14
    from statsmodels.stats.libqsturng import psturng
    _HAVE_PSTURNG = True
except Exception:
    _HAVE_PSTURNG = False

from pandas.api.types import CategoricalDtype


# ----------------------------- Utilities -----------------------------

def _groupby(df: pd.DataFrame, by_cols: List[str], **kwargs):
    """
    Wrapper that (a) avoids passing duplicate 'dropna' and
    (b) gracefully supports pandas < 1.1 where 'dropna' is unsupported.
    """
    dropna_kw = kwargs.pop("dropna", None)
    try:
        if dropna_kw is not None:
            return df.groupby(by_cols, dropna=dropna_kw, **kwargs)
        else:
            # default to dropna=False for stable marginalization behavior
            return df.groupby(by_cols, dropna=False, **kwargs)
    except TypeError:
        # pandas < 1.1: retry without 'dropna'
        return df.groupby(by_cols, **kwargs)


def _base_name(patsy_name: str) -> str:
    n = patsy_name.strip()
    if n.startswith("C(") and n.endswith(")"):
        return n[2:-1].strip()
    return n


def _extract_dep_vars(expr: str, data_cols: pd.Index) -> List[str]:
    """
    Heuristic: pull candidate symbols from an EvalFactor's name,
    keep only those that are actual columns in `data`.
    """
    tokens = set(re.findall(r"[A-Za-z_]\w*", expr))
    ignore = {"C", "I", "np", "pd", "math", "sin", "cos", "tan", "exp", "log", "sqrt"}
    return [t for t in tokens if (t not in ignore and t in data_cols)]


def _get_design_info_or_raise(result):
    try:
        return result.model.data.design_info
    except AttributeError:
        raise ValueError("Model must be fitted with a Patsy formula to have design_info.")


def _get_model_params_and_vcov(result, vcov: Optional[np.ndarray]) -> Tuple[np.ndarray, List[str], np.ndarray]:
    """
    Handles MixedLM vs others and aligns covariance to fixed-effect parameter order.
    """
    if hasattr(result, "fe_params"):  # MixedLM
        beta = result.fe_params.values
        param_names = list(result.fe_params.index)
        if vcov is not None:
            V = np.asarray(vcov)
        else:
            V_full = result.cov_params()
            if hasattr(V_full, "loc"):  # DataFrame
                V = V_full.loc[param_names, param_names].to_numpy()
            else:
                # ndarray fallback: assume order matches params; slice by position
                all_names = list(getattr(result, "params").index)
                idx = [all_names.index(n) for n in param_names]
                V = np.asarray(V_full)[np.ix_(idx, idx)]
    else:
        beta = result.params.values
        param_names = list(result.params.index)
        V = np.asarray(vcov) if vcov is not None else np.asarray(result.cov_params())
    return beta, param_names, V


def _get_link_and_deriv(result) -> Tuple[Callable[[np.ndarray], np.ndarray], Callable[[np.ndarray], np.ndarray]]:
    """
    Returns (invlink, deriv) where 'deriv' is derivative of inverse link wrt eta.
    Falls back to identity if not a GLM.
    """
    try:
        link = result.model.family.link
        invlink = link.inverse
        def deriv(eta):
            if hasattr(link, "inverse_deriv"):
                return link.inverse_deriv(eta)
            # symmetric numerical derivative, stable enough for scalar/vector
            eps = 1e-6
            return (invlink(eta + eps) - invlink(eta - eps)) / (2 * eps)
    except AttributeError:
        invlink = lambda x: x
        deriv = lambda x: np.ones_like(np.asarray(x), dtype=float)
    return invlink, deriv


@dataclass
class FactorSpec:
    kind: str                           # "cat" or "num"
    levels: Optional[List] = None       # for categorical
    depends_on: Optional[List[str]] = None  # raw variable deps for transformed factors


@dataclass
class FactorExtraction:
    factors: Dict[str, FactorSpec]
    name_map: Dict[str, str]
    deps_map: Dict[str, List[str]]


def _extract_factors(di, data: pd.DataFrame) -> FactorExtraction:
    factors: Dict[str, FactorSpec] = {}
    name_map: Dict[str, str] = {}
    deps_map: Dict[str, List[str]] = {}

    for fac, info in di.factor_infos.items():
        raw_name = fac.name() if hasattr(fac, "name") else str(fac)
        base = _base_name(raw_name)
        name_map[base] = raw_name

        state = getattr(info, "state", {}) or {}
        levels = [lv for lv in list(state.get("levels", [])) if pd.notna(lv)]

        if levels:
            # Patsy categorical with known levels
            factors[base] = FactorSpec(kind="cat", levels=levels, depends_on=[])
            continue

        if base in data.columns:
            s = data[base]
            if isinstance(s.dtype, CategoricalDtype):
                factors[base] = FactorSpec(kind="cat", levels=list(s.cat.categories), depends_on=[])
            elif not pd.api.types.is_numeric_dtype(s):
                factors[base] = FactorSpec(kind="cat", levels=list(pd.unique(s.dropna())), depends_on=[])
            else:
                factors[base] = FactorSpec(kind="num", levels=None, depends_on=[])
        else:
            # Transformed / EvalFactor (e.g., I(x**2), np.log(x), etc.)
            deps = _extract_dep_vars(raw_name, data.columns)
            factors[base] = FactorSpec(kind="num", levels=None, depends_on=deps)
            deps_map[base] = deps

    return FactorExtraction(factors=factors, name_map=name_map, deps_map=deps_map)


def _validate_at_levels(at: Dict[str, Union[float, str, List[Union[float, str]]]],
                        factors: Dict[str, FactorSpec]):
    for k, v in (at or {}).items():
        spec = factors[k]
        if spec.kind == "cat":
            vals = list(v) if isinstance(v, (list, tuple, np.ndarray, pd.Series)) else [v]
            bad = [x for x in vals if x not in (spec.levels or [])]
            if bad:
                raise ValueError(
                    f"at['{k}'] contains levels not in the model: {bad}. "
                    f"Allowed: {spec.levels}"
                )


def _determine_used_index_and_weights(result, data: pd.DataFrame) -> Tuple[Optional[pd.Index], Optional[pd.Series]]:
    """
    Attempts to recover the actual analysis sample index and (if present) model weights.
    """
    used_idx = None
    weights = None
    try:
        used_idx = getattr(result.model.data, "row_labels", None)
        if used_idx is None and len(getattr(result.model, "endog", [])) == len(data):
            used_idx = data.index
        if used_idx is not None:
            used_idx = pd.Index(used_idx).intersection(data.index)
    except Exception:
        used_idx = None

    try:
        weights = getattr(result.model, "weights", None)
        if weights is not None and used_idx is not None:
            weights = pd.Series(np.asarray(weights).reshape(-1), index=used_idx)
    except Exception:
        weights = None

    return used_idx, weights


def _build_reference_grid(
    data: pd.DataFrame,
    factors: Dict[str, FactorSpec],
    deps_map: Dict[str, List[str]],
    focal_set: set,
    at: Dict[str, Union[float, str, List[Union[float, str]]]],
    di,
    param_names: List[str],
) -> Tuple[pd.DataFrame, pd.DataFrame]:
    # Levels per factor (respect `at` first)
    grid_levels: Dict[str, List] = {}
    for name, meta in factors.items():
        if name in at:
            val = at[name]
            grid_levels[name] = list(val) if isinstance(val, (list, tuple, np.ndarray, pd.Series)) else [val]
        elif name in focal_set:
            if meta.kind == "cat":
                grid_levels[name] = meta.levels or []
            else:
                raise ValueError(
                    f"Numeric focal variable '{name}' requires explicit levels via at={{'{name}':[...]}}"
                )
        else:
            if meta.kind == "cat":
                grid_levels[name] = meta.levels or []
            else:
                if name in data.columns:
                    grid_levels[name] = [float(pd.to_numeric(data[name], errors="coerce").mean())]
                else:
                    # transformed numeric factor: inject raw dependencies later
                    pass

    if deps_map:
        needed = set(v for deps in deps_map.values() for v in deps)
        missing = [v for v in needed if v not in grid_levels]
        for v in missing:
            if v not in data.columns:
                continue
            s = data[v]
            if isinstance(s.dtype, CategoricalDtype):
                grid_levels[v] = list(s.cat.categories)
            elif not pd.api.types.is_numeric_dtype(s):
                grid_levels[v] = list(pd.unique(s.dropna()))
            else:
                grid_levels[v] = [float(pd.to_numeric(s, errors="coerce").mean())]

    # Cartesian grid
    keys = list(grid_levels.keys())
    values = [grid_levels[k] for k in keys]
    grid_raw = pd.DataFrame([dict(zip(keys, tup)) for tup in product(*values)])

    # Design matrix aligned to param order
    X_grid = patsy.build_design_matrices([di], grid_raw, return_type="dataframe")[0]
    Xg = X_grid.reindex(columns=param_names, fill_value=0.0)
    return grid_raw, Xg


def _apply_marginal_weights(
    grid_raw: pd.DataFrame,
    data: pd.DataFrame,
    used_idx: Optional[pd.Index],
    weight: str,
    focal_set: set,
    at: Dict[str, Union[float, str, List[Union[float, str]]]],
    factors: Dict[str, FactorSpec],
    drop_unseen: bool
) -> pd.DataFrame:
    grid_raw = grid_raw.copy()
    grid_raw["_weight"] = 1.0

    if weight == "proportional":
        nonfocal_cats = [
            nm for nm, m in factors.items()
            if (nm not in focal_set) and (nm not in at) and (m.kind == "cat")
        ]
        if nonfocal_cats:
            source_df = data.loc[used_idx] if used_idx is not None else data
            # Avoid pandas warning by passing a single column name instead of a list
            grouper = nonfocal_cats[0] if len(nonfocal_cats) == 1 else nonfocal_cats
            grp = _groupby(source_df[nonfocal_cats], grouper, dropna=False).size()
            total = grp.sum()
            joint_props = (grp / total).to_dict()

            def _row_prop(row):
                key = tuple(row[c] for c in nonfocal_cats)
                return float(joint_props.get(key, 0.0))

            grid_raw["_weight"] = grid_raw.apply(
                (lambda r: _row_prop(r) if nonfocal_cats else 1.0), axis=1
            )

            if drop_unseen:
                grid_raw = grid_raw.loc[grid_raw["_weight"] > 0].copy()

            if np.allclose(grid_raw["_weight"].values.sum(), 0.0):
                grid_raw["_weight"] = 1.0

    return grid_raw


def _normalize_weights_by_cell(grid_raw: pd.DataFrame, cell_keys: List[str]) -> pd.DataFrame:
    grid_raw = grid_raw.copy()
    if len(cell_keys) > 0:
        sums = _groupby(grid_raw, cell_keys, sort=False)["_weight"].transform("sum")
        with np.errstate(divide="ignore", invalid="ignore"):
            grid_raw["_weight_norm_by_cell"] = np.where(sums > 0, grid_raw["_weight"] / sums, 0.0)
    else:
        grid_raw["_weight_norm_by_cell"] = 1.0
    return grid_raw


def _label_for_spec_tuple(specs: List[str], tup: Tuple) -> str:
    if len(specs) == 1:
        return str(tup[0])
    parts = [f"{nm}={val}" for nm, val in zip(specs, tup)]
    return " × ".join(parts)


# ---------------------- Core computational blocks ----------------------

def _compute_emms_per_cell(
    grid_raw: pd.DataFrame,
    Xg: pd.DataFrame,
    beta: np.ndarray,
    V: np.ndarray,
    invlink: Callable,
    deriv: Callable,
    transform: str,
    level: float,
    df_global: float,
    specs: List[str],
    by: List[str],
) -> Tuple[pd.DataFrame, Dict[Tuple, List[Tuple[Tuple, np.ndarray, float, float]]]]:
    cell_keys = specs + by
    # Avoid pandas FutureWarning by not passing a list of length 1 as the grouper
    if cell_keys:
        grouper = cell_keys[0] if len(cell_keys) == 1 else cell_keys
        grouped = _groupby(grid_raw, grouper, sort=False)
    else:
        grouped = [((), grid_raw)]

    emm_rows = []
    by_slices: Dict[Tuple, List[Tuple[Tuple, np.ndarray, float, float]]] = {}

    for key_vals, subgrid in grouped:
        if subgrid.shape[0] == 0:
            continue
        w = subgrid["_weight"].values.astype(float)
        if w.size == 0:
            continue
        total_w = w.sum()
        w = (w / total_w) if total_w > 0 else np.full_like(w, 1.0 / float(w.size))

        L = (Xg.loc[subgrid.index].values.T @ w).reshape(1, -1)

        eta = float((L @ beta).item())
        var_eta = float((L @ V @ L.T).item())
        se_eta = float(np.sqrt(max(var_eta, 0.0)))

        if transform == "response":
            mu = float(invlink(eta))
            se_mu = float(abs(deriv(eta)) * se_eta)
        elif transform == "link":
            mu, se_mu = eta, se_eta
        else:
            raise ValueError("transform must be 'response' or 'link'")

        df_use = df_global
        crit = norm.ppf(0.5 + level / 2) if np.isinf(df_use) else tdist.ppf(0.5 + level / 2, df_use)

        lo_eta, hi_eta = eta - crit * se_eta, eta + crit * se_eta
        if transform == "response":
            lo, hi = float(invlink(lo_eta)), float(invlink(hi_eta))
        else:
            lo, hi = float(lo_eta), float(hi_eta)

        row = {}
        if cell_keys:
            if not isinstance(key_vals, tuple):
                key_vals = (key_vals,)
            for k, v in zip(cell_keys, key_vals):
                row[k] = v

        row.update({
            "emmean": mu,
            "SE": se_mu,
            "df": df_use,
            "lower.CL": lo,
            "upper.CL": hi
        })
        emm_rows.append(row)

        if by:
            by_vals = tuple(row[bk] for bk in by)
            spec_vals = tuple(row[sk] for sk in specs)
        else:
            by_vals = ()
            spec_vals = tuple(row[sk] for sk in specs) if specs else ("grand",)

        by_slices.setdefault(by_vals, []).append((spec_vals, L, eta, se_eta))

    return pd.DataFrame(emm_rows), by_slices


def _pairwise_contrasts_for_slice(
    entries: List[Tuple[Tuple, np.ndarray, float, float]],
    specs: List[str],
    by_vals: Tuple,
    beta: np.ndarray,
    V: np.ndarray,
    invlink: Callable,
    deriv: Callable,
    contrast_transform: str,
    df_global: float,
    df_provider: Optional[Callable[[np.ndarray], float]],
    by: List[str],
) -> List[Dict]:
    rows = []
    spec_keys = [e[0] for e in entries]
    L_rows = [e[1] for e in entries]
    k_groups = len(entries)
    if k_groups < 2:
        return rows

    for (i, j) in combinations(range(k_groups), 2):
        name_i = _label_for_spec_tuple(specs, spec_keys[i])
        name_j = _label_for_spec_tuple(specs, spec_keys[j])
        level_i_raw = spec_keys[i][0] if len(specs) == 1 else spec_keys[i]
        level_j_raw = spec_keys[j][0] if len(specs) == 1 else spec_keys[j]
        Li = L_rows[i]
        Lj = L_rows[j]
        Lc = Li - Lj

        # Link-scale
        delta_link = float((Lc @ beta).item())
        var_link   = float((Lc @ V @ Lc.T).item())
        se_link    = float(np.sqrt(max(var_link, 0.0)))

        # Response-scale (delta method) if requested
        if contrast_transform == "response":
            eta_i = float((Li @ beta).item())
            eta_j = float((Lj @ beta).item())
            mu_i  = float(invlink(eta_i))
            mu_j  = float(invlink(eta_j))
            delta_resp = mu_i - mu_j

            dgi = float(np.squeeze(deriv(eta_i)))
            dgj = float(np.squeeze(deriv(eta_j)))
            var_eta_i = float((Li @ V @ Li.T).item())
            var_eta_j = float((Lj @ V @ Lj.T).item())
            cov_ij    = float((Li @ V @ Lj.T).item())
            var_resp  = max(dgi*dgi*var_eta_i + dgj*dgj*var_eta_j - 2*dgi*dgj*cov_ij, 0.0)
            se_resp   = float(np.sqrt(var_resp))

            delta_out = delta_resp
            se_out    = se_resp
        else:
            delta_out = delta_link
            se_out    = se_link

        if df_provider is not None:
            try:
                df_c = float(df_provider(Lc))
            except Exception:
                df_c = df_global
            if not np.isfinite(df_c):
                df_c = df_global
        else:
            df_c = df_global

        stat = delta_out / se_out if se_out > 0 else np.nan
        if np.isinf(df_c):
            p_raw = 2 * (1 - norm.cdf(abs(stat))) if np.isfinite(stat) else np.nan
        else:
            p_raw = 2 * (1 - tdist.cdf(abs(stat), df_c)) if np.isfinite(stat) else np.nan

        row = {
            "contrast": f"{name_i} - {name_j}",
            ("estimate_link" if contrast_transform == "link" else "estimate"): delta_out,
            "SE": se_out,
            "df": df_c,
            "stat": stat,
            "p.value": p_raw,
            "__delta_link": delta_link,
            "__se_link": se_link,
            "__i": i,
            "__j": j,
            "__name_i": name_i,
            "__name_j": name_j,
            "__level_i": level_i_raw,
            "__level_j": level_j_raw,
        }
        for idx, bname in enumerate(by):
            row[bname] = by_vals[idx]
        rows.append(row)

    return rows


def _custom_contrasts_for_slice(
    entries: List[Tuple[Tuple, np.ndarray, float, float]],
    contrasts: List[Tuple[str, np.ndarray]],
    specs: List[str],
    by_vals: Tuple,
    beta: np.ndarray,
    V: np.ndarray,
    invlink: Callable,
    deriv: Callable,
    contrast_transform: str,
    df_global: float,
    df_provider: Optional[Callable[[np.ndarray], float]],
    by: List[str],
) -> List[Dict]:
    rows = []
    L_rows = [e[1] for e in entries]
    k_groups = len(entries)

    # Precompute for response-scale delta method
    etas = [float((L_rows[g] @ beta).item()) for g in range(k_groups)]
    mus  = [float(invlink(etas[g])) for g in range(k_groups)]
    dgs  = [float(np.squeeze(deriv(etas[g]))) for g in range(k_groups)]
    cov_eta = np.empty((k_groups, k_groups), dtype=float)
    for g in range(k_groups):
        for h in range(k_groups):
            cov_eta[g, h] = float((L_rows[g] @ V @ L_rows[h].T).item())

    for label, wvec in contrasts:
        w = np.asarray(wvec, dtype=float).reshape(-1)
        if w.shape[0] != k_groups:
            raise ValueError(
                f"Custom contrast '{label}' length {w.shape[0]} != number of groups {k_groups} "
                f"in BY-slice {by_vals}."
            )

        # Link-scale linear combination
        Lc = sum(w[g] * L_rows[g] for g in range(k_groups))
        delta_link = float((Lc @ beta).item())
        var_link   = float((Lc @ V @ Lc.T).item())
        se_link    = float(np.sqrt(max(var_link, 0.0)))

        if contrast_transform == "response":
            delta_resp = float(np.dot(w, mus))
            Gw = np.outer(w, w) * np.outer(dgs, dgs)
            var_resp = float(np.sum(Gw * cov_eta))
            se_resp  = float(np.sqrt(max(var_resp, 0.0)))
            delta_out, se_out = delta_resp, se_resp
            est_key = "estimate"
        else:
            delta_out, se_out = delta_link, se_link
            est_key = "estimate_link"

        if df_provider is not None:
            try:
                df_c = float(df_provider(Lc))
            except Exception:
                df_c = df_global
            if not np.isfinite(df_c):
                df_c = df_global
        else:
            df_c = df_global

        stat = delta_out / se_out if se_out > 0 else np.nan
        if np.isinf(df_c):
            p_raw = 2 * (1 - norm.cdf(abs(stat))) if np.isfinite(stat) else np.nan
        else:
            p_raw = 2 * (1 - tdist.cdf(abs(stat), df_c)) if np.isfinite(stat) else np.nan

        row = {
            "contrast": str(label),
            est_key: delta_out,
            "SE": se_out,
            "df": df_c,
            "stat": stat,
            "p.value": p_raw
        }
        for idx, bname in enumerate(by):
            row[bname] = by_vals[idx]
        rows.append(row)

    return rows


def _adjust_pvalues_per_slice(
    contrasts_df: pd.DataFrame,
    method: str,
    result,
    data: pd.DataFrame,
    specs: List[str],
    by: List[str],
    used_idx: Optional[pd.Index],
    model_weights: Optional[pd.Series],
) -> pd.Series:
    """
    Returns a Series aligned to contrasts_df.index with adjusted p-values.
    Implements strict Tukey/Tukey–Kramer when valid; otherwise falls back to
    Holm/Bonferroni/Sidak/FDR as requested (with 'tukey' aliasing to Holm).
    """
    if by:
        # Avoid passing a single-element list as the grouper to avoid pandas warnings
        grouper = by[0] if isinstance(by, list) and len(by) == 1 else by
        slice_iter = contrasts_df.groupby(grouper, dropna=False, sort=False)
    else:
        slice_iter = [((), contrasts_df)]

    # conditions for exact Tukey:
    is_ols = isinstance(getattr(result, "model", None), sm.OLS)
    strict_oneway = is_ols and (method == "tukey") and (len(specs) == 1)
    no_weights = (model_weights is None)

    use_real_tukey_global = (method == "tukey" and _HAVE_PSTURNG and strict_oneway and no_weights)

    p_adj_chunks = []
    for by_vals, subdf in slice_iter:
        if use_real_tukey_global:
            factor = specs[0]
            if used_idx is None:
                p_adj_chunks.append(_holm_chunk(subdf))
                continue

            # filter analysis rows by BY levels if any
            if by:
                mask = pd.Series(True, index=used_idx)
                for bname, bval in zip(by, by_vals if isinstance(by_vals, tuple) else (by_vals,)):
                    mask &= (data.loc[used_idx, bname] == bval)
                used_in_slice = used_idx[mask]
            else:
                used_in_slice = used_idx

            data_slice = data.loc[used_in_slice]

            if factor not in data_slice.columns:
                p_adj_chunks.append(_holm_chunk(subdf))
                continue

            # group sizes per level
            # Use single-column grouper when the factor list has length 1 to avoid pandas warnings
            level_counts = (
                _groupby(data_slice[[factor]], factor, dropna=False)
                .size().astype(float)
            )

            try:
                mse = float(result.mse_resid)
                df_tukey = int(result.df_resid)
            except Exception:
                p_adj_chunks.append(_holm_chunk(subdf))
                continue

            # Tukey–Kramer q statistics using LINK-scale deltas
            q_vals = []
            for _, r in subdf.iterrows():
                key_i = r["__level_i"] if "__level_i" in r else r["__name_i"]
                key_j = r["__level_j"] if "__level_j" in r else r["__name_j"]

                ni = float(level_counts.get(key_i, np.nan))
                nj = float(level_counts.get(key_j, np.nan))
                if not np.isfinite(ni) or not np.isfinite(nj) or ni <= 1 or nj <= 1:
                    q_vals.append(np.nan)
                    continue

                if "__delta_link" in r and np.isfinite(r["__delta_link"]):
                    delta_link_here = float(r["__delta_link"])
                else:
                    delta_link_here = float(r["estimate_link"])

                denom = np.sqrt(mse * 0.5 * (1.0/ni + 1.0/nj))
                q_vals.append(abs(delta_link_here) / denom if denom > 0 else np.nan)

            k = int(level_counts.shape[0])
            padj = [1 - psturng(q, k, df_tukey) if np.isfinite(q) else np.nan for q in q_vals]
            p_adj_chunks.append(pd.Series(padj, index=subdf.index))
        else:
            # General models or 'tukey' fallback → Holm
            eff = "holm" if method == "tukey" else method
            p = subdf["p.value"].to_numpy(dtype=float)
            mask = np.isfinite(p)
            p_adj = np.full_like(p, np.nan, dtype=float)
            if mask.any():
                _, padj_valid, _, _ = multipletests(p[mask], method=eff)
                p_adj[mask] = padj_valid
            p_adj_chunks.append(pd.Series(p_adj, index=subdf.index))

    return pd.concat(p_adj_chunks).sort_index()


def _holm_chunk(subdf: pd.DataFrame) -> pd.Series:
    p = subdf["p.value"].to_numpy(dtype=float)
    mask = np.isfinite(p)
    p_adj = np.full_like(p, np.nan, dtype=float)
    if mask.any():
        _, padj_valid, _, _ = multipletests(p[mask], method="holm")
        p_adj[mask] = padj_valid
    return pd.Series(p_adj, index=subdf.index)


def _finalize_contrasts_df(contrasts_df: pd.DataFrame) -> pd.DataFrame:
    out = contrasts_df.copy()
    df_vals = out["df"].astype(float).to_numpy()
    stat_vals = out["stat"].astype(float).to_numpy()
    out["t.ratio"] = np.where(np.isfinite(df_vals), stat_vals, np.nan)
    out["z.ratio"] = np.where(~np.isfinite(df_vals), stat_vals, np.nan)

    for _aux in ["__i", "__j", "__name_i", "__name_j", "__level_i", "__level_j", "__delta_link", "__se_link"]:
        if _aux in out.columns:
            out.drop(columns=[_aux], inplace=True)
    return out


# ------------------------------ Public API ------------------------------

def emmeans(
    result,
    data: pd.DataFrame,
    specs: Union[str, List[str]],
    *,
    by: Optional[Union[str, List[str]]] = None,
    at: Optional[Dict[str, Union[float, str, List[Union[float, str]]]]] = None,
    weight: str = "equal",
    transform: str = "response",
    level: float = 0.95,
    contrasts: Union[str, List[Tuple[str, np.ndarray]]] = "pairwise",
    contrast_transform: str = "link",
    adjust: Optional[str] = "tukey",
    df_method: str = "resid",
    vcov: Optional[np.ndarray] = None,
    return_grid: bool = False,
    # ---- extended knobs ----
    df_provider: Optional[Callable[[np.ndarray], float]] = None,
    drop_unseen: bool = False,
) -> Dict[str, Optional[pd.DataFrame]]:
    """
    Compute estimated marginal means (EMMs) and (within-BY) contrasts for statsmodels results.

    Refactored for clarity/DRY while preserving computations and outputs.
    """
    # ---------------- A. Parse/validate inputs ----------------
    specs = [specs] if isinstance(specs, str) else list(specs)
    by = [by] if isinstance(by, str) and by is not None else (list(by) if by else [])

    if weight not in {"equal", "proportional"}:
        raise ValueError("weight must be 'equal' or 'proportional'")

    at = at or {}
    focal_set = set(specs) | set(by)

    beta, param_names, V = _get_model_params_and_vcov(result, vcov)
    invlink, deriv = _get_link_and_deriv(result)

    if contrast_transform not in {"link", "response"}:
        raise ValueError("contrast_transform must be 'link' or 'response'")

    di = _get_design_info_or_raise(result)

    # Degrees of freedom for EMM CIs (global default; per-contrast handled with df_provider)
    df_global = np.inf if df_method == "wald" else getattr(result, "df_resid", np.inf)

    # ---------------- B. Extract factors & validate 'at' ----------------
    fx = _extract_factors(di, data)
    for nm in (set(specs) | set(by) | set(at.keys())):
        if nm not in fx.factors:
            raise ValueError(
                f"'{nm}' is not a recognized model factor. "
                f"Known factors (by base name): {list(fx.factors.keys())}"
            )
    _validate_at_levels(at, fx.factors)

    # ---------------- C. Build reference grid & design ----------------
    grid_raw, Xg = _build_reference_grid(
        data=data,
        factors=fx.factors,
        deps_map=fx.deps_map,
        focal_set=focal_set,
        at=at,
        di=di,
        param_names=param_names,
    )

    # ---------------- D. Analysis index/weights for Tukey conditions ----------------
    used_idx, model_weights = _determine_used_index_and_weights(result, data)

    # ---------------- E. Marginalization weights ----------------
    grid_raw = _apply_marginal_weights(
        grid_raw=grid_raw,
        data=data,
        used_idx=used_idx,
        weight=weight,
        focal_set=focal_set,
        at=at,
        factors=fx.factors,
        drop_unseen=drop_unseen
    )
    # Keep Xg aligned to current grid rows if we dropped unseen
    Xg = Xg.loc[grid_raw.index]

    cell_keys = specs + by
    grid_raw = _normalize_weights_by_cell(grid_raw, cell_keys)

    # ---------------- F. EMMs ----------------
    emm_df, by_slices = _compute_emms_per_cell(
        grid_raw=grid_raw,
        Xg=Xg,
        beta=beta,
        V=V,
        invlink=invlink,
        deriv=deriv,
        transform=transform,
        level=level,
        df_global=df_global,
        specs=specs,
        by=by
    )

    # ---------------- G. Contrasts ----------------
    contrasts_df = None
    if (contrasts == "pairwise" or isinstance(contrasts, list)) and len(emm_df) > 0 and len(specs) > 0:
        all_rows = []
        for by_vals, entries in by_slices.items():
            if contrasts == "pairwise":
                all_rows.extend(
                    _pairwise_contrasts_for_slice(
                        entries=entries,
                        specs=specs,
                        by_vals=by_vals,
                        beta=beta,
                        V=V,
                        invlink=invlink,
                        deriv=deriv,
                        contrast_transform=contrast_transform,
                        df_global=df_global,
                        df_provider=df_provider,
                        by=by
                    )
                )
            else:
                all_rows.extend(
                    _custom_contrasts_for_slice(
                        entries=entries,
                        contrasts=contrasts,  # type: ignore[arg-type]
                        specs=specs,
                        by_vals=by_vals,
                        beta=beta,
                        V=V,
                        invlink=invlink,
                        deriv=deriv,
                        contrast_transform=contrast_transform,
                        df_global=df_global,
                        df_provider=df_provider,
                        by=by
                    )
                )
        if all_rows:
            contrasts_df = pd.DataFrame(all_rows)

            if adjust:
                method = (adjust or "").lower()
                contrasts_df["p.value.adj"] = _adjust_pvalues_per_slice(
                    contrasts_df=contrasts_df,
                    method=method,
                    result=result,
                    data=data,
                    specs=specs,
                    by=by,
                    used_idx=used_idx,
                    model_weights=model_weights,
                )

            contrasts_df = _finalize_contrasts_df(contrasts_df)

    out: Dict[str, Optional[pd.DataFrame]] = {"emm": emm_df, "contrasts": contrasts_df}
    if return_grid:
        out["grid"] = grid_raw
    return out

### Example Usage: Post-hoc Tests for model_1

Now let's apply the `emmeans` function to perform post-hoc pairwise comparisons for our mixed linear model.

In [None]:
model_1 = smf.mixedlm(
    "pac_value ~ condition",
    data=df_plan,
    groups=df_plan["sub"]
).fit(reml=True)
print("\n" + "="*70)
print("MODEL 1: Planning Stage (BL vs MAIN baseline vs MAIN adaptation)")
print("="*70)
print(model_1.summary().as_text())
print(f"Random Effect Variance: {model_1.cov_re.iloc[0, 0]:.2e}")

# Compute EMMs and post-hoc comparisons for model_1
# Using df_method="resid" (default) for conservative small-sample inference
results_model1 = emmeans(
    result=model_1,
    data=df_plan,
    specs="condition",
    adjust="tukey",
    level=0.95,
    return_grid=True
)
# Display Pairwise Contrasts
print("\n" + "="*70)
print("PAIRWISE COMPARISONS (adjusted)")
print("="*70)

ctr = results_model1['contrasts']
if ctr is not None:
    # Show whatever stat column exists
    stat_col = 't.ratio' if 't.ratio' in ctr.columns else ('z.ratio' if 'z.ratio' in ctr.columns else 'stat')
    cols = ['contrast', 'estimate_link', 'SE', stat_col, 'p.value', 'p.value.adj']
    cols = [c for c in cols if c in ctr.columns]  # keep only existing
    print(ctr[cols].to_string(index=False))

    # Significant comparisons
    sig = ctr[ctr['p.value.adj'] < 0.05] if 'p.value.adj' in ctr.columns else ctr[ctr['p.value'] < 0.05]
    print("\n" + "="*70)
    print(f"SIGNIFICANT COMPARISONS (p < 0.05): {len(sig)}/{len(ctr)}")
    print("="*70)
    if len(sig) > 0:
        sig_cols = ['contrast', 'estimate_link', 'SE', 'p.value.adj'] if 'p.value.adj' in ctr.columns else ['contrast', 'estimate_link', 'SE', 'p.value']
        sig_cols = [c for c in sig_cols if c in sig.columns]
        print(sig[sig_cols].to_string(index=False))
    else:
        print("No significant pairwise differences found.")
else:
    print("No contrasts computed")


MODEL 1: Planning Stage (BL vs MAIN baseline vs MAIN adaptation)
                  Mixed Linear Model Regression Results
Model:                  MixedLM       Dependent Variable:       pac_value
No. Observations:       69            Method:                   REML     
No. Groups:             23            Scale:                    0.0000   
Min. group size:        3             Log-Likelihood:           598.5126 
Max. group size:        3             Converged:                Yes      
Mean group size:        3.0                                              
-------------------------------------------------------------------------
                               Coef.  Std.Err.   z    P>|z| [0.025 0.975]
-------------------------------------------------------------------------
Intercept                       0.000    0.000 22.546 0.000  0.000  0.000
condition[T._MAIN__adaptation] -0.000    0.000 -3.250 0.001 -0.000 -0.000
condition[T._MAIN__baseline]    0.000    0.000 24.023 0.000  0.0

In [20]:
# Compute EMMs and post-hoc comparisons for model_1
# Using df_method="resid" (default) for conservative small-sample inference
results_model1 = emmeans(
    result=model_2,
    data=df_go,
    specs="condition",
    adjust="tukey",
    level=0.95,
    return_grid=True
)
# Display Pairwise Contrasts
print("\n" + "="*70)
print("PAIRWISE COMPARISONS (adjusted)")
print("="*70)
print(model_2.summary().as_text())
print(f"Random Effect Variance: {model_2.cov_re.iloc[0, 0]:.2e}")

ctr = results_model1['contrasts']
if ctr is not None:
    # Show whatever stat column exists
    stat_col = 't.ratio' if 't.ratio' in ctr.columns else ('z.ratio' if 'z.ratio' in ctr.columns else 'stat')
    cols = ['contrast', 'estimate_link', 'SE', stat_col, 'p.value', 'p.value.adj']
    cols = [c for c in cols if c in ctr.columns]  # keep only existing
    print(ctr[cols].to_string(index=False))

    # Significant comparisons
    sig = ctr[ctr['p.value.adj'] < 0.05] if 'p.value.adj' in ctr.columns else ctr[ctr['p.value'] < 0.05]
    print("\n" + "="*70)
    print(f"SIGNIFICANT COMPARISONS (p < 0.05): {len(sig)}/{len(ctr)}")
    print("="*70)
    if len(sig) > 0:
        sig_cols = ['contrast', 'estimate_link', 'SE', 'p.value.adj'] if 'p.value.adj' in ctr.columns else ['contrast', 'estimate_link', 'SE', 'p.value']
        sig_cols = [c for c in sig_cols if c in sig.columns]
        print(sig[sig_cols].to_string(index=False))
    else:
        print("No significant pairwise differences found.")
else:
    print("No contrasts computed")


PAIRWISE COMPARISONS (adjusted)
                  Mixed Linear Model Regression Results
Model:                   MixedLM       Dependent Variable:       pac_value
No. Observations:        69            Method:                   REML     
No. Groups:              23            Scale:                    0.0000   
Min. group size:         3             Log-Likelihood:           612.4956 
Max. group size:         3             Converged:                Yes      
Mean group size:         3.0                                              
--------------------------------------------------------------------------
                               Coef.  Std.Err.    z    P>|z| [0.025 0.975]
--------------------------------------------------------------------------
Intercept                       0.000    0.000  33.875 0.000  0.000  0.000
condition[T._MAIN__adaptation] -0.000    0.000 -17.066 0.000 -0.000 -0.000
condition[T._MAIN__baseline]    0.000    0.000   2.883 0.004  0.000  0.000
Group Var  

In [21]:
model_2.params

Intercept                         0.000155
condition[T._MAIN__adaptation]   -0.000092
condition[T._MAIN__baseline]      0.000016
Group Var                         0.433510
dtype: float64

In [8]:
# Compute EMMs and post-hoc comparisons for model_1
# Using df_method="resid" (default) for conservative small-sample inference
results_model1 = emmeans(
    result=model_3,
    data=df_plan_main,
    specs="block",
    adjust="tukey",
    level=0.95,
    return_grid=True
)
# Display Pairwise Contrasts
print("\n" + "="*70)
print("PAIRWISE COMPARISONS (adjusted)")
print("="*70)

ctr = results_model1['contrasts']
if ctr is not None:
    # Show whatever stat column exists
    stat_col = 't.ratio' if 't.ratio' in ctr.columns else ('z.ratio' if 'z.ratio' in ctr.columns else 'stat')
    cols = ['contrast', 'estimate_link', 'SE', stat_col, 'p.value', 'p.value.adj']
    cols = [c for c in cols if c in ctr.columns]  # keep only existing
    print(ctr[cols].to_string(index=False))

    # Significant comparisons
    sig = ctr[ctr['p.value.adj'] < 0.05] if 'p.value.adj' in ctr.columns else ctr[ctr['p.value'] < 0.05]
    print("\n" + "="*70)
    print(f"SIGNIFICANT COMPARISONS (p < 0.05): {len(sig)}/{len(ctr)}")
    print("="*70)
    if len(sig) > 0:
        sig_cols = ['contrast', 'estimate_link', 'SE', 'p.value.adj'] if 'p.value.adj' in ctr.columns else ['contrast', 'estimate_link', 'SE', 'p.value']
        sig_cols = [c for c in sig_cols if c in sig.columns]
        print(sig[sig_cols].to_string(index=False))
    else:
        print("No significant pairwise differences found.")
else:
    print("No contrasts computed")


PAIRWISE COMPARISONS (adjusted)
               contrast  estimate_link       SE   t.ratio  p.value  p.value.adj
_baseline - _adaptation        0.00017 0.000006 27.877833      0.0          0.0

SIGNIFICANT COMPARISONS (p < 0.05): 1/1
               contrast  estimate_link       SE  p.value.adj
_baseline - _adaptation        0.00017 0.000006          0.0


In [9]:
# Compute EMMs and post-hoc comparisons for model_1
# Using df_method="resid" (default) for conservative small-sample inference
results_model1 = emmeans(
    result=model_4,
    data=df_go_main,
    specs="block",
    adjust="tukey",
    level=0.95,
    return_grid=True
)
# Display Pairwise Contrasts
print("\n" + "="*70)
print("PAIRWISE COMPARISONS (adjusted)")
print("="*70)

ctr = results_model1['contrasts']
if ctr is not None:
    # Show whatever stat column exists
    stat_col = 't.ratio' if 't.ratio' in ctr.columns else ('z.ratio' if 'z.ratio' in ctr.columns else 'stat')
    cols = ['contrast', 'estimate_link', 'SE', stat_col, 'p.value', 'p.value.adj']
    cols = [c for c in cols if c in ctr.columns]  # keep only existing
    print(ctr[cols].to_string(index=False))

    # Significant comparisons
    sig = ctr[ctr['p.value.adj'] < 0.05] if 'p.value.adj' in ctr.columns else ctr[ctr['p.value'] < 0.05]
    print("\n" + "="*70)
    print(f"SIGNIFICANT COMPARISONS (p < 0.05): {len(sig)}/{len(ctr)}")
    print("="*70)
    if len(sig) > 0:
        sig_cols = ['contrast', 'estimate_link', 'SE', 'p.value.adj'] if 'p.value.adj' in ctr.columns else ['contrast', 'estimate_link', 'SE', 'p.value']
        sig_cols = [c for c in sig_cols if c in sig.columns]
        print(sig[sig_cols].to_string(index=False))
    else:
        print("No significant pairwise differences found.")
else:
    print("No contrasts computed")


PAIRWISE COMPARISONS (adjusted)
               contrast  estimate_link       SE   t.ratio  p.value  p.value.adj
_baseline - _adaptation       0.000108 0.000004 25.774454      0.0          0.0

SIGNIFICANT COMPARISONS (p < 0.05): 1/1
               contrast  estimate_link       SE  p.value.adj
_baseline - _adaptation       0.000108 0.000004          0.0
