# Practice 3: Multi-Group Analysis

## Objetives:
1. Multi-Group analysis by gender and contract type
2. Report moderation hypotheses tests, direct and indirect effects
    1. Gender Hypotheses:
        1. H5a: The influence of technical quality (QT) on satisfaction is more positive among women than among men
        2. H5b: The influence of financial quality (QF) on satisfaction is more positive among women than among men.
        3. H5c: The influence of service quality (QS) on satisfaction is more positive among women than among men.
	    4. H5d: The influence of satisfaction on loyalty is more positive among women than among men.
    2. Contract Type Hypotheses:
        1. H6a: The influence of technical quality (QT) on satisfaction is more positive among consumers who use postpaid plans than those who use prepaid plans.
	    2. H6b: The influence of financial quality (QF) on satisfaction is more positive among consumers who use postpaid plans than those who use prepaid plans.
	    3. H6c: The influence of service quality (QS) on satisfaction is more positive among consumers who use postpaid plans than those who use prepaid plans.
	    4. H6d: The influence of satisfaction on loyalty is more positive among consumers who use postpaid plans than those who use prepaid plans.


## Data Preparation

In [1]:
import pandas as pd
import pyreadstat

raw_data_df, meta = pyreadstat.read_sav("../data/EXERCICIO_2025.sav")
raw_data_df.head()

Unnamed: 0,CASO,f1,f2,f3,f4,f5,f6,f7,P1.1_QA,P1.2_QA,...,P7,P14,ltv2,tempo,clsocial,SATIS,LEAL,ATEND,FIN,TEC
0,,1.0,3.0,6.0,1.0,1.0,50.0,18.0,8.0,6.0,...,3.0,2.0,1373.0,61.0,5.0,6.5,6.2,7.2,9.333333,8.5
1,,1.0,1.0,5.0,1.0,1.0,36.0,35.0,8.0,7.0,...,1.0,1.0,1002.0,47.0,5.0,5.5,4.4,6.8,6.666667,8.0
2,,1.0,3.0,5.0,1.0,1.0,72.0,73.0,10.0,10.0,...,2.0,2.0,2097.0,83.0,5.0,7.5,8.0,9.6,10.0,8.0
3,,1.0,3.0,9.0,1.0,1.0,24.0,27.0,6.0,6.0,...,1.0,2.0,802.0,35.0,4.0,6.25,6.2,6.4,7.666667,6.5
4,,1.0,4.0,9.0,1.0,1.0,72.0,24.0,2.0,5.0,...,2.0,1.0,4835.0,78.0,4.0,5.25,4.4,4.4,5.666667,5.75


In [2]:
columns_to_filter = [
    'P1.1_QA', 'P1.2_QA', 'P1.3_QA', 'P1.4_QA', 'P1.5_QA', 'P1.15_QA', 'P1.16_QA', 'P1.17_QA',
    'P1.6_QC', 'P1.7_QC', 'P1.8_QC', 'P1.9_QC',
    'P1.10_QT', 'P1.11_QT', 'P1.12_QT', 'P1.13_QT', 'P1.14_QT',
    'P3.1_S', 'P3.2_S', 'P3.3_S', 'P3.4_S',
    'P6.1_L', 'P6.2_L', 'P6.3_L', 'P6.4_L', 'P6.5_L', 'P6.6_L',
    'P14', 'f4'
]

clean_df = raw_data_df[columns_to_filter].copy()

clean_df.rename(columns=lambda c: c.replace('.', '_'), inplace=True)
clean_df.rename(columns=lambda c: c.replace('C', 'F'), inplace=True)
clean_df.rename(columns=lambda c: c.replace('A', 'S'), inplace=True)
print(raw_data_df.shape)
print(clean_df.shape)
print(clean_df.isnull().sum())

(493, 45)
(493, 29)
P1_1_QS     0
P1_2_QS     0
P1_3_QS     0
P1_4_QS     0
P1_5_QS     0
P1_15_QS    0
P1_16_QS    0
P1_17_QS    0
P1_6_QF     0
P1_7_QF     0
P1_8_QF     0
P1_9_QF     0
P1_10_QT    0
P1_11_QT    0
P1_12_QT    0
P1_13_QT    0
P1_14_QT    0
P3_1_S      0
P3_2_S      0
P3_3_S      0
P3_4_S      0
P6_1_L      0
P6_2_L      0
P6_3_L      0
P6_4_L      0
P6_5_L      0
P6_6_L      0
P14         0
f4          0
dtype: int64


In [3]:
male_df = clean_df[clean_df['P14'] == 2].copy()
female_df = clean_df[clean_df['P14'] == 1].copy()
print(male_df.shape)
print(female_df.shape)

(232, 29)
(261, 29)


In [4]:
prepaid_df = clean_df[clean_df['f4'] == 1].copy()
postpaid_df = clean_df[clean_df['f4'] == 2].copy()
print(prepaid_df.shape)
print(postpaid_df.shape)

(288, 29)
(205, 29)


## Gender Models

### Modeling

In [5]:
model_description = """
QS =~ P1_1_QS + P1_2_QS + P1_4_QS + P1_5_QS + P1_16_QS
QF =~ P1_6_QF + P1_7_QF + P1_8_QF
QT =~ P1_10_QT + P1_11_QT + P1_12_QT + P1_13_QT
S =~ P3_1_S + P3_2_S + P3_3_S + P3_4_S
L =~ P6_1_L + P6_2_L + P6_3_L + P6_4_L

S ~ QS + QF + QT
L ~ S
"""

#### Male

In [6]:
from semopy import Model
male_model = Model(model_description)
print(male_model.fit(male_df))

Name of objective: MLW
Optimization method: SLSQP
Optimization successful.
Optimization terminated successfully
Objective value: 1.807
Number of iterations: 50
Params: 0.793 1.123 1.085 0.776 1.642 1.666 1.278 1.042 1.067 0.832 0.996 0.987 1.138 0.704 0.984 0.637 0.024 0.262 0.610 0.906 2.063 0.545 0.859 1.661 2.422 1.547 1.971 2.571 2.084 1.730 1.198 2.791 0.564 0.631 0.536 0.919 0.846 1.294 2.462 0.570 0.892 2.588 0.830 1.108 2.074 0.675 1.087


#### Female

In [7]:
female_model = Model(model_description)
print(female_model.fit(female_df))

Name of objective: MLW
Optimization method: SLSQP
Optimization successful.
Optimization terminated successfully
Objective value: 1.508
Number of iterations: 46
Params: 0.856 1.116 1.141 0.677 1.221 1.011 1.166 1.016 0.902 0.906 1.085 1.105 1.087 0.679 0.955 0.373 0.379 0.102 0.626 1.463 2.726 0.930 1.172 1.498 2.293 1.207 1.634 2.128 1.444 1.530 1.367 3.359 0.787 0.699 0.508 0.797 0.672 1.158 2.807 0.609 1.797 2.894 1.048 1.186 2.074 1.050 0.987


### Direct Effects

#### Male

In [8]:
male_raw_item_estimates = (
    male_model.inspect(std_est=True)
         .query("op == '~' and not lval.str.startswith('P')")
)
male_direct_effects_df = (
    male_raw_item_estimates
    .loc[:, ['lval','rval', 'Est. Std', 'Estimate','p-value', 'z-value', 'Std. Err']]
    .rename(columns={
        'lval': 'Outcome',
        'rval': 'Predictor',
        'Estimate': 'beta',
        'Est. Std': 'beta_std',
        'Std. Err': 'SE'
    })
    .sort_values(by='beta')
)
male_direct_effects_df['hypothesis_result'] = male_direct_effects_df.apply(
    lambda row: 'Supported' if row['p-value'] < 0.05 else 'Not Supported',
    axis=1
)
male_direct_effects_df


Unnamed: 0,Outcome,Predictor,beta_std,beta,p-value,z-value,SE,hypothesis_result
1,S,QF,0.014011,0.024302,0.851981,0.186591,0.13024,Not Supported
2,S,QT,0.230126,0.261745,0.000436,3.517361,0.074415,Supported
3,L,S,0.724064,0.610031,0.0,11.173243,0.054598,Supported
0,S,QS,0.625403,0.63666,0.0,7.914545,0.080442,Supported


#### Female

In [9]:
female_raw_item_estimates = (
    female_model.inspect(std_est=True)
         .query("op == '~' and not lval.str.startswith('P')")
)
female_direct_effects_df = (
    female_raw_item_estimates
    .loc[:, ['lval','rval', 'Est. Std', 'Estimate','p-value', 'z-value', 'Std. Err']]
    .rename(columns={
        'lval': 'Outcome',
        'rval': 'Predictor',
        'Estimate': 'beta',
        'Est. Std': 'beta_std',
        'Std. Err': 'SE'
    })
    .sort_values(by='beta')
)
female_direct_effects_df['hypothesis_result'] = female_direct_effects_df.apply(
    lambda row: 'Supported' if row['p-value'] < 0.05 else 'Not Supported',
    axis=1
)
female_direct_effects_df


Unnamed: 0,Outcome,Predictor,beta_std,beta,p-value,z-value,SE,hypothesis_result
2,S,QT,0.100431,0.101959,0.147797,1.447358,0.070445,Not Supported
0,S,QS,0.433928,0.372888,0.0,6.542589,0.056994,Supported
1,S,QF,0.347743,0.379274,5e-06,4.583658,0.082745,Supported
3,L,S,0.603152,0.625585,0.0,9.603669,0.06514,Supported


#### Wald's Test for Group Differences

$Z = \frac{\beta_1 - \beta_2}{\sqrt{SE_1^2 + SE_2^2}}$

Clogg, C. C., Petkova, E., & Haritou, A. (1995). Statistical methods for comparing regression coefficients between models. American Journal of Sociology, 100(5), 1261–1293. https://doi.org/10.1086/230638

Please refer to [Wald test](https://en.wikipedia.org/wiki/Wald_test) and [Testing equality of coefficients from two different regressions](https://stats.stackexchange.com/questions/93540/testing-equality-of-coefficients-from-two-different-regressions?utm_source=chatgpt.com) for more information.

In [10]:
from scipy.stats import norm

def direct_wald_test(beta1, se1, beta2, se2):
    z = (beta1 - beta2) / ((se1**2 + se2**2)**0.5)
    p = 2 * (1 - norm.cdf(abs(z)))
    return z, p

In [11]:
female_df = female_direct_effects_df.copy()
male_df = male_direct_effects_df.copy()

female_df = female_df.rename(columns={
    'beta': 'beta_female',
    'beta_std': 'beta_std_female',
    'p-value': 'p_female',
    'SE': 'SE_female'
})

male_df = male_df.rename(columns={
    'beta': 'beta_male',
    'beta_std': 'beta_std_male',
    'p-value': 'p_male',
    'SE': 'SE_male'
})

In [12]:
gender_direct_effects_df = pd.merge(
    female_df[[
        'Outcome',
        'Predictor',
        'beta_female',
        'beta_std_female',
        'SE_female'
    ]],
    male_df[[
        'Outcome',
        'Predictor',
        'beta_male',
        'beta_std_male',
        'SE_male'
    ]],
    on=['Outcome', 'Predictor']
)

In [13]:
gender_direct_effects_df[['z_wald', 'p_wald']] = gender_direct_effects_df.apply(
    lambda row: direct_wald_test(
        row['beta_female'],
        row['SE_female'],
        row['beta_male'],
        row['SE_male']
    ),
    axis=1, result_type='expand'
)

In [14]:
gender_direct_effects_df['difference_significant'] = (
    gender_direct_effects_df['p_wald']
    .apply(lambda p: 'Yes' if p < 0.05 else 'No')
)

In [15]:
gender_direct_effects_df['SE_female'] = (
    gender_direct_effects_df['SE_female'].astype(float)
)
gender_direct_effects_df['SE_male'] = gender_direct_effects_df['SE_male'].astype(float)
gender_direct_effects_df = gender_direct_effects_df.round(3)
gender_direct_effects_df

Unnamed: 0,Outcome,Predictor,beta_female,beta_std_female,SE_female,beta_male,beta_std_male,SE_male,z_wald,p_wald,difference_significant
0,S,QT,0.102,0.1,0.07,0.262,0.23,0.074,-1.559,0.119,No
1,S,QS,0.373,0.434,0.057,0.637,0.625,0.08,-2.676,0.007,Yes
2,S,QF,0.379,0.348,0.083,0.024,0.014,0.13,2.301,0.021,Yes
3,L,S,0.626,0.603,0.065,0.61,0.724,0.055,0.183,0.855,No


### Indirect Effects

#### Male

In [16]:
male_paths = (
    male_model.inspect(std_est=True, se_robust=True)
         .query("op == '~'")
         .set_index(['lval', 'rval'])
)

##### Mediator-to-outcome path  (M -> Y)

In [17]:
male_b_path_coefficient = float(
    male_paths.loc[('L', 'S'), 'Estimate']
)
male_b_path_standardized_coefficient = float(
    male_paths.loc[('L', 'S'), 'Est. Std']
)
male_b_path_standard_error = float(
    male_paths.loc[('L', 'S'), 'Std. Err']
)

##### Predictor-to-mediator paths (X -> M)

Sobel's z test for mediation is calculated as:

$Z = \frac{a \cdot b}{SE_{ab}}
= \frac{a \cdot b}{\sqrt{b^2 \cdot SE_a^2 + a^2 \cdot SE_b^2}}$

Where:
- a = path coefficient from X -> M
- b = path coefficient from M -> Y
- $SE_a$ = standard error of a
- $SE_b$ = standard error of b

Abu-Bader, S., & Jones, T. V. (2021). Statistical mediation analysis using the Sobel test and Hayes SPSS Process Macro. International Journal of Quantitative and Qualitative Research Methods, 9(1), 42–61.

In [18]:
import math
male_results = []
for predictor in ['QS', 'QF', 'QT']:
    male_a_path_coefficient = float(
        male_paths.loc[('S', predictor), 'Estimate']
    )
    male_a_standardized_path_coefficient = float(
        male_paths.loc[('S', predictor), 'Est. Std']
    )
    male_a_path_standard_error = float(
        male_paths.loc[('S', predictor), 'Std. Err']
    )

    # Indirect effect = a * b
    male_indirect_effect = (
        male_a_path_coefficient * male_b_path_coefficient
    )
    male_standardized_indirect_effect = (
        male_a_standardized_path_coefficient * male_b_path_standardized_coefficient
    )

    # Sobel standard error: sqrt(b^2 * SE_a^2 + a^2 * SE_b^2)
    male_indirect_se = math.sqrt(
        (male_b_path_coefficient ** 2) * (male_a_path_standard_error ** 2) +
        (male_a_path_coefficient ** 2) * (male_b_path_standard_error ** 2)
    )
    male_standardized_indirect_effect = (
        male_a_standardized_path_coefficient * male_b_path_standardized_coefficient
    )

    male_z_value = male_indirect_effect / male_indirect_se
    male_p_value = 2 * (1 - norm.cdf(abs(male_z_value)))

    male_results.append({
        'Path': f'{predictor} -> L (via S)',
        'Indirect Effect': round(male_indirect_effect, 3),
        'Std. Indirect Effect': round(male_standardized_indirect_effect, 3),
        'Standard Error':  round(male_indirect_se, 3),
        'z-value': round(male_z_value, 2),
        'p-value': round(male_p_value, 4)
    })

In [19]:
male_indirect_effects_df = pd.DataFrame(male_results)
male_indirect_effects_df = male_indirect_effects_df.round(3)
male_indirect_effects_df

Unnamed: 0,Path,Indirect Effect,Std. Indirect Effect,Standard Error,z-value,p-value
0,QS -> L (via S),0.388,0.453,0.069,5.65,0.0
1,QF -> L (via S),0.015,0.01,0.093,0.16,0.874
2,QT -> L (via S),0.16,0.167,0.051,3.15,0.002


#### Female

In [20]:
female_paths = (
    female_model.inspect(std_est=True, se_robust=True)
         .query("op == '~'")
         .set_index(['lval', 'rval'])
)

##### Mediator-to-outcome path  (M -> Y)

In [21]:
female_b_path_coefficient = float(
    female_paths.loc[('L', 'S'), 'Est. Std']
)
female_b_path_standardized_coefficient = float(
    male_paths.loc[('L', 'S'), 'Est. Std']
)
female_b_path_standard_error = float(
    female_paths.loc[('L', 'S'), 'Std. Err']
)

##### Predictor-to-mediator paths (X -> M)

Sobel's z test for mediation is calculated as:

$Z = \frac{a \cdot b}{SE_{ab}}
= \frac{a \cdot b}{\sqrt{b^2 \cdot SE_a^2 + a^2 \cdot SE_b^2}}$

Where:
- a = path coefficient from X -> M
- b = path coefficient from M -> Y
- $SE_a$ = standard error of a
- $SE_b$ = standard error of b

Abu-Bader, S., & Jones, T. V. (2021). Statistical mediation analysis using the Sobel test and Hayes SPSS Process Macro. International Journal of Quantitative and Qualitative Research Methods, 9(1), 42–61.

In [22]:
import math
female_results = []
for predictor in ['QS', 'QF', 'QT']:
    female_a_path_coefficient = float(
        female_paths.loc[('S', predictor), 'Estimate']
    )
    female_a_standardized_path_coefficient = float(
        female_paths.loc[('S', predictor), 'Est. Std']
    )
    female_a_path_standard_error = float(
        female_paths.loc[('S', predictor), 'Std. Err']
    )

    # Indirect effect = a * b
    female_indirect_effect = (
        female_a_path_coefficient * female_b_path_coefficient
    )
    female_standardized_indirect_effect = (
        female_a_standardized_path_coefficient * female_b_path_standardized_coefficient
    )

    # Sobel standard error: sqrt(b^2 * SE_a^2 + a^2 * SE_b^2)
    female_indirect_se = math.sqrt(
        (female_b_path_coefficient ** 2) * (female_a_path_standard_error ** 2) +
        (female_a_path_coefficient ** 2) * (female_b_path_standard_error ** 2)
    )
    female_standardized_indirect_effect = (
        female_a_standardized_path_coefficient * female_b_path_standardized_coefficient
    )

    female_z_value = female_indirect_effect / female_indirect_se
    female_p_value = 2 * (1 - norm.cdf(abs(female_z_value)))

    female_results.append({
        'Path': f'{predictor} -> L (via S)',
        'Indirect Effect': round(female_indirect_effect, 3),
        'Std. Indirect Effect': round(female_standardized_indirect_effect, 3),
        'Standard Error':  round(female_indirect_se, 3),
        'z-value': round(female_z_value, 2),
        'p-value': round(female_p_value, 4)
    })

In [23]:
female_indirect_effects_df = pd.DataFrame(female_results)
female_indirect_effects_df = female_indirect_effects_df.round(3)
female_indirect_effects_df

Unnamed: 0,Path,Indirect Effect,Std. Indirect Effect,Standard Error,z-value,p-value
0,QS -> L (via S),0.225,0.314,0.051,4.42,0.0
1,QF -> L (via S),0.229,0.252,0.069,3.33,0.001
2,QT -> L (via S),0.061,0.073,0.045,1.38,0.169


## Contract Type Models

### Modeling

#### Prepaid

In [24]:
pre_model = Model(model_description)
print(pre_model.fit(prepaid_df))

Name of objective: MLW
Optimization method: SLSQP
Optimization successful.
Optimization terminated successfully
Objective value: 1.641
Number of iterations: 47
Params: 0.761 0.970 0.974 0.731 0.996 0.720 1.145 1.071 0.875 0.855 1.036 1.048 1.223 0.748 1.009 0.486 0.209 0.175 0.560 1.103 2.498 0.975 1.052 0.774 2.526 0.995 1.756 2.401 1.840 1.065 0.353 4.420 0.687 0.750 0.522 0.914 0.865 1.141 2.648 0.697 2.122 3.039 0.897 1.220 2.289 1.042 1.048


#### Postpaid

In [25]:
post_model = Model(model_description)
print(post_model.fit(postpaid_df))

Name of objective: MLW
Optimization method: SLSQP
Optimization successful.
Optimization terminated successfully
Objective value: 1.843
Number of iterations: 51
Params: 0.942 1.361 1.309 0.714 2.322 2.202 1.342 1.042 1.254 0.868 1.039 1.047 1.006 0.596 0.913 0.469 0.388 0.230 0.704 1.285 2.287 0.629 0.743 1.822 1.887 1.832 1.776 2.171 1.633 2.041 0.929 1.589 0.669 0.588 0.546 0.764 0.537 1.313 2.633 0.447 0.658 2.427 0.702 1.035 1.682 0.493 1.161


### Direct Effects

#### Prepaid

In [26]:
pre_raw_item_estimates = (
    pre_model.inspect(std_est=True)
         .query("op == '~' and not lval.str.startswith('P')")
)
pre_direct_effects_df = (
    pre_raw_item_estimates
    .loc[:, ['lval','rval', 'Est. Std', 'Estimate','p-value', 'z-value', 'Std. Err']]
    .rename(columns={
        'lval': 'Outcome',
        'rval': 'Predictor',
        'Estimate': 'beta',
        'Est. Std': 'beta_std',
        'Std. Err': 'SE'
    })
    .sort_values(by='beta')
)
pre_direct_effects_df['hypothesis_result'] = pre_direct_effects_df.apply(
    lambda row: 'Supported' if row['p-value'] < 0.05 else 'Not Supported',
    axis=1
)
pre_direct_effects_df

Unnamed: 0,Outcome,Predictor,beta_std,beta,p-value,z-value,SE,hypothesis_result
2,S,QT,0.171215,0.175113,0.004432,2.845689,0.061536,Supported
1,S,QF,0.196393,0.208615,0.000468,3.498597,0.059628,Supported
0,S,QS,0.547677,0.486139,0.0,9.056507,0.053678,Supported
3,L,S,0.636721,0.560365,0.0,10.403632,0.053862,Supported


#### Postpaid

In [27]:
post_raw_item_estimates = (
    post_model.inspect(std_est=True)
         .query("op == '~' and not lval.str.startswith('P')")
)
post_direct_effects_df = (
    post_raw_item_estimates
    .loc[:, ['lval','rval', 'Est. Std', 'Estimate','p-value', 'z-value', 'Std. Err']]
    .rename(columns={
        'lval': 'Outcome',
        'rval': 'Predictor',
        'Estimate': 'beta',
        'Est. Std': 'beta_std',
        'Std. Err': 'SE'
    })
    .sort_values(by='beta')
)
post_direct_effects_df['hypothesis_result'] = post_direct_effects_df.apply(
    lambda row: 'Supported' if row['p-value'] < 0.05 else 'Not Supported',
    axis=1
)
post_direct_effects_df


Unnamed: 0,Outcome,Predictor,beta_std,beta,p-value,z-value,SE,hypothesis_result
2,S,QT,0.190922,0.230413,0.007908,2.655984,0.086752,Supported
1,S,QF,0.200919,0.38764,0.011879,2.515713,0.154088,Supported
0,S,QS,0.466579,0.468712,0.0,5.606618,0.0836,Supported
3,L,S,0.697067,0.704154,0.0,10.719871,0.065687,Supported


#### Wald's Test for Group Differences

$Z = \frac{\beta_1 - \beta_2}{\sqrt{SE_1^2 + SE_2^2}}$

Clogg, C. C., Petkova, E., & Haritou, A. (1995). Statistical methods for comparing regression coefficients between models. American Journal of Sociology, 100(5), 1261–1293. https://doi.org/10.1086/230638

Please refer to [Wald test](https://en.wikipedia.org/wiki/Wald_test) and [Testing equality of coefficients from two different regressions](https://stats.stackexchange.com/questions/93540/testing-equality-of-coefficients-from-two-different-regressions?utm_source=chatgpt.com) for more information.

In [28]:
pre_df = pre_direct_effects_df.copy()
post_df = post_direct_effects_df.copy()

pre_df = pre_df.rename(columns={
    'beta': 'beta_pre',
    'beta_std': 'beta_std_pre',
    'p-value': 'p_pre',
    'SE': 'SE_pre'
})

post_df = post_df.rename(columns={
    'beta': 'beta_post',
    'beta_std': 'beta_std_post',
    'p-value': 'p_post',
    'SE': 'SE_post'
})

In [29]:
plan_type_direct_effects_df = pd.merge(
    pre_df[[
        'Outcome',
        'Predictor',
        'beta_pre',
        'beta_std_pre',
        'SE_pre'
    ]],
    post_df[[
        'Outcome',
        'Predictor',
        'beta_post',
        'beta_std_post',
        'SE_post'
    ]],
    on=['Outcome', 'Predictor']
)

In [30]:
plan_type_direct_effects_df[['z_wald', 'p_wald']] = plan_type_direct_effects_df.apply(
    lambda row: direct_wald_test(
        row['beta_pre'],
        row['SE_pre'],
        row['beta_post'],
        row['SE_post']
    ),
    axis=1, result_type='expand'
)

In [31]:
plan_type_direct_effects_df['significant_difference'] = (
    plan_type_direct_effects_df['p_wald']
        .apply(lambda p: 'Yes' if p < 0.05 else 'No')
)

In [32]:
plan_type_direct_effects_df['SE_pre'] = (
    plan_type_direct_effects_df['SE_pre'].astype(float)
)
plan_type_direct_effects_df['SE_post'] = (
    plan_type_direct_effects_df['SE_post'].astype(float)
)
plan_type_direct_effects_df = plan_type_direct_effects_df.round(3)
plan_type_direct_effects_df

Unnamed: 0,Outcome,Predictor,beta_pre,beta_std_pre,SE_pre,beta_post,beta_std_post,SE_post,z_wald,p_wald,significant_difference
0,S,QT,0.175,0.171,0.062,0.23,0.191,0.087,-0.52,0.603,No
1,S,QF,0.209,0.196,0.06,0.388,0.201,0.154,-1.084,0.279,No
2,S,QS,0.486,0.548,0.054,0.469,0.467,0.084,0.175,0.861,No
3,L,S,0.56,0.637,0.054,0.704,0.697,0.066,-1.693,0.091,No


### Indirect Effects

#### Prepaid

In [33]:
pre_paths = (
    pre_model.inspect(std_est=True, se_robust=True)
         .query("op == '~'")
         .set_index(['lval', 'rval'])
)

##### Mediator-to-outcome path  (M -> Y)

In [34]:
pre_b_path_coefficient = float(
    pre_paths.loc[('L', 'S'), 'Estimate']
)
pre_b_path_standardized_coefficient = float(
    pre_paths.loc[('L', 'S'), 'Est. Std']
)
pre_b_path_standard_error = float(
    male_paths.loc[('L', 'S'), 'Std. Err']
)

##### Predictor-to-mediator paths (X -> M)

Sobel's z test for mediation is calculated as:

$Z = \frac{a \cdot b}{SE_{ab}}
= \frac{a \cdot b}{\sqrt{b^2 \cdot SE_a^2 + a^2 \cdot SE_b^2}}$

Where:
- a = path coefficient from X -> M
- b = path coefficient from M -> Y
- $SE_a$ = standard error of a
- $SE_b$ = standard error of b

Abu-Bader, S., & Jones, T. V. (2021). Statistical mediation analysis using the Sobel test and Hayes SPSS Process Macro. International Journal of Quantitative and Qualitative Research Methods, 9(1), 42–61.

In [35]:
pre_results = []
for predictor in ['QS', 'QF', 'QT']:
    pre_a_path_coefficient = float(
        pre_paths.loc[('S', predictor), 'Estimate']
    )
    pre_a_standardized_path_coefficient = float(
        pre_paths.loc[('S', predictor), 'Est. Std']
    )
    pre_a_path_standard_error = float(
        pre_paths.loc[('S', predictor), 'Std. Err']
    )

    # Indirect effect = a * b
    pre_indirect_effect = (
        pre_a_path_coefficient * pre_b_path_coefficient
    )
    pre_standardized_indirect_effect = (
        pre_a_standardized_path_coefficient * pre_b_path_standardized_coefficient
    )

    # Sobel standard error: sqrt(b^2 * SE_a^2 + a^2 * SE_b^2)
    pre_indirect_se = math.sqrt(
        (pre_b_path_coefficient ** 2) * (pre_a_path_standard_error ** 2) +
        (pre_a_path_coefficient ** 2) * (pre_b_path_standard_error ** 2)
    )
    pre_standardized_indirect_effect = (
        pre_a_standardized_path_coefficient * pre_b_path_standardized_coefficient
    )

    pre_z_value = pre_indirect_effect / pre_indirect_se
    pre_p_value = 2 * (1 - norm.cdf(abs(pre_z_value)))

    pre_results.append({
        'Path': f'{predictor} -> L (via S)',
        'Indirect Effect': round(pre_indirect_effect, 3),
        'Std. Indirect Effect': round(pre_standardized_indirect_effect, 3),
        'Standard Error':  round(pre_indirect_se, 3),
        'z-value': round(pre_z_value, 2),
        'p-value': round(pre_p_value, 4)
    })

In [36]:
pre_indirect_effects_df = pd.DataFrame(pre_results)
pre_indirect_effects_df = pre_indirect_effects_df.round(3)
pre_indirect_effects_df

Unnamed: 0,Path,Indirect Effect,Std. Indirect Effect,Standard Error,z-value,p-value
0,QS -> L (via S),0.272,0.349,0.049,5.55,0.0
1,QF -> L (via S),0.117,0.125,0.039,3.02,0.002
2,QT -> L (via S),0.098,0.109,0.034,2.91,0.004


#### Postpaid

In [37]:
post_paths = (
    post_model.inspect(std_est=True, se_robust=True)
         .query("op == '~'")
         .set_index(['lval', 'rval'])
)

##### Mediator-to-outcome path  (M -> Y)

In [38]:
post_b_path_coefficient = float(
    post_paths.loc[('L', 'S'), 'Estimate']
)
post_b_path_standardized_coefficient = float(
    post_paths.loc[('L', 'S'), 'Est. Std']
)
post_b_path_standard_error = float(
    male_paths.loc[('L', 'S'), 'Std. Err']
)

##### Predictor-to-mediator paths (X -> M)

Sobel's z test for mediation is calculated as:

$Z = \frac{a \cdot b}{SE_{ab}}
= \frac{a \cdot b}{\sqrt{b^2 \cdot SE_a^2 + a^2 \cdot SE_b^2}}$

Where:
- a = path coefficient from X -> M
- b = path coefficient from M -> Y
- $SE_a$ = standard error of a
- $SE_b$ = standard error of b

Abu-Bader, S., & Jones, T. V. (2021). Statistical mediation analysis using the Sobel test and Hayes SPSS Process Macro. International Journal of Quantitative and Qualitative Research Methods, 9(1), 42–61.

In [39]:
post_results = []
for predictor in ['QS', 'QF', 'QT']:
    post_a_path_coefficient = float(
        post_paths.loc[('S', predictor), 'Estimate']
    )
    post_a_standardized_path_coefficient = float(
        post_paths.loc[('S', predictor), 'Est. Std']
    )
    post_a_path_standard_error = float(
        post_paths.loc[('S', predictor), 'Std. Err']
    )

    # Indirect effect = a * b
    post_indirect_effect = (
        post_a_path_coefficient * post_b_path_coefficient
    )
    post_standardized_indirect_effect = (
        post_a_standardized_path_coefficient * post_b_path_standardized_coefficient
    )

    # Sobel standard error: sqrt(b^2 * SE_a^2 + a^2 * SE_b^2)
    post_indirect_se = math.sqrt(
        (post_b_path_coefficient ** 2) * (post_a_path_standard_error ** 2) +
        (post_a_path_coefficient ** 2) * (post_b_path_standard_error ** 2)
    )
    post_standardized_indirect_effect = (
        post_a_standardized_path_coefficient * post_b_path_standardized_coefficient
    )

    post_z_value = post_indirect_effect / post_indirect_se
    post_p_value = 2 * (1 - norm.cdf(abs(post_z_value)))

    post_results.append({
        'Path': f'{predictor} -> L (via S)',
        'Indirect Effect': round(post_indirect_effect, 3),
        'Std. Indirect Effect': round(post_standardized_indirect_effect, 3),
        'Standard Error':  round(post_indirect_se, 3),
        'z-value': round(post_z_value, 2),
        'p-value': round(post_p_value, 4)
    })

In [40]:
post_indirect_effects_df = pd.DataFrame(post_results)
post_indirect_effects_df = post_indirect_effects_df.round(3)
post_indirect_effects_df

Unnamed: 0,Path,Indirect Effect,Std. Indirect Effect,Standard Error,z-value,p-value
0,QS -> L (via S),0.33,0.325,0.071,4.65,0.0
1,QF -> L (via S),0.273,0.14,0.145,1.88,0.06
2,QT -> L (via S),0.162,0.133,0.074,2.2,0.028


## Final Analysis

In [41]:
gender_direct_effects_df

Unnamed: 0,Outcome,Predictor,beta_female,beta_std_female,SE_female,beta_male,beta_std_male,SE_male,z_wald,p_wald,difference_significant
0,S,QT,0.102,0.1,0.07,0.262,0.23,0.074,-1.559,0.119,No
1,S,QS,0.373,0.434,0.057,0.637,0.625,0.08,-2.676,0.007,Yes
2,S,QF,0.379,0.348,0.083,0.024,0.014,0.13,2.301,0.021,Yes
3,L,S,0.626,0.603,0.065,0.61,0.724,0.055,0.183,0.855,No


In [42]:
plan_type_direct_effects_df

Unnamed: 0,Outcome,Predictor,beta_pre,beta_std_pre,SE_pre,beta_post,beta_std_post,SE_post,z_wald,p_wald,significant_difference
0,S,QT,0.175,0.171,0.062,0.23,0.191,0.087,-0.52,0.603,No
1,S,QF,0.209,0.196,0.06,0.388,0.201,0.154,-1.084,0.279,No
2,S,QS,0.486,0.548,0.054,0.469,0.467,0.084,0.175,0.861,No
3,L,S,0.56,0.637,0.054,0.704,0.697,0.066,-1.693,0.091,No


In [43]:
male_indirect_effects_df

Unnamed: 0,Path,Indirect Effect,Std. Indirect Effect,Standard Error,z-value,p-value
0,QS -> L (via S),0.388,0.453,0.069,5.65,0.0
1,QF -> L (via S),0.015,0.01,0.093,0.16,0.874
2,QT -> L (via S),0.16,0.167,0.051,3.15,0.002


In [44]:
female_indirect_effects_df

Unnamed: 0,Path,Indirect Effect,Std. Indirect Effect,Standard Error,z-value,p-value
0,QS -> L (via S),0.225,0.314,0.051,4.42,0.0
1,QF -> L (via S),0.229,0.252,0.069,3.33,0.001
2,QT -> L (via S),0.061,0.073,0.045,1.38,0.169


In [45]:
pre_indirect_effects_df

Unnamed: 0,Path,Indirect Effect,Std. Indirect Effect,Standard Error,z-value,p-value
0,QS -> L (via S),0.272,0.349,0.049,5.55,0.0
1,QF -> L (via S),0.117,0.125,0.039,3.02,0.002
2,QT -> L (via S),0.098,0.109,0.034,2.91,0.004


In [46]:
post_indirect_effects_df

Unnamed: 0,Path,Indirect Effect,Std. Indirect Effect,Standard Error,z-value,p-value
0,QS -> L (via S),0.33,0.325,0.071,4.65,0.0
1,QF -> L (via S),0.273,0.14,0.145,1.88,0.06
2,QT -> L (via S),0.162,0.133,0.074,2.2,0.028


This analysis aimed at examining whether the structural model relationships differ significantly by gender and contract type.

**Moderation by Gender:**

Four hypotheses (H5a–H5d) were tested to determine if men and women have different effects of quality on satisfaction and satisfaction on loyalty.
According to the Wald tests:

1. H5a (QS -> S, effect bigger for women): **Not supported**, the difference between men and women is not significant (p = 0.119)
2. H5b (QF -> S, effect bigger for women): **Supported**, the difference is significant (p < 0.05) and the effect is bigger for women than for men ($\beta^* = .348$ and $\beta^* = .014$ respectively)
3. H5c (QT -> S, effect bigger for women): **Not supported**, although the difference is significant (p < 0.01), the direction of the effect is opposite to what was hypothesized
4. H5d (S -> L, effect bigger for women): **Not supported**, the difference is not significant (p = .855)

Thus, only H5b was confirmed. This suggests that financial quality plays a more influential role in determining satisfaction for female consumers


**Moderation by Contract type:**

Similarly to gender, four hypotheses (H6a–H6d) were tested to determine if prepaid and postpaid consumers have different effects of quality on satisfaction and satisfaction on loyalty.

None of the hypotheses (H6a–H6d) were supported, contract type does not significantly moderate the hypothesized relationships (p ranging from .091  to 0.861).

**Indirect Effects**
There was no hypothesized moderation of the indirect effects, but they were calculated for completeness, only 3 of the 12 mediation paths were not significant (according to Sobel's tests), male QF -> L (p=.874), female QT -> L (p=.169) and postpaid QF -> L (p=.060).