# Completion Times Hypothesis Testing

In [2]:
# Load necessary libraries
import pandas as pd
import statsmodels.formula.api as smf
from scipy.stats import shapiro

# Load the dataset
file_path = "completion_times.csv"  # Replace with actual file path
df_completion_times = pd.read_csv(file_path)

# Rename CHIRON to SBC for consistency
df_completion_times["Controller"] = df_completion_times["Controller"].replace("CHIRON", "SBC")

# Ensure correct data types
df_completion_times["Modality"] = df_completion_times["Modality"].astype(str)
df_completion_times["Controller"] = df_completion_times["Controller"].astype(str)
df_completion_times["Trial"] = df_completion_times["Trial"].astype(int)

# Check normality using Shapiro-Wilk test
shapiro_test = shapiro(df_completion_times["Total Time"])

# Fit the Linear Mixed-Effects Model (LMM)
lmm_model = smf.mixedlm(
    "Q('Total Time') ~ Controller * Modality * Trial", 
    df_completion_times, 
    groups=df_completion_times["Subject"]
)

# Fit the model
lmm_result = lmm_model.fit()

# Display results
print("Shapiro-Wilk Test for Normality:")
print(f"Statistic={shapiro_test.statistic}, p-value={shapiro_test.pvalue}\n")

print("Linear Mixed-Effects Model Results:")
print(lmm_result.summary())


Shapiro-Wilk Test for Normality:
Statistic=0.9065364599227905, p-value=1.4640572771895677e-06

Linear Mixed-Effects Model Results:
                           Mixed Linear Model Regression Results
Model:                      MixedLM           Dependent Variable:           Q('Total Time')
No. Observations:           107               Method:                       REML           
No. Groups:                 20                Scale:                        7773.2166      
Min. group size:            1                 Log-Likelihood:               -612.4254      
Max. group size:            6                 Converged:                    Yes            
Mean group size:            5.3                                                            
-------------------------------------------------------------------------------------------
                                             Coef.   Std.Err.   z    P>|z|  [0.025   0.975]
--------------------------------------------------------------------

Statistical Approach and Justification

During data cleaning, missing data for the task completion times were found, more specifically 7 of them. To analyze task completion times, Linear Mixed-Effects Model (LMM) was applied to account for repeated measures across trials and participants. LMM models both fixed effects (Controller, Modality, Trial Number) and random effects (Participant), allowing for a robust analysis without discarding data due to missing values. A Shapiro-Wilk test confirmed that completion times were not normally distributed (p < 0.05), further justifying the use of LMM over traditional parametric tests like ANOVA.

Results from the Completion Time Analysis

Vaseline was defined as being SBC without VR. Thye usage of VR HMD significantly increases completion time (+141.80 seconds, p = 0.026). WBC is significantly slower than SBC overall (+169.09 seconds, p = 0.025). Trial Number had a marginal effect on completion time (-31.64 seconds per trial, p = 0.120), indicating a possible learning effect over repeated attempts, though not statistically significant. No significant interaction effects between Controller, Modality, and Trial Number (p > 0.05). 


Conclusion
Statistical testing confirms that both modality and controller type influence task duration, with VR significantly increasing completion time and SBC outperforming WBC in efficiency.