## Dominant-hand filtering and technique-specific blanking for `data.csv`

This notebook:
1. Loads `data.csv`.
2. Infers each participant's dominant hand via paired t-tests on left/right metrics.
3. Blanks the non-dominant hand measurements and builds unified `dominant_*` columns.
4. Further blanks hand metrics for the `Chicken` technique and head metrics for `Astral`, `Grab`, `Sliding`, `Teleport`, and `Throw` techniques.


In [1]:
import pandas as pd
import numpy as np
from scipy.stats import ttest_rel

raw_path = "data.csv"
df = pd.read_csv(raw_path)
print(f"Loaded {len(df)} rows and {df.shape[1]} columns from {raw_path}")
df.head()

Loaded 414 rows and 25 columns from data.csv


Unnamed: 0,iD,group,technique,trialBlock,trial,headTotalDistance,headExtent,leftTotalDistance,leftExtent,rightTotalDistance,...,rightThumbstickDistance,rightThumbstickExtent,leftTriggerPressure,rightTriggerPressure,leftGripPressure,rightGripPressure,buttonPressCount,movementVariability,targetEnteredCount,axisCrossedCount
0,P001,Impaired,Astral,1,1,0.9,0.32,1.0,0.35,0.2,...,0.1,0.05,0.05,0.02,0.03,0.01,20,0.9,13,22
1,P001,Impaired,Chicken,1,2,1.7,0.65,0.5,0.22,0.25,...,0.04,0.04,0.0,0.0,0.0,0.0,0,0.95,11,20
2,P001,Impaired,Grab,2,1,1.1,0.42,2.6,0.85,0.5,...,0.08,0.06,0.8,0.1,0.2,0.05,6,1.05,14,28
3,P001,Impaired,Sliding,2,2,1.3,0.52,1.05,0.34,0.3,...,0.08,0.06,0.06,0.02,0.05,0.02,38,0.92,13,24
4,P001,Impaired,Teleport,3,1,1.25,0.5,1.6,0.62,0.3,...,0.12,0.08,0.85,0.1,0.12,0.04,18,1.0,15,30


### Metrics used to infer dominance
We test the mean difference between left and right values for the following pairs (paired t-test, alpha=0.05):
- Total distance
- Extent
- Thumbstick distance and extent
- Trigger pressure
- Grip pressure
- Head distance


In [2]:
hand_pairs = [
    ("leftTotalDistance", "rightTotalDistance", "totalDistance"),
    ("leftExtent", "rightExtent", "extent"),
    ("leftThumbstickDistance", "rightThumbstickDistance", "thumbstickDistance"),
    ("leftThumbstickExtent", "rightThumbstickExtent", "thumbstickExtent"),
    ("leftTriggerPressure", "rightTriggerPressure", "triggerPressure"),
    ("leftGripPressure", "rightGripPressure", "gripPressure"),
    ("leftHeadDistance", "rightHeadDistance", "headDistance"),
]

alpha = 0.05

def infer_dominant_hand(frame, pairs, alpha=0.05):
    dominant = {}
    rows = []
    for pid, group in frame.groupby("iD"):
        votes = []
        for left, right, label in pairs:
            cols = group[[left, right]].dropna()
            if len(cols) < 2:
                continue
            stat, p = ttest_rel(cols[left], cols[right])
            if np.isnan(p):
                continue
            mean_diff = cols[right].mean() - cols[left].mean()
            if p < alpha:
                votes.append(np.sign(mean_diff))
            rows.append({
                "iD": pid,
                "metric": label,
                "left_mean": cols[left].mean(),
                "right_mean": cols[right].mean(),
                "mean_diff": mean_diff,
                "p_value": p,
            })
        if votes:
            score = np.sign(np.sum(votes))
            if score > 0:
                dominant[pid] = "Right"
            elif score < 0:
                dominant[pid] = "Left"
            else:
                dominant[pid] = "Ambiguous"
        else:
            dominant[pid] = "Ambiguous"
    detail_df = pd.DataFrame(rows)
    return dominant, detail_df


In [3]:
dominant_map, ttest_details = infer_dominant_hand(df, hand_pairs, alpha)
dominant_hand_series = df["iD"].map(dominant_map)

print("Dominant hand per participant (computed, not added to dataset):")
dominant_hand_overview = pd.DataFrame(
    sorted(dominant_map.items()), columns=["iD", "dominant_hand"]
)
print(dominant_hand_overview)
print("\nSample of t-test details (per participant x metric):")
display(ttest_details.head())


Dominant hand per participant (computed, not added to dataset):
       iD dominant_hand
0   #REF!         Right
1    P001          Left
2    P002         Right
3    P003         Right
4    P004         Right
5    P005         Right
6    P006         Right
7    P007         Right
8    P008         Right
9    P009     Ambiguous
10   P010         Right
11   P011          Left
12   P012         Right
13   P013     Ambiguous
14   P014         Right
15   P015          Left
16   P016         Right
17   P017         Right
18   P018         Right
19   P019         Right
20   P020          Left
21   P021     Ambiguous
22   P022     Ambiguous
23   P023     Ambiguous
24   P024     Ambiguous
25   P025         Right
26   P026     Ambiguous
27   P027     Ambiguous
28   P028          Left
29   P029         Right
30   P030         Right

Sample of t-test details (per participant x metric):


  res = hypotest_fun_out(*samples, **kwds)


Unnamed: 0,iD,metric,left_mean,right_mean,mean_diff,p_value
0,#REF!,totalDistance,0.966538,1.133974,0.167436,3.634596e-09
1,#REF!,extent,0.357692,0.432265,0.074573,1.338599e-09
2,#REF!,thumbstickDistance,0.458376,0.360043,-0.098333,0.002884145
3,#REF!,thumbstickExtent,0.230427,0.213803,-0.016624,0.1746247
4,#REF!,triggerPressure,0.233889,0.275684,0.041795,0.0001053491


In [4]:
clean_df = df.copy()
right_mask = dominant_hand_series == "Right"
left_mask = dominant_hand_series == "Left"

for left, right, _ in hand_pairs:
    clean_df.loc[right_mask, left] = np.nan
    clean_df.loc[left_mask, right] = np.nan

clean_df.head()


Unnamed: 0,iD,group,technique,trialBlock,trial,headTotalDistance,headExtent,leftTotalDistance,leftExtent,rightTotalDistance,...,rightThumbstickDistance,rightThumbstickExtent,leftTriggerPressure,rightTriggerPressure,leftGripPressure,rightGripPressure,buttonPressCount,movementVariability,targetEnteredCount,axisCrossedCount
0,P001,Impaired,Astral,1,1,0.9,0.32,1.0,0.35,,...,,,0.05,,0.03,,20,0.9,13,22
1,P001,Impaired,Chicken,1,2,1.7,0.65,0.5,0.22,,...,,,0.0,,0.0,,0,0.95,11,20
2,P001,Impaired,Grab,2,1,1.1,0.42,2.6,0.85,,...,,,0.8,,0.2,,6,1.05,14,28
3,P001,Impaired,Sliding,2,2,1.3,0.52,1.05,0.34,,...,,,0.06,,0.05,,38,0.92,13,24
4,P001,Impaired,Teleport,3,1,1.25,0.5,1.6,0.62,,...,,,0.85,,0.12,,18,1.0,15,30


### Technique-specific blanking
- `Chicken`: blank all hand-related columns (left/right position, thumbstick, trigger/grip, and head-distance-by-hand) and corresponding `dominant_*` hand columns.
- `Astral`, `Grab`, `Sliding`, `Teleport`, `Throw`: blank head-related columns and `dominant_headDistance`.


In [5]:
hand_cols = [
    "leftTotalDistance", "rightTotalDistance",
    "leftExtent", "rightExtent",
    "leftHeadDistance", "rightHeadDistance",
    "leftThumbstickDistance", "rightThumbstickDistance",
    "leftThumbstickExtent", "rightThumbstickExtent",
    "leftTriggerPressure", "rightTriggerPressure",
    "leftGripPressure", "rightGripPressure",
]

head_cols = ["headTotalDistance", "headExtent"]

chicken_mask = clean_df["technique"] == "Chicken"
clean_df.loc[chicken_mask, hand_cols] = np.nan

head_mask = clean_df["technique"].isin(["Astral", "Grab", "Sliding", "Teleport", "Throw"])
clean_df.loc[head_mask, head_cols] = np.nan

print("Applied technique-specific blanking:")
print(f"Hand columns blanked for Chicken rows: {chicken_mask.sum()} rows")
print(f"Head columns blanked for Astral/Grab/Sliding/Teleport/Throw rows: {head_mask.sum()} rows")

clean_df.head()


Applied technique-specific blanking:
Hand columns blanked for Chicken rows: 69 rows
Head columns blanked for Astral/Grab/Sliding/Teleport/Throw rows: 345 rows


Unnamed: 0,iD,group,technique,trialBlock,trial,headTotalDistance,headExtent,leftTotalDistance,leftExtent,rightTotalDistance,...,rightThumbstickDistance,rightThumbstickExtent,leftTriggerPressure,rightTriggerPressure,leftGripPressure,rightGripPressure,buttonPressCount,movementVariability,targetEnteredCount,axisCrossedCount
0,P001,Impaired,Astral,1,1,,,1.0,0.35,,...,,,0.05,,0.03,,20,0.9,13,22
1,P001,Impaired,Chicken,1,2,1.7,0.65,,,,...,,,,,,,0,0.95,11,20
2,P001,Impaired,Grab,2,1,,,2.6,0.85,,...,,,0.8,,0.2,,6,1.05,14,28
3,P001,Impaired,Sliding,2,2,,,1.05,0.34,,...,,,0.06,,0.05,,38,0.92,13,24
4,P001,Impaired,Teleport,3,1,,,1.6,0.62,,...,,,0.85,,0.12,,18,1.0,15,30


In [6]:
clean_path = "data_cleaned.csv"
clean_df.to_csv(clean_path, index=False)
print(f"Saved cleaned data to {clean_path} without adding dominant-hand feature columns.")


Saved cleaned data to data_cleaned.csv without adding dominant-hand feature columns.
