# Exploration â€“ Synthetic Data for Stroke Execution Quality

This notebook explores synthetic stroke-related features
and their relationship to execution quality labels.

The goal is to validate problem formulation and feature usefulness
before introducing machine learning models.


In [None]:
import numpy as np
import pandas as pd

In [None]:
np.random.seed(42)
N = 300

data = pd.DataFrame({
    "swing_speed": np.random.normal(30, 5, N),        # km/h
    "racket_angle": np.random.normal(0, 10, N),       # deviation from ideal (degrees)
    "time_pressure": np.random.normal(0.35, 0.05, N), # seconds
    "body_balance": np.random.uniform(0.5, 1.0, N),   # normalized
    "shuttle_height": np.random.normal(1.8, 0.3, N),  # meters
})

data.head()

## Feature Semantics

- **swing_speed**: approximates force generation and offensive intent.
- **racket_angle**: deviation from the ideal racket face angle at contact.
- **time_pressure**: a proxy for temporal constraint during stroke execution,
  reflecting how rushed the player is when initiating the stroke.
- **body_balance**: represents overall body stability and coordination at execution.
- **shuttle_height**: contact point height, describing the spatial context of execution.

In [None]:
score = (
    0.08 * data["swing_speed"]
    - 1.2 * data["time_pressure"]
    + 1.5 * data["body_balance"]
    - 0.05 * np.abs(data["racket_angle"])
    + 0.6 * data["shuttle_height"]
)

data["execution_score"] = score

In [None]:
threshold = data["execution_score"].median()
data["quality_label"] = (data["execution_score"] > threshold).astype(int)

data[["execution_score", "quality_label"]].head()