# Clustering Offensive Playstyles

We apply k-means clustering to categorize teams by offensive tendencies such as pass rate and efficiency.

## K-Means Objective
Given observations $x_i$, k-means minimizes within-cluster variance:
\[ \sum_{i=1}^n \|x_i - \mu_{c_i}\|^2 \] where $\mu_{c_i}$ is the centroid of cluster $c_i$.

In [None]:
import numpy as np
import pandas as pd
from sklearn.cluster import KMeans
np.random.seed(2)
teams = [f'Team {i}' for i in range(8)]
data = pd.DataFrame({
    'team': teams,
    'pass_rate': np.random.uniform(0.4,0.7,8),
    'epa_per_play': np.random.normal(0,0.2,8)
})
features = data[['pass_rate','epa_per_play']]
km = KMeans(n_clusters=3, random_state=0)
clusters = km.fit_predict(features)
data['cluster'] = clusters
print(data)


## Visualization
Using two features allows us to scatter-plot teams by playstyle. Clusters group similar offensive strategies.