# Cluster students given their exam grades

Based on [Worked student grades k-modes example](https://codinginfinite.com/k-modes-clustering-algorithm-with-numerical-example/) notebook,
with help from this [Random Name generator](https://randomwordgenerator.com/name.php).

## Import the modules we need

In [None]:
from kmodes.kmodes import KModes
import pandas as pd
import numpy as np

## Read the data from `data/studentGrades.csv`

In [None]:
df=pd.read_csv("data/studentGrades.csv", index_col=["Student"])
display(df)

## Configure the KModes model

1. n_clusters = 3
2. random_state = 42 (seed used by RNG to find initial centres and to resolve ties)
3. n_init = 5 (rerun with 5 random starts, take the overall best)

In [None]:
model=KModes(n_clusters=3, random_state=42, n_init=4)

## Fit the data and display the resulting "centroids"

In [None]:
fittedModel=model.fit(df)
print("Cluster centroids - archetypal student grades")
print(fittedModel.cluster_centroids_)

## Show how data has been assigned to clusters, given the fitted "centroids"

In [None]:
clusters = fittedModel.predict(df)
df["ClusterID"] = clusters
print("Allocation of students to clusters")
display(df)

## See how two new students would be assigned to these clusters, given their grades

In [None]:
unseenStudentGrades = [['A','B','A','C','B'], ['C','A','B','B','A']]
clusters=fittedModel.predict(unseenStudentGrades)
print("Allocation of new students to existing clusters")
print(clusters)