# Face Clustering

Face clustering is an unsupervised learning task to find unique faces in a group of unlabeled faces. I have created a module `FacialClustering`, that will cluster faces using two methods: `DBSCAN` and `chinese_whispers`, utilizing the facial embedding models from the previous notebooks. 

In [1]:
# The module takes a FacePreprocess class as input
# I designed it this way since we need to load the ssd model and weights
from modules.FacePreprocess import FacePreprocess
ssd_model = r'./models/ssd/deploy.prototxt.txt'
ssd_weights = r'./models/ssd/res10_300x300_ssd_iter_140000.caffemodel'
processor = FacePreprocess(ssd_model, ssd_weights)

## Initialize the `FacialClustering` module

In [2]:
from modules.FacialClustering import FacialClustering

# set input paths --> make sure that every image inside the directory ends with '.jpg' or '.png'
input_paths = [
    '.\dataset'
]
output_path = '.\output\\face_clustering'
cluster = FacialClustering(
    pathlist = input_paths, 
    processor = processor, 
    out_path = output_path,
    preprocess = True, # since the images in our dataset hasn't been preprocessed, set this as True
)

If the module loads correctly, you should see a `log.txt` file inside your output directory. This file will log all the clustering parameters we used.

## Method 1: Chinese Whispers

reference: 
- https://github.com/zhly0/facenet-face-cluster-chinese-whispers-/blob/master/clustering.py 
- https://en.wikipedia.org/wiki/Chinese_whispers_(clustering_method) 

In [4]:
cluster.chinese_whispers(
    FE = 'kv-resnet50',
    threshold = 0.75, # min distance between clusters
    iterations = 100, # number of iterations
    saveas = False, # save a copy of the clustered faces
)