- Clustering is an unsupervised machine learning methodology
- It is used to group and identify similar observations when we do not have labels that identify the groups
- It is often a preprocessing or an exploratory step in the data science pipeline
- What groupings exist in the data already? (Clustering)
- Text: Document classification, summarization, topic modeling, recommendations
- Geographic: Crime zones, housing prices
- Marketing: Customer segmentation, market research
- Anomaly Detection: Account takeover, security risk, fraud
- Image Processing: Radiology, security