In this notebook we use NLP methods and kernel PCA to build a clustering model
- Column selection
- Null values
- Duplicate values
- Lowercasing and removing newline, tabular spaces, numbers and punctuations
- Stopwords removal
- Lemmatization
- Label encoding
- TF-IDF vectorization
In this part we apply kernel PCA to reduce dimensions
At first, we apply the elbow method to calculate the best K and then build a model
The score for this task is 0.81
We finish the task with final analyses