Skip to content

szuvarska/NLPClustering

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

51 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NLPClustering

Project for Introduction to Machine Learning course

Data: https://www.kaggle.com/datasets/amrwael/nlp-project-fcis-23

The aim of this project was to perform data clustering on the provided dataset containing a collection of 20 000 documents. We preprocessed the data (removing stopwords, lemmatization, vectorization etc.) and built a few models focusing on the KMeans method. We also added our own interpretation to the final clusters, which you can see in our presentation: Presentation/presentation.pdf.

Authors

Releases

No releases published

Packages

No packages published