Skip to content

danmontesi/DataMining-KTH

Repository files navigation

Data Mining

Repository for the Laboratories of Data Mining course in KTH - ID2222

header

KTH

Description

Repository of all the deliverables developed for the course in KTH: ID2222 in Data mining, part of the Double Degree Master of Science in Computer Science and Engineering, Data Science & Distibuted Systems track @EIT Digital.

The course deliverables are 5 labs implementing some Data Mining techniques to deal with large datasets. They are developed using Python or pyspark.

Labs

  • Lab1 - LSH: LSH Document Similarity implementation using 10 scientific papers sample corpus
  • Lab1 - MinHashing & AssociationRules: AssociationRules algorithm implementation including Apriori algorithm
  • Lab3 - HyperBall: HyperBall technique for approximated node centrality calculation implementation. From paper HyperBall
  • Lab4 - GraphSpectra: Graph Spectral Clustering technique implementation
  • Lab5 - JABEJA: JABEJA (swap) algorithm for minimization of ratio cut in partitioning large graphs into similar dimensionality clusters

Team members

About

Repository for the Laboratories of Data Mining course in KTH - ID2222

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors