Spectral Clustering

Implementation of Spectral Clustering in Apache Spark.

Dataset used is synthetic data, generated on-the-fly using random number generators (specifically, the scikit-learn samples generators); they don’t represent any “real” data

Used Matplot Library for plotting the clusters

How to Run

$ spark-submit ~/absolute/path/to/the/directory/spectral_clustering.py 3 10 1.0 a5_data/blobs.txt

Argument 1 is the number of clusters
Argument 2 is the upper bound
Argument 3 is the value of gamma

Note:- spark-submit should be in path

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
a5_data		a5_data
README.md		README.md
spectral_clustering.py		spectral_clustering.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Spectral Clustering

How to Run

About

Releases

Packages

Languages

nitinsaroha/Spectral-Clustering-on-Apache-Spark

Folders and files

Latest commit

History

Repository files navigation

Spectral Clustering

How to Run

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages