HCA-DBSCAN

This is a proof-of-concept implementation for a new density based clustering algortihm, called the HCA-DBSCAN. Paper

Please refer our associated paper for a detailed explanation of the HCA-DBSCAN algorithm https://goo.gl/eVBCkX

Abstract

Density based clustering has proven to be very efficient and has found numerous applications across domains. The Density-Based Spatial Clustering of Applications with Noise (DBSCAN) algorithm is capable of finding clusters of varied shapes and is not sensitive to outliers in the data. In this paper we propose a new clustering algorithm, the HyperCube based Accelerated DBSCAN(HCA-DBSCAN) which runs in O(n log n) bettering the O(n^2 ) complexity of the original DBSCAN algorithm without compromising on clustering accuracy. While DBSCAN can achieve the same complexity when spatial indexing is used, that complexity again grows to O(n^2) for higher dimensional data; which does not happen with our approach. We use a combination of distance based aggregation by overlaying the data with customized grids and use representative points to reduce the number of comparisons. Experimental results show that the proposed algorithm achieves a significant run time speed up of upto 52.57% when compared to other improvements that try to reduce the complexity of DBSCAN to O(n log n). Another significant improvement is that we eliminate the need for one of the two input parameters needed by the original DBSCAN algorithm thus making it more accessible to non expert users.

How to Run

Run the visualization.py to test the code on various 2-dimensional data sets and see the results by visulaizing the graphs of the clusters detected in the data

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
Algo.png		Algo.png
HCA-DBSCAN.pdf		HCA-DBSCAN.pdf
Iris.csv		Iris.csv
README.md		README.md
Visualization.py		Visualization.py
dbscan.py		dbscan.py
graph for iris.png		graph for iris.png
main.py		main.py
spherical_6_2_modified 1.csv		spherical_6_2_modified 1.csv
spherical_6_2_modified.csv		spherical_6_2_modified.csv
spherical_6_2_modified_1.csv		spherical_6_2_modified_1.csv
spherical_6_2_modified_300pt.csv		spherical_6_2_modified_300pt.csv
spherical_trace.png		spherical_trace.png
test.py		test.py
tracing iris.png		tracing iris.png
vinu_dataset.csv		vinu_dataset.csv
vinu_dataset1.csv		vinu_dataset1.csv

vin101/HCA-DBSCAN

Folders and files

Latest commit

History

Repository files navigation

HCA-DBSCAN

This is a proof-of-concept implementation for a new density based clustering algortihm, called the HCA-DBSCAN. Paper

Abstract

How to Run

About

Resources

Stars

Watchers

Forks

Languages