Skip to content

A collection of seven clustering algorithms along with validity indices

Notifications You must be signed in to change notification settings

thanSkourtan/Cluster-Analysis-Algorithms

Repository files navigation

Every algorithm category can be located in the package named after it. For every algorithm there are two basic scripts. One of them is the basic module which is named after the algorithm and contains the implementation of the algorithm and the other is included in the folder test (this is not a module, so it cannot be imported anywhere) in each algorithm’s package and is named after the algorithms plus the string “_test” at the end of the script. The second file contains a class of type TestCase and each of the class’ functions can be regarded as a unit test. The important thing to note here is that although we are using the logic of unit tests, we are not exactly testing some functions but rather use the test functions as entry points for our scripts. Let us see an example that will make things clearer.

If we want to execute the k-means algorithm in a dataset of 4 blobs, 2 features and 500 samples then we should go to the path ‘./cost_function_optimization/tests/kmeans_test.py’ and choose the first function ‘testBlobs’ to run. This can be accomplished by commenting out the line ‘@unittest.skip(‘no’) that can be located in the line right above each of the test functions. From inside the test function we can change the parameters of the synthetic data as we wish. In the same way we can test the algorithms on concentric circles by running the function testCircles or moon-like data shapes by running the function testMoons. Same procedure applies in the test functions that execute the relative criteria and the image segmentation scripts. Regarding the last category of test function, they can be found only in kmeans_test, fuzzy_test and possibilistic_test scripts.

Of course somebody could dispose the test scripts and directly call the algorithm functions with his own data. The test scripts in other words are not part of the library, but rather left there for convenience in order the user to find an already set up environment to execute the algorithms and reproduce the results included in this thesis.

For more information on the final results of the execution of the algorithms please see above the file thesis_final.pdf.

About

A collection of seven clustering algorithms along with validity indices

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages