Skip to content

Stable KMeans clustering using the Monte Carlo simulation, applied on NAPS and NAPS-BE datasets

License

Notifications You must be signed in to change notification settings

kburnik/naps-clustering

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NAPS & NAPS BE Stable Clustering via Monte-Carlo Simulation

This is a snapshot of the NAPS & NAPS BE stable clustering solution using the Monte-Carlo method.

The datasets are not included in this solution and should be obtained separately from the Nencki Institute (http://en.nencki.gov.pl/).

About the project

The project was created as a python library (module).

The main program (entry point) is clustering/analysis.py which contains the program snippets used to generate tables and graphs from the final work, while the remaining files are parts of the library.

For further research, it is recommended to start by adding a snippet to clustering/analysis.py, and if necessary make a separate project or script that imports existing modules.

The results

The results are saved in the clustering/out directory. Most of the results are generated with the cache mechanism to make it easier to modify plots and the rest of the code that does not affect results that are precomputed for a long time. Cache files are recognizable by the suffix .cached-result.json.

The clustering results are saved in CSV files with one blank line in between two partitions. It is recommended to use partition indexes for software solutions from the corresponding cached-result.json file. The order there is the same as the input the order of the corresponding input data, and the values ​​ denote the partition indices.

Installation

Windows:

** After installing the above, everything else needs to be done in git-bash. **

Linux:

  • The steps are almost the same, you need to install the software with apt-get and pip while MSVC++ 2015 Redistributable is not required.

Fetching the code and preparing the development environment

cd ~
git clone https://github.com/kburnik/naps-clustering

cd ~/naps-clustering
virtualenv venv -p /c/python3X/python.exe  # Adjust path python3.X

Initializing the environment (git bash & python virtualenv)

This needs to be done only once at the beginning of a terminal session.

cd ~/naps-clustering
. venv/Scripts/activate # . venv/bin/activate on Linux-u

Running

# The program will print the available snippets to run
python -u clustering/analysis.py

# Run the first snippet.
python -u clustering/analysis.py 1

About

Stable KMeans clustering using the Monte Carlo simulation, applied on NAPS and NAPS-BE datasets

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages