## <div style="text-align: center; font-size: 100px;">Introduction to ensemble clustering</div>

 ## Idea and purpose of cluster analysis

<figure>
    <img src="k-means-clustering.png" width="900" height="600" alt="K-means Clustering">
    <figcaption>Source: https://www.javatpoint.com/k-means-clustering-algorithm-in-machine-learning</figcaption>
</figure>

Cluster analysis is an example of unsupervised machine learning and focuses on detecting the internal structure of the data, such that each cluster contains objects that are similar to each other.

* Cluster - group of objects in a dataset that are similar to each other
* Partition - division of the entire dataset into a finite number K of subsets (clusters), where each data point belongs to exactly one subset.

## An overview of the most important clustering algorithm types

A common division of cluster analysis algorithms includes the following:
- clustering methods,
- hierarchical methods,
- others: methods based on probabilistic distributions, methods using graph theory and many others.

<center><img src="Cluster_types.png"/></center>
<div style="font-size: 16px;">Source: Gao, C. X., Wang, S., Zhu, Y., Ziou, M., Teo, S. M., Smith, C. L., ... & Dwyer, D. (2024). Ensemble clustering: A practical tutorial.</div>

## Potential problems of clustering 
* Application of two different clustering algorithms to the same dataset may produce completely different results.
* Evaluating the performance of clustering models is relatively challenging.
* Basic clustering algorithms often have problems with detecting unusual cluster shapes.

To tackle these problems, <span style="font-weight: bold;"> ensemble clustering </span>might be used.

## What is ensemble clustering (a.k.a. consensus clustering)?
Ensemble clustering combines the creation of multiple partitions for a given dataset and using the relationships observed between them to determine a final, single partition. The goal is to find a more reliable, improved solution than with individual grouping.
Clustering ensemble is always made up of two steps:
* Generate $M$ partitions $P = \{P_1, P_2, \dots, P_M\} $ using selected clustering algorithms.
* Combining the obtained partitions into one final partition $P^{*}$ using a consensus function $\Gamma$.
<br>
<center><img src="clusters_graph.png" style="width: 70%; height: 50%;"/></center>
<div style="font-size: 16px;">Source: Vega-Pons, Sandro, and José Ruiz-Shulcloper. "A survey of clustering ensemble algorithms." International Journal of Pattern Recognition and Artificial Intelligence 25.03 (2011): 337-372.</div>



## Properties that an ensemble clustering algorithm should posses
* <span style="font-weight: bold;">Robustness</span> - the combination of results should have better average performance than the single clustering algorithms,
* <span style="font-weight: bold;">Consistency</span> - combination result should be close to all combined results of the individual clustering algorithms,
* <span style="font-weight: bold;">Novelty</span> - ensemble clustering should achieve results that are unattainable by standard clustering algorithms,
* <span style="font-weight: bold;">Stability</span> - the results obtained shall be less sensitive to noise and outliers.

## First step - Generation mechanism
It involves creating multiple diverse partitions (base clusters), which will then be used to obtain consensus. The main objective is to generate a variety of clustering results by using different approaches and parameters. This helps  to highlight different structural aspects of the dataset and can lead to better results.

## Key methods of generating partitions

<center><img src="BC.png"/></center>
<div style="font-size: 16px;">Source: Vega-Pons, Sandro, and José Ruiz-Shulcloper. "A survey of clustering ensemble algorithms." International Journal of Pattern Recognition and Artificial Intelligence 25.03 (2011): 337-372.</div>


## Different Objects Representations
Data objects in a dataset can be represented in multiple ways by transforming or selecting different features. These representations can reveal different characteristics and patterns within the data. For example, in one pratition, features can be transformed using a logarithmic scale.
* <span style="font-weight: bold;">Advantage:</span> Transformed features can mitigate the impact of outliers, normalize distributions, and highlight different patterns in the data.

## Different Clustering Algorithms

Apply multiple clustering algorithms to the same dataset to obtain diverse partitions.
* <span style="font-weight: bold;">Advantage:</span> The unique properties and assumptions of each algorithm can lead to a broader exploration of clusters.
<br>

## Different Parameters Initialization

Use the same clustering algorithm with different initial parameters to generate different partitions.
* <span style="font-weight: bold;">Advantage:</span> Increase the robustness and accuracy of the final clustering result by exploring sensitivity to initial conditions.

## Projection to Subspaces
Technique involves projecting the data into different subspaces. It works on the entire dataset and includes, for example, dimensionality reduction.
* <span style="font-weight: bold;">Advantage:</span> The method is particularly useful for multidimensional datasets, as it helps to capture different structural aspects of the data that may not be visible in the original feature space.
<br>

## Different Subsets of Objects

Involves generating multiple partitions of the dataset using different subsets of the data (e.g. randomly selected). Works on subsets of the dataset, but keeping constant feature space.
* <span style="font-weight: bold;">Advantage:</span> Suitable for large data sets due to the reduction in computational load.

## Second step - Consensus Functions
Consensus functions are mathematical or algorithmic procedures that combine the results of clustering into one final solution (transforming the $M$ components of a partition $P$ into a final partition $P^*$).

There are two main methods for obtaining consensus:
* using information about which class labels have been given to observations in each partitions. This method is called  <span style="font-weight: bold;">Median Partition</span> and focuses on finding the partition that maximises similarity to all partitions in the cluster ensemble.
* using information on how often different objects were assigned to the same cluster (known as  <span style="font-weight: bold;">Co-Occurrence</span>).


## Main categories of consensus functions
* Direct Methods (Majority Voting) - use assigned cluster labels directly and apply different majority voting schemes.
* Feature-Based Methods - 
* Method based on co-association matrix - 
* Graph methods 
* Information theory and mixture model

<center><img src="consensus_types.png"/></center>
<div style="text-align: center; font-size: 16px;">
    Source: Gao, C. X., Wang, S., Zhu, Y., Ziou, M., Teo, S. M., Smith, C. L., ... & Dwyer, D. (2024). Ensemble clustering: A practical tutorial.
</div>

Zalety i wady ensamble

K-means opisać 

Macierz co-ass + pierwsza metoda na niej 
Z macierzy do spectral ensamble 

macierz -> graf ->treshold 

Pierwsza metoda u Marcina batch majority