# Interval P2Clust
This module shows an example of how to use the "Interval P2Clust" module.

## Definition of inputs and problem formalization

In [1]:
import pandas as pd
from modular_parts.preference import compute_preference_indices
from modular_parts.clustering import cluster_using_interval_p2clust
from core.enums import Direction, GeneralCriterion


alternatives = [f"a{i}" for i in range(1, 8)]
profiles = [f"p{i}" for i in range(1, 4)]
criteria = [f"c{i}" for i in range(1, 4)]
criteria_directions = pd.Series([Direction.MAX, Direction.MIN, Direction.MAX], index=criteria)
criteria_weights = pd.Series([0.3, 0.2, 0.5], index=criteria)
generalised_criteria = pd.Series([GeneralCriterion.U_SHAPE,
                                  GeneralCriterion.V_SHAPE_INDIFFERENCE,
                                  GeneralCriterion.USUAL], index=criteria)
preference_thresholds = pd.Series([2, 10, None], index=criteria)
indifference_thresholds = pd.Series([1, 5, None], index=criteria)
standard_deviations = pd.Series([None, None, None], index=criteria) # None, because we do not use GeneraCriterion.GAUSSIAN

alternatives_performances = pd.DataFrame([[15, 83, 21],
                                          [10, 90, 15],
                                          [11, 75, 20],
                                          [18, 59, 20],
                                          [17, 60, 28],
                                          [22, 44, 15],
                                          [13, 62, 22]], index=alternatives, columns=criteria)

## Usage of Interval P2Clust

### Comparison between n_categories = 2 and n_categories = 3

#### n_categories = 2

In [2]:
assignments, central_profiles, global_quality_index = cluster_using_interval_p2clust(alternatives_performances,
                                                                                    preference_thresholds,
                                                                                    indifference_thresholds,
                                                                                    standard_deviations,
                                                                                    generalised_criteria,
                                                                                    criteria_directions,
                                                                                    criteria_weights,
                                                                                    n_categories=2)

In [3]:
assignments

a1    C1
a2    C1
a3    C1
a4    C2
a5    C2
a6    C1
a7    C1
dtype: object

In [4]:
central_profiles

Unnamed: 0,c1,c2,c3
C1,14.2,70.8,18.6
C2,17.5,59.5,24.0


In [5]:
global_quality_index

1.4285714285714286

#### n_categories = 3

In [6]:
assignments, central_profiles, global_quality_index = cluster_using_interval_p2clust(alternatives_performances,
                                                                                    preference_thresholds,
                                                                                    indifference_thresholds,
                                                                                    standard_deviations,
                                                                                    generalised_criteria,
                                                                                    criteria_directions,
                                                                                    criteria_weights,
                                                                                    n_categories=3)

In [7]:
assignments

a1    C2
a2    C1
a3    C2
a4    C3
a5    C3
a6    C2
a7    C2
dtype: object

In [8]:
central_profiles

Unnamed: 0,c1,c2,c3
C1,10.0,90.0,15.0
C2,15.25,66.0,19.5
C3,17.5,59.5,24.0


In [9]:
global_quality_index

1.066921606118547

As you can see, for this example, the best number of clusters is 2

### Change max_iteration parameter
This parameter is useful for large data sets with many alternatives. For this example, it may look grotesque.

In [33]:
assignments, central_profiles, global_quality_index = cluster_using_interval_p2clust(alternatives_performances,
                                                                                    preference_thresholds,
                                                                                    indifference_thresholds,
                                                                                    standard_deviations,
                                                                                    generalised_criteria,
                                                                                    criteria_directions,
                                                                                    criteria_weights,
                                                                                    n_categories=2,
                                                                                    max_iterations=1)

In [34]:
assignments

a1      C1
a2      C1
a3      C1
a4      C2
a5      C2
a6    C1C2
a7      C1
dtype: object

In [36]:
central_profiles

Unnamed: 0,c1,c2,c3
C1,12.25,77.5,19.5
C2,17.5,59.5,24.0


In [37]:
global_quality_index

1.3454117647058825