# Choosing the perfect city for a short vacation with indecisive friends and uncertain criteria
## Training part II: Introducing uncertainty in the criteria
The scenario is the same as in Part I of the training. This time, however, the criteria are treated as uncertain values, each modeled by a Probability Density Function (PDF).

### Outline
1. Load input matrix - with uncertainties - and set preferences
2. Instatiate ProMCDA
3. Normalization
4. Aggregation

In [11]:
import os
import sys
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import plotly.express as px
from plotly.io import show

In [12]:
import promcda
from utilities import *
from promcda.models.ProMCDA import ProMCDA
from promcda.enums import PDFType, NormalizationFunctions, AggregationFunctions

### 1. Load data

In [13]:
data = pd.read_csv("data/matrix_probabilistic.csv")

Note: the "Alternatives" column should be set as index of the df

In [14]:
data.set_index(data.columns[0], inplace=True)

In [15]:
data.head()

Unnamed: 0_level_0,Cost_mean,Cost_std,Time_mean,Time_std,Rain_min,Rain_max,Activities_lambda,CO2_mean,CO2_std,Preference
Destination,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
Alps,350,30,5,0.5,30,50,8,20,5,7
Tuscany,300,20,3,0.3,15,25,10,15,3,8
Amsterdam,400,50,6,1.0,60,80,12,80,20,6
Barcelona,380,40,4,0.7,40,60,15,50,15,9


In [16]:
data.columns

Index(['Cost_mean', 'Cost_std', 'Time_mean', 'Time_std', 'Rain_min',
       'Rain_max', 'Activities_lambda', 'CO2_mean', 'CO2_std', 'Preference'],
      dtype='object')

### 2. Instantiate ProMCDA

#### Preference (weights)

In [17]:
preference = "travel time"

In [18]:
# Run the setup and store parameters in a variable
setup = setup_robustness(data, preference)
# Check the setup parameters
setup

{'input_matrix':              Cost_mean  Cost_std  Time_mean  Time_std  Rain_min  Rain_max  \
 Destination                                                                 
 Alps               350        30          5       0.5        30        50   
 Tuscany            300        20          3       0.3        15        25   
 Amsterdam          400        50          6       1.0        60        80   
 Barcelona          380        40          4       0.7        40        60   
 
              Activities_lambda  CO2_mean  CO2_std  Preference  
 Destination                                                    
 Alps                         8        20        5           7  
 Tuscany                     10        15        3           8  
 Amsterdam                   12        80       20           6  
 Barcelona                   15        50       15           9  ,
 'polarity': ('-', '-', '-', '+', '-', '+'),
 'marginal_distributions': (<PDFType.NORMAL: 'normal'>,
  <PDFType.LOGNORMAL: 

In [19]:
promcda = ProMCDA(
    input_matrix=setup['input_matrix'],
    polarity=setup['polarity'],
    marginal_distributions=setup['marginal_distributions'],
    weights=setup['weights'],
    robustness=setup['robustness'],
)

INFO: 2025-06-26 17:13:10,659 - ProMCDA - Alternatives are: ['Alps', 'Tuscany', 'Amsterdam', 'Barcelona']


### 3. Normalization

In [20]:
minmax = promcda.normalize(NormalizationFunctions.STANDARDIZED)

INFO: 2025-06-26 17:13:13,514 - ProMCDA - Number of alternatives: 4
INFO: 2025-06-26 17:13:13,515 - ProMCDA - Number of indicators: 6
INFO: 2025-06-26 17:13:13,516 - ProMCDA - Polarities are checked: ('-', '-', '-', '+', '-', '+')


ValueError: too many values to unpack (expected 2)

In [None]:
minmax

In [None]:
# promcda.get_normalized_values_with_robustness() # - it displays a very long dictionary! 

### 4. Aggregation

In [None]:
promcda.aggregate(AggregationFunctions.WEIGHTED_SUM)

In [None]:
average_scores, normalized_avg_scores, std = promcda.get_aggregated_values_with_robustness_indicators()

In [None]:
alt_names = ["Alps", "Tuscany", "Amsterdam", "Barcelona"]

In [None]:
plot_bar_with_std(average_scores, std, alt_names)

<div style="border: 1px solid #ccc; padding: 10px; border-radius: 5px; background-color: #f9f9f9;">

<b>MCDA ranking of travel destinations under Travel Time preference, with uncertainty</b><br><br>

The plot shows the <b>mean MCDA scores</b> for each destination, including <b>uncertainty</b> represented as ± standard deviation error bars. The analysis used a <b>weighted sum aggregation</b> and <b>Min-Max normalization</b>, with a higher weight assigned to <b>Travel Time</b> (shorter time = better score).<br><br>

<b>Observations:</b>
<ul>
<li><b>Tuscany</b> stands out with the <b>highest average score</b> and the <b>smallest uncertainty</b> (narrow error bar). This reflects both its good performance across criteria and the fact that its input variables were modeled with relatively low variability. Tuscany is a robust best alternative.</li>

<li><b>Amsterdam</b> shows the <b>lowest score</b> and a <b>larger uncertainty</b>, mainly due to its long travel time and the use of input variables with higher variance.</li>

<li><b>Barcelona</b> and <b>Alps</b> fall in between, with <b>moderate scores and moderate uncertainties</b>.</li>
</ul>

This highlights how both the <b>performance data</b> and the <b>uncertainty assumptions</b> behind the criteria influence the final decision and its robustness.

</div>

<div style="border: 1px solid #ccc; padding: 10px; border-radius: 5px; background-color: #f9f9f9;">

<b>Effect of different Probability Distribution Functions (PDFs) on data variability in MCDA</b><br><br>

Different assumptions about input uncertainty (modeled through various PDFs) impact the variability of MCDA outputs. The following table summarizes the key characteristics of each PDF type and their typical influence on the spread (standard deviation) of results:<br><br>

| <b>PDF Type</b> | <b>Main Characteristic</b> | <b>Effect on Output Variability (Std)</b> |
|---|---|---|
| <b>Normal</b> | Symmetric distribution centered around the mean, variance easily controlled | Moderate variability, evenly spread around the mean |
| <b>Lognormal (NotNormal)</b> | Skewed distribution with a long right tail | Often leads to higher variability, especially with larger mean or std values |
| <b>Uniform</b> | Equal probability for all values within a fixed range | Variability limited to the specified range; depends on how wide the interval is |
| <b>Poisson</b> | Discrete distribution centered around lambda (event rate) | Variability increases with lambda but remains discrete and event-based |
| <b>Exact</b> | Fixed, deterministic value with no randomness | No added uncertainty from this variable |

<br>

<b>Key Insights:</b><br>
- Distributions like <b>Lognormal</b> and <b>Poisson</b> can amplify input variability, resulting in a larger spread in MCDA outputs.<br>
- <b>Normal</b> and <b>Uniform (with a narrow range)</b> lead to more controlled and predictable variability.<br>
- <b>Exact</b> inputs contribute no additional uncertainty.<br><br>
</div>

### 6. Sensitivity analysis

In [None]:
norm = promcda.normalize()

In [None]:
#scores = promcda.aggregate()
#ranks = promcda.evaluate_ranks(scores)

In [None]:
ranks