# Code Python - ELECTRE Tri 

## Introduction to the project

Multi-criteria Decision Analysis (MCDA) is a decision support protocol for ranking elements by evaluating them on different criteria. It deals with a problem that has several aspects depending on the wishes of a decision maker. MCDA's makes a problem more understandable, transparent, and accessible. MCDA has developed well in environmental management such as renovation problems which are dynamic systems that bring together many actors, many factors with long-term applications and a lot of uncertainty. 

To take into account the fluctuations of the input data in environmental projects and thus to obtain results less sensitive to variations, this method is coupled with the Monte Carlo principle. Instead of using crisp data, Monte Carlo allows the use of distributions for each of the data and thus allows variations in the data to be incorporated into the analysis. 

In this technical report, the implementation of this new procedure is presented regarding a case study (Daniel, 2022). A company wanted to find a method to choose an energy refurbishment scenario for a group of three buildings located in the Lyon region, in France.  The Multi Criteria Decision Analysis used is ELECTRE Tri. The mechanisms used and the input data are presented first. Then, the different steps of the new procedures are described one by one. 

## ELECTRE Tri

### ELECTRE Tri method

ELECTRE Tri is the multi-criteria decision analysis method chosen for the project which aim to sort all the alternatives, i.e. the different possibilities for which there is a choice process, in predefined categories. They correspond to the ranks of the alternatives. These alternatives are evaluated by several criteria, perspectives of evaluations of different natures. In its process, the input data including criteria weights and performance matrix is compared to reference profiles i.e. the limits, for each criterion, of the categories. From these alternative/profile comparisons, preference relations are determined, indicating how an alternative relates to a profile. This method results in an optimistic and pessimistic sorting of the alternatives according to the direction of classification. 

In this method, the input data and parameters used are : 
- Alternatives: options from which the decision-maker must choose
- Criteria: Perspectives on which the alternatives are evaluated, quantitative or qualitative
- Weight: Degree of importance of each criterion in \%. 
- Performance Matrix: Matrix with the performance of each alternatives regarding each criterion
- Reference profiles: Values that define the different performance boundaries for each criterion
- Thresholds: Boundaries, defined by decision-makers, to measure the indifference or the preference between an alternative and a reference profile, or the very bad performance of an alternative compared to a profile.  


First of all, ELECTRE Tri is a multicriteria decision-making process that begins by comparing alternatives to profiles, which enables the classification of alternatives into specific categories. This method allows for the independent comparison of alternatives, without the ranking of one alternative being influenced by the ranking of others (Corrente, 2016). Thus, it not only identifies the alternatives that best meet the decision-makers' requirements, but also provides an overall performance assessment of each alternative. 

Also, a specificty of ELECTRE Tri is that criteria are given weights. Criteria are then established hierarchically, which allows some criteria to be performed with greater interest than others (Corrente, 2016). The method can include numerous criteria which corresponds well to complex issues such as environmental problems. Finally, This method includes thresholds, values which defined the objectives of the decision makers. This is crucial in an environmental decision-making process, since it prevents a very poor performance in one criterion from being compensated by a very good performance in another criterion. 

In all its aspects, the ELECTRE Tri multi-criteria decision analysis method appeared interesting to develop and use in the framework of environmental projects. Here are the different steps of calculation of ELECTRE Tri that will be followed during this notebook :
- Partial concordance $C_j(a_i,b_k)$ and $C_j(b_k,a_i)$ : for each criterion $j$, each alternative $a_i$ is compared to each reference profile $b_k$ to determine if it is consistent with the statement "$a_i$ is at least as good as $b_k$"
- Discordance $D_j(a_i,b_k)$ and $D_j(b_k,a_i)$ : for each criterion $j$, each alternative $a_i$ is compared to each reference profile $b_k$ to determine if it is discordant with the statement "$a_i$ is at least as good as $b_k$"
- Global concordance $C(a_i,b_k)$ and $C(b_k,a_i)$: the partial concordance values calculated for each criterion $j$ are aggregated to obtain a global concordance per alternative $a_i$ and reference profile $b_k$ pair.
- Degree of credibility $\delta(a_i,b_k)$ and $\delta(b_k,a_i)$ : the degree of credibility is the global concordance weakened by the eventual veto effects that can be found in the discordance
- Over-ranking relations : thanks to the credibility degrees computed previously, the preference relations between each alternative $a_i$ and each reference profile $b_k$ are determined 
- Pessimistic and optimistic ranking : rank each alternative $a_i$ in a category

In the first 4 calculation steps: concordance, discordance, global concordance and degree of credibility, the calculations will be made twice. The analysis is carried out by comparing alternatives to profiles and profiles to alternatives in order to have a precise notion of the distance between the two. As explained in Figure 1, the performance of an alternative to a profile does not indicate a performance of a profile to an alternative. 
 
<center>
<figure>
  <img src="drawbacks.png" width="50%" height="50%">
  <figcaption><i> Figure 1: Schema of the comparison of an alternative with a reference profile</i></figcaption>
</figure>
</center>


### ELECTRE Tri applied to the renovation decision making process

#### Input data

The input data correspond to all the data collected by the technicians in order to establish the method. These data are collected from decision-makers and consultancy firms. 

##### Criteria $g$

According to Roy B. (Roy, 1985), a criterion is a "tool" that allows to evaluate an action by a specific "point of view". Since these criteria will allow us to establish preference relations between many alternatives, it quality of construction is crucial. Also it is important that all the actors adhere to the choice of criteria and understand what each criterion represents, its precise definition and its evaluation method. Criteria should be diversified, precise but not redundant to avoid assessing the same element twice. Thus, the assessment methods for each of the criteria should be precisely described so that the same data is not used to assess different criteria. Each criterion is defined by it unit and it weight. A criterion can have a direction of preference that can be either increasing or decreasing.


*In order to cover all aspects of the project, 4 categories of criteria are defined: economic, social, technical and environmental where several criteria are formulated. For this project, a total of 16 criteria are finally used.*

##### Alternatives $a$

The alternatives are the different possible outcomes of the choice process. In this project, the method should show which type of renovation best fits the building and the decision makers's objectives. In order to make the method undestandable, the actions to be compared represent the different renovation possibilities that exist, named as scenarios of renovation. The method gives the performance of each scenario regarding the others. 

The energy renovation of a building affects several areas and in each of these areas there are several possibilities. Thus renovation scenarios are formed with coherent elementary actions. Families of alternatives are formed according to the different possible alternatives in each field.

*In the project, seven areas have been identified. For each of these areas, different alternatives are developed to obtain a total of 24 basic renovation actions. From the elementary actions, Thus, 28 renovation solutions are identified in total with 7 groups, first renovation solution being the one where no changes are made. In the data file, the alternatives are named $S$.*


##### Performance Matrix 
 
Each alternative is evaluated regarding each criterion previously established. The evaluation of the performance $a$ of the alternative $i$ regarding the criterion $j$ will be noted $u_j(a_i)$. In the performance matrix, each column corresponds to an alternative and each line to a criterion.

In a case of a criterion with an increasing preference direction, the higher the evaluation of the alternative on this criterion $u_j(a_i)$, the better the alternative performs on this criterion. Conversely, for a criterion with a decreasing performance direction, the lower the evaluation of the alternative on this criterion $u_j(a_i)$, the lower the performance of this alternative on this criterion. 

In order to unify the calculations and not to have to differentiate between the two cases described above, the performance values in criteria with a decreasing preference direction will be multiplied by "-1". Thus, these criteria will also get an increasing performance direction. 

*To sum up, in this project 28 alternatives, renovation scenarios, will be evaluated thanks to 16 criteria.* 




#### Parameters

Parameters are the data involved in the method. They are values defined by the decision makers and the technician. 

##### Reference profiles $b$
The alternatives are not compared with each other but to reference profiles. Reference profiles can be seen as boundary reference actions that allows to define the upper and lower bounds of each category (Almeida-Dias et. al, 2010). As represented in the Figure 2, these reference profiles are specific to each criterion. All the profiles for all the criteria form the categories. 

<center>
<figure>
  <img src="ref_profiles2.png" width="50%" height="50%">
  <figcaption><i> Figure 2: Reference profiles </i></figcaption>
</figure>
</center>

In order to have these boundaries for all the categories, if $q$ is the number of categories, there is $q$ reference profiles starting from zero ($q+1$ reference profile in total). 
As for the performances, the values of the reference profiles with a decreasing preference direction are multiplied by "-1" in order to obtain only criteria with increasing preference direction. 

##### Thresholds $q$, $p$, $v$

Thresholds are parameters that quantify the difference between the alternatives and the reference profiles in order to determine whether this difference is indifferent or significant. Indeed, the difference between the alternatives and the reference profiles will be calculated and compared to these thresholds. In this objective, three thresholds are necessary:
- The indifference threshold $q$ : indicates whether the alternative is equivalent to the reference profile
- The preference threshold $p$ : indicates wehther the alternative or the profile is preferred
- The veto threshold $v$ : indicates whether the difference is too high to be acceptable. 

These 3 thresholds are determined for each criterion $j$, going from 1 to $n$. Thresholds are thus noted for each criterion $j$: indifference threshold: $q_j$, preference threshold, $p_j$ and veto threshold $v_j$.

##### Cut-off threshold $\lambda $

The cut-off threshold is a value between 0 and 1 that defines the desired level of requirement. The closer the value is to 1, the higher the level of requirement is chosen, the closer it is to 0 the lower the level of requirement. The default value used in the ELECTRE Tri method is 0.75, but it can be adapted according to the case studied. To choose the right cut-off threshold, the desired precision in ranking the alternatives, the goals to be achieved, and the constraints of the problem should be considered. 

In the ELECTRE Tri method, the alternatives are compared to reference profiles, it is necessary to establish the relation between the two. The details of this step are described in the "Outranking relations" part. At the beginning of this step, the "degree of credibility" describing the proximity between each alternative and each reference profile is compared to the cut-off threshold. It determines if a preference of the alternative compared to the reference profile can be established or not.


## Uncertainty on the performances

The process of filling in the performance matrix is often complicated. It is necessary to find the values for each alternative for each criterion. This value is a fixed number representing the performance of the alternative against a criterion. However, this is not representative of reality; the values are actually subject to fluctuations. This is due to two factors: uncertainty and variability of the data. Uncertainty reflects the fact that measurements are subject to random errors and systematic errors caused by biases or systematic deviations in the measurement process, which can introduce variations in the measured values (Faber, 2005). The variability represents the fluctuation due to kinematic, kinetic and spatio-temporal effect (Chau et. al, 2005).

In order to obtain more robust data, another parameter is integrated into the method, **the uncertainty of the data**. Instead of applying the method to a data table, the method is applied to a distribution table. These data fluctuations are not represented in the performance matrix data, however, if they were included, they would have significant consequences for the rest of the ELECTRE Tri process.

The uncertainty is introduced through two main elements:
- Representation of data as Probability Density Functions (PDFs), as opposed to crisp values. 
- The Monte Carlo method, a statistical technique that utilizes random sampling, is applied to these distributions. This method and its implementation will be developed in details later. 

A probability density function (PDF) is a mathematical expression that describes the probability distribution of a discrete random variable (Kenton, 2022). There are various types of PDFs that can be used to represent the distribution of different types of phenomena, including the uniform, exponential, normal, and Poisson distribution (Harrison, 2010). In the context of the case study, it was decided to use only one type of distribution to represent the entire dataset: the normal distribution.

The normal distribution needs two parameters to be described : 
- The mean value $ \mu$
- The standard deviation $ \sigma$

The normal distribution is noted $N(\mu, \sigma^2)$.
*Note that the variance is $v=\sigma^2$. For this reason, in the rest of the project the calculations will be done with the mean value $ \mu$ and the variance $v$ as parameters.*


### Python environment

The code is developed with the library Pandas, Numpy and Math. The input data are imported from a csv file. 

In [1]:
import pandas as pd
import numpy as np
from numpy import random, vstack, empty
import math

### Import of data from csv file as a Pandas Dataframe

The input of the whole analysis is a `csv.file` made of 16 lines and 39 columns.

The 16 lines correponds to 16 criteria defined earlier. 
The indices of the lines are therefore the names of the criteria: <br>
`g1.1, g1.2, g1.3, g1.4, g1.5, g2.1, g2.2, g2.3, g2.4, g3.1, g3.2, g3.3, g3.4, g4.1, g4.2, g4.3, g4.4, g5.1, g5.2, g5.3`.

The columns contain the following informations : 
- The **mean value of the performance** of each scenario regarding each criterion (columns 0 to 27) <br>
Names of the columns : `'S1.1','S1.2','S1.3','S1.4','S2.1','S2.2','S2.3','S2.4','S3.1','S3.2','S3.3','S3.4','S4.1',`
`'S4.2','S4.3','S4.4','S5.1','S5.2','S5.3','S5.4','S6.1','S6.2','S6.3','S6.4','S7.1','S7.2',`
`'S7.3','S7.4'`
- The **weight** of each criterion (column 28) <br>
Name of the column : `Weights`
- The **variance** of each criterion (column 29) <br>
Name of the column : `Var`
- The **6 reference profiles** : $b0, b1, b2, b3, b4$ and $b5$ (columns 30 to 35) <br>
Names of the columns : `b0`, `b1`,`b2`, `b3`, `b4` and `b5`
- The **3 thresholds** : $q$ (the indifference threshold), $p$ (the preference threshold), $v$ (the veto threshold) (columns 36 to 38) <br>
Names of the columns : `q`, `p` and `v`


It is imported as a dataframe `d`.<br>
Two others parameters are also defined : 
- `λ` : the **cut-off threshold**
- `repetition` : the **number of repetition** of the ELECTRE Tri method desired


In [2]:
d = pd.read_csv('Input_data.csv')
λ = 0.75
repetition = 10

### Monte Carlo Function

#### How does it works ###
Our hypothesis is to use the **Monte Carlo method** to obtain data sets from distributions and use those data sets in the ELECTRE Tri procedure. 

Monte-Carlo simulation is used in complex systems in order to estimate some operations by using random sample and statistical modeling. 
1. Pick a value from Probability Density Functions
2. Run the calculation multiple times: ELECTRE Tri in our case
3. Obtain a set of results to be analyzed 

The first step involve to be given Probability Distribution Functions as inputs. For our study, all the values will be represented as normal distributions. To describe these distributions 2 parameters are needed : 
- the mean value : `m` given per scenario $S$ and per criterion $g$
- the variance : `variance` given per criterion $g$

These values are given in the `d` DataFrame given as input of the code. 

The following function allows to :
1. Creates the Normal Distribution from the input data present in `data`
2. Pick a random value in each of it
3. Return a DataFrame called `ndata` with the random values picked 

*The DataFrame returned will also contain all the parameters initially present in the `data` DataFrame.*

In [3]:
def MC(data):
    """
    Build a new performance matrix from the distribution
    with m : mean value and v : variance per criterion
    :param data: Data Frame with input data and parameters
    :return: ndata: Data frame with the new performance Data Frame with random value picked
     in the distribution
    """
    ndata = data.copy()
    variance = ndata['VAR'].values  # general variance located in the column "VAR"
    m = ndata.iloc[:, 0:28].values  # for each scenario : columns 0 to 27
    v = np.abs(m * variance[:, np.newaxis])  # variance v of the performance matrix
    perf = np.random.normal(m, v)  # random value in the normal distribution
    ndata.iloc[:, 0:28] = perf
    return ndata

### Partial concordance

The partial concordance refers to the degree of concordance, i.e. agreement between the evaluations of pairs of alternatives and reference profiles. In other words, it evaluates how well each option performs relative to the others with respect to the set of criteria. 

This function takes as input the `data` DataFrame containing all the performances as well as all the others parameters and input of the method, but only the performances, the reference profiles, and the thresholds will be used.

The objective is to calculate, regarding each criterion $j$ the concordance between each pair of alternative $a_i$ and reference profiles $b_k$ i.e. the alternatives regarding the profiles and the profiles regarding the alternatives: 
- The concordance $C_j(a_i,b_k)$
- The concordance $C_j(b_k,a_i)$ <br>
*for $i$ the scenarios, $k$ the reference profiles and $j$ the criteria*

The Figure 3 shows how the value of the corcordance $C_j(a_i,b_k)$ is determined:

<center>
<figure>
  <img src="conc2.png" width="70%" height="70%">
  <figcaption><i> Figure 3: Partial Concordance </i></figcaption>
</figure>
</center>

*with : <br>*
- *$u_j(a_i)$ : value of the performance of the scenario $i$ in the criterion $j$*
- *$u_j(b_k)$ : value of the reference profile $k$ in the criterion $j$*


It can be therefore interpreted as follow : <br>
 The difference between the performance of an alternative $u_j(a_i)$ and the performance of a reference profile $u_j(b_k)$ regarding the criterion $j$ is calculated. This difference is then compared to the two thresholds $q_j, p_j$, respectively the indifference threshold and the preference threshold. 
- if $u_j(a_i)-u_j(b_k) > -q_j$    <br>
$C_j(a_i,b_k)=1$, the alternative is as good as the profile. 
- if $u_j(a_i)-u_j(b_k) < -p_j $ <br>
$C_j(a_i,b_k)=0$, the alternative $a_i$ is not as good as the profile $b_k$ for the criterion $j$. 
- if $-p_j < u_j(a_i)-u_j(b_k) < -q_j$   <br>
 It not possible to neither agree nor disagree with the statement "the alternative is as good as the profile", so an intermediate value between 0 and 1 which qualifies the degree of agreement is calulated. The closer it is to 1 the more it agrees with the assumption, the closer it is to 0, the less it agrees with the assumption. 

Thus, in this case, the two types of concordance can be calculated in the function as follow: <br>
<center>

$C_j(a_i,b_k) = u_j(a_i)-u_j(b_j)+p_j/(p_j-q_j)$<br>

</center>

*with : <br>*
- *$u_j(a_i)$ : value of the performance of the scenario $i$ in the criterion $j$*
- *$u_j(b_k)$ : value of the reference profile $k$ in the criterion $j$*
- *$p_j$ : the preference threshold of the criterion $j$* 
- *$q_j$ : the indiference threshold of the criterion $j$*

If the value of the concordance is higher than one it is replaced by `1`, and if it is smaller than zero it is replaced by `0`. 

The calculattions are done in the same way for the concordance $C_j(b_k,a_i)$. 

Finally, the function returns two DataFrames : 
- `new_df` : The concordance between the performances of the alternatives and the reference profiles $C_j(a_i,b_k)$
- `new_df2` : The concordance between the performances of the reference profiles and the alternatives $C_j(b_j,a_k)$


In [4]:
def conc(data):
    """
    Calculates the concordance coefficient between a performance and a profile

    :param data: new performance Data Frame and original parameters
    :return: new_df: DataFrame with concordance Cj(ai,bk) of each alternative ai
    regarding each profile bk for each criterion j
    :return: new_df2: DataFrame with concordance Cj(bk,ai) of each profile bk
    regarding each alternative ai for each criterion j
    """
    new_df = pd.DataFrame()
    new_df2 = pd.DataFrame()
    for sc in data.iloc[:, 0:28]:  # for each scenario : columns 0 to 27
        for pr in data.iloc[:, 30:36]:  # for each reference profile : columns 30 to 35
            alpha = (data[sc] - data[pr] + data[data.columns[37]]) \
                    / (data[data.columns[37]] - data[data.columns[36]])
            beta = (data[pr] - data[sc] + data[data.columns[37]]) \
                   / (data[data.columns[37]] - data[data.columns[36]])
            new_df = pd.concat([new_df, alpha], axis=1, ignore_index=True)
            new_df2 = pd.concat([new_df2, beta], axis=1, ignore_index=True)
    new_df[new_df < 0] = 0
    new_df[new_df > 1] = 1
    new_df2[new_df2 < 0] = 0
    new_df2[new_df2 > 1] = 1
    return new_df, new_df2


### Discordance

The discordance matrix is a matrix that is used to represent the degree of discordance between pairs of alternatives and reference profiles. It is typically constructed by comparing the values of each alternative on each criterion, and determining whether the difference between the values is significant enough to cause discordance. In contrast to calculating the concordance with the sentence, the discordance with the sentence is studied, i.e. how far apart the alternative and the profile are. 

This function takes as input the `data` DataFrame containig all the performances as well as all the others parameters and input of the method. In this function, only the performances, the reference profiles, and the thresholds will be used.

The objective is to calculate, regarding each criterion $j$, the discordance between each pair of alternative $a_i$ and reference profiles $b_k$ and in both ways: 
- The discordance $D_j(a_i,b_k)$
- The discordance $D_j(b_k,a_i)$ <br>
*for $i$ the scenarios, $k$ the reference profiles and $j$ the criteria*

The Figure 4 shows how the value of the discordance $D_j(a_i,b_k)$ is determined: 

<center>
<figure>
  <img src="disc.png" width="70%" height="70%">
  <figcaption><i> Figure 4: Discordance </i></figcaption>
</figure>
</center>

It can be interpreted as follow : <br>
The difference between the performance of an alternative $u_j(a_i)$ and the performance of a reference profile $u_j(b_k)$ regarding the criterion $j$ is calculated. This difference is then compared to the two thresholds $p_j, v_j$, respectively the preference threshold and the veto threshold. 
- if $u_j(a_i)-u_j(b_k) < -p_j$    <br>
$D_j(a_i,b_k)=0$, the alternative is as good as the profile $b_k$ for the criterion $j$.
- if $u_j(a_i)-u_j(b_k) > -v_j $ <br>
$D_j(a_i,b_k)=1$, the alternative $a_i$ is not "as good as the profile" $b_k$ for the criterion $j$. 
- if $-v_j < u_j(a_i)-u_j(b_k) < -p_j$<br>
 It not possible to establish neither the discordance or not with the statement "the alternative is as good as the profile", so an intermediate value between 0 and 1 which qualifies the degree of disagreement is calulated. The closer it is to 1 the more it is discordant with the assumption, the closer it is to 0, the less it is discrodant with the assumption. 

Thus, in this case, the two types of discordance can be calculated in the function as follow: <br>
<center>

$D_j(a_i,b_k) = u_j(b_k)-u_j(a_i)-p_j/(v_j-p_j)$<br>

</center>

*with : <br>*
- *$u_j(a_i)$ : value of the performance of the scenario $i$ in the criterion $j$*
- *$u_j(b_k)$ : value of the reference profile $k$ in the criterion $j$*
- *$p_j$ : the preference threshold of the criterion $j$* 
- *$v_j$ : the veto threshold of the criterion $j$*

If the value is higher than one it is replaced by `1`, and if it is smaller dans zero it is replaced by `0`. 

The calculations are done the same way for the discordance $D_j(b_k,a_i)$.

The function takes as input the `d` Dataframe.
Finally, the function returns two DataFrames : 
- `new_df` : The discordance between the performances of the alternatives and the reference profiles $D_j(a_i,b_k)$
- `new_df2` : The discordance between the performances of the reference profiles and the alternatives $D_j(b_k,a_i)$

In [5]:
def disco(data):
    """
    Calculates the discordance coefficient between a performance and a profile

    :param data: new performance Data Frame and original parameters
    :return: new_df: DataFrame with discordance Dj(ai,bk) of each alternative ai
    regarding each profile bk for each criterion j
    :return: new_df2: DataFrame with discordance Dj(bk,ai) of each profile bk
    regarding each alternative ai for each criterion j
    """
    new_df = pd.DataFrame()
    new_df2 = pd.DataFrame()
    for sc in data.iloc[:, 0:28]:  # for each scenario : columns 0 to 27
        for pr in data.iloc[:, 30:36]:  # for each reference profile : columns 30 to 35
            alpha = (data[pr] - data[sc] - data[data.columns[37]]) / (
                    data[data.columns[38]] - data[data.columns[37]])
            beta = (data[sc] - data[pr] - data[data.columns[37]]) / (
                    data[data.columns[38]] - data[data.columns[37]])
            new_df = pd.concat([new_df, alpha], axis=1, ignore_index=True)
            new_df2 = pd.concat([new_df2, beta], axis=1, ignore_index=True)
    new_df[new_df < 0] = 0
    new_df[new_df > 1] = 1
    new_df2[new_df2 < 0] = 0
    new_df2[new_df2 > 1] = 1
    return new_df, new_df2

### Global concordance

The aim of this step is to calculate the global concordance of each scenario regarding all the criteria. The partial concordance values calculated for each criterion $j$ is aggregated to obtain one unique value of global concordance per per of alternative $a_i$ and reference profile $b_k$. In other words, it expresses to which extend the performance of the alternative $a_i$ with $i$ the scenario and the performance of the profile $b_k$, with $k$ the profile number regarding all the criteria are concordant with the assertion ”$a_i$ outranks $b_k$". <br>

As previously, the calculation are made twice: 
- $C(a_i,b_k)$: for the alternatives $a_i$ regarding the profiles $b_k$ 
- $C(b_k,a_i)$: for the profiles $b_k$ regarding the alternatives $a_i$

For each case, the following global concordance is calculated regarding each scenario: 

<center>

$C(a_i,b_k) = \frac {\sum_{j} C_j(a_i,b_k)  w_j}{\sum_{j} w_j}$

$C(b_k,a_i) = \frac {\sum_{j} C_j(b_k,a_i)  w_j}{\sum_{j} w_j}$

</center>

*with i the alternative, j the criteria and k the reference profile*


The function takes as input  :
- Weights of each criterion, located in the `data` DataFrame, in the column 28 named `Weights`
- Partial Concordance Matrix: `dconc1`

The DataFrame `new_df` returns the Global Concordance for each alternative or for each profile. 

In [6]:
def gconc(data, dconc1):
    """
    Calculates the global concordance

    :param data: new performance Data Frame and original parameters
    :param dconc1: concordance Data Frame
    :return: new_df: global concordance Data Frame
    """
    new_df = pd.DataFrame(index=['b0', 'b1', 'b2', 'b3', 'b4', 'b5'],
                          columns=['S1.1', 'S1.2', 'S1.3', 'S1.4',
                                   'S2.1', 'S2.2', 'S2.3', 'S2.4',
                                   'S3.1', 'S3.2', 'S3.3', 'S3.4',
                                   'S4.1', 'S4.2', 'S4.3', 'S4.4',
                                   'S5.1', 'S5.2', 'S5.3', 'S5.4',
                                   'S6.1', 'S6.2', 'S6.3', 'S6.4',
                                   'S7.1', 'S7.2', 'S7.3', 'S7.4'])
    i = 0
    for j in range(0, len(dconc1.columns), 6):  # for each scenario : one line out of 6
        # C(ai,bk) for the scenario for each reference profile
        a = sum(dconc1[j] * data[data.columns[28]]) / sum(data[data.columns[28]])
        b = sum(dconc1[j + 1] * data[data.columns[28]]) / sum(data[data.columns[28]])
        c = sum(dconc1[j + 2] * data[data.columns[28]]) / sum(data[data.columns[28]])
        dr = sum(dconc1[j + 3] * data[data.columns[28]]) / sum(data[data.columns[28]])
        e = sum(dconc1[j + 4] * data[data.columns[28]]) / sum(data[data.columns[28]])
        f = sum(dconc1[j + 5] * data[data.columns[28]]) / sum(data[data.columns[28]])
        th = [a, b, c, dr, e, f]
        new_df[new_df.columns[i]] = th  # add the global concordance as a new column
        i = i + 1
    return new_df

### Degree of credibility

The degree of credibility evaluates if the assumption that a scenario outperforms a profile is plausible and to which extent "$a_i$ outranks $b_k$", resulting in a value between 0 (the assumption is not plausible) and 1 (the assumption is very plausible). The calculation are made twice, once for the alternatives in relation to the profiles and once for the profiles in relation to the alternatives.  The degree of credibility evaluating the outranking of the alternative $a_i$ over the reference profile $b_k$ is noted : $ \delta(a_i,b_k)$ and conversely the degree of credibility evaluating the outranking of the reference profile $b_k$ over the alternative $a_i$ is noted $ \delta(b_k,a_i)$.

The degree of credibility is calculated thanks to :
- the Global Concordance: `dgconc`
- the Discordance Matrix: `ddsic` 

The objective is, for each alternatives, to follow these steps : 

If, for all the criteria $j$, the discordance is lower or equal to the global concordance : $D_j(a_i,b_k) \le C(a_i,b_k)$ or respectively $D_j(b_k,a_i) \le C(b_k,a_i)$, the credibility is equal to the global concordance:
<center>

$ \delta(a_i,b_k) = C(a_i,b_k) $

$ \delta(b_k,a_i) = C(b_k,a_i) $

</center>

Else, if at least one of the discordance is higher than the global concordance, the credibility is calculated as follow : 

<center>

$ \delta(a_i,b_k) = C(a_i,b_k) * \prod_{j \in J } \frac{(1-D_j(a_i,b_k))}{(1-C(a_i,b_k))} $

$ \delta(b_k,a_i) = C(b_k,a_i) * \prod_{j \in J } \frac{(1-D_j(b_k,a_i))}{(1-C(b_k,a_i))} $

</center>

*With :*
- *J : all the criteria for whom the discordance is lower than the concordance : $D_j(a_i,b_k) \ge C(a_i,b_k)$ or respectively $D_j(b_k,a_i) \le C(b_k,a_i)$*
- *$C(a_i,b_k)$ : the global concordance of the alternative $a_i$ with the reference profile $b_k$* 
- *$C(b_k,a_i)$ : the global concordance of the reference profile $b_k$ with the alternative $a_i$*
- *$D(a_i,b_k)$ : the global discordance of the alternative $a_i$ with the reference profile $b_k$*
- *$D(b_k,a_i)$ : the global discordance of the reference profile $b_k$ with the alternative $a_i$*

In order to better understand the steps of this calculation the degree of credibility is calculated as follows: 
-  If within the criteria, none of them is discordant, the degree of credibility is equal to the global concordance :
<center>

$ \delta(a_i,b_k) = C(a_i,b_k) $ or $ \delta(b_k,a_i) = C(b_k,a_i) $

</center>

- If one of them is discordant (equal to one), that means that it is above the veto threshold, the degree of credibility is equal to zero. The degree of credibility is the global concordance weakened by the eventual veto effects that can be found in the partial discordance : 
<center>

$ \delta(a_i,b_k) = 0$ or $ \delta(b_k,a_i) = 0$

</center>

- Finally, if some criteria are lower than $1$ but higher that the concordance, the degree of credibility is lowered by these effects, the calculation is therefore developed in the formula above.

The function return `dcred` Data Frame which is the credibility degrees calculated from `dgconc` and `ddisc`. 



In [7]:
def credibility(dgconc, ddisc):
    """
    Calculates the credibility degree

    :param dgconc: Global concordance Data Frame
    :param ddisc: Discordance Data Frame
    :return: dcred: Credibility degree Data Frame
    """
    # initialization
    dcred = pd.DataFrame(index=['b0', 'b1', 'b2', 'b3', 'b4', 'b5'],
                         columns=['S1.1', 'S1.2', 'S1.3', 'S1.4',
                                  'S2.1', 'S2.2', 'S2.3', 'S2.4',
                                  'S3.1', 'S3.2', 'S3.3', 'S3.4',
                                  'S4.1', 'S4.2', 'S4.3', 'S4.4',
                                  'S5.1', 'S5.2', 'S5.3', 'S5.4',
                                  'S6.1', 'S6.2', 'S6.3', 'S6.4',
                                  'S7.1', 'S7.2', 'S7.3', 'S7.4'])
    for j in range(0, len(ddisc.columns), 6):
        sc = int(j / 6)
        degree = [0, 0, 0, 0, 0, 0]
        for pr in range(len(dcred.index)):
            # verification if all Dj < C
            verif = sum(ddisc[j + pr][c] > dgconc[dgconc.columns[sc]][pr]
                        for c in ddisc.index)
            # case 1
            if verif == 0:
                degree[pr] = dgconc[dgconc.columns[sc]][pr]
            # case 2
            else:
                degree[pr] = (((1 - ddisc[j + pr][ddisc[j + pr]
                                                  > dgconc[dgconc.columns[sc]][pr]])
                               / (1 - dgconc[dgconc.columns[sc]][pr])).prod()) * dgconc[dgconc.columns[sc]][pr]
        dcred[dcred.columns[sc]] = degree
    return dcred

### Over-ranking

The objective of this step is to establish preference relations between the alternatives $a$ and the reference profiles $b$. 
These relations are established thanks to the degree of credibility determined just before and thanks to the cut-off threshold $\lambda$.  

There are 4 types of relations that can be established between each $a_i$ and each $b_k$
- $a_i$  `I`  $b_k$ : $a_i$  is Indifferent to  $b_k$ 
- $a_i$  `>`  $b_k$ : $a_i$  is preferred to  $b_k$ 
- $a_i$  `<`  $b_k$ : $a_i$  is not preferred to  $b_k$ 
- $a_i$  `R`  $b_k$ : $a_i$  incomparable to $b_k$ 

These relations are represented in the Figure 5. 

<center>
<figure>
  <img src="overrank2.png" width="50%" height="50%">
  <figcaption>Figure 5: Preference relations</figcaption>
</figure>
</center>

This is how these relations are determined : 



- if $\delta(a_i,b_k) > \lambda$ and $\delta(b_k,a_i) > \lambda$ <br>
    $a_i$ I $b_k$ : $a_i$  is Indifferent to  $b_k$ 
- if $\delta(a_i,b_k) > \lambda$ and $\delta(b_k,a_i) < \lambda$ <br>
     $a_i > b_k$ : $a_i$  is preferred to  $b_k$
- if $\delta(a_i,b_k) < \lambda$ and $\delta(b_k,a_i) > \lambda$ <br>
    $a_i < b_k$ : $a_i$  is not preferred to  $b_k$
- if $\delta(a_i,b_k) < \lambda$ and $\delta(b_k,a_i) < \lambda$ <br>
    $a_i$  R  $b_k$ : $a_i$  incomparable to $b_k$ 

The input data of the function are :
- The credibility degrees of the alternatives in relation to the profiles: `cred1`
- The credibility degrees of the profiles in relation to the alternatives: `cred2`
- The cut-off threshold: `param`

The function returns a single Dataframe `new_df` containing all these relations between the alternatives and the profiles.



In [8]:
def over_ranking_relations(cred1, cred2, param):
    """
    Calculates the relations between each alternative and each profile

    :param cred1: Credibility degree Data Frame of the alternatives regarding
    each profile
    :param cred2: Credibility degree Data Frame of the profiles regarding
    each alternative
    :param param: Cut-off threshold
    :return: new_df: Data Frame with the relation of each alternative regarding
    each profile
    """
    # initialization
    new_df = pd.DataFrame(index=['b0', 'b1', 'b2', 'b3', 'b4', 'b5'],
                          columns=['S1.1', 'S1.2', 'S1.3', 'S1.4',
                                   'S2.1', 'S2.2', 'S2.3', 'S2.4',
                                   'S3.1', 'S3.2', 'S3.3', 'S3.4',
                                   'S4.1', 'S4.2', 'S4.3', 'S4.4',
                                   'S5.1', 'S5.2', 'S5.3', 'S5.4',
                                   'S6.1', 'S6.2', 'S6.3', 'S6.4',
                                   'S7.1', 'S7.2', 'S7.3', 'S7.4'])
    classementa = cred1.apply(lambda x: x - param)
    classementb = cred2.apply(lambda x: x - param)
    # 1 if outperform (S), 0 if not
    classementa[classementa > 0] = 1
    classementa[classementa < 0] = 0
    classementb[classementb > 0] = 1
    classementb[classementb < 0] = 0
    mask = (classementa == classementb) & (classementa == 1)
    new_df = new_df.mask(mask, "I")
    mask = (classementa == classementb) & (classementa == 0)
    new_df = new_df.mask(mask, "R")
    mask = (classementb != 0) & (classementa == 0)
    new_df = new_df.mask(mask, "<")
    mask = (classementa != 0) & (classementb == 0)
    new_df = new_df.mask(mask, ">")
    return new_df


## Sorting

The relations previously established allow to reach the final goal of the method, i.e. to assign to each alternative a category. 
Two sorting procedures are performed: optimistic and pessimistic sorting. The major difference between the two is that the pessimistic sort "pushes the alternative down" starting from the best category, while the optimistic sort "pushes the alternative up" starting from the worst category. 

A median ranking can be obtained as an average of these two rankings.

### Pessimistic sorting

The following function permits to obtain the pessimistic sorting thanks to the over-ranking relations we just established. The objective is to place each scenario in one of the 5 predefined categories. This type of sorting "pushes the action down". 

This is how the ranking works : <br>

The 6 reference profiles $b0, b1, b2, b3, b4$ and $b5$ delineate 5 categories : <br>
$C1, C2, C3, C4$ and $C5$, C5 being the best one and C1 the worse as shown in the Figure 6. 

<center>
<figure>
  <img src="pessi_sort.jpg" width="10%" height="10%">
  <figcaption> <i> Figure 6: Pessimistic sorting </figcaption>
</figure>
</center>

</i>

For each scenario, these categories are browsed from the best to the worst ( from C5 to C1 ). 
For each reference profile encountered the credibility $ \delta(a_i,b_k)$ are compared to the cutting threshold $\lambda$ : 
- if $ \delta(a_i,b_k) > \lambda $ : the alternative is ranked in the category with the same number as $b_k$
- if $ \delta(a_i,b_k) < \lambda $ : it continues to the next reference profile 

This function takes as input: 
- The relations between the alternatives and the profiles: `ranking`
- The memory of the ranking of the alternatives in the categories: `mpessi`

It returns the updating of the Data Frame `mpessi` with the ranking obtained. 

In [9]:
def pessimistic_sort(ranking, mpessi):
    """
    Builds the pessimistic sorting

    :param ranking: Data Frame with the relation of each alternative regarding
    each profile
    :param mpessi: Data Frame storing the pessimist ranking of each alternative
    :return: mpessi: Updates of the Data Frame storing the pessismist sorting
    of alternatives
    """
    for sc in ranking:
        step = mpessi[sc]
        for pr in reversed(range(len(ranking.index))):
            if ranking[sc][pr] == '>' or ranking[sc][pr] == 'I':
                step[step.index[pr]] = step[step.index[pr]] + 1  # classified
                break
        mpessi[sc] = step
    return mpessi

### Optimistic sorting

The following function permits to obtain the optimistic ranking thanks to the over-ranking relations established.

The ranking works as follow: <br>

As previously 6 reference profiles delineate 5 categories, C5 being the best one and C1 the worse as shown in the Figure 7. 

<center>
<figure>
  <img src="opti_sort.jpg" width="10%" height="10%">
  <figcaption><i> Figure 7: Optimistic sorting</figcaption>
</figure>
</center>

</i>

The difference is that for this ranking, for each scenario, these categories are browsed from the worst to the best ( from C1 to C5 ). 
For each reference profile encountered the over-ranking relation are analyzed : 
- if $a_i$ `<` $b_k$ : the scenario is ranked in the category with the same number as $b_k$
- if $a_i$ `>` $b_k$, $a_i$ `R` $b_k$ or $a_i$ `I` $b_k$ : it continues to the next reference profile 

This function takes as inputs: 
- The relations between the alternatives and the profiles: `ranking`
- The memory of the ranking of the alternatives in the categories: `mopti`

It returns the updating of the Data Frame `mopti` with the ranking obtained. 

In [10]:
def optimistic_sort(ranking, mopti):
    """
        Builds the optimistic sorting

        :param ranking: Data Frame with the relation of each alternative regarding
        each profile
        :param mopti: Data Frame storing the optimistic sorting of each alternative
        :return: mopti: Updates of the Data Frame storing the optimistic sorting
        of alternatives
        """
    for sc in ranking:
        step = mopti[sc]
        for pr in (range(len(ranking.index))):
            if ranking[sc][pr] == '<' or ranking[sc][pr] == 'R':
                step[step.index[pr]] = step[step.index[pr]] + 1  # classified
                break
        mopti[sc] = step
    return mopti

### ELECTRE Tri application function
This final method permits to run all the previous methods in order to compute all the steps of the ELECTRE Tri method. 

First of all, it takes as input : 
- `data` : the input Dataframe containing the performances, the weights, the variances, the reference profiles and the thresholds
- `rep` : the number of times the Electre Tri method will be run, defined at the beginning of the code

It creates two data frames :
- `pessi_sort` : it allows to keep in memory the pessimistic ranking obtained at each iteration of the method
- `opti_sort` : it allows to keep in memory the optimistic ranking obtained at each iteration of the method

They are both build in the same way : <br>
They are made of 5 lines (corresponding to the 5 categories) and 28 columns (corresponding to the 28 alternatives).
Here are the `index` names : <br>
`'C1', 'C2', 'C3', 'C4', 'C5'` <br>
Here are the `columns` names : <br>
`'S1.1','S1.2','S1.3','S1.4','S2.1','S2.2','S2.3','S2.4','S3.1','S3.2','S3.3','S3.4','S4.1',`
`'S4.2','S4.3','S4.4','S5.1','S5.2','S5.3','S5.4','S6.1','S6.2','S6.3','S6.4','S7.1','S7.2',`
`'S7.3','S7.4'` <br>
Initially, they are composed only of zeros .

Thereafter the following functions will be executed one after the other, the number of times `rep` which was defined at the very beginning of the code : <br>
*(note that the functions below are clearly defined and explained one by one right above their code,including detailed explanations of input and output data)*

- `MCarlo` : Monte Carlo function <br>
    Takes as input : the input dataframe `d`<br>
    Return : the dataframe `newdata` : the mean values have been replaced by the performances  <br>
- `conce` : Partial Concordance function <br>
    Takes as input :  the input dataframe `newdata` <br>
    Return : the two concordance matrix `dconca, dconcb` <br>
- `disco` : Discordance function <br>
    Takes as input : the input dataframe `newdata` <br>
    Return : the two discordance dataframes `ddisca, ddiscb`<br>
- `global_conc` : Global Concordance function <br>
    This function is called twice : 
    - Once taking in input : the input dataframe `newdata` and the concordance dataframe `dconca` <br>
        Return : the global concordance dataframe `dgconca`
    - Once taking in input : the input dataframe `newdata` and the concordance dataframe `dconcb` <br>
        Return : the global concordance dataframe`dgconcb`
- `credibility` : Credibility Degree function <br>
    This function is called twice : 
    - Once taking in input : the global concordance and discordance dataframes `dgconca` and `ddisca`<br>
        Return : credibility dataframe `dcreda`
    - Once taking in input : the global concordance and discordance dataframes `dgconcb` and `ddiscb`<br>
        Return : credibility dataframe `dcredb`
- `over_ranking_relations` : Over-ranking function <br>
    Takes as input : the two credibility dataframes `dcreda` and `dcredb`<br>
    Return : the overanking dataframe `dranking` <br>
- `optimistic_sort` : Optimistic sorting function <br>
    Takes as input : the overanking datadrame `dranking` and the optimistic sorting dataframe obtained at the previous iteration `opti_sort` <br>
    Return : the optimistic sorting daframe modified, i.e. with the optimistic sorting added to the previous `opti_sort`
- `pessimistic_sort`: Pessimistic sorting function <br>
    Takes as input : the overanking datadrame `dranking` and the pessimistic sorting dataframe `pessi_sort` <br>
    Return : the pessimistic sorting daframe modified, i.e. with the pessimistic sorting added to the previous `pessi_sort`




In [11]:
def Elec_tri(data, rep):
    """
    Function which capitalises ELECTRE_Tri calculations and repeats them
     a rep number of times

    :param data: Input data and parameters
    :param rep: Number of repetition
    :return: opti_sort: Data Frame with percentage of time each alternative are classified in each
    category within a optimistic sorting
    :return: pessi_sort: Data Frame with percentage of time each alternative are classified in each
     category within a pessimistic sorting
    """
    pessi = np.zeros((5, 28))
    opti = np.zeros((5, 28))
    pessi_sort = pd.DataFrame(pessi, index=['C1', 'C2', 'C3', 'C4', 'C5'],
                              columns=['S1.1', 'S1.2', 'S1.3', 'S1.4',
                                       'S2.1', 'S2.2', 'S2.3', 'S2.4',
                                       'S3.1', 'S3.2', 'S3.3', 'S3.4',
                                       'S4.1', 'S4.2', 'S4.3', 'S4.4',
                                       'S5.1', 'S5.2', 'S5.3', 'S5.4',
                                       'S6.1', 'S6.2', 'S6.3', 'S6.4',
                                       'S7.1', 'S7.2', 'S7.3', 'S7.4'])
    opti_sort = pd.DataFrame(opti, index=['C1', 'C2', 'C3', 'C4', 'C5'],
                             columns=['S1.1', 'S1.2', 'S1.3', 'S1.4',
                                      'S2.1', 'S2.2', 'S2.3', 'S2.4',
                                      'S3.1', 'S3.2', 'S3.3', 'S3.4',
                                      'S4.1', 'S4.2', 'S4.3', 'S4.4',
                                      'S5.1', 'S5.2', 'S5.3', 'S5.4',
                                      'S6.1', 'S6.2', 'S6.3', 'S6.4',
                                      'S7.1', 'S7.2', 'S7.3', 'S7.4'])
    # repetitions
    for i in range(rep):
        newdata = MC(data)
        dconca, dconcb = conc(newdata)
        ddisca, ddiscb = disco(newdata)
        dgconca = gconc(newdata, dconca)
        dgconcb = gconc(newdata, dconcb)
        dcreda = credibility(dgconca, ddisca)
        dcredb = credibility(dgconcb, ddiscb)
        dranking = over_ranking_relations(dcreda, dcredb, λ)
        opti_sort = optimistic_sort(dranking, opti_sort)
        pessi_sort = pessimistic_sort(dranking, pessi_sort)
    pessi_sort = pessi_sort.apply(lambda x: (x / rep) * 100)  # %
    opti_sort = opti_sort.apply(lambda x: x / rep * 100)  # %
    return opti_sort, pessi_sort

The `electre_tri` function is run returning two DataFrames : `o_sorting` and `p_sorting`. 

Then two csv files are created containing the repartition of the scenarios in the categories as percentages : 
- `pessimistic_sorting.csv` for the pessimistic sorting 
- `optimistic_sorting.csv` for the optimistic sorting

In [12]:
o_sorting, p_sorting = Elec_tri(d, repetition)
o_sorting_transposed = o_sorting.transpose()
o_sorting_transposed['Total'] = 100
p_sorting_transposed = p_sorting.transpose()
p_sorting_transposed['Total'] = 100

# p_sorting.to_csv('pessimistic_sorting.csv')
# o_sorting.to_csv('optimistic_sorting.csv')

### Printing of the optimistic sorting

The optimistic ranking is printed. Each column corresponds to a category (from `C1` to `C5`). The last column `Total` correspond to the sum of the percentages of the line. 
Each line corresponds to an alternative, from `S1.1` to `S7.4`.

The results are given as percentage: for each alternative it gives the proportion of times it was classified in each category. 


In [13]:
print("The optimistic sorting of the scenarios is:")
print(o_sorting_transposed)


The optimistic sorting of the scenarios is:
       C1     C2     C3     C4    C5  Total
S1.1  0.0  100.0    0.0    0.0   0.0    100
S1.2  0.0   30.0   70.0    0.0   0.0    100
S1.3  0.0    0.0   90.0   10.0   0.0    100
S1.4  0.0   40.0   60.0    0.0   0.0    100
S2.1  0.0   20.0   70.0   10.0   0.0    100
S2.2  0.0    0.0    0.0   70.0  30.0    100
S2.3  0.0    0.0   20.0   80.0   0.0    100
S2.4  0.0    0.0   10.0   70.0  20.0    100
S3.1  0.0   10.0   90.0    0.0   0.0    100
S3.2  0.0    0.0    0.0  100.0   0.0    100
S3.3  0.0   20.0   20.0   60.0   0.0    100
S3.4  0.0    0.0   10.0   90.0   0.0    100
S4.1  0.0    0.0  100.0    0.0   0.0    100
S4.2  0.0    0.0  100.0    0.0   0.0    100
S4.3  0.0   30.0   70.0    0.0   0.0    100
S4.4  0.0    0.0  100.0    0.0   0.0    100
S5.1  0.0  100.0    0.0    0.0   0.0    100
S5.2  0.0  100.0    0.0    0.0   0.0    100
S5.3  0.0  100.0    0.0    0.0   0.0    100
S5.4  0.0  100.0    0.0    0.0   0.0    100
S6.1  0.0    0.0   20.0   80.0  

### Printing of the pessimistic sorting

The pessimistic ranking is printed. Each column corresponds to a category (from `C1` to `C5`). Each line corresponds to an alternative, from `S1.1` to `S7.4`. The last column `Total` correspond to the sum of the percentages of the line. 

The results are given as percentage: for each alternative it gives the proportion of times it was classified in each category. 


In [14]:
print("The pessimistic sorting of the scenarios is:")
print(p_sorting_transposed)

The pessimistic sorting of the scenarios is:
         C1     C2     C3    C4   C5  Total
S1.1  100.0    0.0    0.0   0.0  0.0    100
S1.2   30.0   70.0    0.0   0.0  0.0    100
S1.3    0.0   90.0   10.0   0.0  0.0    100
S1.4   40.0   60.0    0.0   0.0  0.0    100
S2.1   20.0   70.0   10.0   0.0  0.0    100
S2.2    0.0    0.0   70.0  30.0  0.0    100
S2.3    0.0   20.0   80.0   0.0  0.0    100
S2.4    0.0   10.0   70.0  20.0  0.0    100
S3.1   10.0   90.0    0.0   0.0  0.0    100
S3.2    0.0    0.0  100.0   0.0  0.0    100
S3.3   20.0   20.0   60.0   0.0  0.0    100
S3.4    0.0   10.0   90.0   0.0  0.0    100
S4.1    0.0  100.0    0.0   0.0  0.0    100
S4.2    0.0  100.0    0.0   0.0  0.0    100
S4.3   30.0   70.0    0.0   0.0  0.0    100
S4.4    0.0  100.0    0.0   0.0  0.0    100
S5.1  100.0    0.0    0.0   0.0  0.0    100
S5.2  100.0    0.0    0.0   0.0  0.0    100
S5.3  100.0    0.0    0.0   0.0  0.0    100
S5.4  100.0    0.0    0.0   0.0  0.0    100
S6.1    0.0   20.0   80.0   0.0

### Analysis

##### 1. Scenarios are spread in the categories

The output obtained is a table with the percentage for each alternatives to be classified in each category. This new representation of the results adds significant information. 
In the Table 1, the results obtained without the new procedure can be compared to the one just obtained above with the code. This was applied to the case study, and the focus is made on the optimistic sorting on the four scenarios : 'S2.1', 'S2.2', 'S2.3' and 'S2.4'. This analysis is also valid for the other scenarios and for the pessimistic ranking.

<center>
<figure>
  <img src="analysis1.png" width="35%" height="35%">
  <figcaption><i> Table 1: Optimistic ranking obtained without the new procedure for 4 scenarios<i></figcaption>
</figure>
</center>




This gives major information on the scenarios. Firstly, the **results are more nuanced**, which makes it easier to compare the alternatives. Indeed, by focusing on scenario `S2.1`, it can be observed that in the classic ELECTRE Tri method, it is classified in category 3. The new procedure shows that by integrating the fluctuation of the data, this scenario is indeed classified the majority of the time in category 3 but is also classified in category 2 and 4 (see Optimistic ranking above). This provides information on the possibility of ranking alternatives in the face of uncertainty and variance. Between two alternatives that seem equivalent in ranking, the percentage of ranking in other categories allows to differentiate them. When looking at two scenarios that are initially classified in the same category, `S2.2` and `S2.4` in category 4, it is difficult to distinguish between them with this information alone. Thanks to the integration of fluctuation, these two scenarios are also classified in categories 2, 3 and 5. However, the scenario `S2.2` is more often classified in category 5 than the scenario. These results can therefore lead to believe that the `S2.2` scenario performs better than the `S2.4` scenario and therefore **allows the decision makers to separate two alternatives that at first seemed equal**. 

This allocation allows for a more **specific study of elementary actions**. The alternatives are cleverly constructed to observe the impact of elementary actions on rankings. This added information makes it possible to observe more precisely whether or not an elementary action improves the overall performance of an alternative.  
As an exemple, it is possible to have a look at the results obtained for the 4 following scenarios: S2.1, S2.2, S3.1, S3.2. All Group 2 scenarios have an electric radiant panel and existing electric floor and all Group 3 scenarios have electric storage heating without electric floor heating. Otherwise, these two families of scenarios are built the same with the index 2 scenarios diverging from the index 1 scenarios by solar panels on the roof. The index 1 scenarios are not provided with any autonomous energy system. By analysing the results obtained previsouly, it is easily observable that the scenarios including solar panels are better categorised. 

##### 2. Impact of the variance

Alternatives do not respond uniformly to the distributions applied by criteria. Some alternatives can
be classified in 4 different categories while some of them are always classified in only 1 category. If we look at the
scenarios of family 2 : `S2.1`, `S2.2`, `S2.3` and `S2.4`, and family 7 :  `S7.1`, `S7.2`, `S7.3` and `S7.4`: 

Both families of scenarios were exposed to fluctuations with similar standard deviations. However, it is noteworthy that the scenarios of family 2 can be classified into four distinct categories based on their fluctuation values, while the scenarios of family 7 are consistently classified in the same category. This discrepancy in classification is influenced by various factors, including the proximity of the alternatives’ performance to the established thresholds. Even a minor deviation in the data can cause the threshold to be exceeded, resulting in a different classification. Furthermore, the weight assigned toeach criterion being evaluated also plays a role, as a variation in performance for a criterion with a high weight is more likely to affect the overall results

# Conclusion

The proposed new procedure adds a significant amount of information by providing the percentage of ranking for each alternative within each category, which allows for a more detailed differentiation of alternatives. This method also facilitates the comparison of alternatives that were previously ranked at the same level, thereby providing a more precise examination of the performance of individual actions within scenarios. To fully leverage the benefits of this method, it is crucial for decision-makers to have a solid understanding of the process, including how different data and parameters affect the results.

Furthermore, this procedure can be effectively combined with other multi-criteria analysis methods to ensure that no important information is overlooked, since they use the same performance matrix to group the input data for the analysis process. By generalizing the use of distributions, instead of fixed values, this method allows for a more comprehensive understanding of the sensitivity of the results to the input data and can help identify a wider range of potential outcomes.

### Bibliography 
*Almeida-Dias, J., J.R. Figueira, and B. Roy (2010). “ELECTRE TRI-C: A multiple criteria sorting
method based on characteristic reference actions”. In: European Journal of Operational Research.*<br>
*Chau, T., S. Young, and S. Redekop (2005). “Managing variability in the summary and comparison of
gait data”. In: Journal of NeuroEngineering and Rehabilitation 22.* <br>
*Corrente, S., S. Greco, and R. Slowinski (2016). “Multiple Criteria Hierarchy Process for ELECTRE Tri
methods”. In: European Journal of Operational Research, 252, pp. 191–203* <br>
*Daniel, S. and C. Ghiaus (2022). “Multi-criteria decision analysis of energy retrofit of residential buildings:
methodology and feedback from real application”. In: Energies 15* <br>
*Faber, M. H. (2005). “On the Treatment of Uncertainties and Probabilities in Engineering Decision
Analysis”. In: Journal of Offshore Mechanics and Artic Engeneering 127, pp. 243–248.* <br>
*Harrison, R. L. (2010). “Introduction to Monte Carlo Simulation.” In: AIP Conference Proceedings 1204,
pp. 17–21.* <br>
*Kenton, W. (2022). “The Basics of Probability Density Function (PDF), With an Example.” In: Investo-
pedia.*<br>
*Roy, B. (1985). “Methodologie Multicritère d’Aide à la Decision.” In: Economica, Paris.*