# Code Python - Electre Tri 

## Introduction to the project

The company at the origin of the request is the social landlord 3F, part of the national group Action Logement. Its request concerns the renovation of three of its housing buildings located in the Lyon region. These buildings having been built in the years 2014, the company thus considered necessary to carry out renovation works for the whole of these 3 buildings. 

The company has found it difficult in the past to find out how to renovate a building in view regarding the different aspects that come into play. For example, when looking at the minimum energy loss between primary and final energy, gas appears to be the most interesting form of energy. However, when looking at the environmental impact of this form of energy, gas is badly ranked. Thus, they wish to take into account several aspects of energy renovation in this project.  

Since several options of renovation are possible and the decision is based on multiple criteria, it has been chosen thaht a multi-criteria analysis should be carried out.


### Electre Tri as a multi-criteria analysis 

The Electre Tri method is the multi-criteria analysis that is selected for the project. In its process, the input data for each item is normalised using thresholds and compared to difference profiles that separate categories. This method results in an optimistic and pessimistic ranking of the elements in which every actions are ranked in categories.

The following code allows to execute step by step the calculations of the Electre Tri method :

*expliquer peut être comment fonctionne la méthode vite fait en mode il y a 28 scenarios 16 critères et il faut poids thresholds etc*

In [455]:
import csv
import pandas as pd
import numpy as np
from numpy import random, vstack, empty
import math


### Import of data from csv file as a Pandas Dataframe

The input of the whole analysis is a csv file containing the following informations : 
- The mean value of the performance of each scenario regarding each criteria 
- The weight of each criteria 
- The variance of each criteria
- The 5 reference profiles : b0, b1, b2, b3, b4 and b5 
- The 3 thresholds : q (the indiference threshold), p (the preference threshold), v (the veto threshold)

It is imported as a dataframe 'd'.


In [456]:
d = pd.read_csv('Input_data.csv')
λ = 0.75

### Monte Carlo Function

Explanation of what is Monte Carlo & how the function works 

In [457]:
def MCarlo(d):
    for i in d.index:
        variance = d['VAR'][i]
        for j in d.iloc[:, 0:28]:
            m = d[j][i]
            v = abs(m*variance)
            perf = random.normal(m, v, 1)
            d[j][i] = perf[0]
    return d

autre version 

In [458]:
def MCarlo2(d):
    variance = d['VAR'].values
    m = d.iloc[:, 0:28].values
    v = np.abs(m * variance[:, np.newaxis])
    perf = np.random.normal(m, v)
    d.iloc[:, 0:28] = perf
    return d

### Concordance

The concordance matrix is a table that compare each pair of alternatives being considered, in our case, the sceanarios. In other words, it evaluates how well each option performs relative to the others with respect to the set of criteria. 

This function take as input the DataFrame containig all the performances as well as all the others parameters and input of the method, but only the performances, the reference profiles, and the thresholds will be used.

The objective is to calculate the concordance between each pair of alternative and reference profiles and in both ways: 
- The concordance $C_j(a_i,b_k)$
- The concordance $C_j(a_i,b_k)$ <br>
*for $i$ the scenarios, $k$ the reference profiles and $j$ the criteria*

Here is how the two types of concordance are calculated in the function: <br>
<center>

$C_j(a_i,b_k) = uj(a_i)-u_j(b_j)+p_j/p_j-q_j$<br>
$C_j(b_k,a_i) = uj(b_j)-u_j(a_i)+p_j/p_j-q_j$<br>

</center>


If the value is higher than one it is replaced by one, and if it is smaller dans zero it is replaced by zero. 

Finally, the function returns two DataFrames : 
- `dconca` : The concordance between the performances and the reference profiles $C_k(a_i,b_j)$
- `dconcb` : The concordance between the reference profiles and the performances $C_k(a_i,b_j)$


In [459]:
def conce(d):
    new_df = pd.DataFrame()
    new_df2 = pd.DataFrame()
    q = d[d.columns[36]]
    p = d[d.columns[37]]
    v = d[d.columns[38]]
    for sc in d.iloc[:, 0:28]: #pour chaque scénario sc
        dscenar = d[sc]
        for pr in d.iloc[:, 30:36]: #the scenario sc is compared to each profil pr
            alpha = (dscenar-d[pr]+p)/(p-q)
            beta = (d[pr]-dscenar+p)/(p-q)
            new_df = pd.concat([new_df, alpha], axis=1, ignore_index=True)
            new_df2 = pd.concat([new_df2, beta], axis=1, ignore_index=True)
    new_df[new_df<0]=0
    new_df[new_df>1]=1
    new_df2[new_df2<0]=0
    new_df2[new_df2>1]=1
    return new_df, new_df2

### Discordance

The discordance matrix is a matrix that is used to represent the degree of discordance between pairs of alternatives. It is typically constructed by comparing the values of each alternative on each criterion, and determining whether the difference between the values is significant enough to cause discordance. 

The objective is to calculate the discordance between each pair of alternative and reference profiles and in both ways: 
- The discordance $D_j(a_i,b_k)$
- The discordance $D_j(b_k,a_i)$ <br>
*for $i$ the scenarios, $k$ the reference profiles and $j$ the criteria*

Here is how the two types of discordance are calculated in the function: <br>
<center>

$D_j(a_i,b_k) = uj(b_k)-u_j(a_i)-p_j/v_j-p_j$<br>
$D_j(b_k,a_i) = uj(a_i)-u_j(b_k)-p_j/v_j-p_j$<br>

</center>


If the value is higher than one it is replaced by one, and if it is smaller dans zero it is replaced by zero. 


Finally, the function returns two DataFrames : 
- `ddiscoa` : The discordance between the performances and the reference profiles $D_j(a_i,b_k)$
- `ddiscob` : The discordance between the reference profiles and the performances $D_j(b_k,ba_i)$

In [460]:
def disco(d):
    new_df = pd.DataFrame()
    new_df2 = pd.DataFrame()
    for sc in d.iloc[:, 0:28]: #pour chaque scénario sc
        for pr in d.iloc[:, 30:35]: #pour chaque profils
            alpha = (d[sc]-d[pr]+d[d.columns[37]])/(d[d.columns[38]]-d[d.columns[37]])
            beta = (d[pr]-d[sc]+d[d.columns[37]])/(d[d.columns[38]]-d[d.columns[37]])
            new_df = pd.concat([new_df, alpha], axis=1, ignore_index=True)
            new_df2 = pd.concat([new_df2, beta], axis=1, ignore_index=True)
    new_df[new_df<0]=0
    new_df[new_df>1]=1
    new_df2[new_df2<0]=0
    new_df2[new_df2>1]=1
    return new_df, new_df2


### Global concordance

The function allows to calculate the global concordance of each scenario regarding each threshold. It takes as input the concordance matrix and the weights for each criteria. <br>
*explain what si the global concordance for*

The function takes as input the weights of each criterion, located in the `d` DataFrame as well as the concordance matrix, separated into 2 DataFrames previously : `dconca` and `dconcb`. 

The objective is, for each scenario calculate the following global concordance : 

<center>

$C(a_i,b_k) = \frac {\sum_{j} C_j(a_i,b_k) * w_j}{\sum_{j} w_j}$

</center>

*with i the scenarios and k the reference profiles*








autre version optimisée je sais pas encore laquelle choisir

In [461]:
def global_conc(d,dconc1):
    new_df = pd.DataFrame(index=['b0', 'b1', 'b2', 'b3', 'b4', 'b5'], columns=['S1.1','S1.2','S1.3','S1.4','S2.1','S2.2','S2.3','S2.4','S3.1','S3.2','S3.3','S3.4','S4.1','S4.2','S4.3','S4.4','S5.1','S5.2','S5.3','S5.4','S6.1','S6.2','S6.3','S6.4','S7.1','S7.2','S7.3','S7.4']) 
    i = 0
    for j in range(0, len(dconc1.columns),6):
        a = sum(dconc1[j]*d[d.columns[28]])/sum(d[d.columns[28]]) 
        b = sum(dconc1[j+1]*d[d.columns[28]])/sum(d[d.columns[28]])  
        c = sum(dconc1[j+2]*d[d.columns[28]])/sum(d[d.columns[28]]) 
        dr = sum(dconc1[j+3]*d[d.columns[28]])/sum(d[d.columns[28]])  
        e = sum(dconc1[j+4]*d[d.columns[28]])/sum(d[d.columns[28]]) 
        f = sum(dconc1[j+5]*d[d.columns[28]])/sum(d[d.columns[28]]) 
        th = [a,b,c,dr,e,f]
        new_df[new_df.columns[i]]= th
        i = i+1
    return new_df

### Degree of credibility

In [462]:
def credibility(dgconc, ddisc):
    dcred = dgconc.copy()
    for j in range(0, len(ddisc),6):           #pour toutes les colonnes de discordance,toutes les 6 colonnes donc pour chaque scénario
        s = int(j/6)
        cglobal = [dgconc[dgconc.columns[s]][0], dgconc[dgconc.columns[s]][1],dgconc[dgconc.columns[s]][2], dgconc[dgconc.columns[s]][3], dgconc[dgconc.columns[s]][4], dgconc[dgconc.columns[s]][5]]
        dc = [0, 0, 0, 0, 0, 0]
        for i in range(len(cglobal)):        #pour chaque profil de référence
            verif = 0
            for c in ddisc.index:           #parcours les valeurs de la colonne
                if  ddisc[j+i][c] > dgconc[dgconc.columns[s]][i]:           #si une valeur de la colonne est supérieur au coef de concordance global 
                    verif = verif + 1
            if verif == 0 :
                dc[i] = dgconc[dgconc.columns[s]][i]
            else: 
                df_mask = ddisc[j+i] > dgconc[dgconc.columns[s]][i]
                filtered_ddisc = ddisc[df_mask]
                degree = (((1-filtered_ddisc[j+i])/(1-dgconc[dgconc.columns[s]][i])).prod())*dgconc[dgconc.columns[s]][i] #le degré de credibilité du profil i et du scénario j
        dcred[dcred.columns[s]] = dc
    print(dcred)
    return dcred


Ranking

In [463]:
def over_ranking_relations2(creda, credb, λ):
    #new_df = creda.copy()
    new_df = pd.DataFrame(index=['b0', 'b1', 'b2', 'b3', 'b4', 'b5'], columns=['S1.1','S1.2','S1.3','S1.4','S2.1','S2.2','S2.3','S2.4','S3.1','S3.2','S3.3','S3.4','S4.1','S4.2','S4.3','S4.4','S5.1','S5.2','S5.3','S5.4','S6.1','S6.2','S6.3','S6.4','S7.1','S7.2','S7.3','S7.4']) 
    i = 0
    classementa = creda.apply(lambda x: x-λ)
    classementb = credb.apply(lambda x: x-λ)
    classementa[classementa>0]= 1 #surclasse
    classementa[classementa<0]= 0 #ne surclasse pas
    classementb[classementb>0]= 1
    classementb[classementb<0]= 0
    for i in creda:
        for j in creda.index:
            if classementa[i][j] == classementb[i][j] == 1: #si les 2 surclassent
                new_df[i][j] = 'I'
            elif classementa[i][j] == classementb[i][j] == 0: #si les 2 ne surclassement pas
                new_df[i][j] = 'R'
            elif classementa[i][j] == 0: #si seulement b surclasse s
                new_df[i][j] = '<'
            elif classementa[i][j] == 1: #si seulement s surclasse b
                new_df[i][j] = '>'
    return new_df

In [464]:
def over_ranking_relations(creda, credb, λ):
    #new_df = creda.copy()
    new_df = pd.DataFrame(index=['b0', 'b1', 'b2', 'b3', 'b4', 'b5'], columns=['S1.1','S1.2','S1.3','S1.4','S2.1','S2.2','S2.3','S2.4','S3.1','S3.2','S3.3','S3.4','S4.1','S4.2','S4.3','S4.4','S5.1','S5.2','S5.3','S5.4','S6.1','S6.2','S6.3','S6.4','S7.1','S7.2','S7.3','S7.4']) 
    classementa = creda.apply(lambda x: x-λ)
    classementb = credb.apply(lambda x: x-λ)
    classementa[classementa > 0] = 1  # surclasse
    classementa[classementa < 0] = 0  # ne surclasse pas
    classementb[classementb > 0] = 1
    classementb[classementb < 0] = 0
    mask = (classementa == classementb) & (classementa == 1)
    new_df = new_df.mask(mask, "I")
    mask = (classementa == classementb) & (classementa == 0)
    new_df = new_df.mask(mask, "R")
    mask = (classementb != 0) & (classementa == 0)
    new_df = new_df.mask(mask, "<")
    mask = (classementa != 0) & (classementb == 0)
    new_df = new_df.mask(mask, ">")
    return new_df

Pessimiste sorting

In [465]:
def pessimistic_sort(dov,new_df):
    cat = new_df.index
    for col in dov: #pour le scéénario col  
        etape = new_df[col] 
        for j in reversed(range(len(dov.index))): 
            if dov[col][j] == '>' or dov[col][j] == 'I':
                etape[etape.index[j]] = etape[etape.index[j]] +1
                break
        new_df[col] = etape
    return new_df 



Optimistic sorting

In [466]:
def optimistic_sort(dov,new_df):
    cat = new_df.index
    for col in dov: 
        etape = new_df[col] 
        for j in range(len(dov.index)): 
            if dov[col][j] == '<' or dov[col][j] == 'R':
                etape[etape.index[j-1]] = etape[etape.index[j-1]] +1
                break
        new_df[col] = etape 
    return new_df 



In [467]:
def electre_tri (d,rep):
    temp = np.zeros((5,28))
    pessi_sort = pd.DataFrame(temp, index=['C1', 'C2', 'C3', 'C4', 'C5'], columns=['S1.1','S1.2','S1.3','S1.4','S2.1','S2.2','S2.3','S2.4','S3.1','S3.2','S3.3','S3.4','S4.1','S4.2','S4.3','S4.4','S5.1','S5.2','S5.3','S5.4','S6.1','S6.2','S6.3','S6.4','S7.1','S7.2','S7.3','S7.4'])
    opti_sort = pd.DataFrame(temp, index=['C1', 'C2', 'C3', 'C4', 'C5'], columns=['S1.1','S1.2','S1.3','S1.4','S2.1','S2.2','S2.3','S2.4','S3.1','S3.2','S3.3','S3.4','S4.1','S4.2','S4.3','S4.4','S5.1','S5.2','S5.3','S5.4','S6.1','S6.2','S6.3','S6.4','S7.1','S7.2','S7.3','S7.4'])
    for i in range(rep) :
        d = MCarlo(d)
        dconca, dconcb = conce(d)
        ddisca, ddiscb = disco(d)
        dgconca = global_conc(d,dconca)
        dgconcb = global_conc(d,dconcb)
        dcreda = credibility(dgconca, ddisca)
        dcredb = credibility(dgconcb, ddiscb)
        dranking = over_ranking_relations(dcreda, dcredb, λ)
        pessi_sort = pessimistic_sort(dranking,pessi_sort)
        opti_sort = optimistic_sort(dranking,opti_sort)
    pessi_sort = pessi_sort.apply(lambda x: (x/rep)*100)
    opti_sort = opti_sort.apply(lambda x: x/rep*100)
    return opti_sort, pessi_sort, dranking

In [468]:
repet = 10
opti_sort, pessi_sort, dranking = electre_tri (d, repet)

    S1.1  S1.2  S1.3      S1.4      S2.1      S2.2      S2.3      S2.4  \
b0   1.0   1.0   1.0  1.000000  1.000000  1.000000  1.000000  1.000000   
b1   0.0   0.0   0.0  0.850321  0.996735  0.920853  1.000000  1.000000   
b2   0.0   0.0   0.0  0.752925  0.810062  0.853884  0.942856  0.885218   
b3   0.0   0.0   0.0  0.637897  0.765654  0.762175  0.715518  0.761008   
b4   0.0   0.0   0.0  0.530701  0.644016  0.359809  0.402541  0.592953   
b5   0.0   0.0   0.0  0.098357  0.094607  0.096807  0.105299  0.104852   

        S3.1      S3.2  ...      S5.3      S5.4      S6.1      S6.2      S6.3  \
b0  1.000000  1.000000  ...  1.000000  1.000000  1.000000  1.000000  1.000000   
b1  0.879149  0.945154  ...  0.666602  0.784609  1.000000  0.973277  0.947671   
b2  0.789258  0.899098  ...  0.591905  0.622233  0.886330  0.814455  0.753138   
b3  0.546257  0.421179  ...  0.336690  0.363068  0.644364  0.722678  0.573299   
b4  0.388810  0.264341  ...  0.287765  0.322168  0.317395  0.432542  0.51549

Ici je pense on peut garder car prends directement les infos des csv mais prendre avec panda et pas faire "append"