# Supplier Selection Methods - TOPSIS
Prepared by: Nickolas Freeman, Ph.D.

This notebook presents the TOPSIS method for multi-criteria decision-making. First, we import the packages that we are going to use.

In [1]:
import pandas as pd
pd.set_option('display.float_format', lambda x: '%.2f' % x)
pd.options.display.max_rows = 100

import numpy as np

import matplotlib.pyplot as plt
plt.rcParams.update({'font.family': 'STIXGeneral', 'mathtext.fontset': 'stix'})

import seaborn as sns

%matplotlib inline

from ipywidgets import interact

We will be using a dataset that includes scores for 100 suppliers on several qualitative and quantitative factors  (the same dataset from the first notebook). The following code block reads in the data, which is in a .csv file named "Supplier_Data.csv" that is stored in a "data" sub-folder of the current working directory. The 'Warranty Terms', 'Payment Terms', 'Technical Support', 'Sustainability Efforts', and 'Financial Stability' columns include subjective scores for each supplier that range from 1-10, with 1 being the lowest rating and 10 being the highest rating. 

In [2]:
supplier_data = pd.read_csv("data/Supplier_Data.csv")
supplier_data.head(10)

Unnamed: 0,Supplier,Warranty Terms,Payment Terms,Technical Support,Sustainability Efforts,Financial Stability,Unit Cost,Lead Time (Days),On Time Delivery
0,1,2.42,2.8,8.25,3.09,8.21,1.05,8,0.75
1,2,6.51,1.88,3.6,6.45,4.31,1.21,13,0.83
2,3,8.51,1.06,2.97,9.36,5.06,1.22,12,0.91
3,4,4.63,4.6,5.84,7.34,9.2,1.01,9,0.72
4,5,4.62,7.44,1.5,8.93,4.01,1.03,7,0.97
5,6,7.87,6.58,2.93,6.88,6.3,1.18,14,0.83
6,7,8.87,5.94,3.32,7.45,3.0,1.13,6,1.0
7,8,9.15,4.35,1.28,9.2,3.07,1.04,15,0.81
8,9,8.16,2.39,1.74,8.8,9.41,1.02,8,0.95
9,10,1.66,7.7,4.11,5.76,5.72,1.1,11,0.98


# TOPSIS

TOPSIS stands for **T**echnique for **O**rder of **P**reference by **S**imilarity to **I**deal **S**olution. From https://en.wikipedia.org/wiki/TOPSIS (accessed on 2/8/18):

> TOPSIS is a multi-criteria decision analysis method, which was originally developed by Hwang and Yoon in 1981 with further developments by Yoon in 1987, and Hwang, Lai and Liu in 1993. TOPSIS is based on the concept that the chosen alternative should have the shortest geometric distance from the positive ideal solution (PIS) and the longest geometric distance from the negative ideal solution (NIS). It is a method of compensatory aggregation that compares a set of alternatives by identifying weights for each criterion, normalising scores for each criterion and calculating the geometric distance between each alternative and the ideal alternative, which is the best score in each criterion. An assumption of TOPSIS is that the criteria are monotonically increasing or decreasing. Normalisation is usually required as the parameters or criteria are often of incongruous dimensions in multi-criteria problems. Compensatory methods such as TOPSIS allow trade-offs between criteria, where a poor result in one criterion can be negated by a good result in another criterion.

The following image depicts the underlying concept of the TOPSIS method as the distance between a particular solution and the *positive ideal* and * negatie ideal* solutions.
<img src="images/TOPSIS_Visual.jpg" style="width: 900px;">

The TOPSIS process is carried out as follows (from https://en.wikipedia.org/wiki/TOPSIS (accessed on 2/8/18)):

> #### Step 1:
>Create an evaluation matrix consisting of m alternatives and n criteria, with the intersection of each alternative and criteria given as $\displaystyle x_{ij}$, we therefore have a matrix $\displaystyle (x_{ij})_{m\times n}$.
 
> #### Step 2:
> The matrix $\displaystyle (x_{ij})_{m\times n}$ is then normalized to form the matrix $R = \displaystyle (r_{ij})_{m\times n}$, using the normalization method $\displaystyle r_{ij} = \frac{x_{ij}}{\sqrt{\sum_{i=1}^{m}x^{2}_{ij}}}, i = 1,2,\ldots,m,~j = 1,2,\ldots,n$.

> #### Step 3:
> Calculate the weighted normalized decision matrix $\displaystyle t_{ij}=r_{ij}\cdot w_{j},~i=1,2,...,m,~j=1,2,...,n$, where $\displaystyle w_{j}=W_{j}/\sum _{j=1}^{n}W_{j},~j=1,2,...,n$ so that $\displaystyle \sum _{j=1}^{n}w_{j}=1$, and $W_{j}$ is the original weight given to indicator $\displaystyle v_{j},~j=i,2,\ldots,n$.

> #### Step 4: 
> Determine the worst alternative $\displaystyle (A_{w})$ and the best alternative $\displaystyle (A_{b})$:

> $$A_{w}=\{\langle max(t_{{ij}}|i=1,2,...,m)|j\in J_{-}\rangle ,\langle min(t_{{ij}}|i=1,2,...,m)|j\in J_{+}\rangle \rbrace \equiv \{t_{{wj}}|j=1,2,...,n\rbrace,$$

> $$\displaystyle A_{b}=\{\langle min(t_{ij}|i=1,2,...,m)|j\in J_{-}\rangle ,\langle max(t_{ij}|i=1,2,...,m)|j\in J_{+}\rangle \rbrace \equiv \{t_{bj}|j=1,2,...,n\},$$

> where,
$\displaystyle J_{+}=\{j=1,2,...,n|j\}$ associated with the criteria having a positive impact, and
$\displaystyle J_{-}=\{j=1,2,...,n|j\}$ associated with the criteria having a negative impact.

> #### Step 5:
> Calculate the L2-distance between the target alternative $\displaystyle i$ and the worst condition $\displaystyle A_{w}$,

> $$\displaystyle d_{iw}={\sqrt {\sum _{j=1}^{n}(t_{ij}-t_{wj})^{2}}},i=1,2,...,m,$$

> and the distance between the alternative $\displaystyle i$ and the best condition $\displaystyle A_{b}$,

> $$\displaystyle d_{ib}={\sqrt {\sum _{j=1}^{n}(t_{ij}-t_{bj})^{2}}},i=1,2,...,m,$$

> where $\displaystyle d_{iw}$ and $\displaystyle d_{ib}$ are L2-norm distances from the target alternative $\displaystyle i$ to the worst and best conditions, respectively.

> #### Step 6
> Calculate the similarity to the worst condition:

> $$\displaystyle s_{iw}=d_{iw}/(d_{iw}+d_{ib}),0\leq s_{iw}\leq 1,i=1,2,...,m.$$

> $\displaystyle s_{iw}=1$ if and only if the alternative solution has the best condition; and

> $\displaystyle s_{iw}=0$ if and only if the alternative solution has the worst condition.

> #### Step 7
> Rank the alternatives according to $\displaystyle s_{iw}(i=1,2,...,m)$.

The following code block applies the TOPSIS method to the supplier data.

In [3]:
criteria = ['Warranty Terms', 'Payment Terms', 'Technical Support','Sustainability Efforts', 'Financial Stability']
criteria_weights = np.array([9,0,0,0,0])

#Step 1
evaluation_matrix = supplier_data[criteria].values

#Step 2
squared_evaluation_matrix = evaluation_matrix**2
normalized_evaluation_matrix = evaluation_matrix/np.sqrt(np.sum(squared_evaluation_matrix,axis=0))

#Step 3
weights = criteria_weights/criteria_weights.sum()
weighted_matrix = normalized_evaluation_matrix * weights

#Step 4
PIS = np.max(weighted_matrix, axis=0)
NIS = np.min(weighted_matrix, axis=0)

#Step 5
intermediate = (weighted_matrix - PIS)**2
Dev_Best = np.sqrt(intermediate.sum(axis = 1))

intermediate = (weighted_matrix - NIS)**2
Dev_Worst = np.sqrt(intermediate.sum(axis = 1))

#Step 6
Closeness = Dev_Worst/(Dev_Best+Dev_Worst)

#Step 7
supplier_data['TOPSIS_Score'] = Closeness.tolist()
supplier_data.sort_values(by='TOPSIS_Score',ascending=False)

Unnamed: 0,Supplier,Warranty Terms,Payment Terms,Technical Support,Sustainability Efforts,Financial Stability,Unit Cost,Lead Time (Days),On Time Delivery,TOPSIS_Score
20,21,9.97,3.4,7.98,6.43,9.3,1.07,6,0.86,1.0
68,69,9.94,9.48,9.12,2.16,7.14,1.15,2,0.9,1.0
11,12,9.9,2.64,6.02,2.34,8.75,1.18,3,0.71,0.99
77,78,9.85,2.84,5.07,8.7,3.57,1.06,11,0.86,0.99
42,43,9.77,3.18,8.64,2.08,2.26,1.1,7,0.75,0.98
60,61,9.65,8.21,4.28,6.35,4.43,1.21,5,0.72,0.96
18,19,9.48,8.47,3.42,7.91,2.62,1.25,2,0.83,0.94
63,64,9.36,8.77,5.19,2.4,6.0,1.21,8,0.87,0.93
47,48,9.34,5.29,7.61,8.26,9.71,1.21,8,0.75,0.93
39,40,9.34,3.97,7.94,1.09,1.3,1.14,9,0.78,0.93


The following code block defines a function that will compute the TOPSIS score for supplier scores passed to the function as a dataframe. As was the case for our function that computed the weighted score in notebook 1, this function expects three arguments:  
1. a dataframe that includes columns for the various criteria and the score for each supplier, 
2. a list of the column names for the criteria to be scored, and 
3. an array of the weights for the criteria. 

**Note that the number of criteria in the criteria list should be the same as the number of entries in the array!**

In [4]:
def Compute_TOPSIS(df,criteria_list,weights_array):
    '''This function computes a TOPSIS score for suppliers based on the specified criteria and weights
    
    Arguments
    df: the dataframe that includes the scores for suppliers on each citeria.
    The dataframe should be structured such that columns contain the scores for all suppliers
    
    criteria_list: a list object that specifies the columns in df that contain the various criteria scores
    
    weights_array: a numpy array that specifies the weights for the various criteria specified in the criteria_list. 
    The order in which the weights are specified should match the orderin which the criteria are provided
    '''
    
    #Step 1
    evaluation_matrix = df[criteria_list].values

    #Step 2
    squared_evaluation_matrix = evaluation_matrix**2
    normalized_evaluation_matrix = evaluation_matrix/np.sqrt(np.sum(squared_evaluation_matrix,axis=0))

    #Step 3
    weights = weights_array/weights_array.sum()
    weighted_matrix = normalized_evaluation_matrix * weights

    #Step 4
    PIS = np.max(weighted_matrix, axis=0)
    NIS = np.min(weighted_matrix, axis=0)

    #Step 5
    intermediate = (weighted_matrix - PIS)**2
    Dev_Best = np.sqrt(intermediate.sum(axis = 1))

    intermediate = (weighted_matrix - NIS)**2
    Dev_Worst = np.sqrt(intermediate.sum(axis = 1))

    #Step 6
    Closeness = Dev_Worst/(Dev_Best+Dev_Worst)

    #Step 7
    df['TOPSIS Score'] = Closeness.tolist()
    df.sort_values(by='TOPSIS Score',ascending=False,inplace=True)

    return df

For the sake of comparison, the following code block provides the function for computing weighted scores (sum and product) that we defined in notebook 2.

In [5]:
def Compute_Weighted_Scores(df,criteria_list,weights_array):
    '''This function computes a weighted score for suppliers based on the specified criteria and weights
    
    Arguments
    df: the dataframe that includes the scores for suppliers on each citeria.
    The dataframe should be structured such that columns contain the scores for all suppliers
    
    criteria_list: a list object that specifies the columns in df that contain the various criteria scores
    
    weights_array: a numpy array that specifies the weights for the various criteria specified in the criteria_list. 
    The order in which the weights are specified should match the orderin which the criteria are provided
    '''
    
    normalized_weights = weights_array/weights_array.sum()
    
    df['Weighted Score (Sum)'] = 0
    df['Weighted Score (Product)'] = 1
    for i in range(len(criteria_list)):
        current_criteria = criteria_list[i]
        current_weight = normalized_weights[i]
        df['Weighted Score (Sum)'] += current_weight*df[current_criteria]
        df['Weighted Score (Product)'] *= df[current_criteria]**current_weight

    return df

The following code block use the perturbation analysis presentedin previous notebooks to compares the rankings for the weighted and TOPSIS methods. The ranking counts are merged into a single dataframe for the purpose of comparison.

In [6]:
from collections import Counter
np.random.seed(42)

criteria = ['Warranty Terms',
            'Payment Terms',
            'Technical Support',
            'Sustainability Efforts',
            'Financial Stability'
           ]

criteria_weights = np.array([2,4,6,7,9])

perturbations = 1000
min_perturbation = -2.0
max_perturbation = 2.0
Top_suppliers = 25

supplier_list_sum = []
supplier_list_product = []
supplier_list_TOPSIS = []

if len(criteria) != len(criteria_weights):
    print('The number of criteria and weights that you specified do not match!')
else:
    for _ in range(perturbations):
        perturbed_weights = np.random.uniform(low=min_perturbation,
                                              high=max_perturbation+0.1,
                                              size=len(criteria_weights)) + criteria_weights
        perturbed_weights = np.maximum(perturbed_weights,0)
        perturbed_weights = np.minimum(perturbed_weights,10)
        
        data = Compute_Weighted_Scores(supplier_data, criteria, perturbed_weights)
        temp = data.nlargest(Top_suppliers,columns = 'Weighted Score (Sum)')
        top_suppliers_list = temp['Supplier'].values.tolist()
        supplier_list_sum.append(top_suppliers_list)
        
        temp = data.nlargest(Top_suppliers,columns = 'Weighted Score (Product)')
        top_suppliers_list = temp['Supplier'].values.tolist()
        supplier_list_product.append(top_suppliers_list)
        
        data = Compute_TOPSIS(supplier_data, criteria, perturbed_weights)
        temp = data.nlargest(Top_suppliers,columns = 'TOPSIS Score')
        top_suppliers_list = temp['Supplier'].values.tolist()
        supplier_list_TOPSIS.append(top_suppliers_list)
        

counts_sum = Counter(x for sublist in supplier_list_sum for x in sublist)
counts_sum = pd.DataFrame.from_dict(counts_sum, orient='index').reset_index()
counts_sum = counts_sum.rename(columns={'index':'Supplier', 0:'Count'})
counts_sum.sort_values(by='Count',inplace=True,ascending=False)
counts_sum['WS Proportion'] = counts_sum['Count']/perturbations
counts_sum.set_index('Supplier')

counts_product = Counter(x for sublist in supplier_list_product for x in sublist)
counts_product = pd.DataFrame.from_dict(counts_product, orient='index').reset_index()
counts_product = counts_product.rename(columns={'index':'Supplier', 0:'Count'})
counts_product.sort_values(by='Count',inplace=True,ascending=False)
counts_product['WP Proportion'] = counts_product['Count']/perturbations
counts_product.set_index('Supplier')

counts_TOPSIS = Counter(x for sublist in supplier_list_TOPSIS for x in sublist)
counts_TOPSIS = pd.DataFrame.from_dict(counts_TOPSIS, orient='index').reset_index()
counts_TOPSIS = counts_TOPSIS.rename(columns={'index':'Supplier', 0:'Count'})
counts_TOPSIS.sort_values(by='Count',inplace=True,ascending=False)
counts_TOPSIS['TOPSIS Proportion'] = counts_TOPSIS['Count']/perturbations
counts_TOPSIS.set_index('Supplier')

merged_data = counts_sum.merge(counts_product,on='Supplier',how='outer')
merged_data = merged_data.merge(counts_TOPSIS,on='Supplier',how='outer')
merged_data.drop(labels = ['Count_x','Count_y','Count'], axis = 1,inplace=True)
merged_data.fillna(value=0,inplace=True)
merged_data

Unnamed: 0,Supplier,WS Proportion,WP Proportion,TOPSIS Proportion
0,57,1.0,1.0,1.0
1,51,1.0,1.0,1.0
2,4,1.0,1.0,1.0
3,79,1.0,1.0,1.0
4,46,1.0,1.0,1.0
5,68,1.0,1.0,1.0
6,48,1.0,1.0,1.0
7,21,1.0,1.0,1.0
8,73,1.0,0.99,1.0
9,76,1.0,0.97,1.0


The following code block determines the correlation among the rankings produced by the various methods.

In [7]:
merged_data[['WP Proportion','WS Proportion','TOPSIS Proportion']].corr()

Unnamed: 0,WP Proportion,WS Proportion,TOPSIS Proportion
WP Proportion,1.0,0.88,0.85
WS Proportion,0.88,1.0,0.9
TOPSIS Proportion,0.85,0.9,1.0
