# TOPSIS
https://zhuanlan.zhihu.com/p/348704436

https://www.zhihu.com/question/546564958/answer/2705275789

TOPSIS是一种综合评价的方法，其中设计到指标权重的确定，熵权法是一种通过信息熵确定指标权重的方法，它的确定是比较客观的，而AHP确定指标权重的方法是比较主观的，通常都是用基于熵权法的TOPSIS法，因此可以把熵权法当做TOPSIS的一部分

seaborn数据库加载不了的解决方案：

1. 挂梯子
2. https://blog.csdn.net/fightingoyo/article/details/106920773

# 熵权法（EWM）

熵权法（EWM）是一种重要的信息权重模型，已被广泛研究和实践。 与各种主观赋权模型相比，EWM的最大优点是避免了人为因素对指标权重的干扰，从而增强了综合评价结果的客观性

In [3]:
import pandas as pd
import numpy as np
import seaborn as sns

In [4]:
df = sns.load_dataset('mpg')
df.head()

Unnamed: 0,mpg,cylinders,displacement,horsepower,weight,acceleration,model_year,origin,name
0,18.0,8,307.0,130.0,3504,12.0,70,usa,chevrolet chevelle malibu
1,15.0,8,350.0,165.0,3693,11.5,70,usa,buick skylark 320
2,18.0,8,318.0,150.0,3436,11.0,70,usa,plymouth satellite
3,16.0,8,304.0,150.0,3433,12.0,70,usa,amc rebel sst
4,17.0,8,302.0,140.0,3449,10.5,70,usa,ford torino


## 1.获得决策矩阵

删掉不必要的信息，保留数据列

In [5]:
# Decision Matrix
df.index = df['name']
new_df = df.drop(['origin', 'name' ], axis = 1)
new_df.head()

Unnamed: 0_level_0,mpg,cylinders,displacement,horsepower,weight,acceleration,model_year
name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
chevrolet chevelle malibu,18.0,8,307.0,130.0,3504,12.0,70
buick skylark 320,15.0,8,350.0,165.0,3693,11.5,70
plymouth satellite,18.0,8,318.0,150.0,3436,11.0,70
amc rebel sst,16.0,8,304.0,150.0,3433,12.0,70
ford torino,17.0,8,302.0,140.0,3449,10.5,70


## 2.获得各指标特征比重
利用公式：
$$p_{ij}=\frac{x_{ij}}{\sum_{i=1}^{m}{x_{ij}}}$$

In [6]:
#Normalize Decision matrix

def norm(X):
    return X/X.sum()


norm_df = new_df.apply(norm)
norm_df.head()

Unnamed: 0_level_0,mpg,cylinders,displacement,horsepower,weight,acceleration,model_year
name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
chevrolet chevelle malibu,0.001923,0.003685,0.003988,0.003174,0.002964,0.001937,0.002314
buick skylark 320,0.001603,0.003685,0.004546,0.004029,0.003124,0.001856,0.002314
plymouth satellite,0.001923,0.003685,0.004131,0.003663,0.002906,0.001775,0.002314
amc rebel sst,0.00171,0.003685,0.003949,0.003663,0.002904,0.001937,0.002314
ford torino,0.001816,0.003685,0.003923,0.003419,0.002917,0.001695,0.002314


## 3.计算各指标熵值
$$
e_{j}=-k \sum_{i=1}^m p_{i j} \ln \left(p_{i j}\right)
$$

其中
$$
k=\frac{1}{ln(m)}
$$

然后再利用公式
$$
g_j=1-e_j
$$
得到各指标的差异系数g

再由公式
$$
w_j=\frac{g_j}{\sum_{k=1}^{g_j}{g_k}}
$$
得到各指标的权重向量

In [7]:
#Entropy Values

k = -(1/np.log(norm_df.shape[0]))

def entropy(X):
    return (X*np.log(X)).sum()*k

entropy = norm_df.apply(entropy)

#degree of differentiation

dod = 1 - entropy

w = dod/dod.sum()
w.sort_values(ascending = False)

displacement    0.369241
horsepower      0.208400
mpg             0.145895
cylinders       0.125612
weight          0.105799
acceleration    0.041895
model_year      0.003158
dtype: float64

In [9]:
w

mpg             0.145895
cylinders       0.125612
displacement    0.369241
horsepower      0.208400
weight          0.105799
acceleration    0.041895
model_year      0.003158
dtype: float64

# TOPSIS法
所选方案应具有到正理想解 (PIS) 的最短几何距离以及距负理想解最长的几何距离

In [10]:
new_df.head()

Unnamed: 0_level_0,mpg,cylinders,displacement,horsepower,weight,acceleration,model_year
name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
chevrolet chevelle malibu,18.0,8,307.0,130.0,3504,12.0,70
buick skylark 320,15.0,8,350.0,165.0,3693,11.5,70
plymouth satellite,18.0,8,318.0,150.0,3436,11.0,70
amc rebel sst,16.0,8,304.0,150.0,3433,12.0,70
ford torino,17.0,8,302.0,140.0,3449,10.5,70


## 1.数据正向化和无量纲化
$$
r_{i j}=\frac{x_{i j}}{\sqrt{\sum_{i=1}^m x_{i j}^2}}
$$
where $i=1,2, \ldots, m$ and $j=1,2, \ldots, n$.

通过计算得到标准化数据矩阵B

In [12]:
### 标准化数据

def norm(X):
    return X/np.sqrt((X**2).sum())

norm_matrix = new_df.apply(norm)
norm_matrix.head()

Unnamed: 0_level_0,mpg,cylinders,displacement,horsepower,weight,acceleration,model_year
name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
chevrolet chevelle malibu,0.036416,0.070189,0.07005,0.058984,0.056869,0.038046,0.046108
buick skylark 320,0.030347,0.070189,0.079862,0.074865,0.059937,0.036461,0.046108
plymouth satellite,0.036416,0.070189,0.07256,0.068059,0.055766,0.034876,0.046108
amc rebel sst,0.03237,0.070189,0.069366,0.068059,0.055717,0.038046,0.046108
ford torino,0.034393,0.070189,0.068909,0.063521,0.055976,0.03329,0.046108


## 2.变换数据矩阵
由于已知权重向量w，利用权重矩阵$w=[w_1,...,w_m]$以及$B=(b_{ij})_{n\times m}$构造加权规范评价矩阵$\tilde{B}=(\tilde{b}_{ij})$，其中$\tilde{b}_{ij}=w_jb_{ij}$

In [13]:
w_norm_matrix = norm_matrix*w
w_norm_matrix.head()

Unnamed: 0_level_0,mpg,cylinders,displacement,horsepower,weight,acceleration,model_year
name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
chevrolet chevelle malibu,0.005313,0.008817,0.025865,0.012292,0.006017,0.001594,0.000146
buick skylark 320,0.004427,0.008817,0.029488,0.015602,0.006341,0.001528,0.000146
plymouth satellite,0.005313,0.008817,0.026792,0.014183,0.0059,0.001461,0.000146
amc rebel sst,0.004723,0.008817,0.025613,0.014183,0.005895,0.001594,0.000146
ford torino,0.005018,0.008817,0.025444,0.013238,0.005922,0.001395,0.000146


## 3.求解正负理想解


In [14]:
V_plus = w_norm_matrix.apply(max)
V_minus = w_norm_matrix.apply(min)
V_plus

mpg             0.013755
cylinders       0.008817
displacement    0.038335
horsepower      0.021748
weight          0.008826
acceleration    0.003294
model_year      0.000171
dtype: float64

## 4.计算各评价对象到正负理想解的距离
$$
\begin{aligned}
S_i^* & =\sqrt{\sum_{j=1}^n\left(v_{i j}-v_j^*\right)^2} \\
S_i^{-} & =\sqrt{\sum_{j=1}^n\left(v_{i j}-v_j^{-}\right)^2}
\end{aligned}
$$
where $i=1,2, \ldots, m$ and $j=1,2, \ldots, n$.
We also calculate
$$
C_i^*=\frac{S_i^{-}}{S_i^*+S_i^{-}}, \text {where } i=1,2, \ldots, m
$$

In [15]:
S_plus = np.sqrt(((w_norm_matrix - V_plus)**2).apply(sum, axis = 1))
S_minus = np.sqrt(((w_norm_matrix - V_minus)**2).apply(sum, axis = 1))

## 5.按分数从高到低进行排序

In [16]:
p_score = S_minus/(S_plus + S_minus)
p_score.sort_values(ascending = False).head(20)

name
pontiac catalina                0.790950
chevrolet impala                0.788325
buick electra 225 custom        0.783470
buick estate wagon (sw)         0.781055
plymouth fury iii               0.779876
chrysler new yorker brougham    0.777923
ford galaxie 500                0.769354
pontiac grand prix              0.763569
mercury marquis                 0.758324
mercury marquis brougham        0.758314
chrysler cordoba                0.741217
pontiac grand prix lj           0.733877
chrysler newport royal          0.731166
pontiac catalina                0.726640
amc ambassador dpl              0.724021
pontiac catalina brougham       0.722612
pontiac catalina                0.722415
pontiac safari (sw)             0.720975
ford country squire (sw)        0.715314
ford country                    0.709291
dtype: float64