## 1. 预备知识

本notebook对应[这一篇blog post](https://hanqiu92.github.io/blogs/2020/LP_dual_simplex_solver_1_202002/)中的内容，主要包括对LP问题的基础元素的实现以及评估工具的介绍。

In [1]:
import numpy as np
import scipy.sparse as sp
import time
from enum import Enum,unique

### 基础元素实现：问题和解

在blog post中，已经介绍了问题和解的基本概念。这里将介绍代码实现。

首先考虑解的实现。如在blog post所介绍的，我们将维护一组基，并考虑基于这组基的解。但是由于从基到解的计算并不trivial，所以在实现中将同时存储解和基。
* 解$(x,\lambda,s)$中的每个元素都是向量，可以直接用np.array存储。
* 基$(B,L,U)$的存储选项更多一些。考虑到在单纯形法的过程中基会不断迭代，如果分别用容器存储，则各容器每次迭代都需要增减，效率低且维护成本高。用一个向量维护更方便快捷。为此，引入枚举类型VarStatus作为标签，然后用向量sign来记录每个变量属于基中的哪个状态。

In [2]:
@unique
class VarStatus(Enum):
    AT_UPPER_BOUND = -1 ## 对应U
    AT_LOWER_BOUND = 1 ## 对应L
    OTHER = 0 ## 对应B

class Solution(object):
    def __init__(self,x,lam,s,sign):
        self.x = x ## 原始问题解，实数
        self.lam = lam ## 对偶问题解，实数
        self.s = s ## 对偶问题解，实数
        self.sign = sign ## 基(B,L,U)的标记，值的类型为上面定义的VarStatus

    def copy(self):
        ## 该函数可以复制解，用于需要保留多组解的场景。
        return Solution(x=self.x.copy(),s=self.s.copy(),
                        lam=self.lam.copy(),sign=self.sign.copy())

下面考虑问题类的实现。该类需要满足以下几个功能：
* 目标函数的评估；
* 解可行性（约束不满足程度-infeasibility）的评估；
* 基sign=$(B,L,U)$与解$(x,\lambda,s)$的一致性评估。

注：由于解是默认根据基生成的，因此不需要对互补松弛条件进行检查。

In [3]:
## 一些与数值误差相关的常数的定义
INF = 1e16
PRIMAL_TOL = 1e-7
PRIMAL_RELA_TOL = 1e-9
DUAL_TOL = 1e-7
CON_TOL = 1e-5

@unique
class BoundType(Enum):
    ## 变量的上下界类型
    BOTH_BOUNDED = 3
    UPPER_BOUNDED = 2
    LOWER_BOUNDED = 1
    FREE = 0

class Problem(object):
    def __init__(self,A,b,c,l,u,AT=None):
        ## 基本输入：A,b,c,l,u
        ## 对应问题 min c^T x
        ##        s.t. A x = b,
        ##          l <= x <= u.
        ## 为了提高计算效率，A采用稀疏矩阵存储；b,c,l,u则是dense的向量
        self.A,self.b,self.c,self.l,self.u = A,b,c,l,u
        
        ## 保存矩阵A的转置
        if AT is None:
            self.AT = self.A.T
        else:
            self.AT = AT
            
        ## 保存一些后续不会变的中间变量，可以提高后续评估的计算速度
        self.n,self.m = self.A.shape
        self.bounds_gap = self.u - self.l

        self.bool_upper_unbounded = self.u >= INF ## 是否上界无界
        self.bool_lower_unbounded = self.l <= -INF ## 是否下界无界
        self.bool_not_both_bounded = self.bool_upper_unbounded | self.bool_lower_unbounded

        self.bound_type = np.zeros((self.m,),dtype=int) ## 上下界类型
        self.bound_type[self.bool_lower_unbounded & self.bool_upper_unbounded] = BoundType.FREE.value
        self.bound_type[self.bool_lower_unbounded & ~self.bool_upper_unbounded] = BoundType.UPPER_BOUNDED.value
        self.bound_type[~self.bool_lower_unbounded & self.bool_upper_unbounded] = BoundType.LOWER_BOUNDED.value
        self.bound_type[~self.bool_lower_unbounded & ~self.bool_upper_unbounded] = BoundType.BOTH_BOUNDED.value

        self.primal_lower_bound_tol = - (np.abs(self.l) * PRIMAL_RELA_TOL + PRIMAL_TOL)
        self.primal_upper_bound_tol = (np.abs(self.u) * PRIMAL_RELA_TOL + PRIMAL_TOL)
        self.l_margin = self.l + self.primal_lower_bound_tol ## 考虑数值误差的下界
        self.u_margin = self.u + self.primal_upper_bound_tol ## 考虑数值误差的上界

    def copy(self):
        ## 复制函数，in case我们需要保留多组问题
        return Problem(self.A,b=self.b.copy(),c=self.c.copy(),l=self.l.copy(),u=self.u.copy())

    ## **************************
    ## 原始/对偶目标函数的评估
    def eval_primal_obj(self,sol):
        ## z = c^T x
        return np.dot(self.c,sol.x)

    def eval_dual_obj(self,sol):
        ## z = b^T \lambda + u^T s_u + l^T s_l
        return (np.dot(self.b,sol.lam) + \
                np.dot(self.u * (sol.sign == VarStatus.AT_UPPER_BOUND.value),sol.s) + \
                np.dot(self.l * (sol.sign == VarStatus.AT_LOWER_BOUND.value),sol.s))
    
    ## **************************
    ## 原始/对偶线性约束的infeasibility评估
    def eval_primal_con_infeas(self,sol):
        ## A x - b
        return (self.A._mul_vector(sol.x) - self.b)

    def eval_dual_con_infeas(self,sol):
        ## A^T \lambda + s - c
        return (self.AT._mul_vector(sol.lam) + sol.s - self.c)

    ## **************************
    ## 原始/对偶变量上下界的infeasibility评估
    def eval_unbnd(self,sol):
        ## x = INF or -INF
        return ( ((self.bool_upper_unbounded) & (sol.sign == VarStatus.AT_UPPER_BOUND.value)) | \
                 ((self.bool_lower_unbounded) & (sol.sign == VarStatus.AT_LOWER_BOUND.value)) )

    def eval_primal_inf(self,sol):
        ## l <= x <= u
        primal_inf = np.maximum(sol.x - self.u,0) - np.maximum(self.l - sol.x,0)
        bool_unbnd = self.eval_unbnd(sol)
        primal_inf[bool_unbnd] += INF
        return primal_inf

    def eval_dual_inf(self,sol):
        ## s_u <= 0, s_l >= 0
        dual_inf = np.maximum(sol.s,0) * (sol.sign == VarStatus.AT_UPPER_BOUND.value) + \
                    np.maximum(-sol.s,0) * (sol.sign == VarStatus.AT_LOWER_BOUND.value)
        bool_unbnd = self.eval_unbnd(sol)
        dual_inf[bool_unbnd] += np.abs(sol.s[bool_unbnd])
        return dual_inf

    ## **************************
    ## 基(B,L,U)与解(x,\lambda,s)的一致程度
    def eval_sign(self,sol):
        ## x_L = l_L, x_U = u_U
        bool_sign = np.zeros((self.m,),dtype=bool)
        bool_lower = sol.sign == VarStatus.AT_LOWER_BOUND.value
        bool_upper = sol.sign == VarStatus.AT_UPPER_BOUND.value
        bool_sign[bool_lower] = sol.x[bool_lower] != self.l[bool_lower]
        bool_sign[bool_upper] = sol.x[bool_upper] != self.u[bool_upper]
        return bool_sign
        
    ## **************************
    ## 整合上述评估结果，形成字符串输出
    def check_sol_status(self,sol,print_func=None,print_header=''):
        infeas_dict = {'primal':False,'dual':False,'cons':False,'unbnd':False,'sign':False}

        ## 目标函数
        primal_obj,dual_obj = self.eval_primal_obj(sol),self.eval_dual_obj(sol)
        status_str = 'Obj Primal {:.4e} Dual {:.4e}'.format(primal_obj,dual_obj)
        
        ## 原始变量的infeasibility
        primal_inf = self.eval_primal_inf(sol)
        primal_inf_cnt = np.sum((primal_inf > self.primal_upper_bound_tol) | \
                                (primal_inf < self.primal_lower_bound_tol))
        if primal_inf_cnt > 0:
            status_str += '  Primal Inf {:.4e} ({:d})'.format(np.sum(np.abs(primal_inf)),primal_inf_cnt)
            infeas_dict['primal'] = True

        ## 对偶变量的infeasibility
        dual_inf = self.eval_dual_inf(sol)
        dual_inf_cnt = np.sum(np.abs(dual_inf) > DUAL_TOL)
        if dual_inf_cnt > 0:
            status_str += '  Dual Inf {:.4e} ({:d})'.format(np.sum(np.abs(dual_inf)),dual_inf_cnt)
            infeas_dict['dual'] = True

        ## 原始和对偶问题的线性约束的infeasibility
        primal_con_inf,dual_con_inf = self.eval_primal_con_infeas(sol),self.eval_dual_con_infeas(sol)
        con_inf_cnt = np.sum(np.abs(primal_con_inf) > CON_TOL) + np.sum(np.abs(dual_con_inf) > CON_TOL)
        if con_inf_cnt > 0:
            status_str += '  Con Inf {:.4e} ({:d})'.format(np.sum(np.abs(primal_con_inf)) + np.sum(np.abs(dual_con_inf)),con_inf_cnt)
            infeas_dict['cons'] = True

        ## 上下界的consistency
        bool_unbnd = self.eval_unbnd(sol)
        bool_unbnd_cnt = np.sum(bool_unbnd)
        if bool_unbnd_cnt > 0:
            status_str += '  Bnd err {:d}'.format(bool_unbnd_cnt)
            infeas_dict['unbnd'] = True

        ## 解与基的consistency
        bool_sign = self.eval_sign(sol)
        bool_sign_cnt = np.sum(bool_sign)
        if bool_sign_cnt > 0:
            status_str += '  Sign err {:d}'.format(bool_sign_cnt)
            infeas_dict['sign'] = True

        ## 打印输出
        if print_func is not None:
            print_func('{}  {}'.format(print_header,status_str))

        return infeas_dict,status_str

### 评估工具

为了提高开发效率，需要能够对求解器的准确性和速度进行快速评估。我们在当前路径中提供了一组评估工具。
其中，
* netlib文件夹中存放了一系列netlib LP问题集（MPS格式）（[获取地址](www.numerical.rl.ac.uk/cute/netlib.html)）。
* util.py文件中提供了
    * 简单的MPS格式读取函数read_mps。
    * 调用[PuLP](https://github.com/coin-or/pulp) API，求解一个以格式$(A,b,sense,c,l,u)$存储的LP问题的函数solve_pulp。该API底层调用的求解器是[Clp](https://github.com/coin-or/Clp)，Clp是目前较快的开源LP求解器之一。虽然使用PuLP调用Clp会有一定的IO成本，但是使用方便，而且对单个LP问题来说调用成本可以忽略。
    * 评估类Evaluator。由于目前solve_pulp函数只获取了原始解$x$，而且原始问题的线性约束$Ax=b$中的关系$=$根据sense变量的取值可能变成$\leq$或者$\geq$，因此我们引入了该类对原始解$x$直接进行评估。在代码实现上，该类的评估方法与上面Problem类的评估方法基本一致。
    
下面展示一个评估流程的例子。在下面的代码中，我们将通过读取数据集获取LP问题，然后调用PuLP获取解，最后进行评估。

In [4]:
from util import *
import time
import traceback
import glob
np.set_printoptions(suppress=False,precision=4)
    
evaluator = Evaluator()
fnames = sorted(glob.glob('netlib/*.SIF'))
for fname in fnames:
    model_name = fname.split('/')[-1].split('.')[0]
    if model_name == 'QAP15': 
        ## solving this model may take half hour, so skip it
        continue
            
    A_dict,b_dict,sense_dict,c_dict,l_dict,u_dict,m,n,row_key,col_key = read_mps(fname)
    A,b,sense,c,l,u = dicts_to_computable(A_dict,b_dict,sense_dict,c_dict,l_dict,u_dict,m,n)

    evaluator.reset(A,b,sense,c,l,u)
    print('Problem name: {}, size:({},{}).'.format(model_name,A.shape,A.nnz))

    try:
        tt = time.time()
        x_pulp,lam_pulp,status_pulp = solve_pulp(A_dict,b_dict,sense_dict,c_dict,l_dict,u_dict,m,n)
        time_pulp = time.time() - tt
        print(f"Eval: {evaluator.eval_str(x_pulp)} Elapsed time: {time_pulp:.3f}.\n")

    except Exception as e:
        print(repr(e))
        print(traceback.print_exc())

Problem name: 25FV47, size:((821, 1571),10400).
Eval: con inf=6.7406e-04,var inf=0.0000e+00,obj=5.5018e+03. Elapsed time: 0.360.

Problem name: 80BAU3B, size:((2262, 9799),21002).
Eval: con inf=1.0557e-03,var inf=0.0000e+00,obj=9.8722e+05. Elapsed time: 0.759.

Problem name: ADLITTLE, size:((56, 97),383).
Eval: con inf=2.6769e-05,var inf=0.0000e+00,obj=2.2549e+05. Elapsed time: 0.017.

Problem name: AFIRO, size:((27, 32),83).
Eval: con inf=9.5400e-06,var inf=0.0000e+00,obj=-4.6475e+02. Elapsed time: 0.012.

Problem name: AGG, size:((488, 163),2410).
Eval: con inf=7.0118e-02,var inf=0.0000e+00,obj=-3.5992e+07. Elapsed time: 0.040.

Problem name: AGG2, size:((516, 302),4284).
Eval: con inf=6.1704e-02,var inf=0.0000e+00,obj=-2.0239e+07. Elapsed time: 0.062.

Problem name: AGG3, size:((516, 302),4300).
Eval: con inf=4.5744e-02,var inf=0.0000e+00,obj=1.0312e+07. Elapsed time: 0.055.

Problem name: BANDM, size:((305, 472),2494).
Eval: con inf=9.5659e-05,var inf=0.0000e+00,obj=-1.5863e+02. El

Eval: con inf=2.5395e-01,var inf=2.9490e-03,obj=-9.3808e+03. Elapsed time: 0.186.

Problem name: PILOT-JA, size:((940, 1988),14698).
Eval: con inf=3.4771e-01,var inf=2.4640e-03,obj=-6.1131e+03. Elapsed time: 0.552.

Problem name: PILOT-WE, size:((722, 2789),9126).
Eval: con inf=4.4547e-01,var inf=1.5447e-03,obj=-2.7201e+06. Elapsed time: 0.280.

Problem name: PILOT, size:((1441, 3652),43167).
Eval: con inf=1.0090e-03,var inf=3.2044e-05,obj=-5.5749e+02. Elapsed time: 1.599.

Problem name: PILOT4, size:((410, 1000),5141).
Eval: con inf=8.6423e-02,var inf=1.5930e-03,obj=-2.5811e+03. Elapsed time: 0.204.

Problem name: PILOT87, size:((2030, 4883),73152).
Eval: con inf=4.2261e-03,var inf=3.3500e-04,obj=3.0171e+02. Elapsed time: 5.484.

Problem name: PILOTNOV, size:((975, 2172),13057).
Eval: con inf=1.3991e-01,var inf=1.2000e-04,obj=-4.4973e+03. Elapsed time: 0.220.

Problem name: QAP12, size:((3192, 8856),38304).
Eval: con inf=3.7434e-06,var inf=3.0478e-09,obj=5.2289e+02. Elapsed time: 71.5