## 0. 评估案例准备

在这个notebook中，我将准备一些测试案例/问题，用于后续对B&B求解器的评估中。由于MIP问题通常较难求解（耗时较长），为了在较短的时间内获得有意义的评估结果，评估中不能考虑过多测试案例，且这些案例不能过于困难。因此，下面我将进行多步筛选来获得数十个相对简单的测试案例。感兴趣的读者可以尝试对测试案例集进行调整。

In [1]:
import numpy as np
np.set_printoptions(suppress=True,precision=4)

第一步是找到测试案例资源。MIPLIB (2017)是一个常用的针对MIP求解器的测试案例集；从[这个网址](https://miplib.zib.de/download.html)中，我们可以获取该案例集的问题文件、求解结果文件以及一些其他信息。在本项目中，我将考虑benchmark案例子集中“简单”且“可行”的测试案例；其中，“简单”标签通过easy-v7.test文件获取，而“可行”标签则通过miplib2017-v13.solu文件来获取。

In [2]:
from util import load_benchmark
## 获取问题“可行性”
df_benchmark = load_benchmark('data/miplib2017-v13.solu')
opt_problems = set(df_benchmark.query('status_best == "opt"')['model'].values)

## 获取“简单”问题
with open('data/easy-v1.test','r+') as f:
    lines = f.readlines()
model_names = [line[:-8] for line in lines]
model_names = [model_name for model_name in model_names if model_name in opt_problems]

满足以上两个条件的问题仍然很多，需要进行进一步筛选。在第二步中，我将根据问题文件的大小进行筛选，只保留小规模问题。

In [3]:
import os,gzip,shutil
from util import model2fname

size_limit = 2.5 ## 问题大小上限, MB
## 遍历各测试案例并解压
model_names_remain = []
for model_name in model_names:
    fname_in = 'data/benchmark/{}.mps.gz'.format(model_name)
    fname_out = model2fname(model_name)
    try:
        if os.path.getsize(fname_in) / 1024 / 1024 <= size_limit: ## in MB
            with gzip.open(fname_in,'r') as f_in, open(fname_out,'wb') as f_out:
                shutil.copyfileobj(f_in,f_out)
                
            if os.path.getsize(fname_out) / 1024 / 1024 <= size_limit: ## in MB
                model_names_remain += [model_name]
                print('{}: {:.2f} MB'.format(model_name,os.path.getsize(fname_out) / 1024 / 1024))
    except Exception:
        pass
print('test cases v1: total {}.'.format(len(model_names_remain)))
## 存储测试案例名称
with open('data/test_cases_v1','w+') as f:
    for model_name in model_names_remain:
        f.write(model_name+'\n')

50v-10: 0.28 MB
assign1-5-8: 0.11 MB
beasleyC3: 0.25 MB
binkar10_1: 0.19 MB
bnatt400: 1.35 MB
bppc4-08: 1.17 MB
cost266-UUE: 1.89 MB
csched007: 0.40 MB
csched008: 0.36 MB
dano3_3: 2.49 MB
dano3_5: 2.49 MB
eil33-2: 1.46 MB
enlight_hard: 0.05 MB
exp-1-500-5-5: 0.20 MB
fastxgemm-n2r6s0t2: 1.48 MB
gen-ip002: 0.05 MB
gen-ip054: 0.03 MB
glass4: 0.08 MB
gmu-35-40: 0.35 MB
gmu-35-50: 0.59 MB
graph20-20-1rand: 1.80 MB
graphdraw-domain: 0.21 MB
h80x6320d: 1.60 MB
ic97_potential: 0.19 MB
icir97_tension: 0.98 MB
lotsize: 0.30 MB
mad: 0.15 MB
markshare_4_0: 0.01 MB
mas74: 0.06 MB
mas76: 0.06 MB
mc11: 0.30 MB
mcsched: 0.49 MB
mik-250-20-75-4: 0.27 MB
milo-v12-6-r2-40-1: 0.73 MB
n5-3: 0.33 MB
n9-3: 0.98 MB
neos-1171737: 1.64 MB
neos-1445765: 1.56 MB
neos-1456979: 1.40 MB
neos-1582420: 1.23 MB
neos-2657525-crna: 0.07 MB
neos-2978193-inde: 2.22 MB
neos-3004026-krka: 1.91 MB
neos-3024952-loue: 0.66 MB
neos-3046615-murg: 0.06 MB
neos-3083819-nubu: 0.97 MB
neos-3381206-awhea: 0.18 MB
neos-3627168-kasai: 0

按问题大小筛选过后，测试案例的数量仍然很多。为了让接下来的B&B求解器有求解效果，我将进一步利用现有求解器的求解结果进行筛选：对于一个测试案例，如果现有求解器不能在短时间内有一定的求解进展，那么后续我们自己实现的求解器很可能也没有效果。在这里，我将使用开源求解器[CBC](https://github.com/coin-or/Cbc)作为评估标准。

In [6]:
## 定义筛选标准
max_secs = 300 ## 最大运行时长
opt_gap_rate_thres = 0.05 ## 最大可接受optimality gap

## 读取测试案例名称
with open('data/test_cases_v1','r+') as f:
    lines = f.readlines()
model_names = [line.strip('\n') for line in lines]   

result = dict()
for model_name in model_names:
    ## 调用CBC进行求解
    (dt,status,LB,obj) = cbc_solve(model2fname(model_name),time_limit=max_secs,solve_type=0)
    print('{}: time={:.2f}, LB={:.2e}, obj={:.2e}.'.format(model_name,dt,LB,obj))
    result[model_name] = (dt,status,LB,obj)
df_result = process_result(result,save_fname='result/cbc_benchmark.csv')
print('elapsed hours: {:.2f}.'.format(df_result['time'].sum() / 3600))

## 筛选符合要求的测试案例
idxs = df_result['opt_gap_rate'] <= opt_gap_rate_thres
model_names_remain = df_result.loc[idxs,'model'].values
print('test cases v2: total {}.'.format(len(model_names_remain)))
## 重新存储测试案例名称
with open('data/test_cases_v2','w+') as f:
    for model_name in model_names_remain:
        f.write(model_name+'\n')

50v-10: time=320.41, LB=3.18e+03, obj=1.00e+50.
assign1-5-8: time=307.30, LB=1.95e+02, obj=2.12e+02.
beasleyC3: time=306.84, LB=6.37e+02, obj=9.37e+02.
binkar10_1: time=97.80, LB=6.74e+03, obj=6.74e+03.
bnatt400: time=303.99, LB=0.00e+00, obj=1.00e+50.
bppc4-08: time=315.27, LB=5.16e+01, obj=5.70e+01.
cost266-UUE: time=311.70, LB=2.20e+07, obj=2.76e+07.
csched007: time=306.26, LB=3.01e+02, obj=1.00e+50.
csched008: time=305.67, LB=1.71e+02, obj=1.75e+02.
dano3_3: time=51.62, LB=5.76e+02, obj=5.76e+02.
dano3_5: time=323.90, LB=5.76e+02, obj=5.77e+02.
eil33-2: time=167.97, LB=9.34e+02, obj=9.34e+02.
enlight_hard: time=378.36, LB=2.30e+01, obj=1.00e+50.
exp-1-500-5-5: time=307.76, LB=5.63e+04, obj=7.64e+04.
fastxgemm-n2r6s0t2: time=311.57, LB=2.70e+01, obj=2.36e+02.
gen-ip002: time=309.55, LB=-4.80e+03, obj=-4.78e+03.
gen-ip054: time=312.31, LB=6.81e+03, obj=6.84e+03.
glass4: time=316.11, LB=9.23e+08, obj=1.80e+09.
gmu-35-40: time=324.06, LB=-2.41e+06, obj=-2.41e+06.
gmu-35-50: time=323.07

通过求解器筛选，符合要求的测试案例显著下降，但仍然较多。接下来，我将仅使用CBC的B&B能力进行求解，进行进一步筛选。

In [7]:
## 定义筛选标准
max_secs = 300 ## 最大运行时长
opt_gap_rate_thres = 0.05 ## 最大可接受optimality gap

## 读取测试案例名称
with open('data/test_cases_v2','r+') as f:
    lines = f.readlines()
model_names = [line.strip('\n') for line in lines]
    
result = dict()
for model_name in model_names:
    ## 调用CBC的B&B能力进行求解
    (dt,status,LB,obj) = cbc_solve(model2fname(model_name),time_limit=max_secs,solve_type=1)
    print('{}: time={:.2f}, LB={:.2e}, obj={:.2e}.'.format(model_name,dt,LB,obj))
    result[model_name] = (dt,status,LB,obj)
df_result = process_result(result,save_fname='result/bb_benchmark.csv')
print('elapsed hours: {:.2f}.'.format(df_result['time'].sum() / 3600))

## 筛选符合要求的测试案例
idxs = df_result['opt_gap_rate'] <= opt_gap_rate_thres
model_names_remain = df_result.loc[idxs,'model'].values
print('test cases v3: total {}.'.format(len(model_names_remain)))
## 重新存储测试案例名称
with open('data/test_cases_v3','w+') as f:
    for model_name in model_names_remain:
        f.write(model_name+'\n')

binkar10_1: time=497.14, LB=6.67e+03, obj=6.75e+03.
csched008: time=347.13, LB=1.71e+02, obj=1.88e+02.
dano3_3: time=201.51, LB=5.76e+02, obj=5.76e+02.
dano3_5: time=323.19, LB=5.76e+02, obj=5.77e+02.
eil33-2: time=62.93, LB=9.34e+02, obj=9.34e+02.
gen-ip002: time=558.04, LB=-4.80e+03, obj=-4.78e+03.
gen-ip054: time=505.39, LB=6.81e+03, obj=6.85e+03.
gmu-35-40: time=527.46, LB=-2.41e+06, obj=-2.40e+06.
gmu-35-50: time=395.43, LB=-2.61e+06, obj=-2.60e+06.
ic97_potential: time=314.79, LB=3.87e+03, obj=4.17e+03.
icir97_tension: time=326.85, LB=6.32e+03, obj=6.48e+03.
markshare_4_0: time=279.52, LB=1.00e+00, obj=1.00e+00.
mas74: time=323.93, LB=1.14e+04, obj=1.18e+04.
mas76: time=86.06, LB=4.00e+04, obj=4.00e+04.
mik-250-20-75-4: time=316.29, LB=-5.66e+04, obj=-5.23e+04.
neos-1171737: time=325.83, LB=-1.95e+02, obj=1.00e+50.
neos-1445765: time=336.01, LB=-2.47e+04, obj=1.00e+50.
neos-1582420: time=347.94, LB=8.92e+01, obj=9.90e+01.
neos-2978193-inde: time=338.46, LB=-2.42e+00, obj=-2.39e+0

经过上面多步筛选，用于评估的测试案例的数量终于能够被控制在一个较低的水平。下面，我们将进入正题：如何实现自己的B&B求解器。