# Predict bad LS moves

This notebook develops simple models for predicting bad local search moves. Particularly, given nodes $U$ and $V$ in routes $R_U$ and $R_V$, it predicts whether each LS operator we currently have is likely to produce an improving solution if the operator were applied to these node pairs $U$ and $V$.

The motivation is that evaluating a full operator move is typically somewhat slow, whereas a fast and reasonably accurate prediction method can completely avoid such evaluations.

In [14]:
%cd ..

D:\Projects\Python\Euro-NeurIPS-2022


In [1]:
%matplotlib inline

In [73]:
from collections import defaultdict
from dataclasses import dataclass
from glob import iglob
import itertools
from pathlib import Path
import re

import numpy as np
import matplotlib.pyplot as plt

from sklearn.linear_model import SGDClassifier

import tools

In [74]:
DATA_PATH = Path("data/raw/")
INST_PATH = Path("instances/")

## Utilities

These can be used to quickly parse the raw results for a single instance into something that contains the same data, but in a more workable format. Uses a generator, so the memory overhead is minimal. Also provides some utilities to work with these generators.

In [75]:
def chunked_iterable(iterable, size):
    "After https://alexwlchan.net/2018/12/iterating-in-fixed-size-chunks/"
    it = iter(iterable)
    while True:
        chunk = tuple(itertools.islice(it, size))
        if not chunk:
            break
        yield chunk

In [76]:
@dataclass
class Record:
    op: int
    U: int
    V: int
    delta: int
    Ru: list[int]
    Rv: list[int]

def parse_file(file: str):
    def parse_record(record: list[str]) -> Record:
        op = int(record[0].strip())
        U, V, delta = map(int, record[1].strip().split(" "))
        _, *Ru = map(int, re.findall('[0-9]+', record[2].strip()))
        _, *Rv = map(int, re.findall('[0-9]+', record[3].strip()))

        return Record(op, U, V, delta, Ru, Rv)

    with open(file, 'r') as fh:
        args = [iter(fh)] * 4
        records = zip(*args)

        yield from chunked_iterable(map(parse_record, records), 1024)

## Features

In [77]:
def make_features(instance, records):
    dist = instance['duration_matrix']
    dist_max = dist.max()
    data = []
    
    for record in records:
        dist_uv = dist[record.V, record.U] / dist_max
        dist_vu = dist[record.U, record.V] / dist_max
        
        data.append([dist_uv, dist_vu])
    
    return np.array(data)

In [78]:
# instance = tools.read_vrplib(INST_PATH / 'ORTEC-VRPTW-ASYM-1de83915-d1-n262-k15.txt')

# record = next(parse_file(DATA_PATH / 'ORTEC-VRPTW-ASYM-1de83915-d1-n262-k15.txt'))
# make_features(instance, record)

## Training

In [82]:
model = SGDClassifier(max_iter=5)

In [83]:
instance = tools.read_vrplib(INST_PATH / 'ORTEC-VRPTW-ASYM-1de83915-d1-n262-k15.txt')

for records in parse_file(DATA_PATH / 'ORTEC-VRPTW-ASYM-1de83915-d1-n262-k15.txt'):
    y = [int(record.delta < 0) for record in records]
    X = make_features(instance, records)
    model.partial_fit(X, y, [0, 1])

In [85]:
# for records in parse_file(DATA_PATH / 'ORTEC-VRPTW-ASYM-1de83915-d1-n262-k15.txt'):
#     y = [int(record.delta < 0) for record in records]
#     X = make_features(instance, records)
    
#     print(model.score(X, y))