<a href="https://www.kaggle.com/code/taanieluleksin/wilcard-optimization-with-acvrp-up-to-2430?scriptVersionId=252260792" target="_blank"><img align="left" alt="Kaggle" title="Open in Kaggle" src="https://kaggle.com/static/images/open-in-kaggle.svg"></a>

# **REFERENCES**
* https://www.kaggle.com/starohub/st-21-a-minmax-ctsp
* https://www.kaggle.com/yamqwe/permutations-rebalancing-multiprocessing
* http://webhotel4.ruc.dk/~keld/research/LKH-3/

In [1]:
import os
import random
import itertools
import numpy as np
import pandas as pd
from tqdm.contrib.concurrent import process_map

In [2]:
!wget http://webhotel4.ruc.dk/~keld/research/LKH-3/LKH-3.0.7.tgz
!tar xvfz LKH-3.0.7.tgz
!cd LKH-3.0.7; make; cp LKH ..

--2025-07-24 12:57:11--  http://webhotel4.ruc.dk/~keld/research/LKH-3/LKH-3.0.7.tgz
Resolving webhotel4.ruc.dk (webhotel4.ruc.dk)... 130.225.220.230
Connecting to webhotel4.ruc.dk (webhotel4.ruc.dk)|130.225.220.230|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 2316081 (2.2M) [application/x-gzip]
Saving to: ‘LKH-3.0.7.tgz’


2025-07-24 12:57:17 (505 KB/s) - ‘LKH-3.0.7.tgz’ saved [2316081/2316081]

LKH-3.0.7/
LKH-3.0.7/pr2392.par
LKH-3.0.7/whizzkids96.atsp
LKH-3.0.7/Makefile
LKH-3.0.7/whizzkids96.par
LKH-3.0.7/pr2392.tsp
LKH-3.0.7/DOC/
LKH-3.0.7/README.txt
LKH-3.0.7/SRC/
LKH-3.0.7/SRC/Penalty_CVRPTW.c
LKH-3.0.7/SRC/RestoreTour.c
LKH-3.0.7/SRC/SolveKMeansSubproblems.c
LKH-3.0.7/SRC/IsCommonEdge.c
LKH-3.0.7/SRC/Penalty_TSPPD.c
LKH-3.0.7/SRC/ReadProblem.c
LKH-3.0.7/SRC/BestKOptMove.c
LKH-3.0.7/SRC/Distance_SPECIAL.c
LKH-3.0.7/SRC/Penalty_TSPDL.c
LKH-3.0.7/SRC/Penalty_PDPTW.c
LKH-3.0.7/SRC/Penalty_ACVRP.c
LKH-3.0.7/SRC/CreateCandidateS

2440 was obtained by running CTSP model for several days.

In [3]:
LETTERS = {
    1: '🎅',  # father christmas
    2: '🤶',  # mother christmas
    3: '🦌',  # reindeer
    4: '🧝',  # elf
    5: '🎄',  # christmas tree
    6: '🎁',  # gift
    7: '🎀',  # ribbon
    8: '🌟',  # star
}
INV_LETTERS = {v: k for k, v in LETTERS.items()}

solution = pd.read_csv('../input/ctsp-2440/submission_no_wildcards_2440_2440_2440.csv')
strings = [[INV_LETTERS[c] for c in s] for s in solution.schedule]
strings.sort(key=len, reverse=True)
print(f'Strings lengths are {[len(_) for _ in strings]}.')

Strings lengths are [2440, 2440, 2440].


# Peek closely at the permutations in each string

Rebalancing helps to reduce the amount of nodes in each string so this might help with getting the solution faster.

In [4]:
def find_strings_perms(strings, verbose=False):
    all_perms = set(itertools.permutations(range(1, 8), 7))
    perms = []
    for s in strings:
        perms.append([])
        for i in range(len(s)-6):
            p = tuple(s[i:i+7])
            if p in all_perms:
                perms[-1].append(p)
    if verbose:
        lens = [len(_) for _ in  perms]
        print(f'There are {lens} permutations in strings, {sum(lens)} in total.')
        lens = [len(set(_)) for _ in  perms]
        print(f'There are {lens} unique permutations in strings, {sum(lens)} in total.')
    return perms

strings_perms = find_strings_perms(strings, verbose=True)

There are [1941, 1904, 1941] permutations in strings, 5786 in total.
There are [1891, 1852, 1891] unique permutations in strings, 5634 in total.


In [5]:
def rebalance_perms(strings_perms, verbose=False):
    # convert to dicts for fast lookup and to keep permutations order
    strings_perms = [dict.fromkeys(_) for _ in strings_perms] 
    for p in strings_perms[0].copy():  # iterate over the copy to allow modification during iteration
        if p[:2] != (1, 2) and (p in strings_perms[1] or p in strings_perms[2]):
            strings_perms[0].pop(p)
    for p in strings_perms[1].copy():
        if p[:2] != (1, 2) and p in strings_perms[2]:
            strings_perms[1].pop(p)
    if verbose:
        lens = [len(_) for _ in  strings_perms]
        print(f'There are {lens} permutations left in strings after rebalancing, {sum(lens)} in total.')
    return [list(_) for _ in strings_perms] 

strings_perms = rebalance_perms(strings_perms, verbose=True)

There are [1652, 1737, 1891] permutations left in strings after rebalancing, 5280 in total.


## Wildcard adding
* Currently wildcard options are added to mandatory permutations as these probably bring the most additional length to string and could be the best candidates for wildcard.
* Wildcard will be added to the beginning of the permutation. For example *234567.

Possible improvements might include.
* Adding wildcard options to other positions.
* Adding non-mandatory permutations to wildcard options.

### Optimization trick
I will define distance from wildcard to its non-wildcard to be 0 (and 7 vice versa). This way the optimization can use all non-wildcard options and use wildcards.

In [6]:
def perm_dist_wildcard(p, q):
    #Hacky-cracky generalization of initial distance calculation to take wildcards into account
    #Wildcard in first position in q but no wildcard in p.
    if (8 in q) and (8 not in p):
        if (q[0] == 8):
            i = p.index(q[1])
            if p[i:] == q[1:7-i+1]:
                if i - 1 == 0:
                    #Going from non-wildcard permutation to its wildcard will result in length 7
                    return 7
                else:
                    return i - 1
            else:
                return 6
        else:
            i = p.index(q[0])
            w = q.index(8)
            return i if (p[i:i+w] == q[:w]) and (p[i+w+1:] == q[w+1:7-i]) else 7
    
    #Distance from wildcard to its normal permutation is 0.
    if (8 in p) and (8 not in q):
        wi = p.index(8)
        if (p[:wi] == q[:wi]) and (p[wi+1:] == q[wi+1:]):
            return 0
        
    #Distance from wildcard to non-wildcard
    if q[0] not in p:
        i = p.index(8)
        return i if p[i+1:] == q[1:7-i] else 7
    
    i = p.index(q[0])
    return i if p[i:] == q[:7-i] else 7

def create_perms_wildcards(perms):
    #Create wildcards with only first member as wildcard
    #Currently creates wildcards for only mandatory members
    perms_wildcards = perms.copy()
    wildcards = []
    for perm in perms:
        if perm[:2] == (1,2):
            perm_w = list(perm)
            perm_w[0] = 8
            perm_w = tuple(perm_w)
            wildcards.append(perm_w)
    for wildcard in wildcards:
        perms_wildcards.append(wildcard)
    return perms_wildcards

def perms_to_string_wildcards(perms):
    perms = list(perms)
    s = [*perms[0]]
    for p, q in zip(perms, perms[1:]):
        d = perm_dist_wildcard(p, q)
        if 8 in q:
            if d == 7:
                s[-7:] = q[:-d]
            else:
                s[-7+d:] = q[:-d]
        if d > 0:
            s.extend(q[-d:])
    return s

def distances_matrix_wildcards(perms):
    m = np.zeros((len(perms), len(perms)), dtype='int8')
    for i, p in enumerate(perms):
        for j, q in enumerate(perms):
            m[i, j] = perm_dist_wildcard(p, q)
    return m

def write_params_file_acvrp(uid):
    with open('santa_%s.par' % uid, 'w') as f:
        print('SPECIAL', file=f)
        print('PROBLEM_FILE = santa_%s.vrp' % uid, file=f)
        print('MTSP_OBJECTIVE = MINSUM', file=f)
        print('TOUR_FILE = best_tour_%s.txt' % uid, file=f)
        print('OUTPUT_TOUR_FILE = output_tour_%s_$.txt' % uid, file=f)
        print('INITIAL_TOUR_FILE = initial_tour_%s.txt' % uid, file=f)
        print('SALESMEN = 1', file=f)
        print('PATCHING_C = 4', file=f)
        print('PATCHING_A = 3', file=f)
        print('GAIN23 = YES', file=f)
        print('SEED = 42', file=f)
        print('MAX_TRIALS = 100000', file=f)
        print('TIME_LIMIT = 30000', file=f) #seconds
        print('TRACE_LEVEL = 2', file=f)
        print('PRECISION = 1', file=f)

def write_problem_file_acvrp(uid, distances, capacity=2, wildcard_count=0):
    with open('santa_%s.vrp' % uid, 'w') as f:
        print('TYPE: ACVRP', file=f)
        print(f'DIMENSION: {len(distances)}', file=f)
        print(f'CAPACITY : {capacity}', file=f)
        print('VEHICLES : 1', file=f)
        print('EDGE_WEIGHT_TYPE: EXPLICIT', file=f)
        print('EDGE_WEIGHT_FORMAT: FULL_MATRIX\n', file=f)
        print('EDGE_WEIGHT_SECTION', file=f)
        for row in distances:
            print(' '.join(str(_) for _ in row), file=f)
        print('DEMAND_SECTION', file=f)
        r = 0
        non_wildcard_count = len(distances) - wildcard_count
        for row in distances:
            r += 1
            if r > non_wildcard_count:
                print(f'{r}    1', file=f)
            else:
                print(f'{r}    0', file=f)

def write_initial_tour_file(uid, perms):
    with open('initial_tour_%s.txt' % uid, 'w') as f:
        print('TOUR_SECTION', file=f)
        print(' '.join(str(_) for _ in range(1, len(perms)+1)), -1, file=f)

def read_output_tour_wildcard(uid, perms, exclusion_count = 0):
    perms = list(perms)
    with open('best_tour_%s.txt' % uid) as f:
        lines = f.readlines()
    tour = lines[lines.index('TOUR_SECTION\n')+1:-2-exclusion_count]
    return [perms[int(_) - 1] for _ in tour] 

def solve_acvrp(perms, verbose=False):
    uid = str(random.randint(1, 9999))
    write_params_file_acvrp(uid)
    perms_wildcard = create_perms_wildcards(perms)
    wildcard_option_count = len(perms_wildcard) - len(perms)
    distances = distances_matrix_wildcards(perms_wildcard)
    write_problem_file_acvrp(uid, distances, 2, wildcard_option_count)
    write_initial_tour_file(uid, perms_wildcard)
    
    # Run LKH-3 to solve ACVRP instance
    if verbose:
        os.system('./LKH santa_%s.par' % uid)
    else:
        os.system('touch lkh_%s.log' % uid)
        os.system('./LKH santa_%s.par >> lkh_%s.log' % (uid, uid))
    
    # To read output tour, we want to exclude wildcard in the end of the file,
    # but there will be 2 more members than a normal tour thanks to two wildcards.
    # This is why we have this weird wildcard_option_count-2 as argument.
    tour = read_output_tour_wildcard(uid, perms_wildcard, wildcard_option_count-2)
    return perms_to_string_wildcards(tour)

# Solving wildcards using ACVRP (Asymmetric Capacitated Vehicle Routing Problem)
LKH-3's ACVRP will be used.

## General idea
Put a constraint to how many wildcards can be "traveled through". In our assignment it is 2. Have non-wildcards with zero capacity and wildcards with 1. When setting vehicles capacity to 2 we should have only 2 wildcards in our solution.

## How to read the results from tour file?
All the unused wildcards will be in the end of the tour so these should be ignored.

## Why VRP?
LKH-3 library didn't seem to have good solution for capacitated TSP but it is the same as VRP with vehicles = 1.

In [7]:
print('='*91)
#Solve ACVRP for wildcards!
new_strings = list(process_map(solve_acvrp, strings_perms))
new_strings.sort(key=len, reverse=True)
new_lens = [len(_) for _ in new_strings]
strings = new_strings



  0%|          | 0/3 [00:00<?, ?it/s]

In [8]:
sub = pd.DataFrame()
sub['schedule'] = [''.join(LETTERS[x] for x in s) for s in strings]
sub_name = 'submission.csv'
sub.to_csv(sub_name, index=False)

This might not be the most optimal solution the model can find because of time constraint. But eventually solution for 2430 can be obtained.