# Algoritmi lokalne pretrage - LS, SA, VNS

Kod ovih algoritama necemo dobijati preterano dobra resenja, ali su svakako vrlo solidna opcija za primere gde znamo da necemo imati mnogo preklapanja izmedju niski

### Pomocne funkcije

In [1]:
import random

def merge(arr):
    n = len(arr)
    solution = arr[0]
    i = 1
    while i < n:
        string = arr[i]
        len1 = len(solution)
        len2 = len(string)
        max_len = 0
         
        for j in range(1, min(len1, len2)+1):
            if solution.endswith(string[:j]):
                 max_len = j
        solution += string[max_len:]
        i+=1
    return solution, len(solution)

def swap(arr):
    n = len(arr)
    index1, index2 = random.sample(range(n), 2)
    
    temp = arr[index1]
    arr[index1] = arr[index2]
    arr[index2] = temp
    
    return arr

def shaking(arr, k):
    n = len(arr)
    indexes = []
    i = 0
    for i in range(k):
        indexes.append(random.sample(range(n), 2))
    
    for index_pair in indexes:
        index1, index2 = index_pair
        temp = arr[index1]
        arr[index1] = arr[index2]
        arr[index2] = temp
    
    return arr

### Glavni algoritam lokalne pretrage - LS

In [2]:
import time
num_iter = 100
def LS(arr):
    start = time.time()
    
    arr = list(set(arr)) #uklanjamo sve duplikate iz polaznog niza
    
    scs, length = merge(arr)
    
    for i in range(num_iter):
        new_arr = swap(arr)
        new_scs, new_length = merge(new_arr)
        
        if new_length < length:
            arr, scs, length = new_arr, new_scs, new_length
    
    
    end = time.time()
    delta = end-start
    if delta < 0.0001:
        print("LS:  Execution time: <0.0001 seconds.")
    else:
        print("LS:  Execution time: " + str(round(delta, 4)) + " seconds.")
        
    return scs, length

### Glavni algoritam simuliranog kaljenja - SA

In [3]:
def SA(arr):
    start = time.time()
    
    arr = list(set(arr)) #uklanjamo sve duplikate iz polaznog niza
    
    scs, length = merge(arr)
    best_arr, best_scs, best_length = arr, scs, length
    
    for i in range(1, num_iter):
        new_arr = swap(arr)
        new_scs, new_length = merge(new_arr)
        
        if new_length < length:
            arr, scs, length = new_arr, new_scs, new_length
            if new_length < best_length:
                best_arr, best_scs, best_length = new_arr, new_scs, new_length
        else:
            p = 1 / i ** 0.5
            q = random.random()
            if q < p:
                arr, scs, length = new_arr, new_scs, new_length
                
    end = time.time()
    delta = end-start
    if delta < 0.0001:
        print("SA:  Execution time: <0.0001 seconds.")
    else:
        print("SA:  Execution time: " + str(round(delta, 4)) + " seconds.")
        
    return best_scs, best_length

### Glavni algoritam VNS pretrage

In [4]:
def VNS(arr, k_max, move_prob):
    start = time.time()
    
    arr = list(set(arr)) #uklanjamo sve duplikate iz polaznog niza
    
    scs, length = merge(arr)
    
    for i in range(num_iter):
        for k in range(1, k_max):
            new_arr = shaking(arr, k)
            new_scs, new_length = merge(new_arr)
        
            if new_length < length or (new_length == length and random.random() < move_prob):
                arr, scs, length = new_arr, new_scs, new_length
                break
    
    
    end = time.time()
    delta = end-start
    if delta < 0.0001:
        print("VNS: Execution time: <0.0001 seconds.\n")
    else:
        print("VNS: Execution time: " + str(round(delta, 4)) + " seconds.\n")
        
    return scs, length

Pomocne funkcije za stampanje rezultata

In [5]:
def print_all_solutions(sol1, sol2, sol3):
    print("Found shortest superstrings:\nLS: " + sol1 + "\tSA: " + sol2 + "\tVNS: " + sol3)
        
def print_all_sizes(size1, size2, size3):
    print("Sizes of found shortest superstrings:\nLS: " + str(size1) + "\tSA: " + str(size2) + "\tVNS: " + str(size3))

## Primeri

### Primer 1

Dat je niz koji sadrzi 4 niske od po 3 karaktera

In [6]:
arr = ["AAB", "BAA", "ABA", "BAB"]

solution1, size1 = LS(arr)
solution2, size2 = SA(arr)
solution3, size3 = VNS(arr, k_max=5, move_prob = 0.5)

print_all_solutions(solution1, solution2, solution3)
print_all_sizes(size1, size2, size3)

LS:  Execution time: 0.0005 seconds.
SA:  Execution time: 0.0005 seconds.
VNS: Execution time: 0.0028 seconds.

Found shortest superstrings:
LS: BABAAB	SA: BAABAB	VNS: BAABAB
Sizes of found shortest superstrings:
LS: 6	SA: 6	VNS: 6


### Primer 2

Dat je niz koji sadrzi 5 niski od po 4 karaktera

In [7]:
arr = ["bloa", "bubl", "gabl", "abpo", "ublm"]

solution1, size1 = LS(arr)
solution2, size2 = SA(arr)
solution3, size3 = VNS(arr, k_max=5, move_prob = 0.5)

print_all_solutions(solution1, solution2, solution3)
print_all_sizes(size1, size2, size3)

LS:  Execution time: 0.0005 seconds.
SA:  Execution time: 0.0006 seconds.
VNS: Execution time: 0.003 seconds.

Found shortest superstrings:
LS: gabloabpobublm	SA: bublmgabloabpo	VNS: gabloabpobublm
Sizes of found shortest superstrings:
LS: 14	SA: 14	VNS: 14


### Primer 3

Dat je niz koji sadrzi 20 niski od po 4 karaktera

In [8]:
arr = ["wobj" , "bfqp", "pzlb", "rfcs", "atha", 
       "npjp", "tfgu", "izjx", "dven", "tksn", 
       "fqws", "cusc", "qlpy", "fepk", "cbzj", 
       "ecrx", "cpsp", "zqdp", "liqu", "rdyu"]

solution1, size1 = LS(arr)
solution2, size2 = SA(arr)
solution3, size3 = VNS(arr, k_max=5, move_prob = 0.5)

#print_all_solutions(solution1, solution2, solution3)
print_all_sizes(size1, size2, size3)

LS:  Execution time: 0.0016 seconds.
SA:  Execution time: 0.0016 seconds.
VNS: Execution time: 0.0111 seconds.

Sizes of found shortest superstrings:
LS: 77	SA: 77	VNS: 77


### Slozeniji primeri

Pomocna funkcija za ucitavanje niski iz .txt fajla u niz

In [9]:
def LoadStringsFromTxt(filename):
    with open(filename) as f:
        data = f.read()
    data = data.split("\n")
    data.pop()
    return data

### Primer 4

In [10]:
arr = []
arr = LoadStringsFromTxt("data/test1.txt")

solution1, size1 = LS(arr)
solution2, size2 = SA(arr)
solution3, size3 = VNS(arr, k_max=5, move_prob = 0.5)

#print_all_solutions(solution1, solution2, solution3)
print_all_sizes(size1, size2, size3)

LS:  Execution time: 0.0272 seconds.
SA:  Execution time: 0.0235 seconds.
VNS: Execution time: 0.0712 seconds.

Sizes of found shortest superstrings:
LS: 1593	SA: 1588	VNS: 1586


### Primer 5

In [11]:
arr = []
arr = LoadStringsFromTxt("data/test2.txt")

solution1, size1 = LS(arr)
solution2, size2 = SA(arr)
solution3, size3 = VNS(arr, k_max=5, move_prob = 0.5)

#print_all_solutions(solution1, solution2, solution3)
print_all_sizes(size1, size2, size3)

LS:  Execution time: 0.0966 seconds.
SA:  Execution time: 0.0926 seconds.
VNS: Execution time: 0.3288 seconds.

Sizes of found shortest superstrings:
LS: 9586	SA: 9587	VNS: 9580


### Primer 6

In [12]:
arr = []
arr = LoadStringsFromTxt("data/DNA_Sequence5.txt")

solution1, size1 = LS(arr)
solution2, size2 = SA(arr)
solution3, size3 = VNS(arr, k_max=5, move_prob = 0.5)

#print_all_solutions(solution1, solution2, solution3)
print_all_sizes(size1, size2, size3)

LS:  Execution time: 0.067 seconds.
SA:  Execution time: 0.0614 seconds.
VNS: Execution time: 0.2237 seconds.

Sizes of found shortest superstrings:
LS: 4650	SA: 4653	VNS: 4627


Sva tri algoritma daju vrlo slicne rezultate - jedina razlika medju njima je sto se VNS izvrsava duze i za malo vece primere u proseku daje za nijansu bolja resenja