# Local Beam Search

## Algorithm Description

Beam search is a heuristic search technique that investigates a network by extending the most promising node in a constrained collection. Beam search always increases the W number of best nodes at each level. It advances level by level, descending exclusively from the best W nodes at each level. Beam Search constructs its search tree using breadth-first search. Beam Search uses breadth-first search to build its search tree. At each level of the tree, it creates all the successors of the current level's state. However, it only assesses a W number of states at each level. Other nodes are not taken into consideration.

## Pseudo Code

*This is pseudo code for function checking if the list contains all classes or not*

```
    class_flag is list of FALSE for all item in class
    FOR item IN knapsack
        BEGIN
        class_flag[class[item]]=True
        END
    FOR item FROM 1 TO length of class_flag:
        IF (NOT class_flag[i])
            return False
            ENDIF
    return True
```
*This is pseudo code for Local Beam Search
```
FUNCTION LBS(k=5)
    kmax_val <- kth or smallest item in val
    INITIALIZE frontier AS AN ARRAY OF Node
    FOR i=0 TO len(val)
        IF (weight[i]<= capacity and val[i] IS IN kmax_val)
            ADD Node(i,[]) TO frontier
    result<-frontier
    REPEAT
        INITIALIZE successor AS AN ARRAY
        FOR EACH cur IN frontier
            ADD cur.successor() TO successor
        frontier <- kth or smallest item in successor
        FOR i=0 TO len(frontier):
            result[i] = frontier[i]
    UNTIL (len(frontier)==0)

    INITIALIZE result_path AS AN ARRAY
    result_node <- None
    FOR EACH node IN result
        IF (there is enough class from root to node)
            result_node = node
            BREAK

    RETURN result_node
```

## Algorithm Explanation

In this practice, local beam search is applied with an aim to find out the maximun value within a weight capacity limitation. Therefore, we started the process with K highest value item, put them in the frontier, and gradually expand the tree while retrieve each item in the frontier with successors are K highest value nodes and have not been chosen in the path. The process terminate when the total weight in the path exceeds the capacity or all the items have been already chosen 

Step - by - step Algorithm: 
1. Start with k states generated with highest value
2. At each iteration, all the successors of all k states are generated 
3. If any one is a goal state, stop; else select the k best successors from the complete list and repeat.
4. Check if all classes appear in k returned path and select the highest value one.

## Evaluate

The heuristic cost associated with the node is utilized to choose the best nodes. W denotes the breadth of the beam search. If B is the branching factor, there will always be W B nodes under evaluation at every level, but only W will be picked. When the beam width is decreased, more states are trimmed. 
When W = 1, the search becomes a hill-climbing search in which the best node is always picked from the successor nodes. If the beam width is set to infinity and the beam search is classified as a breadth-first search, no states are pruned. 

## Comments

The beamwidth constrains the amount of memory required to finish the search, but at the expense of completeness and optimality (possibly that it will not find the best solution). This risk arises from the possibility that the desirable state has been trimmed.

## Conclusion
Local beam search is a special method in the context of a local search that begins by picking k randomly generated states and then, for each level of the search tree, always evaluates k new states from all the possible successors of the current ones, until it achieves a target. Because local beam search frequently leads to local maxima, a popular approach is to select the next k states at random, with a probability determined by the heuristic assessment of the states. In short, this approach is still most often used to maintain tractability in large systems with insufficient amount of memory to store the entire search tree due to its effect on minimize the demand of storage.


# Visualization

The tree below is the visualization of how this algorithm proceed a test case with the following parameters:
- capacity: 8
- number of class: 2
- list of value: [1,2,3,4,5]
- list of weight: [1,1,3,2,5]
- list of class label: [1,1,1,2,2]

![title](visualize/LBS.png)

## Result
- Path: [0,1,0,1,1] 
- Total value: 11
- Total weight: 8

# Demo Code

## Library

In [None]:
import os #read file
from tqdm import tqdm #progress bar
import time #compute running time

### Term Initiation

In [None]:
num_class=0 #number of class
capacity=0 #weight capacity
weight=[] #weight list of items
val=[] #value list of items
label=[] #label list of items

### Helper Function

In [None]:
#Function to check whether a path contain at least 1 item in each class
def check_class(path,num_class): 
    class_flag=[False for i in range(num_class+1)]
    for item in path:
        class_flag[label[item]]=True
    for i in range(1,len(class_flag)):
        if not class_flag[i]: return False
    return True

class Node:
    def __init__(self,index,path):
        self.index = index
        self.path=path+[index]
    def current_value(self): #compute total value in path
        value=0
        for i in self.path:
            value+=val[i]
        return value
    def current_weight(self): #compute total weight in path
        w=0
        for i in self.path:
            w+=weight[i]
        return w
    def successor(self): #generate successor
        lst=[]
        for i in range(len(val)):
            if i not in self.path and self.current_weight()+weight[i]<=capacity: #check goal
                lst.append(Node(i,self.path))
        return lst

In [None]:
def LBS(k=5): 
    # initial state
    kmax_val=sorted(val,reverse=True)[:min(k,len(val))]
    frontier=[Node(i,[]) for i in range(len(val)) if weight[i]<=capacity and val[i] in kmax_val]
    result=frontier
    while True:
        successor=[]
        for cur in frontier:
            successor+=cur.successor()
        frontier=sorted(successor, key= lambda x: x.current_value(),reverse=True)[:min(k,len(successor))]
        if len(frontier)==0: break
        for i in range(len(frontier)):
            result[i]=frontier[i]
    
    result_path=[]
    
    #check class
    result_node =None
    for node in result:
        if check_class(node.path,num_class):
            result_node=node
            break
    return result_node 
   

# Visualize

In [None]:
input_dir='data/input/'
output_dir='data/output03/'
for file in tqdm(os.listdir(input_dir)):
    with open(input_dir+file) as f_read:
        if (file[-4:]!='.txt'): continue #to eliminate.DS_store
        capacity,num_class,weight,val,label=f_read.readlines()
        
    #set capacity and number of class
    capacity,num_class=int(capacity),int(num_class)

    #set weight
    weight=weight.replace(' ','').replace('\n','')
    weight=weight.split(',')
    weight=[eval(i) for i in weight]

    #set value
    val=val.replace(' ','').replace('\n','')
    val=val.split(',')
    val=[eval(i) for i in val]

    #set label
    label=label.replace(' ','').replace('\n','')
    label=label.split(',')
    label=[eval(i) for i in label]
    
    #Calculate execution time
    start=time.time()
    result=LBS()
    end=time.time()
    file_index=file[6:-4]
    
    #Process file
    output_file=f'output_{file_index}.txt'
    chosen_item=[]
    if result:
        for item in range(len(val)):
            if item not in result.path:
                chosen_item.append('0')
            else:
                chosen_item.append('1')
        result_val=result.current_value()
        total_weight=result.current_weight()
    else:
        result_val=0
        total_weight=0
    with open(output_dir+output_file,'w+') as f_write:
        f_write.write(str(result_val)+'\n')
        f_write.write(str(', '.join(chosen_item)))
    
    # print the result list, highest value and total time needed to run the algorithm
    print(file)
    print(f'The result is: {chosen_item}\nWith the value of: {result_val} and total weight: {total_weight}')
    print(f'Complete searching in: {end-start} seconds\n')


 64%|██████▎   | 7/11 [00:00<00:00, 57.06it/s]

input_001.txt
The result is: ['0', '1', '0', '0', '1', '0', '0', '0', '0', '1']
With the value of: 117 and total weight: 100
Complete searching in: 0.005988359451293945 seconds

input_002.txt
The result is: ['1', '1', '1', '1', '1', '1', '1', '1', '1', '1']
With the value of: 1000 and total weight: 10
Complete searching in: 0.0009968280792236328 seconds

input_003.txt
The result is: ['1', '1', '1', '0', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1']
With the value of: 14211 and total weight: 8729
Complete searching in: 0.0049898624420166016 seconds

input_004.txt
The result is: []
With the value of: 0 and total weight: 0
Complete searching in: 0.0015821456909179688 seconds



 27%|████████████                                | 3/11 [00:02<00:05,  1.42it/s]

input_010.txt
The result is: ['0', '0', '0', '0', '0', '1', '1', '0', '0', '0', '0', '1', '0', '0', '0', '0', '0', '0', '0', '0', '0', '1', '0', '0', '0', '0', '0', '0', '0', '0', '0', '1', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '1', '0', '0', '0', '0', '0', '0', '1', '0', '0', '1', '0', '0', '0', '0', '0', '0', '1', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '1', '0', '0', '0', '0', '0', '0', '0', '0', '1', '0', '0', '0', '1', '1', '0', '0', '0', '1', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '1', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '1', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '1', '1', '0', '0', '0', '0', '0', '0', '0', 

 36%|████████████████                            | 4/11 [02:12<04:57, 42.46s/it]

input_007.txt
The result is: ['1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', 

 82%|████████████████████████████████████        | 9/11 [02:15<00:26, 13.29s/it]

input_008.txt
The result is: ['1', '0', '1', '1', '0', '0', '0', '0', '0', '1', '1', '1', '0', '0', '0', '0', '0', '1', '0', '0', '0', '1', '1', '1', '0', '0', '1', '0', '1', '1', '0', '1', '0', '0', '0', '0', '1', '1', '1', '1', '0', '1', '1', '0', '0', '1', '0', '0', '0', '0', '1', '0', '1', '1', '0', '0', '0', '0', '1', '1', '0', '1', '1', '0', '1', '1', '1', '0', '1', '0', '0', '1', '0', '0', '1', '1', '1', '1', '1', '0', '0', '1', '1', '1', '1', '0', '1', '0', '1', '0', '0', '0', '0', '0', '0', '1', '1', '1', '0', '1', '0', '1', '1', '0', '0', '0', '0', '0', '0', '0', '1', '0', '1', '1', '0', '0', '0', '1', '1', '1', '0', '0', '1', '0', '1', '0', '0', '0', '0', '0', '0', '1', '0', '1', '0', '0', '1', '1', '0', '0', '0', '1', '0', '1', '0', '0', '1', '1', '0', '0', '1', '1', '1', '0', '1', '0', '0', '0', '1', '0', '0', '1', '1', '1', '0', '0', '0', '0', '1', '1', '1', '0', '1', '0', '1', '0', '0', '0', '0', '0', '1', '1', '0', '0', '0', '1', '1', '1', '0', '0', '0', '1', '0', '1', 

100%|███████████████████████████████████████████| 11/11 [03:07<00:00, 17.00s/it]

input_009.txt
The result is: ['1', '1', '1', '1', '1', '1', '0', '0', '1', '0', '1', '0', '1', '1', '0', '1', '1', '0', '1', '1', '0', '0', '1', '0', '1', '0', '1', '1', '1', '1', '0', '0', '1', '1', '0', '1', '0', '1', '1', '1', '1', '0', '1', '1', '1', '1', '1', '0', '1', '0', '0', '1', '1', '1', '1', '1', '1', '1', '1', '1', '0', '0', '1', '1', '1', '1', '1', '1', '1', '0', '1', '1', '0', '0', '0', '1', '0', '0', '1', '0', '1', '1', '1', '0', '1', '0', '1', '1', '0', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '0', '0', '0', '1', '0', '0', '1', '0', '1', '1', '0', '0', '0', '1', '1', '1', '0', '1', '1', '1', '0', '1', '1', '1', '1', '0', '1', '1', '1', '1', '1', '0', '0', '0', '1', '0', '0', '1', '1', '0', '1', '1', '0', '1', '1', '1', '1', '0', '1', '1', '1', '1', '1', '1', '0', '0', '0', '1', '1', '1', '1', '1', '1', '0', '1', '0', '1', '1', '1', '1', '1', '0', '1', '0', '0', '1', '0', '1', '1', '1', '1', '1', '1', '0', '1', '1', '1', '0', '1', '1', '0', '0', '0', 


