# Workbook 4: Local Search in categorical and continuous spaces
## Introduction
This workbook focusses on the final search algorithm that we have discussed but not asked you to implement so far: Local Search.  

We will focus on perturbative approaches

To develop your understanding you will:
- Start with a simple binary problem that local search should be able to solve.
- Look at a binary problem local search cannot solve without some changes
- Adapt the **SingleMemberSearch** class to work with continuous decision variables,  
   using  continuous version of the first binary problem. 

## Aims of this practical
1. To give you the experience of  implementing, and evaluating the behaviour of local search in categorical problems.
2. To give you experience of comparing the behaviour of different search algorithms.
3. To give you experience of evaluating the efficiency of an algorithm for a problem ( in this case path-planning) by creating different instances of a problem (mazes) to *stress-test* different methods. 

# This is not an assessed workbook.




## reminder: Pseudocode for function SelectAndMoveFromOpenList in Local Search
### This assumes the search process maintains track of *bestSoFar*
<div style="background:#F0FFFF;font-size:18pt">
<p style="color:darkred;font-size:18pt;margin-bottom:0pt"><em>SelectAndMoveFromOpenList</em></p>
<dl style="font-size:18pt;margin-top:0pt">
    <dt>&nbsp;&nbsp;&nbsp;<b>IF</b> IsEmpty( open_list) <b>THEN</b> </dt>
    <dd> RETURN None</dd>
    <dt> &nbsp;&nbsp;&nbsp;<b>ELSE</b></dt>
    <dd>bestChild &larr; <b>GetMemberWithHighestQuality</b>(openList)</dd>
    <dd> <b>EMPTY</b>(openlist)&nbsp;&nbsp;&nbsp;&nbsp;<span style="background:pink">This prevents backtracking</span></dd>
    <dd>  <b>IF</b> BetterThan(bestChild, bestSoFar) <b>THEN</b> <br>
        &nbsp;&nbsp;&nbsp;&nbsp;bestSoFar &larr; bestChild <br>
        &nbsp;&nbsp;&nbsp;&nbsp;RETURN bestChild </dd>
    <dd> <b>ELSE</b> <br>&nbsp;&nbsp;&nbsp;&nbsp; RETURN None</dd>
</dl>
</div>    

<div class="alert alert-block alert-warning" style="color:black">
    <h2> Activity 1: implementing local search for a binary problem</h2>
    <ol>
        <li>Run the first cell to do some standard imports.</li>
    <li>Then complete the second cell which contains an incomplete implementation of local search.</li>
        <ul>
            <li> We have provided an <code>__init__()</code> method with over-rides the default behaviour and creates a random starting point</li>
            <li> <b> You need to complete</b> the method <code>select_and_move_from_openlist()</code>.</li>
            <li> We have broken this down into <b>4</b> clearly marked small steps</li>
            <li> The first step you need to complete uses similar code to BestFirstSearch()</li>
            </ul>
    <li> Test your implementation by running the third cell which uses your implementation to solve the <em>oneMax</em> problem. <br>
        This is a simple binary maximisation problem where the quality is the number of the decision variables set to 1.</li>
    </ol>
 </div>

In [None]:
import numpy as np

In [None]:
# YOU MUST RUN THIS CELL BUT DO NOT EDIT IT OR YOU WILL BREAK THE NOTEBOOK
import copy
import importlib
import sys, os
sys.path.append(os.path.join(os.path.dirname(sys.path[0]), 'common'))

In [None]:
from candidatesolution import CandidateSolution
from singlemembersearch import SingleMemberSearch
from problem import Problem
from onemaxproblem import OneMaxBinary, OneMaxContinuous

In [None]:
class LocalSearch(SingleMemberSearch):
    """Implementation of local search."""

    def __str__(self) -> str:
        """ return name"""
        return "local search"
    
    def __init__( self,
        problem: Problem,
        constructive: bool = False,
        max_attempts: int = 50,
        minimise=True,
        target_quality=1):
        """ call super class 
        then change to random starting point
        """
        super().__init__(problem, constructive=constructive,
                       max_attempts=max_attempts,
                       minimise=minimise,
                       target_quality=target_quality)
        # over-ride default
        arrays_of_rands = np.random.choice(my_binary_onemax.value_set,size=num_vars)
        start_point =  self.open_list[0]
        start_point.variable_values= list(arrays_of_rands)
        #measure quality 
        start_point.quality = self.problem.evaluate(start_point.variable_values)
        if start_point.quality == self.target_quality:
            self.trials = 1
            self.result = start_point.variable_values
            self.solved = True
        
        

    def select_and_move_from_openlist(self) -> CandidateSolution:
        """Pops best thing from list, 
        clears rest of list, 
        then returns best thing
        relies on the presence of self.best_so_far

        Returns
        -------
        next
           working candidate (solution) taken from open list
           **if it is an improvement**
        None
           IF list is empty OR next thing is worse than best so far
        """
        next_soln = CandidateSolution()

        # edge cases
        if len(self.open_list) == 0:
            self.runlog += "LS:empty open list\n"
            return None

        # get best child: start looking for it in position 0
        best_index = 0
        quality = self.open_list[0].quality
        best_so_far: int = quality
        # ====>> insert your code below  to copy the best solution from the open list into next_soln
        
        # ====>> insert your code above  to copy the best solution from the open list into next_soln

        self.runlog += (
            f"\t best child quality {best_so_far},"
            f"\n\t best so far {self.best_so_far}\n"
        )
        # clear the openlist
        # =====>> insert your code below here to clear the openlist
        
        # <<===== insert your code above here to clear the openlist

        # always accept first move
        improvement_found: bool 
        if self.trials == 1:
            improvement_found = True
        # otherwise there must be an improvement
        else:
            pass
            #value will depend on whether next_soln.quality improves on self.best_so_far
            # ====>> insert your code below to set the value of variable improvement_found after first trial
        
            # ====>> insert your code above to set the value of variable improvement_found after first trial

        

        #return best offspring from open listor None if it doesn't improve on self.best_so_far
        # =====> insert your code below to manage the return
        
        # =====> insert your code above manage the return
        
        


In [None]:
#define and create problem instance
num_vars = 20

my_binary_onemax = OneMaxBinary(N=num_vars)

#create search
mysearch = LocalSearch( my_binary_onemax,
                        constructive = False,
                        max_attempts= 500,
                        minimise=False,
                        target_quality=num_vars)

starting_quality = mysearch.open_list[0].quality
success = mysearch.run_search()
if success:
    print ( f'Run  found the goal ({num_vars})'
            f'starting from point with quality {starting_quality}'
            f' after examining {mysearch.trials} solutions.'
          )
else:
    print(f'Run  failed to solve the problem in {mysearch.max_attempts} trials\n'
          f' runlog is:\n {mysearch.runlog}'
         )
    completed_ok=False


<div class="alert alert-block alert-warning" style="color:black">
    <h2> Activity 2: Evaluating your implementation of implementing local search</h2>
    <p>Once your code works and the cell above runs and finds a solution, it is time to evaluate its performance.</p>
    <p> Because it usually starts from a different random place every time, Local Search is a <b> stochastic</b> algorithm 
            ( the technical term for an algorithm that has a <b> random</b> element).<br>
             This means that to analyse its behaviour we should run several times and report the <b>average</b> number of solutions it tries before it finds the goal.</p><p><b> Steps to do</b></p><ol>
    <li>Run the cell above ten times (i.e. ten repetitions) with <em>num_vars= 10</em> and note the number of attempts needed to solve the problem.
        <ul>
        <li> You might like to record these in an excel spreadsheet or similar</li>
        <li> You might also chose to edit the code to automatically run 10 repetitions and calculate the mean and standard deviation of the number of trials</li>
        </ul>
    </li>
    <li>Then repeat, increasing the value of <em>num_vars</em> from 10 to 30 in steps of five </li>
    <li> Plot your results as a curve of mean values (y-axis) vs num_vars (x-axis) with error bars showing the standard deviation.<br>
        The cell below shows you first introduction to the graphics package <b> matplotlib</b> - just comment out the last few lines if you prefer to use something like excel</li>
        </ol>
    Can you explain what it is that makes this problem so easy?
 </div>

<div>
<div style="padding:10px;width:45%;color:black;background-color:yellow;float:left">
    <h3> How to examine results when the algorithm contains randomness</h3>
    <p>Lots of AI algorithms- both for search/optimisation and machine learning - use some form of randomness.
        This means that you might get a different result each time you run them on the same problem (or dataset).
        So two understand or compare results (scientists typically call these <i>observations</i>) we need to look at</p>
    <div>
         <div style="float:right">
            <img src="https://curvebreakerstestprep.com/wp-content/uploads/2021/04/standard-deviation.png" width="300" height="300">
        </div>
        <div "padding:10px;width:25%;color:black;background-color:yellow;float:left">
        <ol> 
        <li>The average case behaviour.<br> Normally we use the <b>mean</b>, which is calculated as the sum of the observed values, divided by the number of observations.</li>
        <li> The amount of difference between observations. <br>
            Usually we use the <b> Standard Deviation</b>, a measure of how much, on average, results differ from the mean (ignoring the sign of the difference).</li>
        </ol>
        </div>
    </div>
    <p> To give a simple example, lets say you run a test in which 5 people score 10, and 5 people score 0.<br>
        The mean= (5*10 +5*0)/10 = 5, but the  standard deviation= 5 as well - since everyone gets a score 5 different from the mean. <br> If we rerun the test but this time everyone gets 4 or 6. Now our mean is still 5 ( 5*4 +5*6 = 50 ), but the standard deviation will be 1. <br>So smaller values of standard deviation means the results are more similar to each other.</p>
</div>
<div style="padding:10px;width:45%;background-color:lightgreen;float:right">
<h3> HINTS on how to do this efficiently</h3>
<ol>
    <li> If you have three arrays for the problem size (x-axis),  number of attempts for each size (y-axis),  and standard deviations  for each size, then you can make a nice plot using the code snippets provided below </li>
<li> You can automate finding the values for these arrays with two loops:<br>
    The first loop is over problem sizes (10,15,20,25,30) inside which:<ul>
    <li> make an array called <code>attempts</code> full of zeros of size REPETITIONS (e.g. 10)</li>
    <li> then have an inside loop which goes  <code>for run in range( REPETITIONS)</code> times <ul>
        <li> make a new instance of the problem, of the appropriate size</li>
    <li> make a new search object <code>mysearch</code></li>
    <li> call the <code>mysearch.runsearch()</code> method</li>
    <li> store the number of solutions it looked at in <code>attempts[run]</code></li></ul>
    <li> Now you can use numpy's built in functions <br> e.g.<code>np.mean(attempts)</code> and <code>np.std(attempts)</code> <br>
    to calculate and store the mean and standard deviation of the number of attempts for this problem size</li>
    </ul>
</div>

In [None]:
# max attempts 10000
MAX_ATTEMPTS =10000

#number of repetitions
REPEATS = 10

sizes= [10,15,20,25,30]
means = np.zeros(len(sizes))
std_deviations=np.zeros(len(sizes))


# ===>your code below here

# copy-paste the code from the cell above and wrap it in a loop 
# that stores the number of solutions tested in each run 

# after that loop   report mean, and standard deviation of these

# <====== your code above here

# for making the plots
from matplotlib import pyplot as plt
#plot results    
plt.errorbar(sizes,means, yerr=std_deviations)   


<div class="alert alert-block alert-warning" style="color:black">
    <h2> Activity 3: Adapting local search for a continuous problem</h2>
    <h3> This is a stretch activity for the more confident coders.</h3>
    <p>For continuous problems you will need to adapt your local search class.</p>
    <p>This requires adapting more of the methods from the single member search class</p>
    <p>I've suggested code that changes the <em>__init__</em> method 
            to initialise with appropriate continuous values,
        and stores the number of samples to take from the neighbourhood each iteration, and whether to use gradient-based search or not.</p>
    <p> <b>So the first thing you need to do</b> is over-ride the <em>select_and_move_from_openlist(self)</em> method
        from your LocalSearch class so that it now accepts solutions that are as good 
        as <em>self.best_so_far</em> and not just improvements.</p>
    <p> <b>The second thing you need to do </b> is to over-ride and change the <em>run_search()</em> method so that it:</p>
        <ol> 
            <li>generates a number of neighbours defined by self.sample_size</li>
            <li> for each neighbour creates a set of changes (one for each decision) then adds those then truncates to the valid range of values (using function provided()
            <li> If  <em> self.use_gradients</em> is <em>False</em> it generates the list of changes at random <br>
                If it is <em>True</em> it calls <em>self.problem.get_gradient()</em> then multiplies the result by <em>self.learning_rate</em> to get the changes</li>
            <li> after looking at all of the neighbours, if they were all worse than what we had already, the open_list will be empty, so you  you need to put the <em>working_candidate</em> back on the open list instead of the closed list.</li> 
        </ol>

 <h3> This version of the problem has a quality function that is the difference to  the target so it needs to be minimised</h3>   
    <p>It also provides a <em>get_gradient() method</em> so you can try both approaches described in the lecture</p>   

 </div>

In [None]:
   
class LocalSearchContinuous(SingleMemberSearch):
    """Implementation of local search for continuous problems.
      Assumes the search mode is perturbative.
      Extends single member search by doing explicit sampling of neighbourhood
      and if not stopping if no improvment is  found in an iteration
      Parameters
      ---------
      sample_size(int): 
          number of neighbours to generate each iteration
          default 10
      use_gradient(bool): 
          whether to use the gradient instead of random changes
          if the problem supports it.
          If set, assume sample_size is 1
          default False
      learning_rate(float)
          multiplier for gradient if used
          default 0.5

    """

    def __str__(self) -> str:
        return "local search continuous"
    
    def __init__(
        self,
        problem: Problem,
        constructive: bool = False,
        max_attempts: int = 50,
        minimise:bool=True,
        target_quality:float=1,
        sample_size:int = 10,
        use_gradient:bool=False,
        learning_rate=0.5
    ):   
        super().__init__(problem, constructive=constructive,
                       max_attempts=max_attempts,
                       minimise=minimise,
                       target_quality=target_quality)
        print(f'self.target_quality is {self.target_quality}') 
        #reinitialise to random continuous values in right range
        self.num_vars  = len(self.open_list[0].variable_values)        
        for decision in range(self.num_vars):
            self.open_list[0].variable_values[decision]= self.rand_in_range()
        #re-evaluate
        quality = self.problem.evaluate(self.open_list[0].variable_values)
        self.open_list[0].quality=quality    

        #store the number of neighbours to examine each iteration 
        self.sample_size = sample_size

        #does the problem support calculation of gradients
        self.use_gradient= use_gradient
        self.learning_rate = learning_rate
        if self.use_gradient:
            try:
                _=self.problem.get_gradient()
                self.sample_size = 1
            except:
                self.use_gradient=False

    def rand_in_range(self)->float:
        """ generates a random number in the range
        specified by the problem
        """
        lowest_val = self.problem.value_set[0]
        val_range = self.problem.value_set[1] - self.problem.value_set[0]
        return np.random.random()*val_range +lowest_val
    
    def get_rand_normals_in_range(self)->list:
        """ 
        generates random number form  normal distribtion
        mean= midpoint of valid range for problem
        sdev = 10% of valid range. for problem
        """
        changes=[]
        valrange = self.problem.value_set[1]-self.problem.value_set[0]
        valmean =  (self.problem.value_set[1]+ self.problem.value_set[0])/2
        for pos in range(self.num_vars):
            randval= np.random.normal() *0.1*valrange + valmean
            changes.append(randval)
        return changes
        
    
    
    def truncate_to_range(self, val:float)->float:
        """ truncates a val ot the valid range
        defined by a problem"""
        if val>self.problem.value_set[1]:
            val = self.problem.value_set[1]
        if val < self.problem.value_set[0]:
            val = self.problem.value_set[0]
        return val
    
    
    def select_and_move_from_openlist(self) -> CandidateSolution:
        """Pops best thing from list, clears rest of list, then returns best thing
        relies on the presence of self.best_so_far

        Returns
        -------
        next
           working candidate (solution) taken from open list
           if it is an improvem ent
        None
           IF list is empty OR next thing is worse than best so far
        """
        next_soln = CandidateSolution()

        # edge cases
        if len(self.open_list) == 0:
            self.runlog += "LS:empty open list\n"
            return None

        # get best child
        best_index = 0
        quality = self.open_list[0].quality
        best_so_far: int = quality
        ## your code to put the right value from the open list into next_soln
        
        self.runlog += (
            f"\t best child quality {best_so_far},\n\t best so far {self.best_so_far}\n"
        )
        # clear the openlist
        ## Your code here
        # always accept first move
        if self.trials == 1:
            better: bool = True
        # otherwise must be an improvement or at least as good (to keep search going)
        #i.e best_so_far must be at least as good as self.best_so_far
        
        # your code to return best from open list 
        #or None if it doesn't improve on self.best_so_far

In [None]:
#search using option 1 from the lectures- adding gaussian noise
num_vars = 10
continuous_onemax = OneMaxContinuous(N=num_vars)
mysearch2 = LocalSearchContinuous( continuous_onemax,
                        constructive = False,
                        max_attempts= 500,
                        minimise=True,
                        target_quality=0.0)
    

success = mysearch2.run_search()
if success:
    print ( 'Local Search solved the problem '
           f' after {mysearch2.trials} attempts.\n'
           f'solution {mysearch2.result}\n'
           f' quality {mysearch2.problem.evaluate(mysearch2.result)[0]}'
          )
else:
    print(f'failed to solve the problem in {mysearch2.max_attempts} trials\n'
          f' runlog is:\n {mysearch2.runlog}'
         )
    

In [None]:
 #Now search using the gradient information

mysearch3 = LocalSearchContinuous( continuous_onemax,
                        constructive = False,
                        max_attempts= 500,
                        minimise=True,
                        target_quality=0.0,
                        use_gradient=True,
                        learning_rate=0.5)    

success = mysearch3.run_search()
if success:
    print ( 'Local Search solved the problem '
           f' after {mysearch3.trials} attempts.\n'
           f'solution {mysearch3.result}\n'
           f' quality {mysearch3.problem.evaluate(mysearch2.result)[0]}'
          )
else:
    print(f'failed to solve the problem in {mysearch3.max_attempts} trials\n'
          f' runlog is:\n {mysearch3.runlog}'
         )
