# Local search

During the previous classes we were concerned with finding a sequence of actions (a plan) to reach one of the goal states from a predefined initial state. In a sense, we were more interested in the *journey* rather than in the *destination* itself: our goal was to transform the initial state to the goal state, not to construct the goal state from scratch. Local search is also different from tree/graph search algorithms like A*: it assumes that all intermediate states are acceptable, but they differ in their quality and the goal is to find the best solution.

We will reuse the definition of the class `Problem` from earlier, but extend it with a new function: `cost`, returning a number representing the cost of the state given as its argument `state`. We want to minimize the cost, so the lower the number the better the state. We make no assumptions about the cost being non-negative, i.e., there may be states with a negative cost (i.e., a gain) - such an approach enables us to consider both maximization and minimization problems within a single framework. We also replace the property `intial_state` with the funtion `random_state` which is supposed to return a randomly generated state for the problem. Finally, we remove `is_goal`, as there may be no clear goal definition.

In [1]:
class Problem:
    def random_state(self):
        ...
        
    def available_actions(self, state):
        ...        
        
    def do_action(self, state, action):
        ...
        return new_state
        
    def cost(self, state) -> float:
        ...

Throughout this assignment we will be using a pseudo-random number generator. A pseudo-random number generator is an algorithm that, given some initial state (usually called *seed*) returns a sequence of numbers. The algorithm itself is deterministic, meaning that the same initial state = the same sequence of numbers. To exploit this property, in the following cell we fix the seed to be 42 - this way every time you restart the notebook and execute it you will get the same results.

In [2]:
import random
random.seed(42)

As an example, we will consider the problem of finding a minimum of some complex function of 2 variables within the domain of integer from the set $\{-10, -9, \ldots, 9, 10\}$. Observe that, in general, moving from the real domain to the integer domain makes the usual optimization algorithms unsuitable. 

In [3]:
class FunctionOptimizationProblem:
    def random_state(self):
        x = random.randrange(-10, 11)
        y = random.randrange(-10, 11)
        return (x, y)
    
    def available_actions(self, state):
        x, y = state
        actions = []
        if x > -10:
            actions += [(-1, 0)]
        if y > -10:
            actions += [(0, -1)]
        if x < 10:
            actions += [(1, 0)]
        if y < 10:
            actions += [(0, 1)]
        return actions
    
    def do_action(self, state, action):
        x, y = state
        dx, dy = action
        return (x+dx, y+dy)
    
    def cost(self, state) -> float:
        x, y = state
        cost = -5*x-8*y
        if x+y>6:
            cost += 10000
        if 5*x+9*y>45:
            cost += 10000
        if x < 0:
            cost += 10000
        if y < 0:
            cost += 10000
        return cost

Let's test it a bit. We start by creating the object representing the problem and generating two random states. Observe that they are different.

In [4]:
problem = FunctionOptimizationProblem()
print("Random state 1", problem.random_state())
print("Random state 2", problem.random_state())

Random state 1 (10, -7)
Random state 2 (-10, -2)


Now let's compute the cost of few different states

In [5]:
print("The cost of an acceptable state (3, 3):", problem.cost((3,3)))
print("The cost of a terrible state (3, 7):", problem.cost((3,7)))
print("The cost of an optimal solution (0, 5):", problem.cost((0,5)))

The cost of an acceptable state (3, 3): -39
The cost of a terrible state (3, 7): 19929
The cost of an optimal solution (0, 5): -40


## Task 1: Implement hill climbing

Complete the following cell of code with an implemention of the hill climbing algorithm.
The algorithm starts in the initial state of the given search problem `problem` and loops forever.
During each repetition of the loop it "looks around" and tests all the states achievable through actions available in the current state.
If none of them is better than the current state (i.e., the algorithm reached a peak or a plateau according to the `cost` function), it breaks the loop and returns the reached state.
Otherwise, it goes to the best of these neighbouring states and computes the next repetition of the loop.

In [6]:
import math

def hill_climbing(problem):
    curr_state = problem.random_state()
    prev_state = []
    curr_cost = problem.cost(curr_state)
    prev_cost = math.inf
    while curr_cost < prev_cost:
        adjacent_nodes = []
        for action in problem.available_actions(curr_state):
            new_state = problem.do_action(curr_state, action)
            new_cost = problem.cost(new_state)
            adjacent_nodes.append((new_cost, new_state))
        prev_cost, prev_state = curr_cost, curr_state
        if adjacent_nodes:
            curr_cost, curr_state = min(adjacent_nodes)
    return prev_state

Lets test your implementation. Try running the cell multiple times. Observe that on some runs it is capable of finding the optimal solution. On others, the result is terrible.

In [7]:
problem = FunctionOptimizationProblem()
solution = hill_climbing(problem)
print("Solution", solution)
print("Cost", problem.cost(solution))

Solution (-2, 6)
Cost 9962


## Task 2: Implement random-restarts hill-climbing

Complete the cell below to implement random-restarts hill-climbing. Randomization is already taken care of in the problem, so basically your task is to call `hill_climbing` number of times given by the argument `n` and return the best solution.

In [8]:
def random_restarts_hill_climbing(problem: Problem, n: int):
    # I'm aware that this code may not be well readable but I can't resist write it as one-liner
    return min([hill_climbing(problem) for _ in range(n)], key = lambda x: problem.cost(x))

In [9]:
problem = FunctionOptimizationProblem()
solution = random_restarts_hill_climbing(problem, 100)
print("Solution", solution)
print("Cost", problem.cost(solution))

Solution (0, 5)
Cost -40


## Task 3: The n-queens

Complete the following cell of code with the implementation of the n-queens problem. The gist of the problem is to place $n$ queens on a $n \times n$ cheesboard so that no queen attacks another. In other words: there is at most one queen in every row, column and diagonal of the cheesboard. The value $n$ is given as the constructor parameter and available in the class as `self.n`.

In [10]:
class NQueens(Problem):
    def __init__(self, n):
        self.n = n
        
    def random_state(self):
        # random state is a tuple of n ints which tells on which index the queen in n-th row is placed
        # state for N = 8 looks like (2, 3, 6, 0, 4, 5, 1, 7)
        state = list(range(self.n))
#         print(state)
        random.shuffle(state)
#         print(state)
        state = tuple(state)
#         print(state)
        return state
        
    def available_actions(self, state):
        actions = []
        for first in range(self.n):
            for second in range(self.n-first, first):
                actions.append((first, second))
        return actions
        
    def do_action(self, state, action):
        first, second = action
        new_state = list(state)
        new_state[first], new_state[second] = new_state[second], new_state[first]
        return tuple(new_state)
        
    def cost(self, state) -> float:
        under_attack = [False for _ in range(self.n)]
        
        for row, queen in enumerate(state):
            
            for spread in range(1, max(self.n - row, row)):
                """max() is for optimization purpose, it decreases number of iterations by 
                quite a lot (of course there is still the same time complexity)
                from n^2 to something like 0.75n^2, it seems like not much but runtimes are much better"""
                upper_row = row - spread
                lower_row = row + spread
                left_col = queen - spread
                right_col = queen + spread
                
                if upper_row >= 0:
                    under_attack[upper_row] = state[upper_row] == left_col or state[upper_row] == right_col
                if lower_row < self.n:
                    under_attack[lower_row] = state[lower_row] == left_col or state[lower_row] == right_col
                    
        return sum(under_attack)

Let's test your implementations. If everything went well, the following cell should terminate after a few seconds and yield a perfect configuration of queens on a $8 \times 8$ board.

In [11]:
# import time
# import matplotlib.pyplot as plt

# times = []
# for n in range(21):
#     print(n, end = " ")
#     start = time.time_ns()
#     problem = NQueens(n)
#     solution = random_restarts_hill_climbing(problem, 100)
#     end = time.time_ns() - start
#     times.append(end / 10**6)

problem = NQueens(8)
solution = random_restarts_hill_climbing(problem, 100)

print("Solution", solution)
print("Cost", problem.cost(solution))
# plt.plot(times)
# plt.xlabel("Number of queens")
# plt.ylabel("Time for 100 restart in ms")
# plt.show()

Solution (3, 6, 2, 4, 7, 1, 0, 5)
Cost 0
