# Goal Based Agent

The environment I have choosen here is the sliding tile puzzle game. It is basically a small puzzle where you have one gap (Here represented as a 0) and all other tiles are numbered from 1 to the max number possible (If it is a 9x9 board then max number is 8). Now the goal is to slide the tiles such that finally all the tiles are in the right order (Starting from 1 and so on). 
Ex:

<pre>
Initial state:      Final State:
2 4 3               1 2 3 
1 8 5               4 5 6 
7 0 6               7 8 0
</pre>

Now the agent can move any tile into the empty gap example here 8 down or 7 to the right or 6 to the left. This is a goal based agent so it has the knowledge of the final goal state in its mind. For the utility function I have used the number of misplaced tiles and shortest path between all the location and thier goal state location. To find the shortest path we can use the manhattan distance ( |x1-x2|+|y1-y2| ). Now the agent uses this and gives the moves and tried to reduce this total score (num of tiles displaced + total distance) till it is zero (Goal state reached).

In [1]:
import random
import math

## Puzzle board generator

This function is used to generate a random puzzle board which is also solvable. To make a puzzle board, I first took solved puzzle and then scrambled it randomly by swapping the gap with its neighbors randomly a random number of times so that the final board it completely scrambled.

In [2]:
def puzzle_generator(n):
    row_num = int(math.sqrt(n))
    num = 1
    puzzle = []
    for i in range(row_num):
        row = []
        for j in range(row_num):
            row.append(num)
            num += 1
        puzzle.append(row)
    
    puzzle[row_num-1][row_num-1] = 0
    rand_suffle = random.randint(n, n*n)
    x = row_num - 1
    y = row_num - 1
    for i in range(rand_suffle):
        possible_moves = [0, 1, 2, 3]
        if(x - 1 < 0):
            possible_moves.remove(0)
        if(x + 1 >= row_num):
            possible_moves.remove(1)
        if(y - 1 < 0):
            possible_moves.remove(2)
        if(y + 1 >= row_num):
            possible_moves.remove(3)
        rand_move = random.choice(possible_moves)
        if(rand_move == 0): #UP
            puzzle[x][y] = puzzle[x-1][y] 
            puzzle[x-1][y] = 0
            x -= 1
        elif(rand_move == 1): #Down
            puzzle[x][y] = puzzle[x+1][y] 
            puzzle[x+1][y] = 0
            x += 1
        elif(rand_move == 2): #Left
            puzzle[x][y] = puzzle[x][y-1] 
            puzzle[x][y-1] = 0
            y -= 1
        elif(rand_move == 3): #Right
            puzzle[x][y] = puzzle[x][y+1] 
            puzzle[x][y+1] = 0
            y += 1
    return (puzzle, (x, y))

In [3]:
class puzzle:
    def __init__(self, n=9, puzzle=None, start=None):
        self.n = n
        self.row_len = int(math.sqrt(self.n))
        if(puzzle):
            self.puzzle = puzzle
            self.start = start
        else:
            create_puzzle = puzzle_generator(n)
            self.puzzle = create_puzzle[0]
            self.start = create_puzzle[1]
    
    def print_puzzle(self):
        for i in range(self.row_len):
            for j in range(self.row_len):
                print(self.puzzle[i][j], end=" ")
            print()
    
    def get_correct_index(self, element):
        if(element == 0):
            x = self.row_len - 1
            y = self.row_len - 1
        else:   
            if(element % self.row_len == 0):
                x = element // (self.row_len+1)
                y = self.row_len - 1
            else:
                x = element // self.row_len
                y = (element % self.row_len) - 1
        return (x, y)
    
    def manhattan_dist(self, pos, element):
        idx, idy = self.get_correct_index(element)
        d = abs(pos[0] - idx) + abs(pos[1] - idy)
        return d
    
    def get_total_distance(self):
        total_dist = 0
        num_disp = 0
        for i in range(self.row_len):
            for j in range(self.row_len):
                if(self.get_correct_index(self.puzzle[i][j]) != (i, j)):
                    num_disp += 1
                total_dist += self.manhattan_dist((i, j), self.puzzle[i][j])
        return total_dist + num_disp
    
    def make_move(self, pos, move):
        x = pos[0]
        y = pos[1]
        if(move == 0): #UP
            self.puzzle[x][y] = self.puzzle[x-1][y] 
            self.puzzle[x-1][y] = 0
            x -= 1
        elif(move == 1): #Down
            self.puzzle[x][y] = self.puzzle[x+1][y] 
            self.puzzle[x+1][y] = 0
            x += 1
        elif(move == 2): #Left
            self.puzzle[x][y] = self.puzzle[x][y-1] 
            self.puzzle[x][y-1] = 0
            y -= 1
        elif(move == 3): #Right
            self.puzzle[x][y] = self.puzzle[x][y+1] 
            self.puzzle[x][y+1] = 0
            y += 1
        return (x, y)

In [4]:
p = puzzle()

In [5]:
p.print_puzzle()

1 5 2 
4 0 3 
7 8 6 


In [6]:
p.start

(1, 1)

In [7]:
p.manhattan_dist(p.start, 0)

2

## Agent

In [8]:
class puzzle_solver:
    def __init__(self, start):
        self.x = start[0]
        self.y = start[1]
        self.prev_move = -1
        self.memory = []
    
    def move(self, env):
        if(env.get_total_distance() == 0):
            print("Solved the puzzle...")
            return False
        if(len(self.memory) > 10):
            self.memory.pop(0)
        
        loop_check = 0
        for i in range(len(self.memory)):
            for j in range(i+1, len(self.memory)):
                if(self.memory[i] == self.memory[j]):
                    loop_count = 0
                    for k in range(4):
                        if((i+k < len(self.memory)) and (j+k < len(self.memory)) and (self.memory[i+k] == self.memory[j+k])):
                            loop_count += 1
                    if(loop_count == 4):
                        loop_check = 1
                        next_move = self.memory[i+1]
                        break
        
        if(env.get_total_distance() == 0):
            print("Solved the puzzle...")
            return False
        elif(loop_check):
            possible_moves = [0, 1, 2, 3]
            if(self.x - 1 < 0):
                possible_moves.remove(0)
            if(self.x + 1 >= env.row_len):
                possible_moves.remove(1)
            if(self.y - 1 < 0):
                possible_moves.remove(2)
            if(self.y + 1 >= env.row_len):
                possible_moves.remove(3)
            if(next_move in possible_moves):
                possible_moves.remove(next_move)
            
            rand_move = random.choice(possible_moves)
            nx, ny = env.make_move((self.x, self.y), rand_move)
            self.x = nx
            self.y = ny
            self.prev_move = rand_move
            self.memory.append(rand_move)
            print("I moved: ", rand_move)
            return True
        else:
            possible_moves = [0, 1, 2, 3]
            if(self.x - 1 < 0):
                possible_moves.remove(0)
            if(self.x + 1 >= env.row_len):
                possible_moves.remove(1)
            if(self.y - 1 < 0):
                possible_moves.remove(2)
            if(self.y + 1 >= env.row_len):
                possible_moves.remove(3)
                
            if(self.prev_move != -1):
                if(self.prev_move == 0):
                    possible_moves.remove(1)
                elif(self.prev_move == 1):
                    possible_moves.remove(0)
                elif(self.prev_move == 2):
                    possible_moves.remove(3)
                elif(self.prev_move == 3):
                    possible_moves.remove(2)

            best_move = -1
            best_score = env.row_len * env.n
            for move in possible_moves:
                nx, ny = env.make_move((self.x, self.y), move)
                score = env.get_total_distance()
                if(score < best_score):
                    best_score = score
                    best_move = move
                if(move == 0):
                    oppo_move = 1
                elif(move == 1):
                    oppo_move = 0
                elif(move == 2):
                    oppo_move = 3
                elif(move == 3):
                    oppo_move = 2
                env.make_move((nx, ny), oppo_move)

            nx, ny = env.make_move((self.x, self.y), best_move)
            self.x = nx
            self.y = ny
            self.prev_move = best_move
            self.memory.append(best_move)
            print("I moved: ", best_move, "Current score: ", best_score)
            return True

## Running the agent

In [9]:
cust_puzzle = [[0,2],
               [1,3],]
cust_start = (0, 0)

In [10]:
env = puzzle(n=4, puzzle=cust_puzzle, start=cust_start)
env.print_puzzle()

0 2 
1 3 


In [11]:
agent = puzzle_solver(env.start)

In [12]:
status = True
while(status):
    status = agent.move(env)
    env.print_puzzle()

I moved:  1 Current score:  4
1 2 
0 3 
I moved:  3 Current score:  0
1 2 
3 0 
Solved the puzzle...
1 2 
3 0 


In [13]:
cust_puzzle2 = [[2, 4, 3],
               [1, 8, 5],
               [7, 0, 6]]
cust_start2 = (2, 1)

In [14]:
env = puzzle(n=9, puzzle=cust_puzzle2, start=cust_start2)
env.print_puzzle()

2 4 3 
1 8 5 
7 0 6 


In [15]:
agent = puzzle_solver(env.start)
status = True
while(status):
    status = agent.move(env)
    env.print_puzzle()

I moved:  0 Current score:  14
2 4 3 
1 0 5 
7 8 6 
I moved:  3 Current score:  11
2 4 3 
1 5 0 
7 8 6 
I moved:  1 Current score:  7
2 4 3 
1 5 6 
7 8 0 
I moved:  2 Current score:  11
2 4 3 
1 5 6 
7 0 8 
I moved:  0 Current score:  14
2 4 3 
1 0 6 
7 5 8 
I moved:  0 Current score:  14
2 0 3 
1 4 6 
7 5 8 
I moved:  2 Current score:  13
0 2 3 
1 4 6 
7 5 8 
I moved:  1 Current score:  10
1 2 3 
0 4 6 
7 5 8 
I moved:  3 Current score:  7
1 2 3 
4 0 6 
7 5 8 
I moved:  1 Current score:  4
1 2 3 
4 5 6 
7 0 8 
I moved:  3 Current score:  0
1 2 3 
4 5 6 
7 8 0 
Solved the puzzle...
1 2 3 
4 5 6 
7 8 0 


In [16]:
cust_puzzle3 = [[0, 5, 2],
               [1, 8, 3],
               [4, 7, 6]]
cust_start3 = (0, 0)
env = puzzle(n=9, puzzle=cust_puzzle3, start=cust_start3)
env.print_puzzle()

0 5 2 
1 8 3 
4 7 6 


In [17]:
agent = puzzle_solver(env.start)
status = True
while(status):
    status = agent.move(env)
    env.print_puzzle()

I moved:  1 Current score:  18
1 5 2 
0 8 3 
4 7 6 
I moved:  1 Current score:  15
1 5 2 
4 8 3 
0 7 6 
I moved:  3 Current score:  12
1 5 2 
4 8 3 
7 0 6 
I moved:  0 Current score:  11
1 5 2 
4 0 3 
7 8 6 
I moved:  0 Current score:  10
1 0 2 
4 5 3 
7 8 6 
I moved:  3 Current score:  7
1 2 0 
4 5 3 
7 8 6 
I moved:  1 Current score:  4
1 2 3 
4 5 0 
7 8 6 
I moved:  1 Current score:  0
1 2 3 
4 5 6 
7 8 0 
Solved the puzzle...
1 2 3 
4 5 6 
7 8 0 


In [18]:
env = puzzle(n=9)
env.print_puzzle()

2 4 0 
1 5 3 
7 8 6 


In [19]:
agent = puzzle_solver(env.start)
status = True
while(status):
    status = agent.move(env)
    env.print_puzzle()

I moved:  1 Current score:  11
2 4 3 
1 5 0 
7 8 6 
I moved:  1 Current score:  7
2 4 3 
1 5 6 
7 8 0 
I moved:  2 Current score:  11
2 4 3 
1 5 6 
7 0 8 
I moved:  0 Current score:  14
2 4 3 
1 0 6 
7 5 8 
I moved:  0 Current score:  14
2 0 3 
1 4 6 
7 5 8 
I moved:  2 Current score:  13
0 2 3 
1 4 6 
7 5 8 
I moved:  1 Current score:  10
1 2 3 
0 4 6 
7 5 8 
I moved:  3 Current score:  7
1 2 3 
4 0 6 
7 5 8 
I moved:  1 Current score:  4
1 2 3 
4 5 6 
7 0 8 
I moved:  3 Current score:  0
1 2 3 
4 5 6 
7 8 0 
Solved the puzzle...
1 2 3 
4 5 6 
7 8 0 
