# Solving Sudoku Puzzles
## Assignment Preamble
Please ensure you carefully read all of the details and instructions on the assignment page, this section, and the rest of the notebook. If anything is unclear at any time please post on the forum or ask a tutor well in advance of the assignment deadline.

In addition to all of the instructions in the body of the assignment below, you must also follow the following technical instructions for all assignments in this unit. *Failure to do so may result in a grade of zero.*
* [At the bottom of the page](#Submission-Test) is some code which checks you meet the submission requirements. You **must** ensure that this runs correctly before submission.
* Do not modify or delete any of the cells that are marked as test cells, even if they appear to be empty.
* Do not duplicate any cells in the notebook – this can break the marking script. Instead, insert a new cell (e.g. from the menu) and copy across any contents as necessary.

Remember to save and backup your work regularly, and double-check you are submitting the correct version.

This notebook is the primary reference for your submission. You may write code in separate `.py` files but it must be clearly imported into the notebook so that it runs without needing to reference those files, and you must explain clearly what functionality is contained in those files (through comments, markdown cells, etc).

As always, **the work you submit for this assignment must be entirely your own.** Do not copy or work with other students. Do not copy answers that you find online. These assignments are designed to help improve your understanding first and foremost – the process of doing the assignment is part of *learning*. They are also used to assess your ability, and so you must uphold academic integrity. Submitting plagiarised work risks your entire place on your degree.

**The pass mark for this assignment is 40%.** We expect that students, on average, will be able to produce a submission which gets a mark between 50-70% within the normal workload allocation for the unit, but this will vary depending on individual backgrounds. Please ask for help if you are struggling.

## Getting Started
For this assignment, you will be writing an agent that can solve sudoku puzzles. You should be familiar with sudoku puzzles from the unit material. You are given a 9x9 grid with some fixed values. To solve the puzzle, the objective is to fill the empty cells of the grid such that the numbers 1 to 9 appear exactly once in each row, column, and 3x3 block of the grid. 

Below is a sample puzzle along with its solution. 

<img src="images/sudoku.png" style="width: 50%;"/>

For this assignment you will need to submit:
1. The implementation for an agent which can solve sudoku puzzles – this notebook
 * You can use any algorithm you like, from the unit material or otherwise
 * Your code will be subject to automated testing, from which grades will be assigned based on whether it can solve sudokus of varying difficulty
 * To get a high grade on this assignment, the speed of your code will also be a factor – the quicker the better
 * There are some sample tests included below, make sure your code is compatible with the format of these tests
2. A text file that explains your approach and the decisions you made in your own words – a readme file
 * Submissions that do not include the written section will receive zero marks – **this part is mandatory**
 * You may write your file in plain text (.txt) or [Markdown](https://www.markdownguide.org/basic-syntax/) (.md)
 * To get top marks on this assignment, as well as getting a high grade from your implementation, you must also demonstrate excellent academic presentation in your written section

### Choice of Algorithm
The choice of algorithm to solve sudoku puzzles is up to you. We expect you will use search techniques from the unit, but you could make something up yourself, or do some independent research to find something else. You will need to evaluate and balance the trade-off between how well suited you think the algorithm is and how difficult it is to write, but there is some advice below.

I suggest you implement *constraint satisfaction* as it is described in the unit material. You can use the code you have previously been given as a guide. A good implementation of a backtracking depth-first search with constraint propagation should be sufficient to get a good grade in the automated tests (roughly 60-70%).

You could also write a successful agent that uses the other search techniques you have seen in the unit so far: basic search, heuristic search, or local search. You may find these easier to implement, though they may perform less well. 

To get a high grade on this assignment will require a particularly efficient implementation of constraint satisfaction, or something which goes beyond the material we have presented. *This is left unguided and is not factored into the unit workload estimates.*

If you choose to implement more than one algorithm, please feel free to include your code and write about it in part two (readme file), but only the code in this notebook will be used in the automated testing.

## Sample Sudoku Puzzles
To get started, the cell below will load in some sample sudoku puzzles for you so you can see the format. There are sudokus provided of multiple difficulties (easier sudokus typically start with more digits provided). The cell below only loads the easiest, but there is another test cell lower in the notebook which will run your code against all of the provided puzzles.

Each sudoku is a 9x9 NumPy array of integers, where zero represents an empty square. Each difficulty comes with 15 sudokus, so when you load the file, it is stored in a 15x9x9 array.

In [11]:
import numpy as np

# Load sudokus
sudoku = np.load("data/very_easy_puzzle.npy")
print("very_easy_puzzle.npy has been loaded into the variable sudoku")
print(f"sudoku.shape: {sudoku.shape}, sudoku[0].shape: {sudoku[0].shape}, sudoku.dtype: {sudoku.dtype}")

# Load solutions for demonstration
solutions = np.load("data/very_easy_solution.npy")
print()

# Print the first 9x9 sudoku...
print("First sudoku:")
print(sudoku[0], "\n")

# ...and its solution
print("Solution of first sudoku:")
print(solutions[0])

very_easy_puzzle.npy has been loaded into the variable sudoku
sudoku.shape: (15, 9, 9), sudoku[0].shape: (9, 9), sudoku.dtype: int8

First sudoku:
[[1 0 4 3 8 2 9 5 6]
 [2 0 5 4 6 7 1 3 8]
 [3 8 6 9 5 1 4 0 2]
 [4 6 1 5 2 3 8 9 7]
 [7 3 8 1 4 9 6 2 5]
 [9 5 2 8 7 6 3 1 4]
 [5 2 9 6 3 4 7 8 1]
 [6 0 7 2 9 8 5 4 3]
 [8 4 3 0 1 5 2 6 9]] 

Solution of first sudoku:
[[1 7 4 3 8 2 9 5 6]
 [2 9 5 4 6 7 1 3 8]
 [3 8 6 9 5 1 4 7 2]
 [4 6 1 5 2 3 8 9 7]
 [7 3 8 1 4 9 6 2 5]
 [9 5 2 8 7 6 3 1 4]
 [5 2 9 6 3 4 7 8 1]
 [6 1 7 2 9 8 5 4 3]
 [8 4 3 7 1 5 2 6 9]]


## Part One
You should write all of your code for solving sudokus below this cell.

You must include a function called `sudoku_solver(sudoku)` which takes one sudoku puzzle (a 9x9 NumPy array) as input, and returns the solved sudoku as another 9x9 NumPy array. This is the function which will be tested. 

In [12]:
import copy
import math

class Sudoku_Solution:
    
    def __init__ (self, sudoku): 
        self.sudoku = sudoku
        self.possible_values = list(range(0,10))
        self.empty_location_list = []
        self.unsolvable_sudoku = -1*np.ones((9,9))
        
    #newly added, check if a certain is allowed in a certain location in the board
#     def is_valid_input(self, row, column, value):
        
    def is_valid_sudoku(self):
        return self.all_valid_numbers() and self.is_row_final_state() and self.is_column_final_state() and self.is_box_final_state()
    
    def is_goal(self):
        """This partial state is a goal state if every column/ row has a final value"""
        return self.all_goal_numbers() and self.is_row_final_state() and self.is_column_final_state() and self.is_box_final_state()

    def all_goal_numbers(self):
        return np.all(self.sudoku > 0) and np.all(self.sudoku < 10)  #and sum no repetitions in column,rows, and 3x3 np.arrays
    
    def all_valid_numbers(self):
        return np.all(self.sudoku >= 0) and np.all(self.sudoku < 10)  #and sum no repetitions in column,rows, and 3x3 np.arrays
   
    #Returns a sudoku board without the zeros to properly check duplication
    def sudoku_without_zeros(self):
        zeroless_sudoku = copy.deepcopy(self.sudoku)
        number_to_replace = 10
        for row_index, row  in enumerate(zeroless_sudoku):
            for col_index, item in enumerate(zeroless_sudoku[row_index]):  
                if item ==0:
                    zeroless_sudoku[row_index][col_index] = number_to_replace
                    number_to_replace +=1 #To avoid duplicates

        return zeroless_sudoku
                    
 
    def is_row_final_state(self):
        #checking for duplicates in rows
        row_number = 1
    
        #TODO: use enumerate 
        for row in self.sudoku_without_zeros():
            unique_row = np.unique(row)
            if unique_row.size != len(row):
#                 print 'There are duplicates in row:', row_number
                return False
            else: row_number+=1
        return True
            
              
    def is_column_final_state(self):
        #checking for duplicates in columns
        column_number = 1
        for column in self.sudoku_without_zeros().T:
            unique_column = np.unique(column)
            if unique_column.size != len(column):
#                 print('There are duplicates in column:', column_number)
                return False
            else: column_number+=1
        return True
    
    def is_box_final_state(self):
        #checking for duplicates in each box
        zeroless_sudoku = self.sudoku_without_zeros()
        #could be improved by running some for loop on it
        box_1 = zeroless_sudoku[0:3,0:3]
        box_2 = zeroless_sudoku[0:3,3:6]
        box_3 = zeroless_sudoku[0:3,6:9]
        box_4 = zeroless_sudoku[3:6,0:3]
        box_5 = zeroless_sudoku[3:6,3:6]
        box_6 = zeroless_sudoku[3:6,6:9]
        box_7 = zeroless_sudoku[6:9,0:3]
        box_8 = zeroless_sudoku[6:9,3:6]
        box_9 = zeroless_sudoku[6:9,6:9]
        boxes_9 = [box_1,box_2,box_3,box_4,box_5,box_6,box_7,box_8,box_9]
        box_number = 1
        for box in boxes_9:
            #check if each box is unique
            unique_box = np.unique(box)
            if unique_box.size != 9:
#                 print('There are duplicates in box', box_number)
                return False
            else: box_number+=1  
        return True
    
    #return next empty location
    def get_next_empty_location(self):
        #looks for the location of zeros.
        empty_location_arrays=np.where(self.sudoku==0)
        empty_location_list=list(zip(empty_location_arrays[0],empty_location_arrays[1])) 
        return empty_location_list[0]
        
    def get_possible_values(self):
        return self.possible_values.copy()
    
    #Fetch a subgrid from the sudoku board based on row and column
    def fetchSubgrid(self, row, column):
        grid = copy.deepcopy(self.sudoku)
        rowStart = math.floor(row/3) * 3
        rowEnd = rowStart +3
    
        columnStart = math.floor(column/3) * 3
        columnEnd = columnStart +3
    
        return grid[rowStart : rowEnd, columnStart : columnEnd]
    
    def get_updated_possible_values(self, row, column):
        possible_values = self.get_possible_values()

        #remove duplicates in rows
        row_array = self.sudoku[row]
        unique_row_list=list(np.unique(row_array))
#         print('unique_row_list:', unique_row_list)
#         print('possible_values_before',possible_values)
        for number in unique_row_list:   
            if number in possible_values:
                possible_values.remove(number)
            
            if len(possible_values) == 1:
                solution = possible_values[0]
#                 print('1) solution:',solution)
                return possible_values                
#         print('possible_values_after_row',possible_values)
                        
        #remove duplicates in columns
        column_array = self.sudoku.T[column]
        unique_column_list = list(np.unique(column_array))
#         print('unique_column_list:', unique_column_list)
#         print('possible_values_before',possible_values)
        for number in unique_column_list:
            if number in possible_values:
                possible_values.remove(number)
                if len(possible_values) == 1:
                    solution = possible_values[0]
#                     print('2) solution:',solution)
                    return possible_values
            
#         print('possible_values_after_column',possible_values)

            
        #remove duplicates in boxes #check with Hasan how to make this optimal# I know that this is very bad lol
        
        box = self.fetchSubgrid(row, column)
        
        box_list = sum(box.tolist(),[])
#         print('box_list', box_list)
#         print('possible_values_before',possible_values)
        for number in box_list:
            if number in possible_values:
                possible_values.remove(number)
                if len(possible_values) == 1:
                    solution = possible_values[0]
#                     print('solution:',solution)
                    return possible_values    
#         print('possible_values_after_box',possible_values)
        
        return possible_values
    
    #Select a value to add to a certain place in the sudoku board
    def set_value(self,row,column, value):
#         print('***inside set_value')
        
        # create a deep copy: the method returns a new state, does not modify the existing one
        state = copy.deepcopy(self)       
        state.sudoku[row][column] = value
        return state


In [13]:
def depth_first_search(sudoku):
    current_sudoku = copy.deepcopy(sudoku)
#     print('\n\nincoming sudoku: \n', current_sudoku.sudoku)
    
    if(current_sudoku.is_goal()):
        print('goal reached!')
        return current_sudoku
    
    empty_location = current_sudoku.get_next_empty_location()
    possible_values = current_sudoku.get_updated_possible_values(empty_location[0],empty_location[1])
#     print("possible values --> ", possible_values)
    
    for possible_value in possible_values:
#         print("\nCurrent Possible value --> ", possible_value, 'for location ',empty_location)
        
        new_state = current_sudoku.set_value(empty_location[0],empty_location[1], possible_value)
#         print('new state --> ', new_state.sudoku)
        
        if new_state is not None: 
            if(new_state.is_goal()):
                print("new_state goal state reached!!!!")
                return new_state
        
            if not new_state.is_valid_sudoku():
#                 print('invalid state found, returning None')
                return None
        
            deep_state = depth_first_search(new_state)

            #Goal reached, return state
            if deep_state is not None and deep_state.is_goal():
                #print('deep_state goal state reached!!!')
                return deep_state
    
    #print('\n\n\nNOTE:\nall possible values did not work, backtracking!')
    return None

In [14]:
def sudoku_solver(sudoku):
    """
    Solves a Sudoku puzzle and returns its unique solution.

    Input
        sudoku : 9x9 numpy array
            Empty cells are designated by 0.

    Output
        9x9 numpy array of integers
            It contains the solution, if there is one. If there is no solution, all array entries should be -1.
    """
    
    sudoku_obj = Sudoku_Solution(sudoku)
    
    #TODO: Needs fixing for duplicate zeros not to be considered a wrong start
    if(not sudoku_obj.is_valid_sudoku()):
        print('invalid sudoku from the start')
        return sudoku_obj.unsolvable_sudoku
        
    
    solved_sudoku = depth_first_search(sudoku_obj)
    
    if solved_sudoku is None:
        return -1*np.ones((9,9))
    
    
    ### YOUR CODE HERE
    
    return solved_sudoku.sudoku

All of your code must go above this cell. You may add additional cells into the notebook if you wish, but do not duplicate or copy/paste cells as this can interfere with the grading script.

### Testing Details
There are four difficulties of sudoku provided: very easy, easy, medium, and hard. There are 15 sample sudokus in each category, with solutions as well. Difficulty was determined using reference solvers, but your code may vary; it is conceivable that your code will find some sudokus much easier or harder within a given category, or even between categories.

*All categories that are easy and above will contain* ***invalid initial states***, that is, sudoku puzzles with no solution. In this case, your function should return a 9x9 NumPy array whose values are all equal to -1.

When we test your code, we will firstly test it on the *same* very easy puzzles that you have been given. Then we will test it on additional *hidden* sudokus from each difficulty in turn, easy and up. Grades are awarded based on whether your code can solve the puzzles. For high grades on the hard puzzles, execution time will also be a factor. 

All puzzles must take under 30 seconds each on the test machine to count as successful, but you should be aiming for an average of under a second per puzzle. Hardware varies, but all tests will take place on the same modern desktop machine. Our ‘standard constraint satisfaction’ implementation takes about 0.001 seconds per puzzle for the very easy category, but struggles to solve some of the hard puzzles within the time limit.

***The hard sudokus are labelled as hard for a reason.*** We expect most submissions will not be able to solve them in a reasonable length of time. Use the stop button (■) on the toolbar if you need to terminate your code because it is taking too long.

The best way to improve the performance of your code is through a detailed understanding and smart choice of AI algorithms. This assignment is ***not*** meant to test your ability to write multi-threaded code or any other kind of high-performance code optimisations. 

#### Test Cell
The following code will run your solution over the provided sudoku puzzles. To enable it, set the constant `SKIP_TESTS` to `False`. If you fail any tests of one difficulty, the code will stop, but you can modify this behaviour if you like.

**IMPORTANT**: you must set `SKIP_TESTS` back to `True` before submitting this file!

In [10]:
SKIP_TESTS = False

if not SKIP_TESTS:
    import time
#     difficulties = ['very_easy', 'easy', 'medium', 'hard']
    difficulties = ['hard']


    for difficulty in difficulties:
        print(f"Testing {difficulty} sudokus")
        
        sudokus = np.load(f"data/{difficulty}_puzzle.npy")
        solutions = np.load(f"data/{difficulty}_solution.npy")
        
        count = 0
        for i in range(len(sudokus)):
            sudoku = sudokus[i].copy()
            print(f"This is {difficulty} sudoku number", i)
            print(sudoku)
            
            start_time = time.process_time()
            your_solution = sudoku_solver(sudoku)
            end_time = time.process_time()
            
            print(f"This is your solution for {difficulty} sudoku number", i)
            print(your_solution)
            
            print("Is your solution correct?")
            if np.array_equal(your_solution, solutions[i]):
                print("Yes! Correct solution.")
                count += 1
            else:
                print("No, the correct solution is:")
                print(solutions[i])
            
            print("This sudoku took", end_time-start_time, "seconds to solve.\n")

        print(f"{count}/{len(sudokus)} {difficulty} sudokus correct")
        if count < len(sudokus):
            break

Testing hard sudokus
This is hard sudoku number 0
[[0 0 0 0 0 7 5 4 0]
 [9 0 6 0 5 0 0 3 0]
 [0 0 0 0 0 0 2 0 0]
 [2 0 0 0 0 0 7 9 0]
 [0 0 3 0 4 1 0 0 0]
 [7 0 0 0 0 0 0 5 0]
 [0 3 0 0 0 4 0 2 0]
 [0 9 4 1 0 0 0 0 0]
 [0 0 0 5 9 0 0 0 4]]
This is your solution for hard sudoku number 0
[[-1. -1. -1. -1. -1. -1. -1. -1. -1.]
 [-1. -1. -1. -1. -1. -1. -1. -1. -1.]
 [-1. -1. -1. -1. -1. -1. -1. -1. -1.]
 [-1. -1. -1. -1. -1. -1. -1. -1. -1.]
 [-1. -1. -1. -1. -1. -1. -1. -1. -1.]
 [-1. -1. -1. -1. -1. -1. -1. -1. -1.]
 [-1. -1. -1. -1. -1. -1. -1. -1. -1.]
 [-1. -1. -1. -1. -1. -1. -1. -1. -1.]
 [-1. -1. -1. -1. -1. -1. -1. -1. -1.]]
Is your solution correct?
Yes! Correct solution.
This sudoku took 35.861891 seconds to solve.

This is hard sudoku number 1
[[1 0 0 7 0 0 0 0 0]
 [0 3 2 0 0 0 0 0 0]
 [0 0 0 6 0 0 0 0 0]
 [0 8 0 0 0 2 0 7 0]
 [5 0 7 0 0 1 0 0 0]
 [0 0 0 0 0 3 6 1 0]
 [7 0 0 0 0 0 2 0 9]
 [0 0 0 0 5 0 0 0 0]
 [3 0 0 0 0 4 0 0 5]]


KeyboardInterrupt: 

## Submission Test
The following cell tests if your notebook is ready for submission. **You must not skip this step!**

Restart the kernel and run the entire notebook (Kernel → Restart & Run All). Now look at the output of the cell below. 

*If there is no output, then your submission is not ready.* Either your code is still running (did you forget to skip tests?) or it caused an error.

As previously mentioned, failing to follow these instructions can result in a grade of zero.

In [10]:
import sys
import pathlib

fail = False;

if not SKIP_TESTS:
    fail = True;
    print("You must set the SKIP_TESTS constant to True in the cell above.")
    
p1 = pathlib.Path('./readme.txt')
p2 = pathlib.Path('./readme.md')
if not (p1.is_file() or p2.is_file()):
    fail = True;
    print("You must include a separate file called readme.txt or readme.md in your submission.")
    
p3 = pathlib.Path('./sudoku.ipynb')
if not p3.is_file():
    fail = True
    print("This notebook file must be named sudoku.ipynb")
    
if "sudoku_solver" not in dir():
    fail = True;
    print("You must include a function called sudoku_solver which accepts a numpy array.")
else: 
    sudoku = np.load("data/very_easy_puzzle.npy")[0]
    solution = np.load("data/very_easy_solution.npy")[0]

    if not np.array_equal(sudoku_solver(sudoku), solution):
        print("Warning:")
        print("Your sudoku_solver function does not correctly solve the first sudoku.")
        print()
        print("Your assignment is unlikely to get any marks from the autograder. While we will")
        print("try to check it manually to assign some partial credit, we encourage you to ask")
        print("for help on the forum or directly to a tutor.")
        print()
        print("Please use the readme file to explain your code anyway.")
    
if fail:
    print()
    sys.stderr.write("Your submission is not ready! Please read and follow the instructions above.")
else:
    print("All checks passed. When you are ready to submit, upload the notebook and readme file to the")
    print("assignment page, without changing any filenames.")
    print()
    print("If you need to submit multiple files, you can archive them in a .zip file. (No other format.)")

You must set the SKIP_TESTS constant to True in the cell above.
You must include a separate file called readme.txt or readme.md in your submission.
new_state goal state reached!!!!



Your submission is not ready! Please read and follow the instructions above.

In [11]:
# This is a TEST CELL. Do not delete or change.

## TODO:
1) Test that sudoku doesn't have duplicate values (EXCLUDING ZEROS)
2) Don't return inside get_possible_values --> if you get to a state where there is no possible value return (None?)

In [15]:
def depth_first_search(sudoku):
#     solution = -1*np.ones((9,9))
    current_sudoku = copy.deepcopy(sudoku)
    print('\n\nincoming sudoku: \n', current_sudoku.sudoku)
    
    if(current_sudoku.is_goal()):
        print('goal reached!')
        return current_sudoku
    
    empty_location = current_sudoku.get_next_empty_location()
    possible_values = current_sudoku.get_updated_possible_values(empty_location[0],empty_location[1])
    print("possible values --> ", possible_values)
    
    for possible_value in possible_values:
        print("\nCurrent Possible value --> ", possible_value, 'for location ',empty_location)
        
        new_state = current_sudoku.set_value(empty_location[0],empty_location[1], possible_value)
        print('new state --> ', new_state.sudoku)
        
        if(new_state.is_goal()):
            print("new_state goal state reached!!!!")
            return new_state
        
        deep_state = depth_first_search(new_state)

        #Goal reached, return state
        if deep_state is not None and deep_state.is_goal():
            print('deep_state goal state reached!!!')
            return deep_state

    return None