<a href="https://colab.research.google.com/github/JamieBali/hopfieldSudokuSolver/blob/main/hopfieldSudokuSolver.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Introduction
We are creating a Neural Network that can solve Sudoku prolems. We will begin by solving a simpler 4x4 sudoku puzzle as a test to implement the system, before we implement the full 9x9 sudoku. We will figure out how we can extend upon this later.

\> We could vary sizes (eg. 16x16, 25x25)

\> We could vary rules (eg. Knight's Puzzle, King's Puzzle, Killer Sudoku)

\> We could implement other solvers and compare them (eg. Algorithmic Solving, Convultional NN, Feed-Forward NN)

In [None]:
import numpy as np
import pandas as pd
import math
import torch
import torch.nn as nn



# Creating the Value Function

For us to be able to use a hopfield network to solve 

The binary rules of a sudoku solution are:

> if $X(i,j) = 0,    V(i,j,k) = 0$ for all $k$
> 
> if $X(i,j) = k \neq 0, V(i,j,k) = 1.$

Where $X(i,j)$ refers to a position on the grid and $k$ refers to any value that *could* go in that position. i.e. it will give a binary representation of whether or not a position on a grid contains a value, allowing us to create an effective optimisation function using the rules as follows:

> $V(i,j,k) = 0$ or $1$ for all $i,j,k$
>
> $\sum_{i}V(i,j,k) = 1$ for all $j,k$
>
> $\sum_{j}V(i,j,k) = 1$ for all $i,k$
>
> $\sum_{k}V(i,j,k) = 1$ for all $i,j$
>
> $\sum_{i,j}V(i,j,k) = 1$ for all $k$, with the sum on $i$ and $j$ taken over one of the 3x3 $i,j$ squares bounded by thicker lines.

(Hopfield, 2008)

This means (as according to the rules of sudoku), each row, column, and sqaure can have the numbers 1 through 9 only once, as otherwise it will violate the constraints. 

In [None]:
def getGridValue(grid, size):
  ###
  # as subject to the above constraints, we can get the value of a solution by running a grid through the described sums and accumulating a total value.
  ###
  
  totalSum = 0

  # sum across i for all j,k (each number appears in each row once and only once)
  # k is equal to the value in the grid - 1, since it begins indexing at 0
  for k in range(0,size):
    for j in range(0,size): 
      sum = 0
      for i in range(0,size):
        sum += grid[i][j][k]
      if sum == 1:
        totalSum += 1

  # if the sum across i is correct, totalSum should now be size^2
  print("Optimal: " + str(size*size) + " | Actual: " + str(totalSum))
  
  # sum across j for all i,k (each number appears in each column once and only once)
  # k is equal to the value in the grid - 1, since it begins indexing at 0
  for k in range(0,size):
    for i in range(0,size): 
      sum = 0
      for j in range(0,size):
        sum += grid[i][j][k]
      if sum == 1:
        totalSum += 1

  # if the sum across i is correct, totalSum should now be 2(size^2)
  print("Optimal: " + str(size*size*2) + " | Actual: " + str(totalSum))

  # sum across k for all i,j (confirms that every tile on the grid contains a number and isn't still 0)
  # k is equal to the value in the grid - 1, since it begins indexing at 0
  for i in range(0,size):
    for j in range(0,size): 
      sum = 0
      for k in range(0,size):
        sum += grid[i][j][k]
      if sum == 1:
        totalSum += 1

  # if the sum across i is correct, totalSum should now be 3(size^2)
  print("Optimal: " + str(size*size*3) + " | Actual: " + str(totalSum))

  # sum across i,j for all k within a sub-grid of dimentions (size x size) (each number appears within each sub-grid once and only once)
  # k is equal to the value in the grid - 1, since it begins indexing at 0
  temp = int(math.sqrt(size))
  for iincrement in range(0,temp):               # this i and j incrementer allows each individual sub-grid to be searched, and allows for easy grid size change
    for jincrement in range(0,temp):
      for i in range(0,temp):
        for j in range(0,temp):
          sum = 0
          for k in range(0, size):
            sum += grid[i + (iincrement*temp)][j+(jincrement*temp)][k]
          if sum == 1:
            totalSum += 1

  # if the sum across i is correct, totalSum should now be 4(size^2)
  print("Optimal: " + str(size*size*4) + " | Actual: " + str(totalSum))

In [None]:
def networkFormat(grid, size):
  # we need a binary representation of the grid in order to put it through a neural network.
  # since we get the data in with integers up to 9 in each slot on the grid, we must construct a binary, 3-dimensional matrix to represent our puzzle.
  puzzle = np.zeros((size,size,size))
  for x in range(0, size):
    for y in range(0,size):
      temp = grid[x][y]
      if temp != 0:
        puzzle[x][y][temp-1] = 1
  return puzzle

###
#
# Because of the way lists are actually displayed in python, the grid gets rotated when printed.
#
# [1,2]                  [1,3]
# [3,4]   would become   [2,4]
#
# we could flip the data, but it shouldn't matter as long as we are consistent.
#
###

def readableFormat(grid, size):
  # we also need a function to get the neural network format and turn it back into a readable human format.
  grid = np.reshape(grid,(size,size,size))
  puzzle = np.zeros((size,size))
  for x in range(0, size):
    for y in range(0,size):
      temp = 0
      for k in range(0,size):
        if grid[x][y][k] == 1:
          temp = k
      puzzle[x][y] = temp + 1    # increased by 1 because k begins indexing at 0, not 1
  return puzzle

# Testing The Value and Formatting Functions 

In [None]:
# test the above constraints with two solved sudoku of different sizes

twobytwo = [[3,1,4,2],
            [4,2,3,1],
            [1,4,2,3],
            [2,3,1,4]]

tbtFormatted = networkFormat(twobytwo, 4)

print(tbtFormatted)

getGridValue(tbtFormatted, 4)

print(readableFormat(tbtFormatted, 4))



#print (" ~ ~ ~ ")

#threebythree = [[2,1,6,9,3,8,4,5,7],
#                [9,5,4,7,6,2,8,3,1],
#                [3,7,8,5,1,4,2,6,9],
#                [6,8,2,1,9,5,3,7,4],
#                [7,3,5,4,2,6,1,9,8],
#                [4,9,1,8,7,3,6,2,5],
#                [8,2,9,6,5,1,7,4,3],
#                [1,6,7,3,4,9,5,8,2],
#                [5,4,3,2,8,7,9,1,6]]

#getGridValue(threebythree, 9)

[[[0. 0. 1. 0.]
  [1. 0. 0. 0.]
  [0. 0. 0. 1.]
  [0. 1. 0. 0.]]

 [[0. 0. 0. 1.]
  [0. 1. 0. 0.]
  [0. 0. 1. 0.]
  [1. 0. 0. 0.]]

 [[1. 0. 0. 0.]
  [0. 0. 0. 1.]
  [0. 1. 0. 0.]
  [0. 0. 1. 0.]]

 [[0. 1. 0. 0.]
  [0. 0. 1. 0.]
  [1. 0. 0. 0.]
  [0. 0. 0. 1.]]]
Optimal: 16 | Actual: 16
Optimal: 32 | Actual: 32
Optimal: 48 | Actual: 48
Optimal: 64 | Actual: 64
[[3. 1. 4. 2.]
 [4. 2. 3. 1.]
 [1. 4. 2. 3.]
 [2. 3. 1. 4.]]


In [None]:
# deomsntrate increment as problem becomes more solved

twobytwo = [[0,1,0,0],[4,2,0,0],[0,0,2,0],[0,3,0,0]]

getGridValue(networkFormat(twobytwo, 4), 4)

print (" ~ ~ ~ ")

twobytwo = [[3,1,0,0],[4,2,0,0],[0,0,2,0],[0,3,0,0]]

getGridValue(networkFormat(twobytwo, 4), 4)

print (" ~ ~ ~ ")

twobytwo = [[3,1,0,0],[4,2,0,0],[0,4,2,0],[0,3,0,0]]

getGridValue(networkFormat(twobytwo, 4), 4)

Optimal: 16 | Actual: 5
Optimal: 32 | Actual: 10
Optimal: 48 | Actual: 15
Optimal: 64 | Actual: 20
 ~ ~ ~ 
Optimal: 16 | Actual: 6
Optimal: 32 | Actual: 12
Optimal: 48 | Actual: 18
Optimal: 64 | Actual: 24
 ~ ~ ~ 
Optimal: 16 | Actual: 7
Optimal: 32 | Actual: 14
Optimal: 48 | Actual: 21
Optimal: 64 | Actual: 28


In [None]:
# and lastly we will test an incorrectly solved puzzle

twobytwo = [[3,1,4,4], #this line is incorrect. It should be 3,1,4,2. This violates the rules of sudoku and should give us a sub-optimal value. (expcted 60 since it violates all 4 rules.)
            [4,2,3,1],
            [1,4,2,3],
            [2,3,1,4]]

getGridValue(networkFormat(twobytwo,4),4)

Optimal: 16 | Actual: 14
Optimal: 32 | Actual: 28
Optimal: 48 | Actual: 44
Optimal: 64 | Actual: 60


# Creating The Network

We will be creating a Hopfield Neural Network to solve our puzzles.

A Hopfield Neural Network is a continuous, single-layer neural network in which all neurones connect to all other neurones symetrically.

In [None]:
###
#
# we will construct this network so it can be size adapted.
# since we initially want to solve a 4x4 sudoku, we will focus on this first.
#
##
def createNetwork(grid, size):
  
  # first we must construct the neurones
  # we've already made a network formatter, so we just need to flatten the binary puzzle into its respective neurones.
  neurones = torch.flatten(networkFormat(grid,size))

  # next we need to construct weights
  # the initial weights of the network 

###
#
# Next we will construct a step function which runs a single matrix multiplication step,
# divides by 8, and then performs the logsig function. These are the steps described in 
# the paper "Solving Suidoku Puzzles by using Hopfield Neural Networks" (Mladenov, 2011)
#
##
def step(neurones, weights):
  

createNetwork(twobytwo,4)

tensor([0., 0., 1., 0., 1., 0., 0., 0., 0., 0., 0., 1., 0., 1., 0., 0., 0., 0.,
        0., 1., 0., 1., 0., 0., 0., 0., 1., 0., 1., 0., 0., 0., 1., 0., 0., 0.,
        0., 0., 0., 1., 0., 1., 0., 0., 0., 0., 1., 0., 0., 1., 0., 0., 0., 0.,
        1., 0., 1., 0., 0., 0., 0., 0., 0., 1.], dtype=torch.float64)
