# Sudoku Solver
## What is Sudoku?
 A sudoku puzzle consists of a 9x9 grid where each cell contains an integer 1-9.
 The 9x9 grid is divided into 9x9 subgrids
The solution to the puzzle must satisfy the following conditions.\
    **>Each row contains digits 1-9 without repetition**\
    **>Each column contains digits 1-9 without repetition**\
    **>Each subgrid contains digits 1-9 without repetition**\
The puzzle is given with some of the cells empty.
This python project aims to create an AI algorithm to solve sudoku.

## Approaching the Problem
The complexity of a Sudoku puzzle depends with the number ofempty cells in the grid\
If 41 of 81 cells are empty then there are 941 possible combininations that could be the solution.
A bruteforce algorithm will be feasable.


## The Dataset
The Dataset used in this project was downloaded from **Kaggle**\

Kaggle has a made publicly available a dataset of 9 million Sudoku puzzles and their solutions.



### importing the required libs and the dataset

In [34]:
#importing the reqired libraries and The Dataset

import numpy as np
import pandas as pd
import time
import itertools

#using pandas too import data set
sudokuDf = pd.DataFrame(pd.read_csv('sudoku.csv',nrows=10**2))

In [2]:
sudokuDf.sample(5)

Unnamed: 0,puzzle,solution
18,0870020102040170030068007055080010006400081000...,9875324162546179833168497255782613496437981521...
14,0073000542450809000030400700709600000000207600...,8673192542457869319135426784729631853814257695...
0,0700000430400096108006349000940520003584600200...,6795182435437296188216349577943521863584617292...
70,0728340690450200080000750200140800008670400032...,1728345699451267386389754213147896528675421932...
84,0000507800006000100900206031004000000542600900...,2361597848756439124917286531684972353542618979...


#### Representing the Data in the standard sudoku 9x9 grid

As we have seen from the sample the puzzle and solutions are represented by a string of 81 digits so we have to reshape them into a standard sudoku puzzle

In [35]:
def shape(sudokuDf):
    for n in range(sudokuDf.shape[0]):
        #shaping the puzzle part
        sudokuDf.iloc[n,0] = np.reshape(list(sudokuDf.puzzle.values[n]),(9,9)).astype(int)
        #shaping the solution part
        sudokuDf.iloc[n,1] = np.reshape(list(sudokuDf.solution.values[n]),(9,9)).astype(int)
    return sudokuDf

**How the first Puzzle looks**

In [36]:
sudokuDf = shape(sudokuDf)
sudokuDf.iloc[0,0]

array([[0, 7, 0, 0, 0, 0, 0, 4, 3],
       [0, 4, 0, 0, 0, 9, 6, 1, 0],
       [8, 0, 0, 6, 3, 4, 9, 0, 0],
       [0, 9, 4, 0, 5, 2, 0, 0, 0],
       [3, 5, 8, 4, 6, 0, 0, 2, 0],
       [0, 0, 0, 8, 0, 0, 5, 3, 0],
       [0, 8, 0, 0, 7, 0, 0, 9, 1],
       [9, 0, 2, 1, 0, 0, 0, 0, 5],
       [0, 0, 7, 0, 4, 0, 8, 0, 2]])

**This is its Solution**

In [37]:
sudokuDf.iloc[0,1]

array([[6, 7, 9, 5, 1, 8, 2, 4, 3],
       [5, 4, 3, 7, 2, 9, 6, 1, 8],
       [8, 2, 1, 6, 3, 4, 9, 5, 7],
       [7, 9, 4, 3, 5, 2, 1, 8, 6],
       [3, 5, 8, 4, 6, 1, 7, 2, 9],
       [2, 1, 6, 8, 9, 7, 5, 3, 4],
       [4, 8, 5, 2, 7, 6, 3, 9, 1],
       [9, 6, 2, 1, 8, 3, 4, 7, 5],
       [1, 3, 7, 9, 4, 5, 8, 6, 2]])

## Implementing checker for solution
This will be a function to check if the solution set is in line with the aforementioned conditions for a solution.

In [None]:
def checkPuzzle(sudokuPuzzle):
    checkRow = all([all([x in sudokuPuzzle[nrow,:]for nrow in range(9) for x in range(1,10) ])])
    checkColumn = all([all([x in sudokuPuzzle[ncol,:]for ncol in range(9) for x in range(1,10) ])])
    checkUpperLeft = all([x in sudokuPuzzle[0:3,0:3] for x in range(1,10)])
    checkUpperMid = all([x in sudokuPuzzle[0:3,3:6] for x in range(1,10)])
    checkUpperRight = all([x in sudokuPuzzle[0:3,6:9] for x in range(1,10)])

    checkMidLeft = all([x in sudokuPuzzle[3:6,0:3] for x in range(1,10)])
    checkMidMid = all([x in sudokuPuzzle[3:6,3:6] for x in range(1,10)])
    checkMidRight = all([x in sudokuPuzzle[3:6,6:9] for x in range(1,10)])

    checkLowerLeft = all([x in sudokuPuzzle[6:9,0:3] for x in range(1,10)])
    checkLowerMid = all([x in sudokuPuzzle[6:9,3:6] for x in range(1,10)])
    checkLowerRight = all([x in sudokuPuzzle[6:9,6:9] for x in range(1,10)])

    solved = all([checkRow,checkColumn,checkUpperLeft,checkUpperMid,checkUpperRight,
                  checkMidLeft,checkMidMid,checkMidRight,checkLowerLeft,checkLowerMid,checkLowerRight])
    if solved:
        for line in sudokuPuzzle:
            print(*line)
    return solved

## Using the brute force approach
We can define a function that will scan through each cell and determines which values are already in the cell's row,column and subgrid.\
it then removes known values from a list of values from 1-9 \
if the cell is already filled there can only be a single value the given value

In [52]:
def determineValues(sudokuPuzzle):
    puzzleValue = list()
    for r in range(9):
        for c in range(9):
            if sudokuPuzzle[r,c]==0:
                cellValues = np.array(range(1,10))
                cellValues= np.setdiff1d(cellValues,sudokuPuzzle[r,:]
                                         [np.where(sudokuPuzzle[r,:]!=0)]).tolist()
                cellValues =  np.setdiff1d(cellValues,sudokuPuzzle[:,c]
                                         [np.where(sudokuPuzzle[:,c]!=0)]).tolist()
            else:
                cellValues= [sudokuPuzzle[r,c]]
                puzzleValue.append(cellValues)
    return puzzleValue       

In [56]:
puzzleValue = determineValues(sudokuDf.iloc[5,1])
puzzleValue

[[5],
 [6],
 [1],
 [4],
 [9],
 [2],
 [7],
 [3],
 [8],
 [3],
 [2],
 [4],
 [7],
 [8],
 [6],
 [1],
 [9],
 [5],
 [9],
 [8],
 [7],
 [3],
 [1],
 [5],
 [2],
 [4],
 [6],
 [6],
 [5],
 [9],
 [8],
 [3],
 [1],
 [4],
 [2],
 [7],
 [4],
 [1],
 [8],
 [2],
 [7],
 [9],
 [5],
 [6],
 [3],
 [2],
 [7],
 [3],
 [5],
 [6],
 [4],
 [8],
 [1],
 [9],
 [1],
 [3],
 [5],
 [9],
 [2],
 [8],
 [6],
 [7],
 [4],
 [7],
 [4],
 [6],
 [1],
 [5],
 [3],
 [9],
 [8],
 [2],
 [8],
 [9],
 [2],
 [6],
 [4],
 [7],
 [3],
 [5],
 [1]]

Since we now know the possible values of each cell this we can then move on to to trying out different combinations <br> till we find the solution set but this computational mammoth of a task and requires bot alot of time and resources a better method will be to use backtracking