<a href="https://colab.research.google.com/github/kameda-yoshinari/DataAlgo-T/blob/master/DataAlgo_T(013)_NumberPlace.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 6.2.3. Number Place

Number place is a very famous pencil puzzle. It is also known as Sudoku. Of course, this is also an problem-solving application.

Note that the algorithm to solve number place is very different from what human begins do because humans and computers are very different.

**Reminder**  
On github, rendering might not be in good shape.  
To see the expected layout, open this page in Google Colaboratory.
To run one specific code cell in colab, click the icon on the left part or just type Ctrl + Enter.  


# Preparation

Connect the Jupyter environment and invoke a runtime. 
Mount your Google Drive by the procedure below.  
Change directory to the mounted point and make it as the working folder.  
By then, files are preserved even after you terminate the runtime environment.

In [None]:
!echo "Mounting your Google Drive"
from google.colab import drive 
drive.mount('/content/drive')

In [None]:
!echo "Make a working folder and chnage directory to it"
%cd /content/drive/My\ Drive
%mkdir -p DataAlgo-T/013
%cd       DataAlgo-T/013
!ls

# Number place

**Problem definition**

Assuming you know the rule of number place.  
If not, just visit wikipedia, or ask someone around you who enjoy solving the puzzles.   
> https://en.wikipedia.org/wiki/Sudoku

The board size is 9 by 9.  
There are apparently three conditions to follow on deciding the number to fill.

* Each column should have 1-9 figures only once for each.
* Each row should have 1-9 figures only once for each.
* Each 3x3 block should have 1-9 figures only once for each.

Obviously, the number of the solutions that satify all theses conditions is finite (as shown in the wikipedia).

In addition, as for puzzle game, some cells are filled and fixed beforehand, and the solution is (usually) limited to only one. In many cases, around 30 cells are fixed, so the problem there is to fill rest of the cells (around 50 cells since the total number of the cell is 81).

**Tip to solve number place (for humans)**

If you invesitigate the alogirhtm to solve number place, you will soon see many advices from your friends, books, or on the internet.
**DO NOT FOLLOW THEM** as the advices are for humans like you.  

Here, the solver is computer, which is designed to remember many things sharply for long time. 

So, in this section, we develop the algorithm to solve number place based on the problem-solving approach, just like Knight tour and N Queens.




# Brute-force search of number place

There are many ways to invent brute-force approach.

**Plan A**

If we think of taking only one conditions (out of thress) on generating solution candidates, the number of the solution candidates would be (9!)<sup>9</sup> (thanks to the similarity of the conditions, it should be same regardless of which you take as the condition).

For verification process for the remaining two conditions, probably it needs (9 x 9) x 2 computation cost. So in total, required computation cost would be (9!)<sup>9</sup> x 2 x 9<sup>2</sup>.

In reality, there only around 50 cells to fill, so the amount should be much smaller yet it is a very large (huge) number. 

**Plan B**

Assuming that the number of the empty cells is K (around 50). Since the candidates are 1-9, the number of the solution candidates would be 9<sup>K</sup>. We need the verification process for the conditions, so it would need (9 x 9) x 3 computation cost for each candidate.

**Plan C**

Since the number of the solutions with these three conditions is finite, if we can expand all the solutions on the memory, by just feeding the info of the fixe d cells, it is just a simple query to the database. Good point is the the query result should be unique (as it is a puzzle). So the time computation cost is constant on this approach. Of course, we need huge (how much? see the wikipedia) size of memory.  

--  
Discuss which is more impossible than others.  
You may think of your brute-foce approaches.  


# Backtracking search for number place

The algoithm should be very similar to the Knight's tour and N Queens.

Of course, we need to invent the codes for three conditions, but the procedure structure is exactly the same as before.

One special coding we need here is to read the problem (handling of fixed cells).

Note that the order of processing empty cells does not make sense (unlike the advices to humans!). If we think of the order to process, it may contribute to speed up the process, but cost of writing such program is troublesome, and probably it is useless for computers since the computer can remember many things at the same time.






# C program of the backtracking search

**Purpose**

Develop the backtracking program to solve number place. The problem is fed to the program by text file.

**Explanation**

We adopt DFS with recurisive call to generate solution candidates.

**Program**

Line 105-108 is for marking on the go, and Line 112-115 is for recovering to the original state on the backtracking.

The empty cells are enumerated to celltoexamine[] array on reading the problem from a text file.

To express the condition on x axis, we use checkx[x][], and checky[y][] for y axis.  
To express the condition on 3x3 block, we use checkb[b][] where b is given by b = (x / 3) + (y / 3) * 3.

**Remarks**

To meature the time amount, we introduce gettimeofday() function (on linux). Unkike the time command, it only counts the real clock. 

Verboselevel variable can be set as the second command line option. It should be 0(default), 1, or 2.



In [None]:
%%writefile NumberPlace_E.c 
// Number Place, a.k.a. Sudoku by backtrack method
//  kameda[at]ccs.tsukuba.ac.jp, 2020.
#include <stdio.h>
#include <stdlib.h> // atoi()
#include <string.h> // strcmp()
#include <sys/time.h> // gettimeofday()

#define N 9 // Sudoku board size

typedef struct {int x; int y;} vec2i; // 2D coordinates

int board[N][N];       // 0 ... undertermined, 1-9 ... answers
int numlockedcell = 0; // number of locked cells (given as the problem)

vec2i celltoexamine[N*N]; // List of empty cells in the problem
int numemptycell = 0;     // Number of empty cells. numlockedcell + numemptycell = N * N

// three kinds of the conditions. binary representation.
int checkx[N][N]; // if figure f(1-9) has been already used on line-x,  checky[x][f-1] = 1, 0 otherwise
int checky[N][N]; // if figure f(1-9) has been already used on line-y,  checky[y][f-1] = 1, 0 otherwise
int checkb[N][N]; // if figure f(1-9) has been already used on block-b, checkb[b][f-1] = 1, 0 otherwise

int num_answer = 0; // Number of the answers (expected to be 1 for Sudoku)
int verboselevel = 0; // Verbose level (0 minimum, 1 show answers, 2 show boards)


// Return blok-ID from (x,y) coord
int xy2b(int x, int y) {
    return ((x / 3) + (y / 3) * 3); // Based on the C language syntax
}

// Show board
//   Empty cell is denoted by '.'
void showboard(void){
    vec2i pt;

    for (pt.y = 0; pt.y < N; pt.y++) {
        printf("%d ", pt.y);
        for (pt.x = 0; pt.x < N; pt.x++) {
            if (board[pt.x][pt.y] != 0) {
                printf("%d", board[pt.x][pt.y]);
            } else {
                printf(".");
            }
        }
        printf("\n");
    }
    return ;
}

//  Read the board (the problem) info from a file.
//    If "-" is specified, read the dat from stdin.
int readboard(char *filename){
    FILE *fd;
    vec2i pt;
	
    if (strcmp(filename, "-") == 0) {
        fd = stdin;
    } else if ((fd = fopen(filename, "r")) == NULL) {
        printf("readboard: cannnot open %s\n", filename);
        return -1;
    }

    for (pt.y = 0; pt.y < N; pt.y++) {
        for (pt.x = 0; pt.x < N; pt.x++) {
            if (fscanf(fd, "%d", &(board[pt.x][pt.y])) != 1) {
                printf("readboard: error at (x,y) = (%d, %d)\n", pt.x, pt.y);
	            return -2;
            }
            if (board[pt.x][pt.y] < 1 || board[pt.x][pt.y] > 9) {
                board[pt.x][pt.y] = 0; // cleaning up to zero / foolproof
            }
            if (board[pt.x][pt.y] == 0) {
                celltoexamine[numemptycell++] = pt; // Append this empty cell to the list
            }
        }
    }
    fclose(fd);

    showboard();
    printf("readboard: empty cells = %d\n", numemptycell);

    return 0;
}

// Find the right figure (1-9) at pos-th empty cells
void sudokuonestep(int pos){
    int x, y, b;
    int f;
	
    // Reach to an answer
    if (pos == numemptycell) {
        if (verboselevel >= 1) printf("======== %d ========\n", num_answer);
        if (verboselevel >= 2) showboard();
        num_answer++;
        return ;
    }
	
    x = celltoexamine[pos].x;
    y = celltoexamine[pos].y;
    b = xy2b(x, y);
    for (f = 1; f <= N; f++) {
        if (checkx[x][f-1] == 0 && checky[y][f-1] == 0 && checkb[b][f-1] == 0) {
            checkx[x][f-1] = 1; // Mark
            checky[y][f-1] = 1; // Mark
            checkb[b][f-1] = 1; // Mark
            board[x][y] = f;    // Record
            
            sudokuonestep(pos+1);
            
            checkx[x][f-1] = 0; // Unmark
            checky[y][f-1] = 0; // Unmark
            checkb[b][f-1] = 0; // Unmark
            board[x][y] = 0;    // Clear
        }
    }
}

// Main function
int main(int argc, char *argv[]){
    int r = 0;
    int i, f;
    int x, y, b;
    struct timeval ts, te;

    // examine options
    if (argc != 3) {
        printf("This command needs Problem-file and Verbose-Level.\n");
        printf("Please set a problem file and verbose level (0 or upper).\n");
        return -1;
    }

    // Read file
    if ((r = readboard(argv[1])) != 0) {
        printf("Error: file reading status : %d\n", r);
        return r;
    }

    // verbose level
    verboselevel = atoi(argv[2]);

    // Initialization of the conditions
    for (i = 0; i < N; i++) {
        for (f = 0; f < N; f++) {
            checkx[i][f] = checky[i][f] = checkb[i][f] = 0;
        }
    }

    // Mark the figures that are set in the problem
    for (x = 0; x < N; x++) {
        for (y = 0; y < N; y++) {
            if (board[x][y] != 0) {
                b = xy2b(x, y);
	            checkx[x][board[x][y]-1] = checky[y][board[x][y]-1] = checkb[b][board[x][y]-1] = 1;
            }
        }
    }

    // Go	
    gettimeofday(&ts, NULL);
    sudokuonestep(0);
    gettimeofday(&te, NULL);

    // Show the result
    printf("===================\n");
    printf("Number of the answers: %d\n", num_answer);
    printf("Time = %.6f [sec]\n", (float)(te.tv_sec - ts.tv_sec) + (te.tv_usec - ts.tv_usec) / 1000000.0);

    return 0;
}


Compile it and check no errors.

In [None]:
!gcc -Wall -o NumberPlace_E NumberPlace_E.c

Before running the program, we need to prepare a problem file.  
See below and pick up p5252.txt as input (save it as a file on the system).

According to [the original site of the problem](https://www.sudoku.name/index-jp.php), it is classified as "top level ++". Probably it should be very difficult to solve for humans.





In [None]:
!time ./NumberPlace_E p5252.txt 2

Note that it usually needs less than 1 second (Wow!). I does not depend on the problem level,rather the number of empty cells. 

# Problem files

The problem file should be prepared in text.

First 9 lines express the problem.  
Empty cell is expressed by 0. 

After reading the 81 figures, the program does not care for the remaining part.
You can write additional comments there.


In [None]:
%%writefile p5252.txt
0 2 0  9 0 0  0 0 6
1 0 0  0 0 0  0 9 0
0 0 8  0 0 7  0 0 0

5 0 0  0 0 0  7 0 0
0 0 7  8 0 0  1 0 0
0 0 9  4 0 0  0 0 2

0 0 0  2 0 0  4 0 0
0 6 0  0 0 0  0 0 3
3 0 0  0 0 5  0 8 0

Taken from:
http://www.sudoku.name/puzzles/jp/5252/f0fcf351df4eb6786e9bb6fc4e2dee02#5252
High level ++

Result

0 .2.9....6
1 1......9.
2 ..8..7...
3 5.....7..
4 ..78..1..
5 ..94....2
6 ...2..4..
7 .6......3
8 3....5.8.
readboard: empty cells = 59
======== 0 ========
0 423918576
1 176523894
2 958647321
3 512396748
4 647852139
5 839471652
6 781239465
7 265784913
8 394165287
===================
Number of the answers: 1
Time = 0.029590 [sec]


In [None]:
%%writefile p22833.txt
0 0 0  0 9 0  0 0 0
0 8 9  0 0 0  3 4 0
7 0 0  8 0 3  0 0 2

0 1 0  0 0 0  0 6 0
0 0 8  3 0 4  7 0 0
0 7 0  0 0 0  0 5 0

8 0 0  5 0 7  0 0 1
0 5 4  0 0 0  9 7 0
0 0 0  0 2 0  0 0 0

https://www.sudoku.name/puzzles/jp/22833/42fe880812925e520249e808937738d2
#22833
High level ++


In [None]:
%%writefile p22833-cut1.txt
0 0 0  0 9 0  0 0 0
0 8 9  0 0 0  3 4 0
7 0 0  8 0 3  0 0 2

0 1 0  0 0 0  0 6 0
0 0 8  3 0 4  7 0 0
0 7 0  0 0 0  0 5 0

8 0 0  5 0 7  0 0 0
0 5 4  0 0 0  9 7 0
0 0 0  0 2 0  0 0 0

https://www.sudoku.name/puzzles/jp/22833/42fe880812925e520249e808937738d2
#22833 
High level ++
remove "1" at (9,7) and see how many answers available

In [None]:
%%writefile p22833-cut4.txt
0 0 0  0 9 0  0 0 0
0 8 9  0 0 0  3 4 0
7 0 0  8 0 3  0 0 2

0 1 0  0 0 0  0 6 0
0 0 8  3 0 4  7 0 0
0 7 0  0 0 0  0 5 0

0 0 0  0 0 0  0 0 0
0 5 4  0 0 0  9 7 0
0 0 0  0 2 0  0 0 0

https://www.sudoku.name/puzzles/jp/22833/42fe880812925e520249e808937738d2
#22833 
High level ++
remove all the figures at 7th row and see how many answers available


# Problems

1. Relationship between computation time and problem level  
Prepare at least three differnet difficulty level problems and measure the computation times (to measure precisely, you should run 5-10 times for each).
Discuss your findings on the relationship between computation time and problem level.

2. Unique answer
By removing one or more fixed figures in a problem, you will see the answer is getting none-unique (accepts multiple answers). However, in some cases, there are some cases that the answer is still unique. What is the aim of leaving such figures in the problem?

3. xy2b() function  
Explain why xy2b() can tell the appropriate 3x3 block by giving (x,y) position. You should mention C language property.

4. (9!)<sup>9</sup> x 2 x 9<sup>2</sup>  
Try to express (9!)<sup>9</sup> x 2 x 9<sup>2</sup> in the form of s x 10<sup>t</sup> Rough estimate is acceptable.

5. X-SUDOKU  
There is a variation of number place where two diagonal lines have the same limitation too. It is called X-SUDOKU. Problems could be found at [Puzzle madness](https://puzzlemadness.co.uk/xsudoku). Develop the program to give the answer to X-SUDOKU problem based on NumberPlace_E.c program.



#**Course Info**

https://github.com/kameda-yoshinari/DataAlgo-T  
Course: Data structure and algorithm  
Department of Engineering Systems, University of Tsukuba,Japan.  
Author: KAMEDA, Yoshinari  
2020.05.19. -

# Memo

Answer example of X-SUDOKU (X-Number place).