***FCIM.FIA - Fundamentals of Artificial Intelligence***

> **Lab 3:** *Domains, Constraints* \\
> **Performed by:** *Bajenov Sevastian*, group *FAF-213* \\
> **Verified by:** Elena Graur, asist. univ.

## Imports and Utils

Create a virtual environment, install all the necessary dependencies so that you can run the notebook using your virtual environment as a kernel.

In [1]:
# pip install -r requirements.txt

The following sudoku solving algorithms were testes using `puzzles/grid1.txt` file, however it is possible to choose another input board (both simpler and harder options are present).

## Task 1

A `backtracking algorithm` is a problem-solving algorithm that uses a brute force approach for finding the desired output. The `Brute force approach` tries out all the possible solutions and chooses the desired/best solutions. The term backtracking suggests that if the current solution is not suitable, then backtracking happens and other solutions are tried. Thus, recursion is used in this approach.

In the context of solving sudoku boards backtracking can be used in a very straightforward way. The algorithm tries to fill empty squares on the board, exploring different game paths. If at one point it is impossible to place any number in the square, the algorithm returns to the last step where a number was placed and chooses another option. In this way all the squares on the board are being filled.

The code below implements basic `backtracking algorithm`. The initial board is being loaded from the text file, empty cells are being identified using `*` symbols (they are further replaced by zeros). The `is_valid_number` method plays crucial role in deciding whether it is possible to fill the square or not.

In [99]:
import time

ZERO_DELIMITER = '*'


def load_sudoku(file_path):
    board = []

    with open(file_path, 'r') as file:
        for line in file:
            row = [int(char) if char !=
                   ZERO_DELIMITER else 0 for char in line.strip()]
            board.append(row)

    return board


def print_board(board):
    for row in board:
        print(" ".join(str(num) for num in row))
    print()


def is_valid_number(num, pos, board):
    row, col = pos

    for i in range(9):
        if board[row][i] == num or board[i][col] == num:
            return False

    box_x = col // 3
    box_y = row // 3

    for i in range(box_y * 3, box_y * 3 + 3):
        for j in range(box_x * 3, box_x * 3 + 3):
            if board[i][j] == num:
                return False

    return True


def find_empty_cell(board):
    for i in range(len(board)):
        for j in range(len(board[0])):
            if board[i][j] == 0:
                return (i, j)

    return None


def solve_backtrack(board):
    empty_cell = find_empty_cell(board)

    if not empty_cell:
        return True
    else:
        row, col = empty_cell

    for i in range(1, 10):
        if is_valid_number(i, (row, col), board):
            board[row][col] = i

            if solve_backtrack(board):
                return True

            board[row][col] = 0

    return False

board = load_sudoku("./puzzles/grid1.txt")

print("Initial board:")
print_board(board)

start_time = time.time()
solve_backtrack(board)
end_time = time.time()

print("Elapsed time: {:.4f} millis\n".format((end_time - start_time) * 1000))

print("Solved board:")
print_board(board)

Initial board:
0 0 2 0 0 0 0 0 0
8 7 1 0 5 0 0 9 0
3 0 0 0 8 9 2 0 0
0 0 6 5 0 0 9 0 0
0 5 0 0 9 0 0 0 0
7 0 0 0 0 0 0 0 0
6 0 7 8 2 0 0 3 0
0 2 0 3 0 4 0 8 6
0 0 8 9 0 0 7 5 2

Elapsed time: 726.3079 millis

Solved board:
5 9 2 1 4 3 6 7 8
8 7 1 2 5 6 3 9 4
3 6 4 7 8 9 2 1 5
1 4 6 5 3 8 9 2 7
2 5 3 4 9 7 8 6 1
7 8 9 6 1 2 5 4 3
6 1 7 8 2 5 4 3 9
9 2 5 3 7 4 1 8 6
4 3 8 9 6 1 7 5 2



## Task 2-3

One of the ways to optimize the basic backtracking algorithm is to apply the concepts of `Domain Reduction` and `Constraint Propagation`. `Constraint propagation` is a fundamental concept in constraint satisfaction problems (CSPs). A CSP involves variables that must be assigned values from a given domain while satisfying a set of constraints. Constraint propagation aims to simplify these problems by reducing the domains of variables, thereby making the search for solutions more efficient.

`Variables` are the elements that need to be assigned values. `Domains` represent possible values that can be assigned to the variables. `Constraints` are the rules that define permissible combinations of values for the variables. `Constraint propagation` works by iteratively narrowing down the domains of variables based on the constraints. This process continues until no more values can be eliminated from any domain. The primary goal is to reduce the search space and make it easier to find a solution. 

In my implementation the domains of each sudoku square are first inititalized (an empty square has the domain [1-9] and the square which contains a number has the domain of exactly that number) and then reduced using the `propagate_constarints` method. This method verifies whether it is possible to place each number from the domain into a particular square using the `is_valid_number` method and discards invalid domain members. Below is presented a demonstration of the domain reduction technique. It can be easily observed that the domains are significantly reduced which will further help decrease the number of computations performed to check possible numbers to be inserted.

In [83]:
ZERO_DELIMITER = '*'


def load_sudoku(file_path):
    board = []

    with open(file_path, 'r') as file:
        for line in file:
            row = [int(char) if char !=
                   ZERO_DELIMITER else 0 for char in line.strip()]
            board.append(row)

    return board

def print_board(board):
    for row in board:
        print(" ".join(str(num) for num in row))
    print()


def initialize_domains(board):
    domains = {}

    for i in range(9):
        for j in range(9):
            if board[i][j] == 0:
                domains[(i, j)] = set(range(1, 10))
            else:
                domains[(i, j)] = {board[i][j]}

    return domains


def propagate_constraints(domains, board):
    for domain_key in domains.keys():
        if board[domain_key[0]][domain_key[1]] == 0:
            for num in range(1, 10):
                if not is_valid_number(num, domain_key, board):
                    domains[domain_key].discard(num)

board = load_sudoku("./puzzles/sudoku.txt")

print("Initial board:")
print_board(board)

domains = initialize_domains(board)
print(f"Initial domains: {domains}")
print()

propagate_constraints(domains, board)
print(f"Reduced domains: {domains}")

Initial board:
0 0 0 0 0 4 0 9 0
8 0 2 9 7 0 0 0 0
9 0 1 2 0 0 3 0 0
0 0 0 0 4 9 1 5 7
0 1 3 0 5 0 9 2 0
5 7 9 1 2 0 0 0 0
0 0 7 0 0 2 6 0 3
0 0 0 0 3 8 2 0 5
0 2 0 5 0 0 0 0 0

Initial domains: {(0, 0): {1, 2, 3, 4, 5, 6, 7, 8, 9}, (0, 1): {1, 2, 3, 4, 5, 6, 7, 8, 9}, (0, 2): {1, 2, 3, 4, 5, 6, 7, 8, 9}, (0, 3): {1, 2, 3, 4, 5, 6, 7, 8, 9}, (0, 4): {1, 2, 3, 4, 5, 6, 7, 8, 9}, (0, 5): {4}, (0, 6): {1, 2, 3, 4, 5, 6, 7, 8, 9}, (0, 7): {9}, (0, 8): {1, 2, 3, 4, 5, 6, 7, 8, 9}, (1, 0): {8}, (1, 1): {1, 2, 3, 4, 5, 6, 7, 8, 9}, (1, 2): {2}, (1, 3): {9}, (1, 4): {7}, (1, 5): {1, 2, 3, 4, 5, 6, 7, 8, 9}, (1, 6): {1, 2, 3, 4, 5, 6, 7, 8, 9}, (1, 7): {1, 2, 3, 4, 5, 6, 7, 8, 9}, (1, 8): {1, 2, 3, 4, 5, 6, 7, 8, 9}, (2, 0): {9}, (2, 1): {1, 2, 3, 4, 5, 6, 7, 8, 9}, (2, 2): {1}, (2, 3): {2}, (2, 4): {1, 2, 3, 4, 5, 6, 7, 8, 9}, (2, 5): {1, 2, 3, 4, 5, 6, 7, 8, 9}, (2, 6): {3}, (2, 7): {1, 2, 3, 4, 5, 6, 7, 8, 9}, (2, 8): {1, 2, 3, 4, 5, 6, 7, 8, 9}, (3, 0): {1, 2, 3, 4, 5, 6, 7, 8, 9}, (3, 1): 

In order to integrate the domain reduction technique into the initial backtracking algorithm it is necessary, first of all, to add `forward_checking` method. It is very similar to the `propagate_constraints` method, however, it enforces the constraints only in the vicinity of the square which was filled with a number, i.e in the same box, column and row, not the whole board. With that in my mind, the `solve_advanced` method can be introduced now. Instead of checking numbers from 1 to 9 for each empty cell, it only checks the numbers from the cell's domain. Moreover, after placing the number into the cell it performs forward checking and then starts recursive verification. Having been forced to change the domains during the recursive check, I came to the decision to keep the copy of the domains before calling the `solve_advanced` recursively in order to restore them if the game path will not lead anywhere.

Below is presented the code for the for the updated backtracking algorithm and its timing. It can be observed that the algorithm performs slightly worse than the basic approach. It can happen due to the fact that the domain copy operation is too expensive because it is being performed at each iteration. Moreover, before running the algorithm it is necessary to propagate the constraints according to the initial domains; it also takes some time.

In [98]:
import time

ZERO_DELIMITER = '*'


def load_sudoku(file_path):
    board = []

    with open(file_path, 'r') as file:
        for line in file:
            row = [int(char) if char !=
                   ZERO_DELIMITER else 0 for char in line.strip()]
            board.append(row)

    return board


def print_board(board):
    for row in board:
        print(" ".join(str(num) for num in row))
    print()


def initialize_domains(board):
    domains = {}

    for i in range(9):
        for j in range(9):
            if board[i][j] == 0:
                domains[(i, j)] = set(range(1, 10))
            else:
                domains[(i, j)] = {board[i][j]}

    return domains


def propagate_constraints(domains, board):
    for domain_key in domains.keys():
        if board[domain_key[0]][domain_key[1]] == 0:
            for num in range(1, 10):
                if not is_valid_number(num, domain_key, board):
                    domains[domain_key].discard(num)


def forward_checking(row, col, num, domains):
    for i in range(9):
        if (row, i) in domains:
            domains[(row, i)].discard(num)
        if (i, col) in domains:
            domains[(i, col)].discard(num)

    box_x = col // 3
    box_y = row // 3

    for i in range(box_y*3, box_y*3 + 3):
        for j in range(box_x*3, box_x*3 + 3):
            if (i, j) in domains:
                domains[(i, j)].discard(num)


def is_valid_number(num, pos, board):
    row, col = pos

    for i in range(9):
        if board[row][i] == num or board[i][col] == num:
            return False

    box_x = col // 3
    box_y = row // 3

    for i in range(box_y * 3, box_y * 3 + 3):
        for j in range(box_x * 3, box_x * 3 + 3):
            if board[i][j] == num:
                return False

    return True


def find_empty_cell(board):
    for i in range(len(board)):
        for j in range(len(board[0])):
            if board[i][j] == 0:
                return (i, j)

    return None


def solve_advanced(board, domains):
    empty_cell = find_empty_cell(board)

    if not empty_cell:
        return True

    row, col = empty_cell

    for num in list(domains[(row, col)]):
        if is_valid_number(num, (row, col), board):
            board[row][col] = num
            local_domains = {key: value.copy() for key, value in domains.items()}
            forward_checking(row, col, num, domains)

            if solve_advanced(board, domains):
                return True

            board[row][col] = 0
            domains = local_domains
            
    return False

board = load_sudoku("./puzzles/grid1.txt")

print("Initial board:")
print_board(board)

start_time = time.time()
domains = initialize_domains(board)
propagate_constraints(domains, board)
solve_advanced(board, domains)
end_time = time.time()

print("Elapsed time: {:.4f} millis\n".format((end_time - start_time) * 1000))

print("Solved board:")
print_board(board)

Initial board:
0 0 2 0 0 0 0 0 0
8 7 1 0 5 0 0 9 0
3 0 0 0 8 9 2 0 0
0 0 6 5 0 0 9 0 0
0 5 0 0 9 0 0 0 0
7 0 0 0 0 0 0 0 0
6 0 7 8 2 0 0 3 0
0 2 0 3 0 4 0 8 6
0 0 8 9 0 0 7 5 2

Elapsed time: 2834.0967 millis

Solved board:
5 9 2 1 4 3 8 6 7
8 7 1 2 5 6 3 9 4
3 6 4 7 8 9 2 1 5
1 4 6 5 3 7 9 2 8
2 5 3 4 9 8 6 7 1
7 8 9 6 1 2 5 4 3
6 1 7 8 2 5 4 3 9
9 2 5 3 7 4 1 8 6
4 3 8 9 6 1 7 5 2



## Task 4

In order to further optimize the sudoku solving algorithm it is possible to use `heuristic approach`. I decided to use two heuristics to integrate into the `solve_advanced` method: `Minimum Remaining Value (MRV)` and `Least Constraining Value (LCV)`. `MRV's` main idea is to choose the empty square with the fewest possible values in its domain instead of an arbitrary square. `LCV` aims at choosing a value for a particular square that rules out the smallest number of values in variables connected to the current variable by constraints.

Practically the heuristics are applied as follows: `MRV` enhances `find_empty_cell` method and `LCV` optimizes the for loop inside of the `solve_advanced` method (the iteration happens over the list provided by the `order_domains_lcv` method, which sorts the domain values from least constraining to the most constraining). The result of this approach is a significant reduction of the sudoku solving speed comparing to the initial approach. Moreover, the time spent on solving sudoku remains almost constant even if the board becomes more difficult to solve. An example of execution is presented below as well as the code implementation:

In [105]:
import time

ZERO_DELIMITER = '*'


def load_sudoku(file_path):
    board = []

    with open(file_path, 'r') as file:
        for line in file:
            row = [int(char) if char !=
                   ZERO_DELIMITER else 0 for char in line.strip()]
            board.append(row)

    return board


def print_board(board):
    for row in board:
        print(" ".join(str(num) for num in row))
    print()


def initialize_domains(board):
    domains = {}

    for i in range(9):
        for j in range(9):
            if board[i][j] == 0:
                domains[(i, j)] = set(range(1, 10))
            else:
                domains[(i, j)] = {board[i][j]}

    return domains


def propagate_constraints(domains, board):
    for domain_key in domains.keys():
        if board[domain_key[0]][domain_key[1]] == 0:
            for num in range(1, 10):
                if not is_valid_number(num, domain_key, board):
                    domains[domain_key].discard(num)


def forward_checking(row, col, num, domains):
    for i in range(9):
        if (row, i) in domains:
            domains[(row, i)].discard(num)
        if (i, col) in domains:
            domains[(i, col)].discard(num)

    box_x = col // 3
    box_y = row // 3

    for i in range(box_y*3, box_y*3 + 3):
        for j in range(box_x*3, box_x*3 + 3):
            if (i, j) in domains:
                domains[(i, j)].discard(num)


def is_valid_number(num, pos, board):
    row, col = pos

    for i in range(9):
        if board[row][i] == num or board[i][col] == num:
            return False

    box_x = col // 3
    box_y = row // 3

    for i in range(box_y * 3, box_y * 3 + 3):
        for j in range(box_x * 3, box_x * 3 + 3):
            if board[i][j] == num:
                return False

    return True


def find_empty_cell_mrv(domains, board):
    min_domain_size = float('inf')
    best_cell = None

    for cell in domains:
        if board[cell[0]][cell[1]] == 0:
            domain_size = len(domains[cell])

            if domain_size < min_domain_size:
                min_domain_size = domain_size
                best_cell = cell

    return best_cell


def order_domains_lcv(var, domains):
    def count_constraints(value):
        count = 0
        row, col = var

        for i in range(9):
            if (row, i) in domains and value in domains[(row, i)]:
                count += 1

        for i in range(9):
            if (i, col) in domains and value in domains[(i, col)]:
                count += 1

        box_x = col // 3
        box_y = row // 3

        for i in range(box_y * 3, box_y * 3 + 3):
            for j in range(box_x * 3, box_x * 3 + 3):
                if (i, j) in domains and value in domains[(i, j)]:
                    count += 1

        return count

    return sorted(domains[var], key=count_constraints)


def solve_advanced(board, domains):
    empty_cell = find_empty_cell_mrv(domains, board)

    if not empty_cell:
        return True

    row, col = empty_cell

    for num in order_domains_lcv((row, col), domains):
        if is_valid_number(num, (row, col), board):
            board[row][col] = num
            local_domains = {key: value.copy() for key, value in domains.items()}
            forward_checking(row, col, num, domains)

            if solve_advanced(board, domains):
                return True

            board[row][col] = 0
            domains = local_domains
            
    return False

board = load_sudoku("./puzzles/grid1.txt")

print("Initial board:")
print_board(board)

start_time = time.time()
domains = initialize_domains(board)
propagate_constraints(domains, board)
solve_advanced(board, domains)
end_time = time.time()

print("Elapsed time: {:.4f} millis\n".format((end_time - start_time) * 1000))

print("Solved board:")
print_board(board)

Initial board:
0 0 2 0 0 0 0 0 0
8 7 1 0 5 0 0 9 0
3 0 0 0 8 9 2 0 0
0 0 6 5 0 0 9 0 0
0 5 0 0 9 0 0 0 0
7 0 0 0 0 0 0 0 0
6 0 7 8 2 0 0 3 0
0 2 0 3 0 4 0 8 6
0 0 8 9 0 0 7 5 2

Elapsed time: 11.3974 millis

Solved board:
9 6 2 7 3 1 5 4 8
8 7 1 4 5 2 6 9 3
3 4 5 6 8 9 2 7 1
1 8 6 5 4 3 9 2 7
2 5 3 1 9 7 8 6 4
7 9 4 2 6 8 3 1 5
6 1 7 8 2 5 4 3 9
5 2 9 3 7 4 1 8 6
4 3 8 9 1 6 7 5 2



## Task 5-6

Before implementing `sudoku generator` I decided to create a complete `validator` for the sudoku boards. I took into consideration several important criteria which covered not only impossible number placing but also multiple solutions cases. The criteria having been implemented are as follows:

1. **Empty** - The ultimate in `not enough input`, this puzzle contains no givens at all. To say that it has multiple solutions is something of an overstatement.
2. **Single Given** - Originally described as `Zen Sudoku`, this setup can be considered either as the one with multiple solutions or the one with not enough starting values.
3. **Insufficient Givens** - This puzzle has only sixteen givens, which is one less than the accepted minimum number for a classic Sudoku puzzle.
4. **Duplicate Given (Box, Column, Row)** - The puzzle contains duplicate values within a 3x3 box, column or row.
5. **Unsolvable Square** - It is impossible to place any value into a particular square.
6. **Unsolvable Box, Column or Row** - The puzzle 3x3 box, column or row in which it is impossible to place a certain value.
7. **Multiple Solutions** - The puzzle has more than one valid solution.

The given validation criteria are described in more details in the article from the `[4]` bibliography link. The code below is the implmentation of the sudoku validator functionality. The `verify_board` method prints the validation error, if any occur, or just an empty string. It is also worth mentioning that `check_multiple_solutions` method uses `solve_advanced` method from the sudoku solver functionality in order to optimize solution checking.

In [4]:
from sudoku_solver import load_sudoku, print_board, solve_advanced, initialize_domains, initialize_neighbors

MINIMAL_GIVENS = 17


def verify_board(board, domains, neighbors):
    empty_board_check, empty_board_msg = check_empty_board(board)
    single_given_check, single_given_msg = check_single_given(board)
    insufficient_givens_check, insufficient_givens_msg = check_insufficient_givens(
        board)
    duplicates_check, duplicates_msg = check_duplicates(board)
    square_solvability_check, square_solvability_msg = check_unsolvable_square(
        board, domains)
    box_column_row_solvability_check, box_column_row_solvability_msg = check_unsolvable_box_column_row(
        board, domains)
    multiple_solutions_check, multiple_solutions_msg = check_multiple_solutions(
        board, domains, neighbors)

    print(empty_board_msg, single_given_msg, insufficient_givens_msg, duplicates_msg,
          square_solvability_msg, box_column_row_solvability_msg, multiple_solutions_msg)

    sudoku_validity_checks = empty_board_check and single_given_check and \
        insufficient_givens_check and duplicates_check and square_solvability_check and \
        box_column_row_solvability_check and multiple_solutions_check

    return sudoku_validity_checks


def check_empty_board(board):
    if not all(cell == 0 for row in board for cell in row):
        return True, ""

    return False, "The board is empty."


def check_single_given(board):
    given_count = sum(cell != 0 for row in board for cell in row)

    if given_count == 1:
        return False, "Only one given value."

    return True, ""


def check_insufficient_givens(board):
    given_count = sum(cell != 0 for row in board for cell in row)

    if given_count < 17:
        return False, "Insufficient givens (less than 17)."

    return True, ""


def check_duplicates(board):
    for i in range(9):
        row_values = [board[i][j]
                      for j in range(9) if board[i][j] != 0]

        if len(row_values) != len(set(row_values)):
            return False, f"Duplicate values found in row {i + 1}."

        col_values = [board[j][i]
                      for j in range(9) if board[j][i] != 0]

        if len(col_values) != len(set(col_values)):
            return False, f"Duplicate values found in column {i + 1}."

    for box_row in range(0, 9, 3):
        for box_col in range(0, 9, 3):
            box_values = [
                board[r][c]
                for r in range(box_row, box_row + 3)
                for c in range(box_col, box_col + 3)
                if board[r][c] != 0
            ]

            if len(box_values) != len(set(box_values)):
                return False, f"Duplicate values found in 3x3 box starting at ({box_row + 1}, {box_col + 1})."

    return True, ""


def check_unsolvable_square(board, domains):
    for i in range(9):
        for j in range(9):
            if board[i][j] == 0 and len(domains[(i, j)]) == 0:
                return False, f"Unsolvable square at ({i + 1}, {j + 1})."

    return True, ""


def check_unsolvable_box_column_row(board, domains):
    for num in range(1, 10):
        for i in range(9):
            if num not in [board[i][j] for j in range(9)]:
                if not any(num in domains[(i, j)] for j in range(9) if board[i][j] == 0):
                    return False, f"Number {num} cannot be placed in row {i + 1}."

        for j in range(9):
            if num not in [board[i][j] for i in range(9)]:
                if not any(num in domains[(i, j)] for i in range(9) if board[i][j] == 0):
                    return False, f"Number {num} cannot be placed in column {j + 1}."

        for box_row in range(0, 9, 3):
            for box_col in range(0, 9, 3):
                if num not in [
                    board[r][c]
                    for r in range(box_row, box_row + 3)
                    for c in range(box_col, box_col + 3)
                ]:
                    if not any(
                        num in domains[(r, c)]
                        for r in range(box_row, box_row + 3)
                        for c in range(box_col, box_col + 3)
                        if board[r][c] == 0
                    ):
                        return False, f"Number {num} cannot be placed in 3x3 box starting at ({box_row + 1}, {box_col + 1})."

    return True, ""


def check_multiple_solutions(board, domains, neighbors):
    solution_count = solve_advanced(board, domains, neighbors,
                                    count_solutions=True)

    if solution_count > 1:
        return False, "Multiple solutions found."

    return True, ""

board = load_sudoku("./puzzles/invalid_board.txt")
print_board(board)

domains = initialize_domains(board)
neighbors = initialize_neighbors()

verify_board(board, domains, neighbors)

0 3 9 0 0 0 1 2 0
0 0 0 9 0 7 0 0 0
8 0 0 4 0 1 0 0 6
0 4 2 0 0 0 7 9 0
0 0 0 0 0 0 0 0 0
0 9 1 0 0 0 5 4 0
5 0 0 1 0 9 0 0 3
0 0 0 8 0 5 0 0 0
0 1 4 0 0 0 8 7 0

      Multiple solutions found.


False

The example which is used for demonstration actually has 2 solutions. The algorithm checks only until the second solution appears in order to avoid redundant computations.

Moving on to the `sudoku generation` logic, it can be noticed that it is rather simple and reuses previously implemented methods. The algorithm for generation is the following: an empty board is created and then solved using `solve_advanced`. After that, a certain number of random "holes" are placed on the board. This process repeats until the validator finds a unique solution for the generated sudoku. It is not necessary to verify all the validation criteria beacuse the sudoku is solved initially, thus meaning that there will not be situations with unsolvable columns, rows, etc.

The resulting puzzle may seem predictable because the solving algorithm works with an empty board, however the random placing of holes decreases generation predictability.

In [10]:
import random
from sudoku_solver import solve_backtrack, initialize_domains, initialize_neighbors
from sudoku_validator import check_multiple_solutions


def generate_full_board():
    board = [[0] * 9 for _ in range(9)]
    solve_backtrack(board)
    return board


def remove_numbers(board, num_holes):
    count = 0

    while count < num_holes:
        row = random.randint(0, 8)
        col = random.randint(0, 8)

        if board[row][col] != 0:
            board[row][col] = 0
            count += 1


def generate_sudoku(num_holes=40):
    board = generate_full_board()

    while True:
        board_copy = [row.copy() for row in board]
        board_check_copy = [row.copy() for row in board]
        remove_numbers(board, num_holes)

        domains = initialize_domains(board)
        neighbors = initialize_neighbors()

        if check_multiple_solutions(board_check_copy, domains, neighbors):
            return board
        else:
            board = board_copy

board = generate_sudoku()
print("Initial board:")
print_board(board)

domains = initialize_domains(board)
neighbors = initialize_neighbors()

solve_advanced(board, domains, neighbors)

print("Solved board:")
print_board(board)

Initial board:
1 0 3 4 5 0 0 8 0
0 5 0 7 8 0 0 0 0
7 0 0 1 2 0 0 5 6
2 1 0 3 6 0 0 9 0
0 0 5 8 9 0 0 1 4
0 0 0 0 1 4 0 6 5
0 3 0 0 0 0 9 7 8
0 4 0 9 7 8 5 3 1
0 0 0 5 0 1 0 0 2

Solved board:
1 2 3 4 5 6 7 8 9
4 5 6 7 8 9 1 2 3
7 9 8 1 2 3 4 5 6
2 1 4 3 6 5 8 9 7
3 6 5 8 9 7 2 1 4
8 7 9 2 1 4 3 6 5
5 3 1 6 4 2 9 7 8
6 4 2 9 7 8 5 3 1
9 8 7 5 3 1 6 4 2



## Task 7

The final change which will be made to the sudoku solving algorithm is the implementation of the `AC-3 (Arc Consistency)` constraint propagation. The `arc consistency algorithm` basically performs two main functions. The first function is to check (or to revise) a particular arc for consistency, i.e., removing those inconsistent values from the variable domain of the arc. The second function is to propagate domain modifications to other related constraints and bring them up to be rechecked.

The algorithm maintains a queue of arcs (or constraints) to be checked. Each arc in the queue is checked in turn and the algorithm terminates when the queue becomes empty. `AC-3` checks the consistency of each arc (or constraint) using the general `revise` procedure. It returns true or false depending on whether the domain `Di` is modified. `Revise` checks every value `x` in the domain `Di` to see whether there is a value `y` in the domain `Dj` that satisfies the constraint `Cij(x, y)`. If a value `y` is not found in `Dj`, `x` will be removed from `Di`. This process prunes the domain `Di`.

The code implementation below differs from the previous step only by the constraint propagation before the solving procedure itself. The algorithm still produces stable and relatively fast results but is slower than the implementation from `Task 4`. It may happen due to the fact that `AC-3` requires checking arcs for each possible pair of squares on the board, as well as creating the list of neighbors. Anyway the approach shows decent results when solving different sudoku boards.

In [119]:
import time

ZERO_DELIMITER = '*'


def load_sudoku(file_path):
    board = []

    with open(file_path, 'r') as file:
        for line in file:
            row = [int(char) if char !=
                   ZERO_DELIMITER else 0 for char in line.strip()]
            board.append(row)

    return board


def print_board(board):
    for row in board:
        print(" ".join(str(num) for num in row))
    print()


def initialize_domains(board):
    domains = {}

    for i in range(9):
        for j in range(9):
            if board[i][j] == 0:
                domains[(i, j)] = set(range(1, 10))
            else:
                domains[(i, j)] = {board[i][j]}

    return domains


def initialize_neighbors():
    neighbors = {}

    for row in range(9):
        for col in range(9):
            neighbors[(row, col)] = get_neighbors((row, col))

    return neighbors


def get_neighbors(var):
    row, col = var
    neighbors = set()

    for i in range(9):
        if (row, i) != var:
            neighbors.add((row, i))
        if (i, col) != var:
            neighbors.add((i, col))

    box_x = col // 3
    box_y = row // 3
    for i in range(box_y*3, box_y*3 + 3):
        for j in range(box_x*3, box_x*3 + 3):
            if (i, j) != var:
                neighbors.add((i, j))

    return neighbors


def propagate_constraints_AC(board, domains, neighbors):
    queue = [(var, neighbor)
             for var in domains for neighbor in get_neighbors(var)]
    while queue:
        (xi, xj) = queue.pop(0)

        if revise(xi, xj, board, domains, neighbors):
            if len(domains[xi]) == 0:
                return False

            for xk in get_neighbors(xi):
                if xk != xj:
                    queue.append((xk, xi))


def revise(xi, xj, board, domains, neighbors):
    revised = False

    for x in set(domains[xi]):
        if not any(is_valid_number(x, (xi[0], xi[1]), board) for xi in neighbors[xj]):
            domains[xi].remove(x)
            revised = True

    return revised


def forward_checking(row, col, num, domains):
    for i in range(9):
        if (row, i) in domains:
            domains[(row, i)].discard(num)
        if (i, col) in domains:
            domains[(i, col)].discard(num)

    box_x = col // 3
    box_y = row // 3

    for i in range(box_y*3, box_y*3 + 3):
        for j in range(box_x*3, box_x*3 + 3):
            if (i, j) in domains:
                domains[(i, j)].discard(num)


def is_valid_number(num, pos, board):
    row, col = pos

    for i in range(9):
        if board[row][i] == num or board[i][col] == num:
            return False

    box_x = col // 3
    box_y = row // 3

    for i in range(box_y * 3, box_y * 3 + 3):
        for j in range(box_x * 3, box_x * 3 + 3):
            if board[i][j] == num:
                return False

    return True


def find_empty_cell_mrv(domains, board):
    min_domain_size = float('inf')
    best_cell = None

    for cell in domains:
        if board[cell[0]][cell[1]] == 0:
            domain_size = len(domains[cell])

            if domain_size < min_domain_size:
                min_domain_size = domain_size
                best_cell = cell

    return best_cell


def order_domains_lcv(var, domains):
    def count_constraints(value):
        count = 0
        row, col = var

        for i in range(9):
            if (row, i) in domains and value in domains[(row, i)]:
                count += 1

        for i in range(9):
            if (i, col) in domains and value in domains[(i, col)]:
                count += 1

        box_x = col // 3
        box_y = row // 3

        for i in range(box_y * 3, box_y * 3 + 3):
            for j in range(box_x * 3, box_x * 3 + 3):
                if (i, j) in domains and value in domains[(i, j)]:
                    count += 1

        return count

    return sorted(domains[var], key=count_constraints)


def solve_advanced(board, domains):
    empty_cell = find_empty_cell_mrv(domains, board)

    if not empty_cell:
        return True

    row, col = empty_cell

    for num in order_domains_lcv((row, col), domains):
        if is_valid_number(num, (row, col), board):
            board[row][col] = num
            local_domains = {key: value.copy() for key, value in domains.items()}
            forward_checking(row, col, num, domains)

            if solve_advanced(board, domains):
                return True

            board[row][col] = 0
            domains = local_domains
            
    return False

board = load_sudoku("./puzzles/grid1.txt")

print("Initial board:")
print_board(board)

start_time = time.time()
domains = initialize_domains(board)
neighbors = initialize_neighbors()
propagate_constraints_AC(board, domains, neighbors)
solve_advanced(board, domains)
end_time = time.time()

print("Elapsed time: {:.4f} millis\n".format((end_time - start_time) * 1000))

print("Solved board:")
print_board(board)

Initial board:
0 0 2 0 0 0 0 0 0
8 7 1 0 5 0 0 9 0
3 0 0 0 8 9 2 0 0
0 0 6 5 0 0 9 0 0
0 5 0 0 9 0 0 0 0
7 0 0 0 0 0 0 0 0
6 0 7 8 2 0 0 3 0
0 2 0 3 0 4 0 8 6
0 0 8 9 0 0 7 5 2

Elapsed time: 329.2139 millis

Solved board:
5 9 2 1 4 3 8 6 7
8 7 1 6 5 2 3 9 4
3 6 4 7 8 9 2 1 5
2 4 6 5 3 8 9 7 1
1 5 3 2 9 7 6 4 8
7 8 9 4 1 6 5 2 3
6 1 7 8 2 5 4 3 9
9 2 5 3 7 4 1 8 6
4 3 8 9 6 1 7 5 2



## Conclusions:

In this laboratory work I familiarized myself with the concepts of constraint specific problems (CSP) and the techniques for solving them, like domain reduction and constraint propagation. Having started from the traditional backtracking algorithm, I enhanced it using various approaches. In the process I learned some common heuristics which are used for CSP solution optimization and other techniques for constraint propagation, like AC-3, for instance. Moreover, I investigated the problem of generating and validating sudoku boards. Finally, I came to the conclusion that the most optimal way to solve sudoku puzzles and validate them is to use backtracking algorithm with default constraint propagation and MRV/LCV optimizations.

## Acknowledgements

In this laboratory work I was assisted by Sorin Iatco from FAF-213. He provided with some sample boards for testing my sudoku solving algorithms.

## Bibliography:

1. https://www.programiz.com/dsa/backtracking-algorithm
2. https://www.geeksforgeeks.org/constraint-propagation-in-ai/
3. https://www.cs.cornell.edu/courses/cs4700/2011fa/lectures/05_CSP.pdf
4. http://sudopedia.enjoysudoku.com/Invalid_Test_Cases.html#Not_Unique_.E2.80.94_2_Solutions
5. https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=&ved=2ahUKEwj62JbfzKaJAxUMRvEDHe4GOyMQFnoECCwQAQ&url=https%3A%2F%2Fwww.cs.uic.edu%2F~liub%2Fteach%2Fcs511-spring-06%2Fcs511-CSP.doc&usg=AOvVaw2vKulI7WedEZOkL9BOS5s-&opi=89978449