# A* :

The A* (pronounced "A-star") search algorithm is a widely used and effective pathfinding algorithm. It's commonly used in various fields, including artificial intelligence, robotics, and video games. A* is an extension of dijkstra algorithm and is used to find the shortest path from a starting point to a goal in a graph, grid, or map.

Here's a high-level explanation of the A* search algorithm:

1. **Initialization:**

   - You start with a graph or grid representing your environment. Each location in this environment is called a "node" or "cell."
   - You have a starting node and a goal node. The goal is the destination you want to reach.

2. **Open List and Closed List:**

   - A* uses two lists: the "Open List" and the "Closed List."
   - The Open List holds nodes that need to be explored or considered.
   - The Closed List holds nodes that have already been explored.

3. **G and H Scores:**

   - Each node on the Open List has two values associated with it: G and H scores.
   - G (the "cost-so-far") represents the actual cost to reach a particular node from the starting node.
   - H (the "heuristic") is an estimate of the cost from the current node to the goal. It's a heuristic because it doesn't give you the exact cost but an estimation.

4. **F Score:**

   - The F score is the sum of G and H: `F = G + H`.
   - A* selects the node with the lowest F score from the Open List, as this is the node that is estimated to be the most promising.

5. **Expanding Nodes:**

   - For the selected node, you expand it by considering its neighbors (often called "successor" nodes).
   - Calculate G and H scores for each neighbor.
   - If a neighbor is not on the Open or Closed List, add it to the Open List with the calculated F, G, and H scores.

6. **Choosing the Next Node:**

   - The node with the lowest F score on the Open List becomes the next node to explore.
   - If the goal node is selected, the path is found, and you can reconstruct the path by tracing back from the goal node to the start node.

7. **Repeat:**

   - Continue this process until you reach the goal node, or the Open List becomes empty (indicating that there is no path to the goal).

8. **Optimality:**

   - A* is known for its optimality; it guarantees that it will find the shortest path if one exists.

A* uses a combination of the G and H scores to prioritize exploring nodes. The G score ensures that the algorithm explores paths with lower actual costs, and the H score guides it toward the goal. By considering the total cost (F score), A* efficiently finds the optimal path. However, it's important to choose an appropriate heuristic fo the problem to ensure A* works effectively.#

In [1]:
# import the only module needed
from simpleai.search import SearchProblem, astar

### Data structure
To represent the irregular 6x6 sudoku board, chains are used, specifically two, one that will represent the numbers on the board and another that will represent the structure of the board (the shape of the subgroups)
 As an example, in this document we work with the following boad:
<div style='text-align: center';> <img src='https://i.imgur.com/ckQ8rn7.png' alt='Alt Text'  width='50%'> </div>
>
>
:

The represention of the numbers of the above table would be:

In [5]:
initial_board_str = '''0-0-0-0-1-0
0-3-0-0-2-0
0-2-0-0-0-0
0-0-0-0-5-0
0-0-0-1-0-0
0-6-5-2-0-0'''

initial_board_str

'\n0-0-0-0-1-0\n0-3-0-0-2-0\n0-2-0-0-0-0\n0-0-0-0-5-0\n0-0-0-1-0-0\n0-6-5-2-0-0\n'

In [6]:
#Randomly defining the jigsaw labelling

initial_group_str = '''1-1-1-1-1-2
3-3-4-1-2-2
3-4-4-4-2-2
3-3-4-4-2-6
3-5-6-6-6-6
5-5-5-5-5-6'''

initial_group_str

'\n1-1-1-1-1-2\n3-3-4-1-2-2\n3-4-4-4-2-2\n3-3-4-4-2-6\n3-5-6-6-6-6\n5-5-5-5-5-6\n'

The number assigned to each group does not affect the result of the algorithm, it is not even necessary for the numbers used to be consecutive, since they are used by the algorithm only to be able to group the boxes that belong to each group

# Sudoku Utilities
Here, you will find a collection of functions and subroutines that the algorithm utilizes for managing states in the methods essential to enact the solution as a search problem. These utility functions are stored within the `sudokutils.py` file within the source code available on [GitHub](https://github.com/Bielos/irregular-sudoku-a-star).

In [9]:
# Functions 'string_to_list' and 'list_to string' allows us to convert the state chain representation to matrices to make it easier for us to handle the data structure.

def string_to_list(my_string):
    return (row.split('-') for row in my_string.split('\n'))

def list_to_string(my_list):
    return('\n'.join(['-'.join(row) for row in my_list])) #fyi, append is for list and join is for strings.

In order to control the empty spaces, the '0' in the space is converted to 'X'. We will form a function which will do the same and return the position of the cell where 'X' is located.

In [10]:
# Find the location of the actual position piece in the puzzle.
# Returns a tuple: row, column

def find_actual_position(board):
    rows = string_to_list(initial_board_str)
    for ir, row in enumerate(rows):
        for ic, element in enumerate(row):
            if element == 'X':
                return ir, ic

In [11]:
# This function receives a board and a row, and returns a list of valid numbers for this row.
# Find all valid numbers inside a row.
# Returns a list of strings

def possible_nos_in_row(board, a_row):
    rows = string_to_list(initial_board_str)
    elements = []
    for ir, row in enumerate(rows):
        for ic, element in enumerate(row):
            if ir == a_row:
                elements.append(element)

    results = []
    for i in range(1,7):
        if str(i) not in elements:
            results.append(str(i))

    return results

In [12]:
# Create a function which receives a board and a column, and returns a list of valid numbers for this column.

def possible_nos_in_col(board, a_col):
    rows = string_to_list(initial_board_str)
    elements = []
    for ir, row in enumerate(rows):
        for ic, element in enumerate(row):
            if ic == a_col:
                elements.append(element)
    
    results = []
    for i in range(1,7):
        if str(i) not in elements:
            results.append(str(i))

    return results

In [None]:
# This function receives a board and a group, and returns a list of valid numbers for this group.
# groups: A string representing groups on the board (e.g., rows, columns, or other segments).
# group: An integer representing the group number for which you want to find valid numbers.
def possible_nos_in_group(board, groups, group):
    