# Harry Ellingham Submission for Coursework 1

## 0. Getting Started 

In [1]:
# Requirements: This script requires numpy (tested with numpy==2.2.1) 
# and the built-in modules typing, importlib and sys.

import importlib.util
import sys

def install_and_import(package: str):
    """
    Ensures a package is installed and imported.

    Parameters:
        package (str): The module name to import and install if missing.
    """
    if importlib.util.find_spec(package) is None:
        print(f"Installing {package}...")
        !{sys.executable} -m pip install {package}
    globals()[package] = __import__(package)

# Test if numpy is installed, else install. Note this function does not work for packages whos download 
# name and import name are different (e.g. scikit-learn and sklearn)
install_and_import("numpy")


import numpy as np
from typing import List, Tuple, Dict, Callable, Optional

Code was ran on python 3.12.0. Note, we rely on the ordering of dictionary keys, which is a feature imposed in python 3.8 for the validity of our returned Gray Code

## 1. Backtracking Background

A backtracking algorithm is one which solves a problem by making sequential decisions in a depth-first search approach, exploring the parameter space maximally until you have reached the point of invalidity, then backtracking to an earlier state and proceeding from there.

We will make use of backtracking to go on to solve a variety of problems. Initially, we describe application specific backtracking algorithms for each of the two cases. This is done because this is the order in which I solved the problem, and I think its helpful to build the abstraction in these steps. We then extract common elements, to end up with an algorithmic structure which is the method by which the final code is implemented

In Sections 1.1 and 1.2, we introduce integer partitions and gray numbers, and explain backtracking implementations for solving this problem. 

Section 2 contains our code implementations, along with a measure of computational complextity, measured in recursive calls during running the function. 

Finally, section 3 concludes with a critical review of the methods, explaining design principles inlcuding data structures, various implementation choices, and avenues for improvement of the code


### 1.1 Integer Partitions
In the integer partitions problem, you hope to find all the integer partitions of a given input number. Mathematically, we want to find all distinct sums of positive integers which add up to n. For example, for the number 3, we have 3 ways of making this number: [(1, 1, 1), (1, 2), (3)]. 

There are algorithms which can do this deterministically, but for the purposes of this coursework we will propose and implement a backtracking solution to this problem. The backtracking solution approach to this problem is as follows: 


> **IntegerPartitionBacktracking** :  
> $ (current\_set, target\_number, \mathcal{P}) \longrightarrow m $
>
> *initialize* :  
> - $ current\_set = [\quad] $  : These is a dynamic store of possible solutions 
> - $ \mathcal{P} = \emptyset $ : This is a set of (non-repeated) sorted integer partitions
>
> *input* :  
> - $ current\_set \subseteq \mathbb{N}^+ $ : current partition sequence  
> - $ target\_number \in \mathbb{N}^+ $ : target integer to partition  
> - $ \mathcal{P} = \left\{ (p_1, p_2, \dots, p_k) \mid \sum_{i=1}^{k} p_i = n, \quad p_1 \leq p_2 \leq \dots \leq p_k, \quad p_i \in \mathbb{N}^+ \right\} \subseteq \mathcal{P}(\mathbb{N}^+) $
>
> *output* :  
> $ m = |\mathcal{P}| $

### **Algorithm**:
1. **Compute total sum** of elements in $ current\_set $:  
   - $ \text{total} \leftarrow \sum current\_set $
2. **If** $ \text{total} = target\_number $ and $ current\_set $ (sorted) is **not** in $ \mathcal{P} $:
   - **Store** partition:
     $ \mathcal{P} \leftarrow \mathcal{P} \cup \{\text{sorted}(current\_set)\} $
   - **Print** the partition $ current\_set $.
   - **Return**.
3. **Otherwise**, iterate over all valid next numbers $ i $ in the range:  
   $ i \in \{1, 2, \dots, target\_number - \text{total} \} $
   - Append $ i $ to $ current\_set $:
     $ current\_set \leftarrow current\_set \cup \{i\} $
   - **Recursively call** IntegerPartitionBacktracking(current\_set, target\_number, $\mathcal{P}$).
   - **Backtrack** by removing $ i $ from $ current\_set $:
     $ current\_set \leftarrow current\_set \setminus \{i\} $
4. **Return** the number of unique partitions:
   $ m = |\mathcal{P}| $



To clarify how this algorithm would proceed, we work through an example of the storing of some solutions for target_number = 5.
1. As the algorithm begins, it does not reach the backtrack step in step 3, until the array is popuated with all 1s, [1,1,1,1,1]. 
2. At this point, this is stored, and the backtrack steps one step back, removing the final 1, so we have [1,1,1,1]. 
3. This does not equal 5, so then we step through to the next step in the iteration of the final digit, moving it to the number 2. 
4. This means we now have [1,1,1,2]. This equals 5, hence we store this as a solution.
5. This final element is then removed, leaving us with [1,1,1], and we begin looping over the values of the rightmost entry here.
6. The (ordered) solution [1,1,2,1] is found, but since we already have it stored, this is skipped. 
7. We then arrive at [1,1,3]. This is a valid solution, and it is stored. 

The remainder of the solution space is given below, in the order in which they are arrived at
- (1, 1, 1, 1, 1)
- (1, 1, 1, 2)
- (1, 1, 3)
- (1, 2, 2)
- (1, 4)
- (2, 3)
- (5)

### 1.2 Gray Code Generation

A gray code is a way to order a binary representation of numbers so that successive numbers differ only by one bit. This is useful in things like error correction. An example Gray code which is $n = 3$ bits long is given below.
-  (0, 0, 0)
-  (1, 0, 0)
-  (1, 1, 0)
-  (0, 1, 0)
-  (0, 1, 1)
-  (1, 1, 1)
-  (1, 0, 1)
-  (0, 0, 1)

This one generalises to a recursive algorithm slightly more naturally. Note, we do not have to generate every possible Gray code, simply a single solution. 


> **GrayCodeBacktracking** :  
> $ (number\_of\_bits) \longrightarrow \mathcal{P} $
>
> *initialize* :  
> - $ current\_bits = (0, 0, \dots, 0) $ of length $ number\_of\_bits $  
> - $ \mathcal{P} = [\quad] $ : list of generated Gray codes
>
> *input* :  
> - $ number\_of\_bits \in \mathbb{N}^+ $ : length of the bit sequence
>
> *output* :  
> $ \mathcal{P} $ : list of all generated Gray codes of length $ number\_of\_bits $

### **Algorithm**:
1. **If** $ current\_bits $ is already in $ \mathcal{P} $, **return** False.
2. **Otherwise**, append a copy of $ current\_bits $ to $ \mathcal{P} $.
3. **For** each bit position $ i $ from $ 0 $ to $ number\_of\_bits - 1 $:
   - Flip bit $ i $:
     $ \text{current\_bits}[i] \leftarrow 1 - \text{current\_bits}[i] $
   - Recursively call `GrayCodeBacktracking` with the updated bit sequence.
   - **Backtrack** by flipping the bit back:
     $ \text{current\_bits}[i] \leftarrow 1 - \text{current\_bits}[i] $
4. **Return** the list of generated Gray codes $ \mathcal{P} $.


This becomes most clear with a diagram. This takes the form of a tree, which we provide a worked example for $n =3$

![Sample Image](gn_algo.png)


## 2. Code
### 2.1 Class Definition

In [2]:
class BacktrackingSolver:
    '''
    Class for implementing backtracking solver. 

    Backtracking is an algorithm for the purpose of solving problems where you can make sequential 
    decisions, some of which may end up being incorrect. The power of backtracking is it implements 
    a depth first search through your solution space, undoing a previous step (backtracking) if an 
    impossible/ not allowed state is reached. 

    Class is initalised with a collection of functions designed for each problem purpose

    Parameters:
        n (int): Parameter for board size. Interpretation will vary depending on application.
        populate_board (callable): Function to initialise the board state, taking n as an input.
        validity (callable): Check if a proposed solution is valid, if so proceed down the solution tree.
        record (callable): Note down the new and correct point in the solution tree.
        next_moves (callable): From current position, return the possible next states. 
        apply_step (callable): Move along the path of the next move.
        undo_step (callable): Undo the last found move if dead end is reached. 
        stop_after_single_solution (bool): Stop after a single solution is found (True) or find all (False).

    Returns: 
        number of permutations (int): The number of possible routes through the sample space.
        permutations (list[list[int]]): The possible routes through the sample space.  

    Example: 
    >>> solver = BacktrackingSolver(
    >>>                             n=4,
    >>>                             populate_board=fn_populate_board, 
    >>>                             validity=fn_validity, 
    >>>                             record=fn_record, 
    >>>                             next_moves=fn_next_moves, 
    >>>                             apply_step=fn_apply_step, 
    >>>                             undo_step=fn_undo_step, 
    >>>                             stop_after_single_solution=False
    >>>                             )
    >>> num_solutions, solutions, calls = solver.solve()
    '''
    
    def __init__(
        self, 
        n: int, 
        populate_board: Callable[[int], List[int]],
        validity: Callable[[int, List[int], List[Tuple[int, ...]]], bool],
        record: Callable[[List[int], List[Tuple[int, ...]]], List[Tuple[int, ...]]],
        next_moves: Callable[[int, List[int]], List[int]],
        apply_step: Callable[[int, List[int]], List[int]],
        undo_step: Callable[[int, List[int]], List[int]],
        stop_after_single_solution: bool
    ) -> None:
        
        '''Initialisation of class method'''
        self.n = n
        self.populate_board = populate_board
        self.validity = validity
        self.record = record
        self.next_moves = next_moves
        self.apply_step = apply_step
        self.undo_step = undo_step
        self.stop_after_single_solution = stop_after_single_solution
        self.permutations: Dict[Tuple[int, ...], None] = {}   # dictionary to store the permutations
        self.recursive_calls:int = 0

    def solve(
        self, 
        current_solution: Optional[List[int]] = None
    ) -> Tuple[int, List[Tuple[int, ...]], int]:
        """
        Runs the backtracking algorithm.

        Parameters:
            current_solution: Current state of the solution (defaults to a populated board if none)
        """
        if current_solution is None:
            current_solution = self.populate_board(self.n)

        self.recursive_calls += 1   # to find the length of our recursive stack 
        
        # Check validity
        if self.validity(self.n, current_solution, self.permutations):
            self.permutations = self.record(current_solution, self.permutations)
        else:
            if self.stop_after_single_solution:
                return #(len(self.permutations), self.permutations)

        # Explore next moves
        for candidate_move in self.next_moves(self.n, current_solution):
            current_solution = self.apply_step(candidate_move, current_solution)
            self.solve(current_solution)
            current_solution = self.undo_step(candidate_move, current_solution)

        return (len(self.permutations.keys()), list(self.permutations.keys()), self.recursive_calls)


A note on design principles

This is not exactly the implementation found on wikipedia, and in fact it sacrifices efficiency for the sake of being a `pure' backtracking algorithm. For example, in self.next_moves, we could check which of the next moves have not yet been visited ($\mathcal{O}(1)$ thanks to our self.permutations dict keys being the things we are checking over) before applying the move, recursively calling the function, and exploring a path further. This would amount to a more effective pruning of possible paths. 

The self.stop_after_single_solution is a hacky fix to the issue of early termination. It ensures that once you have explored the full space (which we do in the case of gray numbers) a return statement is propagated all the way up the recursive stack.

Lists are mutable and not hashable, so we turn them into tuples before we add them to the dictionary as keys. We make use of a dict since we can search the keys in $\mathcal{O}(1)$ time, and also we can return an ordered list (the keys are returned in the order they are added in python 3.8 onwards). You could achieve a similar thing with a list which stores the ordered grey code numbers/ integer partitions, and a set to do lookup, but this is redundant in memory. 

### 2.2 Example Calls to Class

#### 2.2.1 Integer Partitions

In [3]:
# Functions for integer partitions
def ip_populate_board(n: int) -> List[int]:
    '''Set up the board function based on the problem, here just an empty list'''
    return []

def ip_validity(n: int, current_solution: List[int], permutations: Dict[Tuple[int, ...], None]) -> bool:
    '''test if a current solution is valid, here that means that you should sort, 
    since the partitions are not an ordered set'''
    sorted_solution = sorted(current_solution)
    return (sum(current_solution) == n) and (tuple(sorted_solution) not in permutations.keys())

def ip_record(current_solution: List[int], permutations: Dict[Tuple[int, ...], None]) -> Dict[Tuple[int, ...], None]:
    '''Note down the current solution'''
    sorted_solution = sorted(current_solution)
    permutations[tuple(sorted_solution)] = None
    return permutations

def ip_next_moves(n: int, current_solution: List[int]) -> List[int]:
    '''Return a list of the possible next moves'''
    return list(range(1, n - sum(current_solution) + 1))

def ip_apply_step(candidate_move: int, current_solution: List[int]) -> List[int]:
    '''Step in the direction just decided'''
    current_solution.append(candidate_move)
    return current_solution

def ip_undo_step(candidate_move: int, current_solution: List[int]) -> List[int]:
    '''Remove the last appended element (i.e., backtrack)'''
    current_solution.pop()
    return current_solution

In [8]:
solver = BacktrackingSolver(
    n=4,
    populate_board=ip_populate_board, 
    validity=ip_validity, 
    record=ip_record, 
    next_moves=ip_next_moves, 
    apply_step=ip_apply_step, 
    undo_step=ip_undo_step, 
    stop_after_single_solution=False
)
num_solutions, solutions, calls = solver.solve()

if num_solutions == 1:
    print(f'There is {num_solutions} Integer Partition, found in {calls} recursive calls and this is {list(solutions)}')
else:
    print(f'There are {num_solutions} Integer Partitions, found in {calls} recursive calls and they are:')
    for partition in list(solutions):
        print(partition)

There are 5 Integer Partitions, found in 16 recursive calls and they are:
(1, 1, 1, 1)
(1, 1, 2)
(1, 3)
(2, 2)
(4,)


#### 2.2.2 Gray Numbers

In [26]:
# Functions for generating Gray numbers
def gn_populate_board(n: int) -> List[int]:
    '''Populate board method for the Gray Numbers case'''
    return [0] * n

def gn_validity(n: int, current_solution: List[int], permutations: Dict[Tuple[int, ...], None]) -> bool:
    '''Test if solution is not valid, i.e. has not been found yet'''
    return (tuple(current_solution) not in permutations.keys()) 

def gn_record(current_solution: List[int], permutations: Dict[Tuple[int, ...], None]) -> Dict[Tuple[int, ...], None]:
    '''note down a correct solution'''
    permutations[tuple(current_solution)] = None
    return permutations

def gn_next_moves(n: int, current_solution: List[int]) -> List[int]:
    '''Return positions where a bit flip can occur'''
    return list(range(n))

def gn_apply_step(flip_position: int, current_solution: List[int]) -> List[int]:
    '''Flip the bit at flip_position'''
    current_solution[flip_position] = 1 - current_solution[flip_position]
    return current_solution

def gn_undo_step(flip_position: int, current_solution: List[int]) -> List[int]:
    '''Flip the bit back (same as apply in this scenario)'''
    current_solution[flip_position] = 1 - current_solution[flip_position]
    return current_solution

In [27]:
solver = BacktrackingSolver(
    n=4,
    populate_board=gn_populate_board, 
    validity=gn_validity, 
    record=gn_record, 
    next_moves=gn_next_moves, 
    apply_step=gn_apply_step, 
    undo_step=gn_undo_step, 
    stop_after_single_solution=True
)
num_solutions, solutions, calls = solver.solve()

print(f'A possible Gray code of length {int(np.log2(num_solutions))} bits, found in {calls} recursive calls is:')
for code in list(solutions):
    print(code)

A possible Gray code of length 4 bits, found in 65 recursive calls is:
(0, 0, 0, 0)
(1, 0, 0, 0)
(1, 1, 0, 0)
(0, 1, 0, 0)
(0, 1, 1, 0)
(1, 1, 1, 0)
(1, 0, 1, 0)
(0, 0, 1, 0)
(0, 0, 1, 1)
(1, 0, 1, 1)
(1, 1, 1, 1)
(0, 1, 1, 1)
(0, 1, 0, 1)
(1, 1, 0, 1)
(1, 0, 0, 1)
(0, 0, 0, 1)


### 3. Appraisal

There are a few possible avenues for development in my code. One is that each function for each different use case may require different arguments than other cases. For example, gn_undo_step needs to know the last step which took place, whereas in ip_undo_step, you simply pop the last bit. Defining these as partial functions could perhaps help clear up the API, allowing for only required arguments to be passed in. I did not do this in the end, since for the malleability of the class it is easiest if you offer each function everything it could possibly need, then each function just picks relevant args on its own. 

I think the usage of data structures throughout has been optimal. Using dictionary keys allows for order 1 lookup time, and also preserves the order in which they are added. 

Efficiency gains could be made by more accurately/ thoroughly checking candidate next_moves, so that you cannot end up revisiting the same states you were in before. 

Overall, I think the code is reusable and robust. Efficiency gains could be made with regard to navigating the solution space, and this would also allow for a reduction in length of the recursive stack. Further work would involve testing out the implementation on other problems in backtracking, and seeing if we are able to apply the same structure in other cases. If not, we have likely benefitted from some commonalities between these two problems without realising.  